Deep learning is a subset of machine learning based on neural networks that process data using a method inspired by the human brain.
Deep learning is a subset of machine learning based on neural networks that process data using a method inspired by the human brain. In traditional computing, a program directs the computer to perform a task by following explicit step-by-step instructions. In deep learning, the computer is not explicitly told how to solve a particular task. Instead, it uses a learning algorithm to extract patterns in the data that relate the input data to the desired output. Deep learning models can learn to perform classification tasks by recognizing complex patterns in various forms of data, such as images, text, and audio. They can also be used to automate tasks that would previously require human intelligence, such as describing images or transcribing audio files.
The use of deep learning has seen significant growth with the development of deeply layered neural networks, the use of GPUs to accelerate execution, and the access to large datasets for training. Deep learning technology is driving many AI applications across a range of fields, including computer vision, natural language processes, speech engines, recommendation engines, digital assistants, text generation, and industrial automation, as well as emerging technologies, such as self-driving cars, and virtual reality.
Deep learning isn't a single approach; butit ratheris a class of algorithms that can be applied to a broad spectrum of problems. There are many architectures and algorithms used in deep learning. These can be divided into supervised deep learning (convolutional neural networks, recurrent neural networks, long short-term memory networks, and gated recurrent networks) and unsupervised deep learning (self-organized maps, autoencoders, and restricted boltzmannBoltzmann machines).
French mathematician Adrien-Marie Legendre published what is now often referred to as a linear neural network in 1805. Johann Carl Friedrich Gauss was later also credited for similar unpublished work done in 1795. Legendre's neural network consisted of two layers: an input and an output. Each input unit holds a number that is connected to the output with a weight. The output is the sum of the product of the input and its weights. Given a training set of input vectors and desired target values for each of them, neural network weights are adjusted such that the sum of the squared errors between the outputs and the corresponding targets is minimized.
The first non-learning recurrent neural network architecture was introduced and analyzed by physicists Ernst Ising and Wilhelm Lenz in the 1920s. It settles into an equilibrium state in response to input conditions. Warren McCulloch and Walter Pitts proposed an artificial neuron (a computational model of the “nerve net” in the brain) in 1943. In 1958, American Psychologist,psychologist Frank Rosenblatt introduced the idea of the Perceptron, a device that mimicked the neural structure of the brain and demonstrated an ability to learn. His multilayer perceptrons (MLPs) had a non-learning first layer with randomized weights and an adaptive output layer. While only the first layer could "learn,", Rosenblatt developed ideas that would go on to become extreme learning machines. In 1962, Rosenblatt would also write about "back-propagating errors" in MLPs with a hidden layer. However, he would not go on to develop a general deep learningdeep-learning algorithm using MLPs.
Soviet mathematician Alexey Ivakhnenko (with help from his associate V.G. Lapa) demonstrated successful learning in deep feedforward network architectures in 1965. Their work introduced the first general learning algorithms for deep MLPs with arbitrarily many hidden layers. They went on to publish a paper in 1971 describing a deep learning net with 8eight layers, trained by their method. Using a training set of input vectors with corresponding target output vectors, layers are incrementally grown and trained by regression analysis. Similar to later deep neural networks, Ivakhnenko's nets learned to create hierarchical, distributed, internal representations of incoming data.
In 1986, Geoffrey Hinton, along with colleagues David Rumelhart and Ronald Williams, published the back-propagation training algorithm as a method of training multilayer neural networks. However, some in the field state Finnish mathematician Seppo Linnainmaa invented back-propagation in the 1960s. Yann LeCun pioneered the use of neural networks for image recognition tasks, and in his 1998 paper, he defined the concept of convolutional neural networks, which mimic the human visual cortex. Around the same time, Joh Hopfield popularized the first recurrent neural network, known as the "Hopfield" network. Jürgen Schmidhuber and Sepp Hochreiter expanded upon Hopfield's work, introducing the long short-term memory in 1975. The architecture improved the efficiency of recurrent neural networks.
Deep learning is a subset of machine learning based on neural networks that process data using a method inspired by the human brain. In traditional computing, a program directs the computer to perform a task by following explicit step-by-step instructions. In deep learning, the computer is not explicitly told how to solve a particular task. Instead, it uses a learning algorithm to extract patterns in the data that relate the input data to the desired output. Deep learning models can learn to perform classification tasks by recognizing complex patterns in various forms of data such as images, text, and audio. They can also be used to automate tasks that would previously require human intelligence such as describing images or transcribing audio files.
In 1986, Geoffrey Hinton along with colleagues David Rumelhart and Ronald Williams published the back-propagation training algorithm as a method of training multilayer neural networks. However, some in the field state Finnish mathematician Seppo Linnainmaa invented back-propagation in the 1960s. Yann LeCun pioneered the use of neural networks for image recognition tasks and in his 1998 apperpaper he defined the the concept of convolutional neural networks, which mimic the human visual cortex. Around the same time, Joh Hopfield popularized the first recurrent neural network known as the "Hopfield" network. Jürgen Schmidhuber and Sepp Hochreiter expanded upon Hopfield's work, introducing the long short-term memory in 1975. The architecture improved the efficiency of recurrent neural networks.
In 2000, Yoshua Bengio introduced high-dimension word embeddings as a representation of word meaning. His group also introduced a form of attention mechanism that led to breakthroughs in machine translation. In 2012, Hinton and two of his students (Alex Krizhevsky and Ilya Sutskever) demonstrated the power of deep learning with significant results in the ImageNet competition. Their work was based on a dataset collated by Fei-Fei Li and others. Around the same time, Jeffrey Dean and Andrew Ng made significant breakthroughs on large-scale image recognition at Google Brain.
Yoshua Bengio, Geoffrey Hinton, and Yann LeCun were the recipients of the 2018 ACM A.M. Turing Award for "conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing."
March 27, 2019
2012
Their work is based on a dataset collated by Fei-Fei Li and others.
2000
1998
1986
French mathematician Adrien-Marie Legendre published what is now often referred to as a linear neural network in 1805. Johann Carl Friedrich Gauss was later also credited for similar unpublished work done in 1795. Legendre's neural network consisted of two layers an input and output. Each input unit holds a number that is connected to the output with a weight. The output is the sum of the product of the input and its weights. Given a training set of input vectors and desired target values for each of them, neural network weights are adjusted such that the sum of the squared errors between the outputs and the corresponding targets is minimized.
The first non-learning recurrent neural network architecture was introduced and analyzed by physicists Ernst Ising and Wilhelm Lenz in the 1920s. It settles into an equilibrium state in response to input conditions. Warren McCulloch and Walter Pitts proposed an artificial neuron (a computational model of the “nerve net” in the brain) in 1943. In 1958, American Psychologist, Frank Rosenblatt introduced the idea of the Perceptron, a device that mimicked the neural structure of the brain and demonstrated an ability to learn. His multilayer perceptrons (MLPs) had a non-learning first layer with randomized weights and an adaptive output layer. While only the first layer could "learn", Rosenblatt developed ideas that would go on to become extreme learning machines. In 1962, Rosenblatt would also write about "back-propagating errors" in MLPs with a hidden layer. However, he would not go on to develop a general deep learning algorithm using MLPs.
Soviet mathematician Alexey Ivakhnenko (with help from his associate V.G. Lapa) demonstrated successful learning in deep feedforward network architectures in 1965. Their work introduced the first general learning algorithms for deep MLPs with arbitrarily many hidden layers. They went on to publish a paper in 1971 describing a deep learning net with 8 layers, trained by their method. Using a training set of input vectors with corresponding target output vectors, layers are incrementally grown and trained by regression analysis. Similar to later deep neural networks, Ivakhnenko's nets learned to create hierarchical, distributed, internal representations of incoming data.
In 1986, Geoffrey Hinton along with colleagues David Rumelhart and Ronald Williams published the back-propagation training algorithm as a method of training multilayer neural networks. However, some in the field state Finnish mathematician Seppo Linnainmaa invented back-propagation in the 1960s. Yann LeCun pioneered the use of neural networks for image recognition tasks and in his 1998 apper he defined the the concept of convolutional neural networks, which mimic the human visual cortex. Around the same time, Joh Hopfield popularized the first recurrent neural network known as the "Hopfield" network.
1965
Their work introduced the first general learning algorithms for deep MLPs with arbitrarily many hidden layers.
1962
1958
His multilayer perceptrons had a non-learning first layer with randomized weights and an adaptive output.
1943
1920
1805
Branch of machine learning based on learning data representations.
Deep learning is a subset of machine learning based on neural networks that process data using a method inspired by the human brain.
Deep learning is a branch of AI that enables a system to automatically recognize salient features at multiple levels of abstraction in any data set given to it. This is different from most AI systems which use mathematical modeling and computation to process a data set, but have to be trained with known data annotations.
For example, a non-deep learning AI program that wants to identify handwritten numbers typically needs a learning algorithm, a training dataset with annotations (e.g. Figure A is a '9'), and a test dataset to verify accuracy. The system uses the annotations to learn which inputs correspond to the appropriate outputs. Often the training and normal-use datasets have to be heavily preprocessed to get high accuracy, and there are many domain-specific shortcuts that can be used as well.
Deep learning is a subset of machine learning based on neural networks that process data using a method inspired by the human brain. Deep learning models can learn to perform classification tasks by recognizing complex patterns in various forms of data such as images, text, and audio. They can also be used to automate tasks that would previously require human intelligence such as describing images or transcribing audio files.
The use of deep learning has seen significant growth with the development of deeply layered neural networks, the use of GPUs to accelerate execution, and the access to large datasets for training. Deep learning technology is driving many AI applications across a range of fields including computer vision, natural language processes, speech engines, recommendation engines, digital assistants, text generation, and industrial automation as well as emerging technologies such as self-driving cars, and virtual reality.
Deep learning can be defined as a neural network with three or more layers. While a single-layer neural network can make approximate predictions, adding hidden layers can help optimize analysis and refine outputs for accuracy. Neural networks attempt to simulate the behavior of the human brain, allowing deep learning models to "learn" from large amounts of data. The name "deep learning" comes from the fact it requires neural networks with additional layers to learn from the data.
Deep learning differs from classical machine learning by the type of data it uses and the methods by which it learns. Machine learning leverages structured, labeled data to make predictions. Specific features are defined in the input data for the model. To use unstructured data, machine learning generally requires some form of pre-processing and organization. Deep learning eliminates the need for pre-processing, enabling the ingesting and processing of unstructured data. Deep learning automates feature extraction and removes some of the reliance on human experts. For example, a deep learning algorithm could identify features in order to classify images in a dataset without human intervention. Then using processes, such as gradient descent and backpropagation, the algorithm adjusts and improves for accuracy. In contrast, machine learning would require a hierarchy of features defined manually by a human expert.
Deep learning neural networks are constructed of multiple layers of software nodes (artificial neurons) mimicking the interconnected neurons of the human brain. Deep learning models can learn by example, clarifying complex abstractions by building a hierarchy in which each level of abstraction is created using knowledge gained from the preceding layer. Each layer builds upon the last to refine and optimize predictions and classifications. Deep learning performs nonlinear transformations to its input to create a statistical model as an output. Iterations continue until the output reaches an acceptable level of accuracy. Achieving this level requires a deep learning model that has been trained on a large dataset and significant processing power.
By contrast, a deep learning AI program simply needs a lot of data. It gradually recognizes patterns in the data which seemingly have no meaning out of context. For the handwritten numbers, the algorithm might first notice areas of light and darkness, or lines that are vertical, horizontal or curved. These observations are then passed into a higher layer which automatically combines the line observations into shapes. This process can continue with progressively higher levels of abstraction, for example into numbers, then into groups of numbers (e.g. phone numbers or other arrangements with meaning). The way this completed can be optimized in various ways, such as through gradient descent
A Deep Learning Framework is an interface, library or a tool which allows users to build deep learning models more easily and quickly, without getting into the details of underlying algorithms. Libraries are useful for individuals who want to implement Deep Learning techniques but don’t have robust fluency in back-propagation, linear algebra or computer math. These libraries provide pre-written code for functions and modules that can be reused for deep learning training for different purposes.
Deep learning isn't a single approach but rather a class of algorithms that can be applied to a broad spectrum of problems. There are many architectures and algorithms used in deep learning. These can be divided into supervised deep learning (convolutional neural networks, recurrent neural networks, long short-term memory networks, and gated recurrent networks) and unsupervised deep learning (self-organized maps, autoencoders, and restricted boltzmann machines).