Deep Learning | in Easy words | Definitions, History, Applications…

Deep Learning

Deep learning is an artificial intelligence (AI) function that imitates the workings of the human brain in processing data and creating patterns for use in decision making. Deep learning is a subset of machine learning in artificial intelligence that has networks capable of learning unsupervised from data that is unstructured or unlabeled. Also known as deep neural learning or deep neural network.

What is Deep Learning

Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised, or unsupervised.
Deep learning architectures such as deep neural networks, deep belief networks, recurrent neural networks, and convolutional neural networks have been applied to fields including computer vision, machine vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics, drug design, medical image analysis, material inspection, and board game programs, where they have produced results comparable to and in some cases surpassing human expert performance.
Artificial neural networks (ANNs) were inspired by information processing and distributed communication nodes in biological systems. ANNs have various differences from biological brains. Specifically, neural networks tend to be static and symbolic, while the biological brain of most living organisms is dynamic (plastic) and analog.


Deep Learning

History of Deep Learning

The history of deep learning dates back to 1943 when Warren McCulloch and Walter Pitts created a computer model based on the neural networks of the human brain. Warren McCulloch and Walter Pitts used a combination of mathematics and algorithms they called threshold logic to mimic the thought process. Since then, deep learning has evolved steadily, over the years with two significant breaks in its development. The development of the basics of a continuous Back Propagation Model is credited to Henry J. Kelley in 1960. Stuart Dreyfus came up with a simpler version based only on the chain rule in 1962. The concept of backpropagation existed in the early 1960s but only became useful until 1985.

Henry J. Kelley is given credit for developing the basics of a continuous Back Propagation Model in 1960. In 1962, a simpler version based only on the chain rule was developed by Stuart Dreyfus. While the concept of backpropagation (the backward propagation of errors for purposes of training) did exist in the early 1960s, it was clumsy and inefficient, and would not become useful until 1985.

The earliest efforts in developing Deep Learning algorithms came from Alexey Grigoryevich Ivakhnenko (developed the Group Method of Data Handling) and Valentin Grigorʹevich Lapa (author of Cybernetics and Forecasting Techniques) in 1965. They used models with polynomial (complicated equations) activation functions, that were then analyzed statistically. From each layer, the best statistically chosen features were then forwarded on to the next layer (a slow, manual process).
Deep Learning

Types of Deep Learning

Deep Belief Net

A deep belief network is a solution to the problem of handling non-convex objective functions and local minima while using the typical multilayer perceptron. It is an alternative type of deep learning consisting of multiple layers of latent variables with the connection between the layers. The deep belief network can be viewed as restricted Boltzmann machines (RBM), where each subnetwork’s hidden layer acts as the visible input layer for the adjacent layer of the network. It makes the lowest visible layer a training set for the adjacent layer of the network. This way, each layer of the network is trained independently and greedily. The hidden variables are used as the observed variables to train each layer of the deep structure. The training algorithm for such a deep belief network is provided as follows:
  • Consider a vector of inputs
  • Train a restricted Boltzmann machine using the input vector and obtain the weight matrix
  • Train the lower two layers of the network using this weight matrix
  • Generate new input vector by using the network (RBM) through sampling or mean activation of the hidden units
  • Repeat the procedure till the top two layers of the network are reached
  • The fine-tuning of the deep belief network is very similar to the multilayer perceptron. Such deep belief networks are useful in acoustic modeling.

Recurrent Neural Networks

The convolutional model works on a fixed number of inputs, generates a fix-sized vector as output with a predefined number of steps. The recurrent networks allow us to operate over sequences of vectors in input and output. In the case of the recurrent neural network, the connection between units forms a directed cycle. Unlike the traditional neural network, the recurrent neural network input and output are not independent but related. Further, the recurrent neural network shares the standard parameters at every layer. One can train the recurrent network in a way that is like the traditional neural network using the backpropagation method.

Here, the calculation of gradient depends not on the current step but on previous steps also. A variant called a bidirectional recurrent neural network is also used for many applications. The bidirectional neural network considers not only the previous but also the expected future output. In two-way and straightforward recurrent neural networks, deep learning can be achieved by introducing multiple hidden layers. Such deep networks provide higher learning capacity with lots of learning data. Speech, image processing, and natural language processing are some of the candidate areas where recurrent neural networks can be used.


An autoencoder is an artificial neural network that is capable of learning various coding patterns. The simple form of the autoencoder is just like the multilayer perceptron, containing an input layer or one or more hidden layers, or an output layer. The significant difference between the typical multilayer perceptron and feedforward neural network and autoencoder is in the number of nodes at the output layer. In the case of the autoencoder, the output layer contains the same amount of nodes as in the input layer. Instead of predicting target values as per the output vector, the autoencoder has to predict its inputs. The broad outline of the learning mechanism is as follows.
For each input x,
Do a feedforward pass to compute activation functions provided at all the hidden layers and output layers
Find the deviation between the calculated values with the inputs using the appropriate error function
Backpropagate the error to update weights
Repeat the task till satisfactory output.
If the number of nodes in the hidden layers is fewer than the input/output nodes, then the activations of the last hidden layer are considered as a compressed representation of the inputs. When the hidden layer nodes are more than the input layer, an autoencoder can potentially learn the identity function and become useless in the majority of the cases.

Convolutional Neural Networks

A convolutional neural network (CNN) is another variant of the feedforward multilayer perceptron. It is a type of feedforward neural network, where the individual neurons are ordered in a way that they respond to all overlapping regions in the visual area.
Deep CNN works by consecutively modeling small pieces of information and combining them deeper in the network. One way to understand them is that the first layer will try to identify edges and form templates for edge detection. Then, the subsequent layers will try to combine them into simpler shapes and eventually into templates of different object positions, illumination, scales, etc. The final layers will match an input image with all the templates, and the final prediction is like a weighted sum of all of them. So, deep CNNs can model complex variations and behavior, giving highly accurate predictions.
Such a network follows the visual mechanism of living organisms. The cells in the visual cortex are sensitive to small subregions of the visual field, called a receptive field. The subregions are arranged to cover the entire visual area, and the cells act as local filters over the input space. The backpropagation algorithm is used to train the parameters of each convolution kernel. Further, each kernel is replicated over the entire image with the same parameters. There are convolutional operators which extract unique features of the input. Besides the convolutional layer, the network contains a rectified linear unit layer, pooling layers to compute the max or average value of a feature over a region of the image, and a loss layer consisting of application-specific loss functions. Image recognition and video analysis and natural language processing are major applications of such a neural network.

Top 6 Applications of Deep Learning

Voice Search & Voice-Activated Assistants

One of the most popular usage areas of deep learning is voice search & voice-activated intelligent assistants. With the big tech giants have already made significant investments in this area, voice-activated assistants can be found on nearly every smartphone. Apple’s Siri is on the market since October 2011. Google Now, the voice-activated assistant for Android, was launched less than a year after Siri. The newest of the voice-activated intelligent assistants is Microsoft Cortana.

Image recognition

A common evaluation set for image classification is the MNIST database data set. MNIST is composed of handwritten digits and includes 60,000 training examples and 10,000 test examples. As with TIMIT, its small size lets users test multiple configurations. A comprehensive list of results on this set is available.
Deep learning-based image recognition has become “superhuman”, producing more accurate results than human contestants. This first occurred in 2011.
Deep learning-trained vehicles now interpret 360° camera views. Another example is Facial Dysmorphology Novel Analysis (FDNA) used to analyze cases of human malformation connected to a large database of genetic syndromes.

Automatic Machine Translation

This is a task where given words, phrases, or sentences in one language, automatically translate it into another language.
Automatic machine translation has been around for a long time, but deep learning is achieving top results in two specific areas:
  1. Automatic Translation of Text
  2. Automatic Translation of Images
Text translation can be performed without any pre-processing of the sequence, allowing the algorithm to learn the dependencies between words and their mapping to a new language.
Deep Learning

Visual art processing

Closely related to the progress that has been made in image recognition is the increasing application of deep learning techniques to various visual art tasks. DNNs have proven themselves capable, for example, of a) identifying the style period of a given painting, b) Neural Style Transfer – capturing the style of a given artwork and applying it in a visually pleasing manner to an arbitrary photograph or video, and c) generating striking imagery based on random visual input fields.

Automatic Colorization

Image colorization is the problem of adding color to black and white photographs. Deep learning can be used to use the objects and their context within the photograph to color the image, much like a human operator might approach the problem. This capability leverage the high quality and very large convolutional neural networks trained for ImageNet and co-opted for the problem of image colorization. Generally, the approach involves the use of very large convolutional neural networks and supervised layers that recreate the image with the addition of color.

Mobile advertising

Finding the appropriate mobile audience for mobile advertising is always challenging since many data points must be considered and analyzed before a target segment can be created and used in an ad serving by any ad server. Deep learning has been used to interpret large, many-dimensioned advertising datasets. Many data points are collected during the request/serve/click internet advertising cycle. This information can form the basis of machine learning to improve ad selection.

Leave a Comment

Your email address will not be published. Required fields are marked *