0% found this document useful (0 votes)
137 views

Deep Learning Unit-1 Finals

The document provides an introduction to deep learning, including: - Deep learning is a subset of machine learning that uses complex algorithms and deep neural networks to train models. - Artificial neural networks are inspired by biological neural networks and use weighted connections to process information. - The key differences between machine learning and deep learning are that deep learning uses neural networks to learn from large, unstructured datasets while machine learning uses statistical algorithms on smaller, structured datasets. - Deep learning faces challenges due to the complexity of models, large computational requirements, and lack of interpretability compared to machine learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
137 views

Deep Learning Unit-1 Finals

The document provides an introduction to deep learning, including: - Deep learning is a subset of machine learning that uses complex algorithms and deep neural networks to train models. - Artificial neural networks are inspired by biological neural networks and use weighted connections to process information. - The key differences between machine learning and deep learning are that deep learning uses neural networks to learn from large, unstructured datasets while machine learning uses statistical algorithms on smaller, structured datasets. - Deep learning faces challenges due to the complexity of models, large computational requirements, and lack of interpretability compared to machine learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 23

DEEP LEARNING

UNIT-1
Topics : Introduction to Deep Learning , Historical Trends in Deep Learning ,
Deep Feed- forward networks , Gradient-Based learning, Hidden units,
Architecture design, Back-propagation networks.

Introduction to Deep Learning


What is Deep Learning?
Before we get deeper into deep learning, its applications and platforms, the first thing this
introduction to deep learning tutorial will help you understand is what exactly is deep learning.
Deep learning is a subfield of machine learning that deals with algorithms inspired by the structure
and function of the brain. Deep learning is a subset of machine learning, which is a part of artificial
intelligence (AI).

Artificial intelligence is the ability of a machine to imitate intelligent human behaviour. Machine learning allows
a system to learn and improve from experience automatically.
Deep learning is an application of machine learning that uses complex algorithms and deep neural nets to train a
model.

Artificial intelligence
"It is a branch of computer science by which we can create intelligent machines which can behave like a
human, think like humans, and able to make decisions."

Artificial Intelligence exists when a machine can have human based skills such as learning, reasoning,
and solving problems

Importance of Artificial Intelligence

Following are some main reasons to learn about AI:

oWith the help of AI, you can create such software or devices which can solve real-world
problems very easily and with accuracy such as health issues, marketing, traffic issues, etc.
oWith the help of AI, you can create your personal virtual Assistant, such as Cortana, Google
Assistant, Siri, etc.
o With the help of AI, you can build such Robots which can work in an environment
where survival of humans can be at risk.
o AI opens a path for other new technologies, new devices, and new Opportunities.

Applications of Artificial Intelligence

Following are some sectors which have the application of Artificial Intelligence:

1. AI in Astronomy
2. AI in Healthcare
3. AI in Gaming
4. AI in Finance
5. AI in Data Security
6. AI in Social Media
7. AI in Travel & Transport
8. AI in Automotive Industry
9. AI in Robotics:
10.AI in Entertainment
11.AI in Agriculture
12.AI in E-commerce
13.AI in education

Advantages of Artificial Intelligence


Following are some main advantages of Artificial Intelligence:

o High Accuracy with less errors: AI machines or systems are prone to less errors and
high accuracy as it takes decisions as per pre-experience or information.
o High-Speed: AI systems can be of very high-speed and fast-decision making, because of that AI
systems can beat a chess champion in the Chess game.
o High reliability: AI machines are highly reliable and can perform the same action
multiple times with high accuracy.
o Useful for risky areas: AI machines can be helpful in situations such as defusing a
bomb, exploring the ocean floor, where to employ a human can be risky.
o Digital Assistant: AI can be very useful to provide digital assistant to the users such as
AI technology is currently used by various E-commerce websites to show the products as
per customer requirement.
o Useful as a public utility: AI can be very useful for public utilities such as a self-driving car
which can make our journey safer and hassle-free, facial recognition for security purpose,
Natural language processing to communicate with the human in human-language, etc.

Disadvantages of Artificial Intelligence


Every technology has some disadvantages, and thesame goes for Artificial intelligence. Being so
advantageous technology still, it has some disadvantages which we need to keep in our mind
while creating an AI system. Following are the disadvantages of AI:

oHigh Cost: The hardware and software requirement of AI is very costly as it requires lots of
maintenance to meet current world requirements.
oCan't think out of the box: Even we are making smarter machines with AI, but still they cannot work
out of the box, as the robot will only do that work for which they are trained, or programmed.
oNo feelings and emotions: AI machines can be an outstanding performer, but still it does not have
the feeling so it cannot make any kind of emotional attachment with human, and may sometime
be harmful for users if the proper care is not taken.
oIncrease dependency on machines: With the increment of technology, people are getting more
dependent on devices and hence they are losing their mental capabilities.
oNo Original Creativity: As humans are so creative and can imagine some new ideas but still AI
machines cannot beat this power of human intelligence and cannot be creative and imaginative

Machine learning

Machine learning enables a machine to automatically learn from data, improve performance
from experiences, and predict things without being explicitly programmed.

A Machine Learning system learns from historical data, builds the prediction models, and
whenever it receives new data, predicts the output for it. The accuracy of predicted output depends
upon the amount of data, as the huge amount of data helps to build a better model which predicts the
output more accurately.

Machine learning has changed our way of thinking about the problem. The below block
diagram explains the working of Machine Learning algorithm:

Features of Machine Learning:


oMachine learning uses data to detect various patterns in a given dataset.
oIt can learn from past data and improve automatically.
oIt is a data-driven technology.
oMachine learning is much similar to data mining as it also deals with the huge amount of the data.

Classification of Machine Learning

At a broad level, machine learning can be classified into three types:


1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning

Applications of Machine learning

1. Image Recognition
2. Speech Recognition
3. Traffic prediction:
4. Product recommendations:
5. Self-driving cars
6. Email Spam and Malware Filtering
7. Virtual Personal Assistant
8. Online Fraud Detection:
9. Stock Market trading
10. Medical Diagnosis:

Deep Learning
Applications of Deep Learning
Deep learning is widely used to make weather predictions about rain, earthquakes, and tsunamis. It helps
in taking the necessary precautions.

With deep learning, machines can comprehend speech and provide the required output. It enables the

machines to recognize people and objects in the images fed to it.

Deep learning models also help advertisers leverage data to perform real- time bidding and targeted
display advertising.

In the next section introduction to deep learning tutorial, we will cover the need and importance of deep
learning.

Importance of Deep Learning

Machine learning works only with sets of structured and semi-structured data, while deep learning works
with both structured and unstructured data

Deep learning algorithms can perform complex operations efficiently, while machine learning
algorithms cannot

Machine learning algorithms use labelled sample data to extract patterns, while deep learning accepts
large volumes of data as input and analyses the input data to extract features out of an object

Artificial neural networks


Artificial neural networks are built on the principles of the structure and operation of human
neurons. It is also known as neural networks or neural nets.

An artificial neural network’s input layer, which is the first layer, receives input from external
sources and passes it on to the hidden layer, which is the second layer.

Each neuron in the hidden layer gets information from the neurons in the previous layer, computes the
weighted total, and then transfers it to the neurons in the next layer.

These connections are weighted, which means that the impacts of the inputs from the preceding layer
are more or less optimized by giving each input a distinct weight. These weights are then adjusted
during the training process to enhance the performance of the model.

Difference between Machine Learning and Deep Learning :

machine learning and deep learning both are subsets of artificial intelligence but there are many
similarities and differences between them.

Machine Learning Deep Learning

Apply statistical algorithms to learn the hidden Uses artificial neural network architecture to
patterns and relationships in the dataset. learn the hidden patterns and relationships in the
dataset.

Can work on the smaller amount of dataset Requires the larger volume of dataset compared
to machine learning

Better for the low-label task. Better for complex task like image processing,
natural language processing, etc.
Machine Learning Deep Learning

Takes less time to train the model. Takes more time to train the model.

A model is created by relevant features which are Relevant features are automatically extracted
manually extracted from images to detect an object in from images. It is an end-to-end learning process.
the image.

Less complex and easy to interpret the result. More complex, it works like the black box
interpretations of the result are not easy.

It can work on the CPU or requires less computing It requires a high-performance computer with
power as compared to deep learning. GPU.

Challenges in Deep Learning


Deep learning has made significant advancements in various fields, but there are still some
challenges that need to be addressed. Here are some of the main challenges in deep learning:
1. Data availability:
It requires large amounts of data to learn from. For using deep learning it’s a big concern to
gather as much data for training.
2. Computational Resources:
For training the deep learning model, it is computationally expensive because it requires
specialized hardware like GPUs and TPUs.
3. Time-consuming:
While working on sequential data depending on the computational resource it can take very large
even in days or months.
4. Interpretability:
5. Deep learning models are complex, it works like a black box. it is very difficult to interpret the
result.
6. Overfitting:
when the model is trained again and again, it becomes too specialized for the training data,
leading to overfitting and poor performance on new data.

Advantages of Deep Learning:

1. High accuracy:
Deep Learning algorithms can achieve state-of-the-art performance in various tasks, such as
image recognition and natural language processing.
2. Automated feature engineering:
Deep Learning algorithms can automatically discover and learn relevant features from data
without the need for manual feature engineering.
3. Scalability:
Deep Learning models can scale to handle large and complex datasets, and can learn
from massive amounts of data.
4. Flexibility:
Deep Learning models can be applied to a wide range of tasks and can handle
various types of data, such as images, text, and speech.
5. Continual improvement:
Deep Learning models can continually improve their performance as more data becomes
available.
Disadvantages of Deep Learning:

1. High computational requirements:


Deep Learning models require large amounts of data and computational resources to train and
optimize.
2. Requires large amounts of labeled data:
Deep Learning models often require a large amount of labeled data for training, which can be
expensive and time- consuming to acquire.
3. Interpretability:
Deep Learning models can be challenging to interpret, making it difficult to understand how
they make decisions.
Overfitting: Deep Learning models can sometimes overfit to the training data, resulting in poor
performance on new and unseen data.
4. Black-box nature:
Deep Learning models are often treated as black boxes, making it difficult to understand how
they work and how they arrived at their predictions.

In summary, while Deep Learning offers many advantages, including high accuracy and
scalability, it also has some disadvantages, such as high computational requirements, the need for large
amounts of labeled data, and interpretability challenges. These limitations need to be carefully
considered when deciding whether to use Deep Learning for a specific task.
.

Historical Trends in Deep Learning


Deep Learning, is a more evolved branch of machine learning, and uses layers of algorithms
to process data, and imitate the thinking process, or to develop abstractions.

The history of deep learning can be traced back to 1943, when Walter Pitts and Warren
McCulloch created a computer model based on the neural networks of the human brain.

They used a combination of algorithms and mathematics they called “ threshold logic” to
mimic the thought process. Since that time, Deep Learning has evolved steadily, with only
two significant breaks in its development. Both were tied to the infamous Artificial Intelligence
winters.

The 1960s
Henry J. Kelley is given credit for developing the basics of a continuous Back Propagation
Modelin 1960.
The 1970s
During the 1970’s the first AI winter kicked in, the result of promises that couldn’t be kept.
The impact of this lack of funding limited both DL and AI research. Fortunately, there were
individuals who carried on the research without funding.
The 1980s and 90s
In 1989, Yann LeCun provided the first practical demonstration of backpropagation at Bell
Labs. He combined convolutional neural networks with back propagation onto read
“handwritten” digits. This system was eventually used to read the numbers of handwritten
checks.

This time is also when the second AI winter (1985-90s) kicked in, which also effected
research for neural networks and deep learning.

2000-2010
Around the year 2000, The Vanishing Gradient Problem appeared. It was discovered
“features” (lessons) formed in lower layers were not being learned by the upper layers,
because no learning signal reached these layers

2011-2020
By 2011, the speed of GPUs had increased significantly, making it possible to train
convolutional neural networks “without” the layer-by-layer pre-training. With the increased
computing speed, it became obvious deep learning had significant advantages in terms of
efficiency and speed

a convolutional neural network whose architecture won several international competitions


during 2011 and 2012. Rectified linear units were used to enhance the speed and dropout.
The Generative Adversarial Neural Network (GAN) was introduced in 2014. GAN was
created by Ian Goodfellow. With GAN, two neural networks play against each other in a
game.

The Future of Deep learning and Business


Deep learning has provided image-based product searches – Ebay, Etsy– and efficient ways
to inspect products on the assembly line. The first supports consumer convenience, while the
Currently, the evolution of artificial intelligenceis dependent on deep learning. Deep learning is still evolving
and in need of creative ideas.

Semantics technology is being used with deep learning to take artificial intelligence to the next level,
providing more natural sounding, human-like conversations.

Banks and financial services are using deep learning to automate trading, reduce risk, detect fraud, and
provide AI/chatbot advice to investors. A report from the EIU (Economist Intelligence Unit) suggests 86%
of financial services are planning to increase their artificial intelligence investments by 2025.

Deep learning and artificial intelligence are influencing the creation of new business models. These
businesses are creating new corporate cultures that embrace deep learning, artificial intelligence, and
modern technology

Deep Feed- forward networks


What is a Feed Forward Neural Network

A Feed Forward Neural Network is an artificial neural network in which the connections between nodes
does not form a cycle. The opposite of a feed forward neural network is a recurrent neural network, in
which certain pathways are cycled. The feed forward model is the simplest form of neural network as
information is only processed in one direction. While the data may pass through multiple hidden nodes, it
always moves in one direction and never backwards.

How does a Feed Forward Neural Network work?

A Feed Forward Neural Network is commonly seen in its simplest form as a single layer
perceptron. In this model, a series of inputs enter the layer and are multiplied by the weights. Each
value is then added together to get a sum of the weighted input values. If the sum of the values is
above a specific threshold, usually set at zero, the
value produced is often 1, whereas if the sum falls below the threshold, the output value is -1. The
single layer perceptron is an important model of feed forward neural networks and is often used in
classification tasks.

Furthermore, single layer perceptrons can incorporate aspects of machine learning. Using a
property known as the delta rule, the neural network can compare the outputs of its nodes with the
intended values, thus allowing the network to adjust its weights through training in order to
produce more accurate output values. This process of training and learning produces a form of a
gradient descent. In multi-layered perceptrons, the process of updating weights is nearly
analogous, however the process is defined more specifically as back-propagation. In such cases,
each hidden layer within the network is adjusted according to the output values produced by the
final layer.
Applications of Feed Forward Neural Networks

While Feed Forward Neural Networks are fairly straightforward, their simplified architecture can
be used as an advantage in particular machine learning applications.
For example, one may set up a series of feed forward neural networks with the intention of running
them independently from each other, but with a mild intermediary for moderation. Like the human
brain, this process relies on many individual neurons in order to handle and process larger tasks. As
the individual networks perform their tasks independently, the results can be combined at the end
to produce a synthesized, and cohesive output.

Gradient-Based Learning
1.Gradient-Based Learning
Gradient-based learning is a type of machine learning in which the optimization algorithm uses gradients to
update the model parameters during training.
This approach is commonly used in deep learning and neural networks because it allows the
model to learn complex representations of the input data.

2.Implementing Gradient Descent:


Gradient descent is a common optimization algorithm used in gradient-based learning to update the
model parameters.
There are several variants of gradient descent, such as batch, stochastic, and mini- batch gradient
descent.
Batch gradient descent computes the gradient over the entire training set, while stochastic gradient
descent computes the gradient over a single training example.
Mini-batch gradient descent computes the gradient over a small subset of the training set.
3.Vanishing and Exploding Gradients:
Vanishing and exploding gradients are common issues that can occur during gradient-based
learning in deep neural networks.
Vanishing gradients occur when the gradients become very small as they propagate through the
network, which can make it difficult for the model to learn long-term dependencies.
Exploding gradients occur when the gradients become very large as they propagate through the
network, which can cause the optimization process to become unstable and lead to numerical
issues.
Techniques such as weight initialization, activation functions, and gradient clipping can be used to
help mitigate these issues.
4.Sentiment Analysis:
Sentiment analysis is a natural language processing task that involves identifying the sentiment
or emotion expressed in a piece of text.
Deep learning techniques, such as convolutional neural networks (CNNs) and recurrent
neural networks (RNNs), have been successfully applied to sentiment analysis tasks. In a
typical sentiment analysis pipeline, the input text is first preprocessed and transformed into a
numerical representation, such as a bag-of-words or word embeddings.
The transformed data is then fed into a deep neural network, which learns to classify the text
into one of several sentiment categories.

HIDDEN UNITS
A hidden unit refers to the components comprising the layers of processorsbetween input and
output units in a connectionist system.

The main functionality of hiddenunits

A lot of the objects we studied so far appear in both Machine Learning and Deep Learning, but
hidden units and output units often are additional objects in Deep Learning. These objects, hidden
units, can be one of manytypes.

Since this is an area of active research, there are many more being studied and have probably yet
to be discovered. Since this is an area of active research, and probably in its infancy, the
principles and definitions are not super set in stone. The closes thing to a formal definition is, a
hidden unit takes in a vector/tensor, compute an affine transformation z and then applies an
element-wise non- linear function g(z). Where z:
The way hidden units are differentiated from each other is based on their activation function,
g(z):

ReLU

ELU GELU

Maxout PReLU

Absolute value rectification

LeakyReLU Logistic Sigmoid Hyperbolic Tangent

Hard Hyperbolic Tangent Identity

Softplus Softmax RBF

Here we explore the different types of hidden units so that when its time to choose one for an
application you’re developing, you have some intuition about which one to use. When you’re in
the initial stages of development, don’t be afraid to experiment through trial and error.

What’s ReLU?

ReLU stands for Rectified Linear Unit. Rectified Linear Units are pretty much the standard that
everyone defaults to, but it’s only one out of the many options. And this activation function looks
like:
What’s Maxout?

Maxout is a flavour of a ReLU, which itself is a subset of activation functions, which is a


component of a hidden unit. As such we know that a hidden unit will apply an affine
transformation to a vector and then apply a nonlinear element-wise activation function. Since
Maxout is a flavour of ReLU, you are right to assume it uses a max(0, z). But remember that an
element-wise max function is not differentiable everywhere, so in order to make it practically
differentiable, we group our elements into k groups. And select the max of the group. Thereby
making it not likely to have a sharp point.

The Maxout unit is then the maximum element of one of these groups:

Where,

is the indices of the inputs of group i.

With large enough k, a Maxout unit can learn to approximate any convex function with arbitrary
fidelity. In particular, a Maxout layer with two pieces can learn to implement the same inputs as
ReLU, PReLU, absolute value rectification and LeakyReLU.

The caveat here is that a Maxout unit is parametrized by k weight vectors instead of 1, and
require more regularization, unless, the training set is large enough. In general, although there is
no limit on k, lower is better as it requires less regularization.
What’s Logistic Sigmoid?

If the ReLU is the reigning queen of activation functions, then logistic sigmoid is the
former, denoted:

What’s Hyperbolic Tangent?

A close relative to the logistic sigmoid is the hyperbolic tangent, related to logistic sigmoid
by:

See the relation? They both saturate really extreme values to a small constant value, more on
this later. The difference between them is that sigmoid is 1/2 at 0,
whereas tanh is 0 at 0. In that sense, the tanh is more like the identity function, at least around
0.

Training a deep neural network:

is similar to training a linear model:


Sigmoidal activation functions are more useful in RNNs, probabilistic models and
autoencoders. As they have additional requirements that rule out piecewise linear
activation functions. And many of these functions that seem to have a horizontal
asymptote give a difficult time to gradient descent.

What’s RBF?

This function, Radial Basis Function, becomes more active as x approaches a certain value
vector, it saturates to 0 everywhere else, so can be annoying for gradient descent:

What’s Softplus?

This one is discouraged from use based on empirical evidence. Which is counter-intuitive. Since
its meant to be an improvement on ReLU, making it differentiable everywhere. But in practice, it
does worse.

What’s the hard hyperbolic tangent, or hard tanh?

It looks like the tanh or the rectifier. But unlike the rectifier, it is bounded. It’s
computationally cheaper than many of the alternatives. It’s basically either -1 or the line a
or 1.
What’s Identity?

Having an identity function as the activation function is exactly like having no activation function.
A linear unit can be a useful output unit, but it can also be a decent hidden unit.

If every layer of the network is a linear transformation, the whole network is also a linear
transformation, by transitivity?

Generally multiplying and adding vectors and matrices acts as a linear transformation that
stretches, combines, rotates, compresses the input vector ormatrix.

We just learned that neural networks consist entirely of tensor operations, and all of these tensor
operations are just geometric transformations of the inputdata.

It follows that then neural networks are just geometric transformations of the input data.

Remember that a hidden unit is:

Our network has n inputs and p outputs. With this approach we replace that with:

The first layer is matrix U and the second weight matrix is V. If the first
layer, U produces q parameters, together these layers produce (n+p)q parameters. Whereas
just W, would produce np parameters. Linear hidden units, then offer an effective way to
reduce the number of parameters in a network.

What’s Softmax?

These hidden units are often used in architectures where your goal is to learn to manipulate
memory. When there is a classification problem and you need to pick one of the multiple
categories, this is the one to use. As it always boosts the max category and drags the other
categories down. This will be studied later.
Deep Learningarchitectures

RNN: Recurrent Neural Networks


RNN is one of the fundamental network architectures from which other deep learning
architectures are built. RNNs consist of a rich set of deep learning architectures. They can use
their internal state (memory) to process variable-length sequences of inputs. Let’s say that RNNs
have a memory. Every processed information is captured, stored, and utilized to calculate the
final outcome. This makes them useful when it comes to, for
instance, speech recognition[1]. Moreover, the recurrent network might have connections that
feedback into prior layers (or even into the same layer). This feedback allows them to maintain
the memory of past inputs and solve problems in time.

RNNs are very useful when it comes to fields where the sequence of presented information is
key. They are commonly used in NLP (i.a. chatbots), speech synthesis, and machine translations.

Currently, we can indicate two types of RNN:

Bidirectional RNN: They work two ways; the output layer can get information from past and
future states simultaneously[2].

Deep RNN: Multiple layers are present. As a result, the DL model can extract more hierarchical
information.
LSTM: Long Short-Term Memory
It’s also a type of RNN. However, LSTM has feedback connections. This means that it can
process not only single data points (such as images) but also entire sequences of data (such as
audio or video files)[3].

LSTM derives from neural network architectures and is based on the concept of a memory cell.
The memory cell can retain its value for a short or long time as a function of its inputs, which
allows the cell to remember what’s essential and not just its last computed value.
A typical LSTM architecture is composed of a cell, an input gate, an output gate, and a forget
gate. The cell remembers values over arbitrary time intervals, and these three gates regulate the
flow of information into and out of the cell.

The input gate controls when new information can flow into the memory.

The output gate controls when the information that is contained in the cell is used in the output.

The forget gate controls when a piece of information can be forgotten, allowing the cell to process
new data.
Today, LSTMs are commonly used in such fields as text compression, handwriting recognition,
speech recognition, gesture recognition, and image captioning[4].

GRU

This abbreviation stands for Gated Recurrent Unit. It’s a type of LSTM. The major difference is
that GRU has fewer parameters than LSTM, as it lacks an output gate[5]. GRUs are used for
smaller and less frequent datasets, where they show better performance.

CNN: Convolutional Neural Networks


This architecture is commonly used for image processing, image recognition, video analysis, and
NLP.

CNN can take in an input image, assign importance to various aspects/objects in the image, and
be able to differentiate one from the others[6]. The name ‘convolutional’ derives from a
mathematical operation involving the convolution of different functions. CNNs consist of an
input and an output layer, as well as multiple hidden layers. The CNN’s hidden layers typically
consist of a series of convolutional layers.

Here’s how CNNs work: First, the input is received by the network. Each input (for instance,
image) will pass through a series of convolution layers with various filters. The control layer
controls how the signal flows from one layer to the other. Next, you have to flatten the output and
feed it into the fully connected layer where all the layers of the network are connected with every
neuron from a preceding layer to the neurons from the subsequent layer. As a result, you can
classify the output.

DBN: Deep Belief Network


DBN is a multilayer network (typically deep, including many hidden layers) in which each pair of
connected layers is a Restricted Boltzmann Machine (RBM). Therefore, we can state that DBN is
a stack of RBMs. DBN is composed of multiple layers of latent variables (“hidden units”), with
connections between the layers but not between units within each layer[7]. DBNs use
probabilities and unsupervised learning to produce outputs. Unlike other models, each layer in
DBN learns the entire input. In CNNs, the first layers only filter inputs for basic features, and the
latter layers recombine all the simple patterns found by the previous layers. DBNs work
holistically and regulate each layer in order.

DBNs can be used i.a. in image recognition and NLP.


DSN: Deep Stacking Network
We saved DSN for last because this deep learning architecture is different from the others. DSNs are
also frequently called DCN–Deep Convex Network. DSN/DCN comprises a deep network, but it’s
actually a set of individual deep networks. Each network within DSN has its own hidden layers that
process data.
This architecture has been designed in order to improve the training issue, which is quite complicated
when it comes to traditional deep learning models. Thanks to many layers, DSNs consider training, not a
single problem that has to be solved but a set of individual problems.

According to a paper “An Evaluation of Deep Learning Miniature Concerning in Soft Computing”[8]
published in 2015, “the central idea of the DSN design relates to the concept of stacking, as proposed
originally, where simple modules of functions or classifiers are composed first and then they are stacked
on top of each other in order to learn complex functions or classifiers.”
Typically, DSNs consist of three or more modules. Each module consists of an input layer, a hidden
layer, and an output layer. These modules are stacked one on top of another, which means that the input
of a given module is based on the output of prior modules/layers. This construction enables DSNs to
learn more complex classification than it would be possible with just one module.
Deep Learning Architecture – Autoencoders
Autoencodersareaspecifictypeoffeedforwardneuralnetwork. Thegeneralideaisthat theinputandtheoutputarepretty
muchthesame.Whatdoesitmean?Simplyput, Autoencoders condense the input into a lower-dimensional code. Based
on this, the outcomeisproduced.Inthismodel,thecodeisacompactversionoftheinput.Oneof Autoencoders’maintasks
istoidentifyanddeterminewhatconstitutesregulardataand then identify the anomalies or aberrations.

Autoencoders comprise threecomponents:

Encoder (condenses the input and produces the code) Code

Decoder (rebuilds the input using the code)


Autoencoders are mainly used for dimensionality reduction and, naturally, anomaly detection(forinstance,frauds).
Simplicityisoneoftheirgreatestadvantages.Theyare easy to build and train. However, there’s also the other side of the
coin. You need high- quality,representativetrainingdata.Ifyoudon’t,theinformationthatcomesoutofthe Autoencoder
can be unclear or biased.

Deep Learning architecture – conclusion

As youcansee, althoughdeeplearningarchitecturesare,generallyspeaking,basedon thesameidea, therearevarious ways


to achieveagoal. That’swhy it’s soimportantto choose deep learning architecture correctly.

BACK PROPAGATION NETWORK


What is a backpropagation algorithm?
Backpropagation, or backward propagation of errors, is an algorithm that is designed to test for
errors working back from output nodes to input nodes. It is an important mathematical tool for
improving the accuracy of predictions in data mining and machine learning. Essentially,
backpropagation is an algorithm used to calculate derivatives quickly.

There are two leading types of backpropagation networks:

1.Static backpropagation. Static backpropagation is a network developed to map static inputs


for static outputs. Static backpropagation networks can solve static classification problems, such
as optical character recognition (OCR).

2.Recurrent backpropagation. The recurrent backpropagation network is used for fixed-point


learning. Recurrent backpropagation activation feeds forward until it reaches a fixed value.
The key difference here is that static backpropagation offers instant mapping and recurrent
backpropagation does not.

What is a backpropagation algorithm in a neural network?


Artificial neural networks use backpropagation as a learning algorithm to compute a gradient
descent with respect to weight values for the various inputs. By comparing desired outputs to
achieved system outputs, the systems are tuned by adjusting connection weights to narrow
the difference between the two as much as possible.

The algorithm gets its name because the weights are updated backward, from output to input.

The advantages of using a backpropagation algorithm are as follows:

It does not have any parameters to tune except for the number of inputs.

It is highly adaptable and efficient and does not require any prior knowledge about the
network.

It is a standard process that usually works well. It is user-friendly, fast and easy to program.

Users do not need to learn any special functions.

The disadvantages of using a backpropagation algorithm are as follows:

It prefers a matrix-based approach over a mini-batch approach. Data mining is


sensitive to noise and irregularities.

Performance is highly dependent on input data.


Training is time- and resource-intensive.
What is the objective of a backpropagation algorithm?
Backpropagation algorithms are used extensively to train feedforward neural networks in
areas such as deep learning. They efficiently compute the gradient of the loss function
with respect to the network weights. This approach eliminates the inefficient process of
directly computing the gradient with respect to each individual weight. It enables the use
of gradient methods, like gradient descent or stochastic gradient descent, to train
multilayer networks and update weights to minimize loss.

You might also like