0% found this document useful (0 votes)

51 views

Unit- 5 Deep Learning (1)

Uploaded by

edigadinesh2002

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views

Unit- 5 Deep Learning (1)

Uploaded by

edigadinesh2002

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

NB-SEAGI DL(R20)-Unit-5

DEEP LEARNING (20A05703c)

UNIT V
Sequence Modeling: Recurrent and Recursive Nets: Unfolding Computational Graphs,
Recurrent Neural Networks, Bidirectional RNNs, Encoder-Decoder Sequence-to-Sequence
Architectures, Deep Recurrent Networks, Recursive Neural Networks, Echo State Networks,
LSTM, Gated RNNs, Optimization for Long-Term Dependencies, Auto encoders, Deep
Generative Models.

What Is a Neural Network? A Neural Network consists of different layers connected to

each other, working on the structure and function of a human brain. It learns from huge
volumes of data and uses complex algorithms to train a neural net.

Several neural networks can help solve different business problems. Let’s look at a few of
them.

 Feed-Forward Neural Network: Used for general Regression and Classification

problems.

 Convolutional Neural Network: Used for object detection and image classification.

 Deep Belief Network: Used in healthcare sectors for cancer detection.

 RNN: Used for speech recognition, voice recognition, time series prediction, and
natural language processing.

Unfolding Computational Graphs

A computational graph is a way to formalize the structure of a set of computations, such as

those involved in mapping inputs and parameters to outputs and loss. In this section we
explain the idea of unfolding a recursive or recurrent computation into a computational graph
that has a repetitive structure, typically corresponding to a chain of events. Unfolding this
graph results in the sharing of parameters across a deep network structure.

For example, consider the classical form of a dynamical system:

where s(t ) is called the state of the system.

Another example, let us consider a dynamical system driven by an external signal ,

where we see that the state now contains information about the whole past sequence.

Dept: CAI Page 1 of 15

NB-SEAGI DL(R20)-Unit-5

Recurrent neural networks can be built in many different ways. Much as almost any function
can be considered a feedforward neural network, essentially any function involving
recurrence can be considered a recurrent neural network.

Many recurrent neural networks used below Eq or a similar equation to define the values of
their hidden units. To indicate that the state is the hidden units of the network, we now
rewrite Eq. using the variable to represent the state:

typical RNNs will add extra architectural features such as output layers that read information
out of the state to make predictions.

When the recurrent network is trained to perform a task that requires predicting the future
from the past, the network typically learns to use as a kind of lossy summary of the task-
relevant aspects of the past sequence of inputs up to .

 This summary is in general necessarily lossy, since it maps an arbitrary length

sequence to fixed length vector .
 Depending on the training criterion, this summary might selectively keep some
aspects of the past sequence with more precision than other aspects.

Dept: CAI Page 2 of 15

NB-SEAGI DL(R20)-Unit-5

The most demanding situation is when we ask to be rich enough to allow one to
approximately recover the input sequence, as in auto-encoder frameworks.

Eq. can be drawn in two different ways. One way to draw the RNN is with a diagram
containing one node for every component that might exist in a physical
implementation of the model, such as a biological neural network.

 In this view, the network defines a circuit that operates in real time, with physical
parts whose current state can influence their future state.

The other way to draw the RNN is as an unfolded computational graph, in which each
component is represented by many different variables, with one variable per time step,
representing the state of the component at the point in time.

 Each variable for each time step is drawn as a separate node of the computational
graph, as in right of above Fig.

What we call unfolding is the operation that maps a circuit as in the left side of the
figure to a computational graph with repeated pieces as in the right side. The
unfolded graph now has a size that depends on the sequence length.

We can represent the unfolded recurrence after steps with a function :

The function takes the whole past sequence as input and produces the current
state, but the unfolded recurrent structure allows us to factorize into repeated
application of a function . The unfolding process thus introduces two major
advantages:

 Regardless of the sequence length, the learned model always has the same input size,
because it is specified in terms of transition from one state to another state, rather than
specified in terms of a variable-length history of states.
 It is possible to use the same transition function with the same parameters at every
time step.

These two factors make it possible to learn a single model that operates on all time
steps and all sequence lengths, rather than needing to learn a separate model for
all possible time steps. Learning a single, shard model allows generalization to
sequence lengths that did not appear in the training set, and allows the model to be
estimated with far fewer training examples than would be required without parameter
sharing.

Both the recurrent graph and the unrolled graph have their uses. The recurrent graph
is succinct. The unfolded graph provides an explicit description of which
Dept: CAI Page 3 of 15
NB-SEAGI DL(R20)-Unit-5

computations to perform. The unfolded graph also helps to illustrate the idea of
information flow forward in time (computing outputs and losses) and backward in
time (computing gradients) by explicitly showing the path along which this
information flows.

Recurrent Neural Network

 Recurrent Neural Network(RNN) is a type of Neural Network where the output

from the previous step are fed as input to the current step. In
traditional neural networks, all the inputs and outputs are
independent of each other, but in cases like when it is required to
predict the next word of a sentence, the previous words are
required and hence there is a need to remember the previous
words. Thus RNN came into existence, which solved this issue
with the help of a Hidden Layer.

 The main and most important feature of RNN is Hidden state,

which remembers some information about a sequence.

 RNN have a “memory” which remembers all information about

what has been calculated. It uses the same parameters for each
input as it performs the same task on all the inputs or hidden layers
to produce the output. This reduces the complexity of parameters,
unlike other neural networks.

Now the RNN will do the following:

RNN converts the independent activations into dependent activations by

providing the same weights and biases to all the layers, thus reducing the
complexity of increasing parameters and memorizing each previous output by
giving each output as input to the next hidden layer.

Hence these three layers can be joined together such that the weights and bias of
all the hidden layers are the same, in a single recurrent layer.

The formula for calculating current state:

Dept: CAI Page 4 of 15

NB-SEAGI DL(R20)-Unit-5

Training through RNN

1. A single-time step of the input is provided to the network.

2. Then calculate its current state using a set of current input and the previous state.

3. The current ht becomes ht-1 for the next time step.

4. One can go as many time steps according to the problem and join the information
from all the previous states.

5. Once all the time steps are completed the final current state is used to calculate the
output.

6. The output is then compared to the actual output i.e the target output and the error is
generated.

7. The error is then back-propagated to the network to update the weights and hence the
network (RNN) is trained.

Why RNN?

RNN were created because there were a few issues in the feed-forward neural network:

 Cannot handle sequential data

 Considers only the current input

 Cannot memorize previous inputs

The solution to these issues is the RNN.

 An RNN can handle sequential data, accepting the current input data, and previously
received inputs. RNNs can memorize previous inputs due to their internal memory.

Advantages of RNN

Dept: CAI Page 5 of 15

NB-SEAGI DL(R20)-Unit-5

1. An RNN remembers each and every piece of information through time. It is useful in
time series prediction only because of the feature to remember previous inputs as
well. This is called Long Short Term Memory.

2. Recurrent neural networks are even used with convolutional layers to extend the
effective pixel neighborhood.

3. Model size does not grow with input size.

Disadvantages of RNN

1. Gradient vanishing and exploding problems.

2. They are slow because they processed sequentially. TO calculate current state we
must know previous state.So,Training an RNN is a very difficult task.

3. It cannot process very long sequences if using tanh or relu as an activation function.

Application of RNN

1. Language Modelling and Generating Text

2. Speech Recognition

3. Machine Translation

4. Image Recognition, Face detection

5. Time series Forecasting

Application of RNN

1. Machine Translation We make use of

Recurrent Neural Networks in the translation
engines to translate the text from one to another
language. They do this with the combination of
other models like LSTM (Long short-term
memory)s.

2. Speech Recognition Recurrent Neural Networks has

replaced the traditional speech recognition models that
made use of Hidden Markov Models. These Recurrent
Neural Networks, along with LSTMs, are better poised at
classifying speeches and converting them into text
without loss of context.

Dept: CAI Page 6 of 15

NB-SEAGI DL(R20)-Unit-5

3. Sentiment Analysis We make use of sentiment analysis to positivity, negativity, or

the neutrality of the sentence. Therefore, RNNs are most adept at handling data
sequentially to find sentiments of the sentence.

4. Automatic Image Tagger RNNs, in conjunction with convolutional neural networks,

can detect the images and provide their descriptions in the form of tags. For example,
a picture of a fox jumping over the fence is better explained appropriately using
RNNs.

Bidirectional RNN

A bi-directional recurrent neural network (Bi-RNN) is a type of recurrent neural network

(RNN) that processes input data in both forward and backward directions. The goal of a Bi-
RNN is to capture the contextual dependencies in the input data by processing it in both
directions, which can be useful in various natural language processing (NLP) tasks.

In a Bi-RNN, the input data is passed through two separate RNNs: one processes the data in
the forward direction, while the other processes it in the reverse direction. The outputs of
these two RNNs are then combined in some way to produce the final output.

One common way to combine the outputs of the forward and reverse RNNs is to concatenate
them. Still, other methods, such as element-wise addition or multiplication, can also be used.
The choice of combination method can depend on the specific task and the desired properties
of the final output.

Need of Bidirectional RNN

 A uni-directional recurrent neural network (RNN) processes input sequences in a

single direction, either from left to right or right to left.
Dept: CAI Page 7 of 15
NB-SEAGI DL(R20)-Unit-5

 This means the network can only use information from earlier time steps when
making predictions at later time steps.

 This can be limiting, as the network may not capture important contextual information
relevant to the output prediction.

 For example, in natural language processing tasks, a uni-directional RNN may not
accurately predict the next word in a sentence if the previous words provide important
context for the current word.

Consider an example where we could use the recurrent network to predict the masked word
in a sentence.

1. Apple is my favorite _____.

2. Apple is my favourite _____, and I work there.

3. Apple is my favorite _____, and I am going to buy one.

In the first sentence, the answer could be fruit, company, or phone. But it can not be a fruit in
the second and third sentences.

A Recurrent Neural Network that can only process the inputs from left to right may not
accurately predict the right answer for sentences discussed above.

This can be useful for tasks such as language processing, where understanding the context of
a word or phrase can be important for making accurate predictions.

In general, bidirectional RNNs can help improve a model's performance on various sequence-
based tasks.

This means that the network has two separate RNNs:

1. One that processes the input sequence from left to right

2. .Another one that processes the input sequence from right to left.

These two RNNs are typically called forward and backward RNNs, respectively.

Dept: CAI Page 8 of 15

NB-SEAGI DL(R20)-Unit-5

Difference between Bidirectional RNN and RNN

Encoder-Decoder Sequence-to-Sequence Architectures,

The Encoder-Decoder architecture with recurrent neural networks has become an effective
and standard approach for both neural machine translation (NMT) and sequence-to-sequence
(seq2seq) prediction in general.

The key benefits of the approach are the ability to train a single end-to-end model directly on
source and target sentences and the ability to handle variable length input and output
sequences of text.

An Encoder-Decoder architecture was developed where an input sequence was read in

entirety and encoded to a fixed-length internal representation.

A decoder network then used this internal representation to output words until the end of
sequence token was reached. LSTM networks were used for both the encoder and decoder.

The encoder-decoder architecture is a deep learning architecture used in many natural

language processing and computer vision applications. It consists of two main components:
an encoder and a decoder.

The most fundamental building blocks or components used to build the encoder-
decoder architecture is neural network. Different kind of neural networks including RNN,
LSTM, CNN, can be used based on encoder decoder architecture.

Dept: CAI Page 9 of 15

NB-SEAGI DL(R20)-Unit-5

In this architecture, the input data is first fed through what’s called as an encoder network.
The encoder network maps the input data into a numerical representation that captures the
important information from the input. Thee numerical representation of the input data is also
called as hidden state. The numerical representation (hidden state) is then fed into what’s
called as the decoder network. The decoder network generates the output by generating one
element of the output sequence at a time. The following picture represents the encoder
decoder architecture as explained here. Note that both input and output sequence of data can
be of varying length as shown in the picture below.

A popular form of neural network architecture called as autoencoder is a type of the encoder
decoder architecture. An autoencoder is a type of neural network architecture that uses an
encoder to compress an input into a lower-dimensional representation, and a decoder to
reconstruct the original input from the compressed representation. It is primarily used for
unsupervised learning and data compression. The other types of encoder-decoder architecture
can be used for supervised learning tasks, such as machine translation, image captioning, and
speech recognition. In this architecture, the encoder maps the input to a fixed-length
representation, which is then passed to the decoder to generate the output. So while the
encoder-decoder architecture and autoencoder have similar components, their main purposes
and applications differ.

Deep Recurrent Networks

The computation in most RNNs can be decomposed into three blocks of parameters and
associated transformations:

1. from the input to the hidden state,

2. from the previous hidden state to the next hidden state, and

3. from the hidden state to the output.

Dept: CAI Page 10 of 15

NB-SEAGI DL(R20)-Unit-5

With the RNN architecture of above fig., each of these three blocks is associated with a single
weight matrix. In other words, when the network is unfolded, each of these corresponds to a
shallow transformation.

By a shallow transformation, we mean a transformation that would be represented by a single

layer within a deep MLP.

Typically this is a transformation represented by a learned affine transformation followed by

a fixed nonlinearity.

Graves et al. were the first to show a significant benefit of decomposing the state of an RNN
into multiple layers as in below Fig.(left).

We can think of the lower layers in the hierarchy depicted in above Fig (a) as playing a role
in transforming the raw input into a representation that is more appropriate, at the higher
levels of the hidden state.

Dept: CAI Page 11 of 15

NB-SEAGI DL(R20)-Unit-5

Considerations of representational capacity suggest to allocate enough capacity in each of

these three steps, but doing so by adding depth may hurt learning by making optimization
difficult.

In general, it is easier to optimize shallower architectures, and adding the extra depth of
above Fig. (b) makes the shortest path from a variable in time step to a variable in time
step become longer.

However, this can be mitigated by introducing skip connections in the hidden-to-hidden path,
as illustrated in above Fig.(c).

Recursive Neural Network

Recursive Neural Networks (RvNNs) are deep neural networks used for natural language
processing. We get a Recursive Neural Network when the same weights are applied
recursively on a structured input to obtain a structured prediction.

Recursive Neural Networks (RvNNs) are a class of deep neural networks that can learn
detailed and structured information. With RvNN, you can get a structured prediction by
recursively applying the same set of weights on structured inputs. The word recursive
indicates that the neural network is applied to its output.

Due to their deep tree-like structure, Recursive Neural Networks can handle hierarchical data.
The tree structure means combining child nodes and producing parent nodes. Each child-
parent bond has a weight matrix, and similar children have the same weights. The number of
children for every node in the tree is fixed to enable it to perform recursive operations and
use the same weights. RvNNs are used when there's a need to parse an entire sentence.

Challenges of Long Term Dependencies

Neural network optimization face a difficulty when computational graphs become deep, e.g.,

 Feedforward networks with many layers

 RNNs that repeatedly apply the same operation at each time step of a long
temporal sequence

Gradients propagated over many stages tend to either vanish (most of the time) or explode
(damaging optimization)

The difficulty with long-term dependencies arise from exponentially smaller weights given to
long-term interactions (involving multiplication of many Jacobians)

Echo State Network

Echo state network is a type of Recurrent Neural Network, part of the reservoir computing
framework, which has the following particularities:

Dept: CAI Page 12 of 15

NB-SEAGI DL(R20)-Unit-5

 the weights between the input -the hidden layer ( the ‘reservoir’) : Win and also the
weights of the ‘reservoir’: Wr are randomly assigned and not trainable

 the weights of the output neurons (the ‘readout’ layer) are trainable and can be learned
so that the network can reproduce specific temporal patterns

 the hidden layer (or the ‘reservoir’) is very sparsely connected (typically < 10%
connectivity)

 the reservoir architecture creates a recurrent non linear embedding (H on the image
below) of the input which can be then connected to the desired output and these final
weights will be trainable

 it is possible to connect the embedding to a different predictive model (a trainable NN

or a ridge regressor/SVM for classification problems)

Long Short Term Memory(LSTM)

 LSTM

 Bidirectional LSTM

Need of LSTM

LSTM networks are an extension of recurrent neural networks (RNNs) mainly introduced to
handle situations where RNNs fail.

 It fails to store information for a longer period of time. At times, a reference to certain
information stored quite a long time ago is required to predict the current output. But
RNNs are absolutely incapable of handling such “long-term dependencies”.

 There is no finer control over which part of the context needs to be carried forward
and how much of the past needs to be ‘forgotten’.

 Other issues with RNNs are exploding and vanishing gradients (explained later)
which occur during the training process of a network through backtracking.

Thus, Long Short-Term Memory (LSTM) was brought into the picture. It has been so
designed that the vanishing gradient problem is almost completely removed, while the
training model is left unaltered. Long-time lags in certain problems are bridged using LSTMs
which also handle noise, distributed representations, and continuous values. With LSTMs,
there is no need to keep a finite number of states from beforehand as required in the hidden
Markov model (HMM). LSTMs provide us with a large range of parameters such as learning
rates, and input and output biases.

Dept: CAI Page 13 of 15

NB-SEAGI DL(R20)-Unit-5

Structure of LSTM

The basic difference between the architectures of RNNs and LSTMs is that the hidden layer
of LSTM is a gated unit or gated cell. It consists of four layers that interact with one another
in a way to produce the output of that cell along with the cell state. These two things are then
passed onto the next hidden layer. Unlike RNNs which have got only a single neural net layer
of tanh, LSTMs comprise three logistic sigmoid gates and one tanh layer. Gates have been
introduced in order to limit the information that is passed through the cell. They determine
which part of the information will be needed by the next cell and which part is to be
discarded. The output is usually in the range of 0-1 where ‘0’ means ‘reject all’ and ‘1’ means
‘include all’.

Information is retained by the cells and the memory manipulations are done by the gates.
There are three gates which are explained below:

Forget Gate The information that is no longer useful in the cell state is removed with the
forget gate. Two inputs x_t (input at the particular time) and h_t-1 (previous cell output) are
fed to the gate and multiplied with weight matrices followed by the addition of bias. The
resultant is passed through an activation function which gives a binary output. If for a
particular cell state, the output is 0, the piece of information is forgotten and for output 1, the
information is retained for future use.

Input gate The addition of useful information to the cell state is done by the input gate. First,
the information is regulated using the sigmoid function and filter the values to be
remembered similar to the forget gate using inputs h_t-1 and x_t. Then, a vector is created
using the tanh function that gives an output from -1 to +1, which contains all the possible
values from h_t-1 and x_t. At last, the values of the vector and the regulated values are
multiplied to obtain useful information.

Output gate The task of extracting useful information from the current cell state to be
presented as output is done by the output gate. First, a vector is generated by applying the
tanh function on the cell. Then, the information is regulated using the sigmoid function and
filtered by the values to be remembered using inputs h_t-1 and x_t. At last, the values of the
Dept: CAI Page 14 of 15
NB-SEAGI DL(R20)-Unit-5

vector and the regulated values are multiplied to be sent as an output and input to the next
cell.

Bidirectional LSTM

Bidirectional LSTM or BiLSTM is a term used for a sequence model which contains two
LSTM layers, one for processing input in the forward direction and the other for processing
in the backward direction. It is usually used in NLP-related tasks. The intuition behind this
approach is that by processing data in both directions, the model is able to better understand
the relationship between sequences (e.g. knowing the following and preceding words in a
sentence).

To better understand this let us see an example. The first statement is “Server can you bring
me this dish” and the second statement is “He crashed the server”. In both these statements,
the word server has different meanings and this relationship depends on the following and
preceding words in the statement. The bidirectional LSTM helps the machine to understand
this relationship better than compared with unidirectional LSTM. This ability of BiLSTM
makes it a suitable architecture for tasks like sentiment analysis, text classification, and
machine translation.

Dept: CAI Page 15 of 15

US Victra Verizon Wireless
No ratings yet
US Victra Verizon Wireless
1 page
Deep Learning Notes
100% (1)
Deep Learning Notes
44 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
36 pages
Module 5(Chapter 10)
No ratings yet
Module 5(Chapter 10)
17 pages
UNIT5
No ratings yet
UNIT5
13 pages
DL 4
No ratings yet
DL 4
11 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
0% (1)
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
16 pages
Module5-dl
No ratings yet
Module5-dl
18 pages
CS 601 Machine Learning Unit 4
No ratings yet
CS 601 Machine Learning Unit 4
14 pages
UNIT-3
No ratings yet
UNIT-3
30 pages
Unit 3
No ratings yet
Unit 3
41 pages
UNIT-IV DL
No ratings yet
UNIT-IV DL
23 pages
Unit 5
No ratings yet
Unit 5
8 pages
nndl (2)
No ratings yet
nndl (2)
10 pages
Institute of Engineering and Technology Davv, Indore: Lab Assingment On
No ratings yet
Institute of Engineering and Technology Davv, Indore: Lab Assingment On
14 pages
Recurrent and Recursive Neural Networks
No ratings yet
Recurrent and Recursive Neural Networks
19 pages
Mod 4-RNN Deep Learning
No ratings yet
Mod 4-RNN Deep Learning
63 pages
Unit 3-2
100% (1)
Unit 3-2
50 pages
Unit Iv (CNN)
No ratings yet
Unit Iv (CNN)
8 pages
DL Unit-4
No ratings yet
DL Unit-4
4 pages
Recurrent Neural Network (RNN)
No ratings yet
Recurrent Neural Network (RNN)
26 pages
What are Recurrent Neural Networks.docx
No ratings yet
What are Recurrent Neural Networks.docx
7 pages
Unit 3
No ratings yet
Unit 3
27 pages
5. DEEP UNIT 3 F (1)
No ratings yet
5. DEEP UNIT 3 F (1)
51 pages
Ministry of Higher Education and Scientific Research University of Technology Computer Engineering Department
No ratings yet
Ministry of Higher Education and Scientific Research University of Technology Computer Engineering Department
6 pages
CS601 - Machine Learning - Unit 4 - Notes - 1672759767
No ratings yet
CS601 - Machine Learning - Unit 4 - Notes - 1672759767
12 pages
AD3501_UNIT3
No ratings yet
AD3501_UNIT3
29 pages
Machine Learning Unit 3-5
No ratings yet
Machine Learning Unit 3-5
13 pages
Unit 4 - MachineLearning
No ratings yet
Unit 4 - MachineLearning
16 pages
mergeddv
No ratings yet
mergeddv
2 pages
viva
No ratings yet
viva
8 pages
DL ASMT-2
No ratings yet
DL ASMT-2
17 pages
2111CS010077 deep learning
No ratings yet
2111CS010077 deep learning
10 pages
Module 3.2 Time Series Forecasting LSTM Model
No ratings yet
Module 3.2 Time Series Forecasting LSTM Model
23 pages
What is a Recurrent Neural Network
No ratings yet
What is a Recurrent Neural Network
36 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
DL Unit - III Notes1
No ratings yet
DL Unit - III Notes1
14 pages
Unit 4 - Machine Learning
No ratings yet
Unit 4 - Machine Learning
16 pages
Material
No ratings yet
Material
16 pages
DL Unit4
No ratings yet
DL Unit4
20 pages
Alexandridis 2015
No ratings yet
Alexandridis 2015
5 pages
Unit V Recurrent Neural Networks
No ratings yet
Unit V Recurrent Neural Networks
35 pages
DL UNIT-II
No ratings yet
DL UNIT-II
36 pages
DL Ut - 2
No ratings yet
DL Ut - 2
30 pages
UNIT-3 part2
No ratings yet
UNIT-3 part2
14 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
99 pages
Cs3491-Artificial Intelligence and Machine Learning-1221091049-Unit 5 Aiml
No ratings yet
Cs3491-Artificial Intelligence and Machine Learning-1221091049-Unit 5 Aiml
38 pages
Lecture Notes_RRN
No ratings yet
Lecture Notes_RRN
8 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
Machine-learning-Unit-4-RNN (2)
No ratings yet
Machine-learning-Unit-4-RNN (2)
11 pages
ML Unit 4
No ratings yet
ML Unit 4
47 pages
DL3 QB
No ratings yet
DL3 QB
19 pages
DL Anonymous Question Bank
No ratings yet
DL Anonymous Question Bank
22 pages
Neural Network (RNN & CNN)
No ratings yet
Neural Network (RNN & CNN)
31 pages
Neural Network Topologies: Input Layer Output Layer
No ratings yet
Neural Network Topologies: Input Layer Output Layer
30 pages
ASP-DAC2017-1352-11
No ratings yet
ASP-DAC2017-1352-11
6 pages
AIDS-II PT1 Question Bank
No ratings yet
AIDS-II PT1 Question Bank
27 pages
AAM ut answer
No ratings yet
AAM ut answer
11 pages
ad3501-dl-unit-3-notes
No ratings yet
ad3501-dl-unit-3-notes
30 pages
A Literature Survey For Object Recognition Using Neural Networks in FPGA
No ratings yet
A Literature Survey For Object Recognition Using Neural Networks in FPGA
6 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
My Support Letter Marcus Hutchins
No ratings yet
My Support Letter Marcus Hutchins
56 pages
Instant Access to Six Sigma Software Development Second Edition Christine B. Tayntor ebook Full Chapters
100% (3)
Instant Access to Six Sigma Software Development Second Edition Christine B. Tayntor ebook Full Chapters
61 pages
ORACLE LOCk
No ratings yet
ORACLE LOCk
14 pages
MUY INTERESANTE Armas 2a Guerra Mundial
No ratings yet
MUY INTERESANTE Armas 2a Guerra Mundial
201 pages
Calculated in Mind Book
100% (1)
Calculated in Mind Book
392 pages
Electronic Device and Circuit
No ratings yet
Electronic Device and Circuit
124 pages
Manual Programacion Zmax Leviton
No ratings yet
Manual Programacion Zmax Leviton
40 pages
NA11 DBX Manual PR - DNP3 - 0250
No ratings yet
NA11 DBX Manual PR - DNP3 - 0250
14 pages
Report of Remote File Sharing Plat Form
No ratings yet
Report of Remote File Sharing Plat Form
24 pages
Pandi
No ratings yet
Pandi
2 pages
Ideal and Practical Inverter Have Sinusoidal and No-Sinusoidal Waveforms at Output Respectively
No ratings yet
Ideal and Practical Inverter Have Sinusoidal and No-Sinusoidal Waveforms at Output Respectively
16 pages
RRB Group - D All Shift Dec. 2018 PDF
No ratings yet
RRB Group - D All Shift Dec. 2018 PDF
291 pages
Adi TV Samsung Treining Seria LB350 650
No ratings yet
Adi TV Samsung Treining Seria LB350 650
155 pages
IC Product Launch Plan Checklist 11208
No ratings yet
IC Product Launch Plan Checklist 11208
9 pages
El Espiritu de Las Leyes Montesquieu Resumen Por Capitulos
100% (1)
El Espiritu de Las Leyes Montesquieu Resumen Por Capitulos
7 pages
Shanling M0 Pro - Advanced User Manual
No ratings yet
Shanling M0 Pro - Advanced User Manual
20 pages
Nd-La MV9 P-2
No ratings yet
Nd-La MV9 P-2
10 pages
Card Set Up Directions: © Stem Mom - Images From Phillip Martin
No ratings yet
Card Set Up Directions: © Stem Mom - Images From Phillip Martin
6 pages
TIPqc DESIGN 9 Thesis Manual 02.1617 Rev00
No ratings yet
TIPqc DESIGN 9 Thesis Manual 02.1617 Rev00
24 pages
ECE VLSI 2nd SEM Syllabus
No ratings yet
ECE VLSI 2nd SEM Syllabus
9 pages
Superbox PDF 3
No ratings yet
Superbox PDF 3
3 pages
MDL GDIPlus 2
No ratings yet
MDL GDIPlus 2
25 pages
Specification - : Ku-Band PLL LNB Model No. NJR2835 Series
No ratings yet
Specification - : Ku-Band PLL LNB Model No. NJR2835 Series
17 pages
Night Vision System in Automobile: Seminar Report On
No ratings yet
Night Vision System in Automobile: Seminar Report On
86 pages
Search: LOOK: Anne Curtis, Solenn Heussaff Display Their Baby Bumps
No ratings yet
Search: LOOK: Anne Curtis, Solenn Heussaff Display Their Baby Bumps
7 pages
Spark With Bigdata
No ratings yet
Spark With Bigdata
94 pages
2022 Occupation List
No ratings yet
2022 Occupation List
19 pages
Novelty Search Report
No ratings yet
Novelty Search Report
6 pages
Ec3361-Edc Lab Manual
No ratings yet
Ec3361-Edc Lab Manual
47 pages

Unit- 5 Deep Learning (1)

Uploaded by

Unit- 5 Deep Learning (1)

Uploaded by

NB-SEAGI DL(R20)-Unit-5

DEEP LEARNING (20A05703c)

What Is a Neural Network? A Neural Network consists of different layers connected to

 Feed-Forward Neural Network: Used for general Regression and Classification

 Deep Belief Network: Used in healthcare sectors for cancer detection.

Unfolding Computational Graphs

A computational graph is a way to formalize the structure of a set of computations, such as

For example, consider the classical form of a dynamical system:

where s(t ) is called the state of the system.

Another example, let us consider a dynamical system driven by an external signal ,

Dept: CAI Page 1 of 15

 This summary is in general necessarily lossy, since it maps an arbitrary length

Dept: CAI Page 2 of 15

We can represent the unfolded recurrence after steps with a function :

Recurrent Neural Network

 Recurrent Neural Network(RNN) is a type of Neural Network where the output

 The main and most important feature of RNN is Hidden state,

 RNN have a “memory” which remembers all information about

Now the RNN will do the following:

RNN converts the independent activations into dependent activations by

The formula for calculating current state:

Dept: CAI Page 4 of 15

Training through RNN

1. A single-time step of the input is provided to the network.

3. The current ht becomes ht-1 for the next time step.

 Cannot handle sequential data

 Considers only the current input

 Cannot memorize previous inputs

The solution to these issues is the RNN.

Dept: CAI Page 5 of 15

3. Model size does not grow with input size.

1. Gradient vanishing and exploding problems.

1. Language Modelling and Generating Text

4. Image Recognition, Face detection

5. Time series Forecasting

1. Machine Translation We make use of

2. Speech Recognition Recurrent Neural Networks has

Dept: CAI Page 6 of 15

3. Sentiment Analysis We make use of sentiment analysis to positivity, negativity, or

4. Automatic Image Tagger RNNs, in conjunction with convolutional neural networks,

A bi-directional recurrent neural network (Bi-RNN) is a type of recurrent neural network

Need of Bidirectional RNN

 A uni-directional recurrent neural network (RNN) processes input sequences in a

1. Apple is my favorite _____.

2. Apple is my favourite _____, and I work there.

3. Apple is my favorite _____, and I am going to buy one.

This means that the network has two separate RNNs:

1. One that processes the input sequence from left to right

Dept: CAI Page 8 of 15

Difference between Bidirectional RNN and RNN

Encoder-Decoder Sequence-to-Sequence Architectures,

An Encoder-Decoder architecture was developed where an input sequence was read in

The encoder-decoder architecture is a deep learning architecture used in many natural

Dept: CAI Page 9 of 15

Deep Recurrent Networks

1. from the input to the hidden state,

3. from the hidden state to the output.

Dept: CAI Page 10 of 15

By a shallow transformation, we mean a transformation that would be represented by a single

Typically this is a transformation represented by a learned affine transformation followed by

Dept: CAI Page 11 of 15

Considerations of representational capacity suggest to allocate enough capacity in each of

Recursive Neural Network

Challenges of Long Term Dependencies

 Feedforward networks with many layers

Echo State Network

Dept: CAI Page 12 of 15

 it is possible to connect the embedding to a different predictive model (a trainable NN

Long Short Term Memory(LSTM)

Dept: CAI Page 13 of 15

Dept: CAI Page 15 of 15

You might also like