0% found this document useful (0 votes)
27 views

Unit 6

This document provides information about applying neural networks to images. It discusses how images can be represented as single column vectors by stretching the pixels and the problems this causes due to high dimensionality and loss of local relationships. Convolutional neural networks are proposed as a solution, using local connectivity and weight sharing. The document then discusses layers in CNNs such as convolution, pooling and fully connected layers. It also covers applications of CNNs and 3D convolutional networks. Recurrent neural networks are also summarized, including their ability to model temporal data and applications such as image captioning. Long short-term memory networks are introduced as an improvement over traditional RNNs.

Uploaded by

Poorna
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Unit 6

This document provides information about applying neural networks to images. It discusses how images can be represented as single column vectors by stretching the pixels and the problems this causes due to high dimensionality and loss of local relationships. Convolutional neural networks are proposed as a solution, using local connectivity and weight sharing. The document then discusses layers in CNNs such as convolution, pooling and fully connected layers. It also covers applications of CNNs and 3D convolutional networks. Recurrent neural networks are also summarized, including their ability to model temporal data and applications such as image captioning. Long short-term memory networks are introduced as an improvement over traditional RNNs.

Uploaded by

Poorna
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

Neural Networks & Deep Learning

Unit-6
Dr. D. SUDHEER
Assistant Professor
Department of CSE
VNR VJIET (NAAC: A++, NIRF: 113)
Hyderabad, Telangana.

©Dr. SUDHEER DEVULAPALLI 1


How to apply NN over Image?
Multi-layer Neural Network & Image

©Dr. SUDHEER DEVULAPALLI 2


How to apply NN over Image?
Multi-layer Neural Network & Image
Stretch pixels in single column vector

©Dr. SUDHEER DEVULAPALLI 3


How to apply NN over Image?
Multi-layer Neural Network & Image
Stretch pixels in single column vector

Problems ?
©Dr. SUDHEER DEVULAPALLI 4
How to apply NN over Image?
Multi-layer Neural Network & Image
Stretch pixels in single column vector

High dimensionality
Problems ? Local relationship

©Dr. SUDHEER DEVULAPALLI 5


How to apply NN over Image?
Multi-layer Neural Network & Image
Stretch pixels in single column vector

High dimensionality
Solutions ? Local relationship

©Dr. SUDHEER DEVULAPALLI 6


Convolutional Neural Networks
• Also known as
CNN, ConvNet, DCN
• CNN = a multi-layer neural network with
1. Local connectivity 2. Weight sharing

©Dr. SUDHEER DEVULAPALLI 7


Convolution Neural
Network(CNN)

©Dr. SUDHEER DEVULAPALLI 8


©Dr. SUDHEER DEVULAPALLI 9
For convolution and pooling operations open CNN layers unit5.pdf file.

©Dr. SUDHEER DEVULAPALLI 10


CNN Local and Global connectivity Input neurons:7, Hidden units:3

Number of parameters:
Global connectivity:?
Local connectivity: ?

©Dr. SUDHEER DEVULAPALLI 11


CNN Local and Global connectivity Input neurons:7, Hidden units:3

Number of parameters:
Global connectivity:3*7=21
Local connectivity: 3*3=9

©Dr. SUDHEER DEVULAPALLI 12


CNN Local and Global connectivity Input neurons:7, Hidden units:3

Number of parameters:
Without weight sharing:3*3=9
With weight sharing: 3*1=3

©Dr. SUDHEER DEVULAPALLI 13


Layers in CNN

• Input Layer (Ex. Input image)


• Convolution layer
• Non linearity layer
• Pooling layer
• fully connected layer
• classification layer

©Dr. SUDHEER DEVULAPALLI 14


©Dr. SUDHEER DEVULAPALLI 15
©Dr. SUDHEER DEVULAPALLI 16
©Dr. SUDHEER DEVULAPALLI 17
©Dr. SUDHEER DEVULAPALLI 18
©Dr. SUDHEER DEVULAPALLI 19
©Dr. SUDHEER DEVULAPALLI 20
©Dr. SUDHEER DEVULAPALLI 21
©Dr. SUDHEER DEVULAPALLI 22
©Dr. SUDHEER DEVULAPALLI 23
3D ConvNet

©Dr. SUDHEER DEVULAPALLI 24


Figure 3: Difference: 2D convolution and 3D convolution
[2]

©Dr. SUDHEER DEVULAPALLI 25


Difference: 2D convolution and 3D convolution
 2D convolution applied on an image will output an image.
 2D convolution applied on multiple images (treating them as different channels)
also results in an image.
 Hence, 2D ConvNets lose temporal information of the input signal right after
every convolution operation.

 Only 3D convolution preserves the temporal information of the input signals


resulting in an output volume.

©Dr. SUDHEER DEVULAPALLI 26


Batch normalization and layers:
• To accelerate training in CNNs we can normalize the activations of the previous
layer at each batch.
• This technique applies a transformation that keeps the mean activation close to
0.0 while also keeping the activation standard deviation close to 1.0.
• By applying normalization for each training mini-batch of input records, we can
use much higher learning rates.
• Batch normalization also reduces the sensitivity of training toward weight
initialization and acts as a regularizer.
Fully Connected Layers:
• We use this layer to compute class scores that we’ll use as output of the
network.
• Fully connected layers perform transformations on the input data volume that
are a function of the activations in the input volume and the parameters.
Applications of CNN:
• MRI data
•3D shape data
• Graph data
• NLP applications
Recurrent Neural Networks

•Historically, these networks have been difficult to train, but more recently,
advances in research (optimization, network architectures, parallelism, and
graphics processing units [GPUs]) have made them more approachable for the
practitioner.
• Recurrent Neural Networks take each vector from a sequence of input
vectors and model them one at a time.
• Modeling the time dimension is a hallmark of Recurrent Neural Networks.
Modeling the Time Dimension:
• Recurrent Neural Networks are considered Turing complete and can simulate
arbitrary programs (with weights).
• Recurrent neural networks are well suited for modeling functions for which
the input and/or output is composed of vectors that involve a time dependency
between the values.
• Recurrent neural networks model the time aspect of data by creating cycles
in the network (hence, the “recurrent” part of the name).©Dr. SUDHEER DEVULAPALLI 29
Lost in Time:
• Many classification tools (support vector machines, logistic regression, and
regular feed-forward networks) have been applied successfully without
modeling the time dimension, assuming independence.
• Other variations of these tools capture the time dynamic by modeling a
sliding window of the input (e.g., the previous, current, and next input together
as a single input vector).
• A drawback of these tools is that assuming independence in the time
connection between model inputs does not allow our model to capture long-
range time dependencies.
• Sliding window techniques have a limited window width and will fail to
capture any effects larger than the fixed window size.
• Good example is automatic replies by machines for conversations over time.

©Dr. SUDHEER DEVULAPALLI 30


Temporal feedback and loops in connections:
• Recurrent Neural Networks can have loops in the connections.
•This allows them to model temporal behavior gain accuracy in domains such
as time-series, language, audio, and text.
• Data in these domains are inherently ordered and context sensitive where later
values depend on previous ones.
• A Recurrent Neural Network includes a feedback loop that it uses to learn
from sequences, including sequences of varying lengths.
• Recurrent Neural Networks contain an extra parameter matrix for the
connections between time-steps, which are used/trained to capture the temporal
relationships in the data.
• Recurrent Neural Networks are trained to generate sequences, in which the
output at each time-step is based on both the current input and the input at all
previous time steps.
• Recurrent Neural Networks compute a gradient with an algorithm called back
propagation through time (BPTT). ©Dr. SUDHEER DEVULAPALLI 31
Applications for Sequences and time-series data:

• Image captioning
•Speech synthesis
•Music generation
•Playing video games
•Language modeling
•Character-level text generation models

Understanding model input and output:


• Recurrent Neural Networks change the fixed input to dynamic to include
multiple input vectors, one for each time-step, and each vector can have many
columns.

©Dr. SUDHEER DEVULAPALLI 32


• One-to-many: sequence output. For example, image captioning takes an
image and outputs a sequence of words.
•Many-to-one: sequence input. For example, sentiment analysis where a given
sentence is input.
•Many-to-many: For example, video classification: label each frame.

©Dr. SUDHEER DEVULAPALLI 33


©Dr. SUDHEER DEVULAPALLI 34
Traditional RNN

©Dr. SUDHEER DEVULAPALLI 35


LSTM Neuron

A
LSTM

©Dr. SUDHEER DEVULAPALLI 36


• It will be useful to remember the past data along with the present data to take
decision.
• Example, In a sentence beginning words more important than the last words
to understand the meaning.
• LSTM stores all the words along with recent words to take decision.

LSTM

Long-term-memory Short-term-memory

©Dr. SUDHEER DEVULAPALLI 37


•by logic gatesLong-term memory represents all the words starting from the
first word.
• Short-term-memory represents recent words from past state of the model.
• when LSTM keep on storing data, it may reach where they cannot store
further.
• It will remove the unwanted information from time to time.
• The removing or keeping the data implemented .
input Gate
output Gate
Forget 1 2 3
Gate
LSTM
Pass
Forget updated
irrelevant information
information

New updated
information ©Dr. SUDHEER DEVULAPALLI 38
Layers of RNN
There two important layers: 1. Embedding 2. LSTM
1. Embedding
• It is useful to convert positive integers to vector of values.
• Fixed range of input values should be provide this layer.
• It will be more useful in language translation to understand the meaning.

Embedding(input_dim,output_dim,input_length)

©Dr. SUDHEER DEVULAPALLI 39


LSTM:
• The LSTM network is different to a classical MLP.
• Input data is propagated through the network in order to make a prediction.
• Like RNNs, the LSTMs have recurrent connections so that the state from
previous activations of the neuron from the previous time step is used as
context for formulating an output.
• But unlike other RNNs, the LSTM has a unique formulation that allows it to
avoid the problems that prevent the training and scaling of other RNNs.
• LSTM overcomes the problems like vanishing gradient and exploding
gradients.
LSTM Gates
• Forget Gate: Decides what information to discard from the cell.
• Input Gate: Decides which values from the input to update the memory
state.
• Output Gate: Decides what to output based on input and the memory of the
cell. ©Dr. SUDHEER DEVULAPALLI 40
• The forget gate and input gate are used in the updating of the internal state.
• The output gate is a final limiter on what the cell actually outputs.
• It is these gates and the consistent data flow called the constant error
carrousel or CEC that keep each cell stable (neither exploding or vanishing).

Applications of LSTM:
• Image caption generation.
• Text translation.
• Hand writing recognition.
Limitations of LSSTM:
• In time series forecasting, often the information relevant for making a
forecast is within a small window of past observations. Often an MLP with a
window or a linear model may be a less complex and more suitable model.
• An important limitation of LSTMs is the memory

©Dr. SUDHEER DEVULAPALLI 41

You might also like