SlideShare a Scribd company logo
Deep Learning – (RNN)
Recurrent Neural Networks
Content
• Architecture for an RNN
• Forward propagation
• Back propagation
• Long Short Term Memory Networks LSTM
Motivation: what about sequence prediction?
What can I do when input size and output size vary?
Architecture for an RNN
RNNs are very powerful, because they combine two properties:
• Distributed hidden state that allows them to store a lot of information about the past efficiently.
• Non-linear dynamics that allows them to update their hidden state in complicated ways.
• Share parameters/weights (matrix Wi),
• Input Layer (Vectors - x1, x2, …, xn), Hidden Layer (vectors - h0, h1, h2, …, hn), Output Layer(vectors - y1, y2, …, yn), Neurons
(aij), time step (t).
Some
information
is passed
from one
subunit to
the next
Start of
sequence
marker
End of
sequence
marker
Sequence of outputs
Sequence of inputs
Architecture for an RNN
• The recurrent net is a layered net with shared weights and then train the feed-
forward net with weight constraints.
Training algorithm is working in the time domain:
1. Forward propagation: Compute predictions
• The forward pass builds up a stack of the activities of all the units at each time step.
• The input vector xᵢ to each hidden state where i=1, 2,…, n for each element in the
input sequence.
• The input text must be encoded into numerical values. E.g. every letter in the word
“dogs” is one-hot encoded vector with dimension (4x1). Similarly, x can also be
word embedding or other numerical representations.
• Hidden state output vector after the activation function from previous state is
applied to the next state of hidden nodes.
• At time t, the architecture uses for h vector hidden vector from the previous state
and the input x at time t-1. i.e. takes information from previous inputs that are
sequentially behind the current input.
• But, the h0 vector will always start as a vector of 0’s because the algorithm has no
information preceding the first element in the sequence.
2
1
2
1
2
1
2
1
2
1
:
:
:
w
and
w
for
w
E
w
E
use
w
E
and
w
E
compute
w
w
need
we
w
w
constrain
To


+







=

=
Back propagation
Sharing weights
Example for reference
word “dogs,” where we want to train an RNN to predict the letter “s” given the letters “d”-“o”-“g”.
use 3 hidden nodes in our RNN (d=3). The dimensions for each of our variables are as follows:
Here, k = 4, because our input x is a 4-dimensional one-hot vector for the letters in “dogs.”
Architecture for an RNN
• Activation function: tanh at hidden layers, softmax
at output layer
2. Compute the loss using cross entropy:
3. Back propagation: compute the gradient (error
derivatives) at each time step.
• Update the weights to minimize the loss:
For hidden state at t=2, input is
the output from t-1 and x at t
Applications of Recurrent Neural Networks (RNNs)
1. Prediction problems
2. Language Modelling and Generating Text
3. Machine Translation
4. Speech Recognition
5. Generating Image Descriptions
6. Video Tagging
7. Text Summarization
8. Call Center Analysis
9. Face detection, OCR Applications as Image Recognition
10. Other applications like Music composition
Long Short Term Memory Networks (LSTMs)
• LSTMs are a type of recurrent neural network (RNN) that can learn and memorize long-term dependencies.
• LSTMs retain past information for long period of time. Hence, It is very useful in time-series prediction.
• LSTMs have a chain-like structure where four (memory cell, forget, input, output) interacting layers communicate in a
unique way.
• LSTM has three gates (forget, input, output) to protect and control the cell state.
LSTMs Working principle:
• First, they forget irrelevant information of the previous state or keep the relevant information of the previous state.
• Next, they selectively update the memory cell-state values.
• The memory cell state carry relevant information from the earlier time steps to later time steps throughout the processing of
the sequence that reducing the effects of short-term memory.
• As the cell state goes on its journey, information get’s added or removed to the cell state via gates.
• The gates are different neural networks that decide which information is allowed on the cell state. The gates can learn what
information is relevant to keep or forget during training.
• Finally, provides the output of certain parts of the cell state.
LSTM Architecture
LSTM Unit Structure
Components in LSTM:
• Three gates: forget gate, input gate
and output gate.
• Memory cell state
• Forward Propagation: Processes the data
passing on information. The differences are
the operations within the LSTM’s cells.
• These operations are used to allow the LSTM
to keep or forget information.
Backward Propagation
• Update the parameters to reduce the error.
LSTM layers working principle
• Gates are composed with a sigmoid neural net layer and a pointwise multiplication operation.
• The sigmoid layer output range is between zero and one that describe how much of each component pass/remove
information. A value of zero means “no information allows,” while a value of one means “pass everything”.
Forget gate layer:
• Decides what information going to throw away from the memory cell state.
• 1 represents “completely keep this” while a 0 represents “completely reject this.”
Input gate layer:
• The next step is to decide what new information we’re going to store in the cell state.
• This has two parts. First, a sigmoid layer called the “input gate layer” decides which values we’ll update.
• Next, a tanh layer creates a vector of new candidate values, Ct, that could be added to the state.
• In the next step, we’ll combine these two to create an update to the state.
Memory Cell State:
• Update the old cell state, Ct-1, into the new cell state Ct.
• Multiply the old state by ft, forgetting the things we decided to forget earlier.
• Then we add it ∗ Ct. This is the new candidate values, scaled by how much we decided to update each state value.
Output gate layer:
• Decides output based on cell state, but will be a filtered version.
• First, run a sigmoid layer which decides what parts of the cell state to be output.
• Then, we put the cell state through tanh (to push the values to be between −1 and 1) and
multiply it by the output of the sigmoid gate, so that we only output the parts we decided to.
Types of LSTM models based on input and output
• One input to One output - eg : Giving labels to image
• One input to many outputs- eg : Giving description/caption to image
(description will have sequence of words - many output)
• Many inputs to one output - eg : Predicting the next word in given incomplete statement
• Many inputs to Many outputs- eg : Stock market prediction for following days based on past data
Applications of LSTM
1. Speech Recognition (Input is audio and output is text) - Google Assistant, Microsoft
Cortana, Apple Siri
2. Machine Translation (Input is text and output is also text) - Google Translate
3. Image Captioning (Input is image and output is text)
4. Sentiment Analysis (Input is text and output is rating)
5. Music Generation/Synthesis ( input music notes and output is music)
6. Video Activity Recognition (input is video and output is type of activity)
Ad

More Related Content

What's hot (20)

Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
Si Haem
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
Shreshth Saxena
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
jins0618
 
Recurrent neural network
Recurrent neural networkRecurrent neural network
Recurrent neural network
Syed Annus Ali SHah
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
leopauly
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
Gaurav Mittal
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
Christian Perone
 
Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)
Abdullah al Mamun
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Yan Xu
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
Simplilearn
 
Lstm
LstmLstm
Lstm
Mehrnaz Faraz
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
Akshay Sehgal
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
Activation function
Activation functionActivation function
Activation function
RakshithGowdakodihal
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
Jon Lederman
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
Ketaki Patwari
 
Deep learning
Deep learningDeep learning
Deep learning
Ratnakar Pandey
 
Deep Learning - Overview of my work II
Deep Learning - Overview of my work IIDeep Learning - Overview of my work II
Deep Learning - Overview of my work II
Mohamed Loey
 
LSTM Tutorial
LSTM TutorialLSTM Tutorial
LSTM Tutorial
Ralph Schlosser
 
Deep learning presentation
Deep learning presentationDeep learning presentation
Deep learning presentation
Tunde Ajose-Ismail
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
Si Haem
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
jins0618
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
leopauly
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
Gaurav Mittal
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
Christian Perone
 
Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)
Abdullah al Mamun
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Yan Xu
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
Simplilearn
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
Jon Lederman
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
Ketaki Patwari
 
Deep Learning - Overview of my work II
Deep Learning - Overview of my work IIDeep Learning - Overview of my work II
Deep Learning - Overview of my work II
Mohamed Loey
 

Similar to Recurrent neural networks rnn (20)

RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesRNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantages
AbhijitVenkatesh1
 
Long short term memory on tensorflow using python
Long short term memory on tensorflow using pythonLong short term memory on tensorflow using python
Long short term memory on tensorflow using python
rahulk2004
 
RNN-LSTM.pptx
RNN-LSTM.pptxRNN-LSTM.pptx
RNN-LSTM.pptx
ssuserc755f1
 
RNN-LSTM.pptx
RNN-LSTM.pptxRNN-LSTM.pptx
RNN-LSTM.pptx
ssuserc755f1
 
Machine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural NetworksMachine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural Networks
Andrew Ferlitsch
 
Sequencing and Attention Models - 2nd Version
Sequencing and Attention Models - 2nd VersionSequencing and Attention Models - 2nd Version
Sequencing and Attention Models - 2nd Version
ssuserbd372d
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
Sharath TS
 
recurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptxrecurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptx
SagarTekwani4
 
Artificial neutral network cousre of AI.ppt
Artificial neutral network cousre of AI.pptArtificial neutral network cousre of AI.ppt
Artificial neutral network cousre of AI.ppt
attaurahman
 
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwlec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
khushbu maurya
 
Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
Taymoor Nazmy
 
Advanced Machine Learning
Advanced Machine LearningAdvanced Machine Learning
Advanced Machine Learning
ANANDBABUGOPATHOTI1
 
Lec10new
Lec10newLec10new
Lec10new
Ananda Gopathoti
 
rnn BASICS
rnn BASICSrnn BASICS
rnn BASICS
Priyanka Reddy
 
lec10new.ppt
lec10new.pptlec10new.ppt
lec10new.ppt
SumantKuch
 
Long and short term memory presesntation
Long and short term memory presesntationLong and short term memory presesntation
Long and short term memory presesntation
chWaqasZahid
 
M5 Topic 1 - Encoder Decoder MODEL-JEC.pdf
M5 Topic 1 - Encoder Decoder MODEL-JEC.pdfM5 Topic 1 - Encoder Decoder MODEL-JEC.pdf
M5 Topic 1 - Encoder Decoder MODEL-JEC.pdf
KeshavSen4
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Junaid Bhat
 
RNN JAN 2025 ppt fro scratch looking from basic.pptx
RNN JAN 2025 ppt fro scratch looking from basic.pptxRNN JAN 2025 ppt fro scratch looking from basic.pptx
RNN JAN 2025 ppt fro scratch looking from basic.pptx
webseriesnit
 
An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)
EmmanuelJosterSsenjo
 
RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesRNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantages
AbhijitVenkatesh1
 
Long short term memory on tensorflow using python
Long short term memory on tensorflow using pythonLong short term memory on tensorflow using python
Long short term memory on tensorflow using python
rahulk2004
 
Machine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural NetworksMachine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural Networks
Andrew Ferlitsch
 
Sequencing and Attention Models - 2nd Version
Sequencing and Attention Models - 2nd VersionSequencing and Attention Models - 2nd Version
Sequencing and Attention Models - 2nd Version
ssuserbd372d
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
Sharath TS
 
recurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptxrecurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptx
SagarTekwani4
 
Artificial neutral network cousre of AI.ppt
Artificial neutral network cousre of AI.pptArtificial neutral network cousre of AI.ppt
Artificial neutral network cousre of AI.ppt
attaurahman
 
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwlec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
khushbu maurya
 
Long and short term memory presesntation
Long and short term memory presesntationLong and short term memory presesntation
Long and short term memory presesntation
chWaqasZahid
 
M5 Topic 1 - Encoder Decoder MODEL-JEC.pdf
M5 Topic 1 - Encoder Decoder MODEL-JEC.pdfM5 Topic 1 - Encoder Decoder MODEL-JEC.pdf
M5 Topic 1 - Encoder Decoder MODEL-JEC.pdf
KeshavSen4
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Junaid Bhat
 
RNN JAN 2025 ppt fro scratch looking from basic.pptx
RNN JAN 2025 ppt fro scratch looking from basic.pptxRNN JAN 2025 ppt fro scratch looking from basic.pptx
RNN JAN 2025 ppt fro scratch looking from basic.pptx
webseriesnit
 
An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)
EmmanuelJosterSsenjo
 
Ad

More from Kuppusamy P (20)

Deep learning
Deep learningDeep learning
Deep learning
Kuppusamy P
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
Kuppusamy P
 
Image enhancement
Image enhancementImage enhancement
Image enhancement
Kuppusamy P
 
Feature detection and matching
Feature detection and matchingFeature detection and matching
Feature detection and matching
Kuppusamy P
 
Image processing, Noise, Noise Removal filters
Image processing, Noise, Noise Removal filtersImage processing, Noise, Noise Removal filters
Image processing, Noise, Noise Removal filters
Kuppusamy P
 
Flowchart design for algorithms
Flowchart design for algorithmsFlowchart design for algorithms
Flowchart design for algorithms
Kuppusamy P
 
Algorithm basics
Algorithm basicsAlgorithm basics
Algorithm basics
Kuppusamy P
 
Problem solving using Programming
Problem solving using ProgrammingProblem solving using Programming
Problem solving using Programming
Kuppusamy P
 
Parts of Computer, Hardware and Software
Parts of Computer, Hardware and Software Parts of Computer, Hardware and Software
Parts of Computer, Hardware and Software
Kuppusamy P
 
Strings in java
Strings in javaStrings in java
Strings in java
Kuppusamy P
 
Java methods or Subroutines or Functions
Java methods or Subroutines or FunctionsJava methods or Subroutines or Functions
Java methods or Subroutines or Functions
Kuppusamy P
 
Java arrays
Java arraysJava arrays
Java arrays
Kuppusamy P
 
Java iterative statements
Java iterative statementsJava iterative statements
Java iterative statements
Kuppusamy P
 
Java conditional statements
Java conditional statementsJava conditional statements
Java conditional statements
Kuppusamy P
 
Java data types
Java data typesJava data types
Java data types
Kuppusamy P
 
Java introduction
Java introductionJava introduction
Java introduction
Kuppusamy P
 
Logistic regression in Machine Learning
Logistic regression in Machine LearningLogistic regression in Machine Learning
Logistic regression in Machine Learning
Kuppusamy P
 
Anomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningAnomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine Learning
Kuppusamy P
 
Machine Learning Performance metrics for classification
Machine Learning Performance metrics for classificationMachine Learning Performance metrics for classification
Machine Learning Performance metrics for classification
Kuppusamy P
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
Kuppusamy P
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
Kuppusamy P
 
Image enhancement
Image enhancementImage enhancement
Image enhancement
Kuppusamy P
 
Feature detection and matching
Feature detection and matchingFeature detection and matching
Feature detection and matching
Kuppusamy P
 
Image processing, Noise, Noise Removal filters
Image processing, Noise, Noise Removal filtersImage processing, Noise, Noise Removal filters
Image processing, Noise, Noise Removal filters
Kuppusamy P
 
Flowchart design for algorithms
Flowchart design for algorithmsFlowchart design for algorithms
Flowchart design for algorithms
Kuppusamy P
 
Algorithm basics
Algorithm basicsAlgorithm basics
Algorithm basics
Kuppusamy P
 
Problem solving using Programming
Problem solving using ProgrammingProblem solving using Programming
Problem solving using Programming
Kuppusamy P
 
Parts of Computer, Hardware and Software
Parts of Computer, Hardware and Software Parts of Computer, Hardware and Software
Parts of Computer, Hardware and Software
Kuppusamy P
 
Java methods or Subroutines or Functions
Java methods or Subroutines or FunctionsJava methods or Subroutines or Functions
Java methods or Subroutines or Functions
Kuppusamy P
 
Java iterative statements
Java iterative statementsJava iterative statements
Java iterative statements
Kuppusamy P
 
Java conditional statements
Java conditional statementsJava conditional statements
Java conditional statements
Kuppusamy P
 
Java introduction
Java introductionJava introduction
Java introduction
Kuppusamy P
 
Logistic regression in Machine Learning
Logistic regression in Machine LearningLogistic regression in Machine Learning
Logistic regression in Machine Learning
Kuppusamy P
 
Anomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningAnomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine Learning
Kuppusamy P
 
Machine Learning Performance metrics for classification
Machine Learning Performance metrics for classificationMachine Learning Performance metrics for classification
Machine Learning Performance metrics for classification
Kuppusamy P
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
Kuppusamy P
 
Ad

Recently uploaded (20)

Anti-Depressants pharmacology 1slide.pptx
Anti-Depressants pharmacology 1slide.pptxAnti-Depressants pharmacology 1slide.pptx
Anti-Depressants pharmacology 1slide.pptx
Mayuri Chavan
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
How to manage Multiple Warehouses for multiple floors in odoo point of sale
How to manage Multiple Warehouses for multiple floors in odoo point of saleHow to manage Multiple Warehouses for multiple floors in odoo point of sale
How to manage Multiple Warehouses for multiple floors in odoo point of sale
Celine George
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
milanasargsyan5
 
How to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 WebsiteHow to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 Website
Celine George
 
One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
Geography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjectsGeography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjects
ProfDrShaikhImran
 
LDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini UpdatesLDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini Updates
LDM Mia eStudios
 
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Library Association of Ireland
 
Social Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy StudentsSocial Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy Students
DrNidhiAgarwal
 
Operations Management (Dr. Abdulfatah Salem).pdf
Operations Management (Dr. Abdulfatah Salem).pdfOperations Management (Dr. Abdulfatah Salem).pdf
Operations Management (Dr. Abdulfatah Salem).pdf
Arab Academy for Science, Technology and Maritime Transport
 
Biophysics Chapter 3 Methods of Studying Macromolecules.pdf
Biophysics Chapter 3 Methods of Studying Macromolecules.pdfBiophysics Chapter 3 Methods of Studying Macromolecules.pdf
Biophysics Chapter 3 Methods of Studying Macromolecules.pdf
PKLI-Institute of Nursing and Allied Health Sciences Lahore , Pakistan.
 
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Library Association of Ireland
 
Political History of Pala dynasty Pala Rulers NEP.pptx
Political History of Pala dynasty Pala Rulers NEP.pptxPolitical History of Pala dynasty Pala Rulers NEP.pptx
Political History of Pala dynasty Pala Rulers NEP.pptx
Arya Mahila P. G. College, Banaras Hindu University, Varanasi, India.
 
Handling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptxHandling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptx
AuthorAIDNationalRes
 
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Celine George
 
Presentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem KayaPresentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem Kaya
MIPLM
 
Presentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar RabbiPresentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar Rabbi
Md Shaifullar Rabbi
 
Odoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo SlidesOdoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo Slides
Celine George
 
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public SchoolsK12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
dogden2
 
Anti-Depressants pharmacology 1slide.pptx
Anti-Depressants pharmacology 1slide.pptxAnti-Depressants pharmacology 1slide.pptx
Anti-Depressants pharmacology 1slide.pptx
Mayuri Chavan
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
How to manage Multiple Warehouses for multiple floors in odoo point of sale
How to manage Multiple Warehouses for multiple floors in odoo point of saleHow to manage Multiple Warehouses for multiple floors in odoo point of sale
How to manage Multiple Warehouses for multiple floors in odoo point of sale
Celine George
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
milanasargsyan5
 
How to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 WebsiteHow to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 Website
Celine George
 
One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
Geography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjectsGeography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjects
ProfDrShaikhImran
 
LDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini UpdatesLDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini Updates
LDM Mia eStudios
 
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Library Association of Ireland
 
Social Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy StudentsSocial Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy Students
DrNidhiAgarwal
 
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Library Association of Ireland
 
Handling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptxHandling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptx
AuthorAIDNationalRes
 
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Celine George
 
Presentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem KayaPresentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem Kaya
MIPLM
 
Presentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar RabbiPresentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar Rabbi
Md Shaifullar Rabbi
 
Odoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo SlidesOdoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo Slides
Celine George
 
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public SchoolsK12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
dogden2
 

Recurrent neural networks rnn

  • 1. Deep Learning – (RNN) Recurrent Neural Networks
  • 2. Content • Architecture for an RNN • Forward propagation • Back propagation • Long Short Term Memory Networks LSTM
  • 3. Motivation: what about sequence prediction? What can I do when input size and output size vary?
  • 4. Architecture for an RNN RNNs are very powerful, because they combine two properties: • Distributed hidden state that allows them to store a lot of information about the past efficiently. • Non-linear dynamics that allows them to update their hidden state in complicated ways. • Share parameters/weights (matrix Wi), • Input Layer (Vectors - x1, x2, …, xn), Hidden Layer (vectors - h0, h1, h2, …, hn), Output Layer(vectors - y1, y2, …, yn), Neurons (aij), time step (t). Some information is passed from one subunit to the next Start of sequence marker End of sequence marker Sequence of outputs Sequence of inputs
  • 5. Architecture for an RNN • The recurrent net is a layered net with shared weights and then train the feed- forward net with weight constraints. Training algorithm is working in the time domain: 1. Forward propagation: Compute predictions • The forward pass builds up a stack of the activities of all the units at each time step. • The input vector xᵢ to each hidden state where i=1, 2,…, n for each element in the input sequence. • The input text must be encoded into numerical values. E.g. every letter in the word “dogs” is one-hot encoded vector with dimension (4x1). Similarly, x can also be word embedding or other numerical representations. • Hidden state output vector after the activation function from previous state is applied to the next state of hidden nodes. • At time t, the architecture uses for h vector hidden vector from the previous state and the input x at time t-1. i.e. takes information from previous inputs that are sequentially behind the current input. • But, the h0 vector will always start as a vector of 0’s because the algorithm has no information preceding the first element in the sequence. 2 1 2 1 2 1 2 1 2 1 : : : w and w for w E w E use w E and w E compute w w need we w w constrain To   +        =  = Back propagation Sharing weights
  • 6. Example for reference word “dogs,” where we want to train an RNN to predict the letter “s” given the letters “d”-“o”-“g”. use 3 hidden nodes in our RNN (d=3). The dimensions for each of our variables are as follows: Here, k = 4, because our input x is a 4-dimensional one-hot vector for the letters in “dogs.”
  • 7. Architecture for an RNN • Activation function: tanh at hidden layers, softmax at output layer 2. Compute the loss using cross entropy: 3. Back propagation: compute the gradient (error derivatives) at each time step. • Update the weights to minimize the loss: For hidden state at t=2, input is the output from t-1 and x at t
  • 8. Applications of Recurrent Neural Networks (RNNs) 1. Prediction problems 2. Language Modelling and Generating Text 3. Machine Translation 4. Speech Recognition 5. Generating Image Descriptions 6. Video Tagging 7. Text Summarization 8. Call Center Analysis 9. Face detection, OCR Applications as Image Recognition 10. Other applications like Music composition
  • 9. Long Short Term Memory Networks (LSTMs) • LSTMs are a type of recurrent neural network (RNN) that can learn and memorize long-term dependencies. • LSTMs retain past information for long period of time. Hence, It is very useful in time-series prediction. • LSTMs have a chain-like structure where four (memory cell, forget, input, output) interacting layers communicate in a unique way. • LSTM has three gates (forget, input, output) to protect and control the cell state. LSTMs Working principle: • First, they forget irrelevant information of the previous state or keep the relevant information of the previous state. • Next, they selectively update the memory cell-state values. • The memory cell state carry relevant information from the earlier time steps to later time steps throughout the processing of the sequence that reducing the effects of short-term memory. • As the cell state goes on its journey, information get’s added or removed to the cell state via gates. • The gates are different neural networks that decide which information is allowed on the cell state. The gates can learn what information is relevant to keep or forget during training. • Finally, provides the output of certain parts of the cell state.
  • 10. LSTM Architecture LSTM Unit Structure Components in LSTM: • Three gates: forget gate, input gate and output gate. • Memory cell state • Forward Propagation: Processes the data passing on information. The differences are the operations within the LSTM’s cells. • These operations are used to allow the LSTM to keep or forget information. Backward Propagation • Update the parameters to reduce the error.
  • 11. LSTM layers working principle • Gates are composed with a sigmoid neural net layer and a pointwise multiplication operation. • The sigmoid layer output range is between zero and one that describe how much of each component pass/remove information. A value of zero means “no information allows,” while a value of one means “pass everything”. Forget gate layer: • Decides what information going to throw away from the memory cell state. • 1 represents “completely keep this” while a 0 represents “completely reject this.” Input gate layer: • The next step is to decide what new information we’re going to store in the cell state. • This has two parts. First, a sigmoid layer called the “input gate layer” decides which values we’ll update. • Next, a tanh layer creates a vector of new candidate values, Ct, that could be added to the state. • In the next step, we’ll combine these two to create an update to the state. Memory Cell State: • Update the old cell state, Ct-1, into the new cell state Ct. • Multiply the old state by ft, forgetting the things we decided to forget earlier. • Then we add it ∗ Ct. This is the new candidate values, scaled by how much we decided to update each state value. Output gate layer: • Decides output based on cell state, but will be a filtered version. • First, run a sigmoid layer which decides what parts of the cell state to be output. • Then, we put the cell state through tanh (to push the values to be between −1 and 1) and multiply it by the output of the sigmoid gate, so that we only output the parts we decided to.
  • 12. Types of LSTM models based on input and output • One input to One output - eg : Giving labels to image • One input to many outputs- eg : Giving description/caption to image (description will have sequence of words - many output) • Many inputs to one output - eg : Predicting the next word in given incomplete statement • Many inputs to Many outputs- eg : Stock market prediction for following days based on past data
  • 13. Applications of LSTM 1. Speech Recognition (Input is audio and output is text) - Google Assistant, Microsoft Cortana, Apple Siri 2. Machine Translation (Input is text and output is also text) - Google Translate 3. Image Captioning (Input is image and output is text) 4. Sentiment Analysis (Input is text and output is rating) 5. Music Generation/Synthesis ( input music notes and output is music) 6. Video Activity Recognition (input is video and output is type of activity)