SlideShare a Scribd company logo
LONG SHORT-
TERM MEMORY
Ssenjovu Emmanuel Joster
Student-No: 2001200052
Bsc. Information Technology
Faculty of Technoscience
Dept. Computer Science & Electrical Engineering
Muni University || P.O.Box 725, Arua(UG)
Artificial neurons are inspired by and modeled
after the biological structure of the human brain
ANNs are capable of learning to solve problems
in a way our brains can do naturally.
Artificial Neurons
First things first...
Connected neurons then form a network, hence
the name neural network consisting of : input
layer, hidden layer, and output layer
● First step in a neural network.
● The network makes a prediction on what the output
would be given an input.
● To propagate the input across the layers, we perform
functions like that of below:
a) Forward Propagation
How ANNs work?
A case of feedforward ANNs
● Comes into play in training phase of a neural
network.
● Involves adjusting the weights until network can
produce desired outputs.
● We calculate error and its gradients with respect to
each weight at each layer & subsequently adjust
weights:
b) Back Propagation
How ANNs work?
● Built upon ANNs like feedforward neural networks
● Have additional connections between layers making
them eg feedback loop.
● ideal for sequential inputs eg text, music, speech,
handwriting, change of price in stock markets.
The previous step’s hidden layer and final outputs
are fed back into the network and will be used as
input to the next steps’ hidden layer, which means
the network will remember the past and it will
repeatedly predict what will happen next.
Traditional RNN
a) Forward Propagation
Memory heavy, and hard to train for long-term
temporal dependency
● We calculate the error the weight matrices W
generate, and then adjust their weights until the
error cannot go any lower.
● To compute the gradient for the current , we
𝑊
need to perform the chain rule through a series of
previous time steps. Because of this, we call the
process back propagation through time (BPTT). If
the sequences are quite long, the BPTT can take a
long time;
Traditional RNN
ln practice many people truncate the backpropagation
to a few steps instead of all the way to the beginning.
a) Back Propagation Through Time(BPTT)
The
Vanishing Gradient
In multilayered ANNs eg RNN the vanishing gradient problem refers to
the situation where the gradients(derivatives) of the loss function with
respect to weights of early layers in a deep neural network become
extremely too small to allow training for activities which require long-
term dependency. For-example if we are predicting the next word in
longer multi-sentence paragraph, it is less likely that that the model will
be able to remember the first words in the beginning of the paragraph.
LONG SHORT-TERM MEMORY
● LSTM is an implementation of improved RNN
architecture to address the issues of general RNN
● Enables long-range dependencies.
● Has better memory through linear memory cells
surrounded by a set of gate units used to control
the flow of information.
● It uses no activation function within its recurrent
components, thus the gradient term does not
vanish with back propagation.
The “Hidden Layer” Of an LSTM
The hidden layer s made of a cell which s surrounded by gates that
control the flow of information
1.Forget Gate
● First step in which the sigmoid
function outputs a value ranging from
0 to 1 to
● Determines how much information of
the previous hidden state and current
input it should retain.
● LSTM does not necessarily need to
remember everything that has
happened in the past.
2.Input Gate 3.Output Gate
The gates of an LSTM
The gates perform the following functions to control the flow of
information
Next step and involves two parts;
● First, the input gate determines what
new information to store in the
memory cell.
● Next, a tanh layer creates a vector of
new candidate values to be added to
the state.
● To determine what to output from the
memory cell, we again apply the
sigmoid function to the previous
hidden state and current input, then
multiply that with tanh applied to the
new memory cell(this will make the
values between -1 and 1)
Memory
LSTM has an actual memory
built into the architecture that
lacks in RNN
Vanshng/ Exploding
gradients
LSTMs can deal with these
problems
Accuracy
More accurate predictions
for larger sequential data
Long-term
dependency
Able to capture complex
patterns n huge datasets
WHY LSTMs?
[1] Mastering Machine Learning with Python in Six Steps By Manohar
Swamynathan Bangalore
[2] Long Short-Term Memory By Sepp Hochreiter, Fakultät für Informatik,
Technische Universität München, 80290 München, Germany
[3] Long Short-Term Memory-Networks for Machine Reading, Jianpeng Cheng, Li
Dong and Mirella Lapata , School of Informatics, University of Edinburgh
[4] Long Short-Term Memory, M. Stanley Fujimoto, CS778–Winter2016,30 Jan
2016
References:
Thank you
E.J.Ssenjovu
Ad

More Related Content

What's hot (20)

Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
Kuppusamy P
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
Shreshth Saxena
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
ananth
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Knoldus Inc.
 
Long Short Term Memory LSTM
Long Short Term Memory LSTMLong Short Term Memory LSTM
Long Short Term Memory LSTM
Abdullah al Mamun
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
jins0618
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
Deep Learning
Deep Learning Deep Learning
Deep Learning
Roshan Chettri
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
CloudxLab
 
Spiking neural network: an introduction I
Spiking neural network: an introduction ISpiking neural network: an introduction I
Spiking neural network: an introduction I
Dalin Zhang
 
Learning set of rules
Learning set of rulesLearning set of rules
Learning set of rules
swapnac12
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognition
ananth
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
leopauly
 
Keras and TensorFlow
Keras and TensorFlowKeras and TensorFlow
Keras and TensorFlow
NopphawanTamkuan
 
RNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataRNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential Data
Yao-Chieh Hu
 
Understanding RNN and LSTM
Understanding RNN and LSTMUnderstanding RNN and LSTM
Understanding RNN and LSTM
健程 杨
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Edureka!
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: Theory
Andrii Gakhov
 
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Fordham University
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
Daiki Tanaka
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
Kuppusamy P
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
ananth
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Knoldus Inc.
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
jins0618
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
CloudxLab
 
Spiking neural network: an introduction I
Spiking neural network: an introduction ISpiking neural network: an introduction I
Spiking neural network: an introduction I
Dalin Zhang
 
Learning set of rules
Learning set of rulesLearning set of rules
Learning set of rules
swapnac12
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognition
ananth
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
leopauly
 
RNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataRNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential Data
Yao-Chieh Hu
 
Understanding RNN and LSTM
Understanding RNN and LSTMUnderstanding RNN and LSTM
Understanding RNN and LSTM
健程 杨
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Edureka!
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: Theory
Andrii Gakhov
 
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Fordham University
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
Daiki Tanaka
 

Similar to An Introduction to Long Short-term Memory (LSTMs) (20)

Concepts of Temporal CNN, Recurrent Neural Network, Attention
Concepts of Temporal CNN, Recurrent Neural Network, AttentionConcepts of Temporal CNN, Recurrent Neural Network, Attention
Concepts of Temporal CNN, Recurrent Neural Network, Attention
SaumyaMundra3
 
Long and short term memory presesntation
Long and short term memory presesntationLong and short term memory presesntation
Long and short term memory presesntation
chWaqasZahid
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Junaid Bhat
 
Applying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language ServicesApplying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language Services
Yannis Flet-Berliac
 
Backpropagation Through Time (BPTT).pptx
Backpropagation Through Time (BPTT).pptxBackpropagation Through Time (BPTT).pptx
Backpropagation Through Time (BPTT).pptx
2022bcaaidsaman11164
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
Sharath TS
 
RNN-LSTM.pptx
RNN-LSTM.pptxRNN-LSTM.pptx
RNN-LSTM.pptx
ssuserc755f1
 
Rnn presentation 2
Rnn presentation 2Rnn presentation 2
Rnn presentation 2
Shubhangi Tandon
 
Artificial neutral network cousre of AI.ppt
Artificial neutral network cousre of AI.pptArtificial neutral network cousre of AI.ppt
Artificial neutral network cousre of AI.ppt
attaurahman
 
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwlec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
khushbu maurya
 
Digital Implementation of Artificial Neural Network for Function Approximatio...
Digital Implementation of Artificial Neural Network for Function Approximatio...Digital Implementation of Artificial Neural Network for Function Approximatio...
Digital Implementation of Artificial Neural Network for Function Approximatio...
IOSR Journals
 
Digital Implementation of Artificial Neural Network for Function Approximatio...
Digital Implementation of Artificial Neural Network for Function Approximatio...Digital Implementation of Artificial Neural Network for Function Approximatio...
Digital Implementation of Artificial Neural Network for Function Approximatio...
IOSR Journals
 
Advanced Machine Learning
Advanced Machine LearningAdvanced Machine Learning
Advanced Machine Learning
ANANDBABUGOPATHOTI1
 
Lec10new
Lec10newLec10new
Lec10new
Ananda Gopathoti
 
rnn BASICS
rnn BASICSrnn BASICS
rnn BASICS
Priyanka Reddy
 
lec10new.ppt
lec10new.pptlec10new.ppt
lec10new.ppt
SumantKuch
 
deeplearning
deeplearningdeeplearning
deeplearning
huda2018
 
Complete solution for Recurrent neural network.pptx
Complete solution for Recurrent neural network.pptxComplete solution for Recurrent neural network.pptx
Complete solution for Recurrent neural network.pptx
ArunKumar674066
 
Sachpazis: Demystifying Neural Networks: A Comprehensive Guide
Sachpazis: Demystifying Neural Networks: A Comprehensive GuideSachpazis: Demystifying Neural Networks: A Comprehensive Guide
Sachpazis: Demystifying Neural Networks: A Comprehensive Guide
Dr.Costas Sachpazis
 
RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesRNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantages
AbhijitVenkatesh1
 
Concepts of Temporal CNN, Recurrent Neural Network, Attention
Concepts of Temporal CNN, Recurrent Neural Network, AttentionConcepts of Temporal CNN, Recurrent Neural Network, Attention
Concepts of Temporal CNN, Recurrent Neural Network, Attention
SaumyaMundra3
 
Long and short term memory presesntation
Long and short term memory presesntationLong and short term memory presesntation
Long and short term memory presesntation
chWaqasZahid
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Junaid Bhat
 
Applying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language ServicesApplying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language Services
Yannis Flet-Berliac
 
Backpropagation Through Time (BPTT).pptx
Backpropagation Through Time (BPTT).pptxBackpropagation Through Time (BPTT).pptx
Backpropagation Through Time (BPTT).pptx
2022bcaaidsaman11164
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
Sharath TS
 
Artificial neutral network cousre of AI.ppt
Artificial neutral network cousre of AI.pptArtificial neutral network cousre of AI.ppt
Artificial neutral network cousre of AI.ppt
attaurahman
 
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwlec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
lec10newwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
khushbu maurya
 
Digital Implementation of Artificial Neural Network for Function Approximatio...
Digital Implementation of Artificial Neural Network for Function Approximatio...Digital Implementation of Artificial Neural Network for Function Approximatio...
Digital Implementation of Artificial Neural Network for Function Approximatio...
IOSR Journals
 
Digital Implementation of Artificial Neural Network for Function Approximatio...
Digital Implementation of Artificial Neural Network for Function Approximatio...Digital Implementation of Artificial Neural Network for Function Approximatio...
Digital Implementation of Artificial Neural Network for Function Approximatio...
IOSR Journals
 
deeplearning
deeplearningdeeplearning
deeplearning
huda2018
 
Complete solution for Recurrent neural network.pptx
Complete solution for Recurrent neural network.pptxComplete solution for Recurrent neural network.pptx
Complete solution for Recurrent neural network.pptx
ArunKumar674066
 
Sachpazis: Demystifying Neural Networks: A Comprehensive Guide
Sachpazis: Demystifying Neural Networks: A Comprehensive GuideSachpazis: Demystifying Neural Networks: A Comprehensive Guide
Sachpazis: Demystifying Neural Networks: A Comprehensive Guide
Dr.Costas Sachpazis
 
RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesRNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantages
AbhijitVenkatesh1
 
Ad

Recently uploaded (20)

Odoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo SlidesOdoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo Slides
Celine George
 
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public SchoolsK12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
dogden2
 
LDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini UpdatesLDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini Updates
LDM Mia eStudios
 
The ever evoilving world of science /7th class science curiosity /samyans aca...
The ever evoilving world of science /7th class science curiosity /samyans aca...The ever evoilving world of science /7th class science curiosity /samyans aca...
The ever evoilving world of science /7th class science curiosity /samyans aca...
Sandeep Swamy
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 4-30-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 4-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-30-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
To study Digestive system of insect.pptx
To study Digestive system of insect.pptxTo study Digestive system of insect.pptx
To study Digestive system of insect.pptx
Arshad Shaikh
 
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Library Association of Ireland
 
Handling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptxHandling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptx
AuthorAIDNationalRes
 
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Library Association of Ireland
 
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Celine George
 
Unit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdfUnit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdf
KanchanPatil34
 
Ultimate VMware 2V0-11.25 Exam Dumps for Exam Success
Ultimate VMware 2V0-11.25 Exam Dumps for Exam SuccessUltimate VMware 2V0-11.25 Exam Dumps for Exam Success
Ultimate VMware 2V0-11.25 Exam Dumps for Exam Success
Mark Soia
 
One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
To study the nervous system of insect.pptx
To study the nervous system of insect.pptxTo study the nervous system of insect.pptx
To study the nervous system of insect.pptx
Arshad Shaikh
 
P-glycoprotein pamphlet: iteration 4 of 4 final
P-glycoprotein pamphlet: iteration 4 of 4 finalP-glycoprotein pamphlet: iteration 4 of 4 final
P-glycoprotein pamphlet: iteration 4 of 4 final
bs22n2s
 
Metamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative JourneyMetamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative Journey
Arshad Shaikh
 
SPRING FESTIVITIES - UK AND USA -
SPRING FESTIVITIES - UK AND USA            -SPRING FESTIVITIES - UK AND USA            -
SPRING FESTIVITIES - UK AND USA -
Colégio Santa Teresinha
 
Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025
Mebane Rash
 
How to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 WebsiteHow to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 Website
Celine George
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
milanasargsyan5
 
Odoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo SlidesOdoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo Slides
Celine George
 
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public SchoolsK12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
dogden2
 
LDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini UpdatesLDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini Updates
LDM Mia eStudios
 
The ever evoilving world of science /7th class science curiosity /samyans aca...
The ever evoilving world of science /7th class science curiosity /samyans aca...The ever evoilving world of science /7th class science curiosity /samyans aca...
The ever evoilving world of science /7th class science curiosity /samyans aca...
Sandeep Swamy
 
To study Digestive system of insect.pptx
To study Digestive system of insect.pptxTo study Digestive system of insect.pptx
To study Digestive system of insect.pptx
Arshad Shaikh
 
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Library Association of Ireland
 
Handling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptxHandling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptx
AuthorAIDNationalRes
 
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Library Association of Ireland
 
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Celine George
 
Unit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdfUnit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdf
KanchanPatil34
 
Ultimate VMware 2V0-11.25 Exam Dumps for Exam Success
Ultimate VMware 2V0-11.25 Exam Dumps for Exam SuccessUltimate VMware 2V0-11.25 Exam Dumps for Exam Success
Ultimate VMware 2V0-11.25 Exam Dumps for Exam Success
Mark Soia
 
One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
To study the nervous system of insect.pptx
To study the nervous system of insect.pptxTo study the nervous system of insect.pptx
To study the nervous system of insect.pptx
Arshad Shaikh
 
P-glycoprotein pamphlet: iteration 4 of 4 final
P-glycoprotein pamphlet: iteration 4 of 4 finalP-glycoprotein pamphlet: iteration 4 of 4 final
P-glycoprotein pamphlet: iteration 4 of 4 final
bs22n2s
 
Metamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative JourneyMetamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative Journey
Arshad Shaikh
 
Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025
Mebane Rash
 
How to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 WebsiteHow to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 Website
Celine George
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
milanasargsyan5
 
Ad

An Introduction to Long Short-term Memory (LSTMs)

  • 1. LONG SHORT- TERM MEMORY Ssenjovu Emmanuel Joster Student-No: 2001200052 Bsc. Information Technology Faculty of Technoscience Dept. Computer Science & Electrical Engineering Muni University || P.O.Box 725, Arua(UG)
  • 2. Artificial neurons are inspired by and modeled after the biological structure of the human brain ANNs are capable of learning to solve problems in a way our brains can do naturally. Artificial Neurons First things first... Connected neurons then form a network, hence the name neural network consisting of : input layer, hidden layer, and output layer
  • 3. ● First step in a neural network. ● The network makes a prediction on what the output would be given an input. ● To propagate the input across the layers, we perform functions like that of below: a) Forward Propagation How ANNs work? A case of feedforward ANNs
  • 4. ● Comes into play in training phase of a neural network. ● Involves adjusting the weights until network can produce desired outputs. ● We calculate error and its gradients with respect to each weight at each layer & subsequently adjust weights: b) Back Propagation How ANNs work?
  • 5. ● Built upon ANNs like feedforward neural networks ● Have additional connections between layers making them eg feedback loop. ● ideal for sequential inputs eg text, music, speech, handwriting, change of price in stock markets. The previous step’s hidden layer and final outputs are fed back into the network and will be used as input to the next steps’ hidden layer, which means the network will remember the past and it will repeatedly predict what will happen next. Traditional RNN a) Forward Propagation Memory heavy, and hard to train for long-term temporal dependency
  • 6. ● We calculate the error the weight matrices W generate, and then adjust their weights until the error cannot go any lower. ● To compute the gradient for the current , we 𝑊 need to perform the chain rule through a series of previous time steps. Because of this, we call the process back propagation through time (BPTT). If the sequences are quite long, the BPTT can take a long time; Traditional RNN ln practice many people truncate the backpropagation to a few steps instead of all the way to the beginning. a) Back Propagation Through Time(BPTT)
  • 7. The Vanishing Gradient In multilayered ANNs eg RNN the vanishing gradient problem refers to the situation where the gradients(derivatives) of the loss function with respect to weights of early layers in a deep neural network become extremely too small to allow training for activities which require long- term dependency. For-example if we are predicting the next word in longer multi-sentence paragraph, it is less likely that that the model will be able to remember the first words in the beginning of the paragraph.
  • 8. LONG SHORT-TERM MEMORY ● LSTM is an implementation of improved RNN architecture to address the issues of general RNN ● Enables long-range dependencies. ● Has better memory through linear memory cells surrounded by a set of gate units used to control the flow of information. ● It uses no activation function within its recurrent components, thus the gradient term does not vanish with back propagation.
  • 9. The “Hidden Layer” Of an LSTM The hidden layer s made of a cell which s surrounded by gates that control the flow of information
  • 10. 1.Forget Gate ● First step in which the sigmoid function outputs a value ranging from 0 to 1 to ● Determines how much information of the previous hidden state and current input it should retain. ● LSTM does not necessarily need to remember everything that has happened in the past. 2.Input Gate 3.Output Gate The gates of an LSTM The gates perform the following functions to control the flow of information Next step and involves two parts; ● First, the input gate determines what new information to store in the memory cell. ● Next, a tanh layer creates a vector of new candidate values to be added to the state. ● To determine what to output from the memory cell, we again apply the sigmoid function to the previous hidden state and current input, then multiply that with tanh applied to the new memory cell(this will make the values between -1 and 1)
  • 11. Memory LSTM has an actual memory built into the architecture that lacks in RNN Vanshng/ Exploding gradients LSTMs can deal with these problems Accuracy More accurate predictions for larger sequential data Long-term dependency Able to capture complex patterns n huge datasets WHY LSTMs?
  • 12. [1] Mastering Machine Learning with Python in Six Steps By Manohar Swamynathan Bangalore [2] Long Short-Term Memory By Sepp Hochreiter, Fakultät für Informatik, Technische Universität München, 80290 München, Germany [3] Long Short-Term Memory-Networks for Machine Reading, Jianpeng Cheng, Li Dong and Mirella Lapata , School of Informatics, University of Edinburgh [4] Long Short-Term Memory, M. Stanley Fujimoto, CS778–Winter2016,30 Jan 2016 References: