LSTM Presentation

The document provides an overview of Long Short-Term Memory (LSTM) networks, which are a type of recurrent neural network designed to process sequential data while addressing issues like long-term dependencies and the vanishing gradient problem. It details the history of LSTM development, its structure including memory cells and gates, and contrasts it with traditional RNNs. Key features of LSTMs include the input, forget, and output gates that manage information flow, enabling effective learning from sequential data.

Uploaded by

ZeeshaN

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

LSTM Presentation

Uploaded by

ZeeshaN

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

202

4
Long Short Term
Memory (LSTM) Model
Group Members
Burhan Ahmed(bai222001)
Saif Ur Rehman(bcs222001)
↙
Halima Sadia(bse222002)
CONTENT

O O
Introduction to LSTM Features of LSTM
1 2

O
3
O
4
↖ Introduction
1
↗ To LSTM
↘ Overview of Long Short-Term Memory Networks

↙
What is LSTM?
 LSTM is a type of recurrent neural network (RNN) that can
process and analyze sequential data, such as text, speech,
and time series.
 They use a memory cell and gates to control the flow of
information.
 Memory cell stores information from previous time steps and
uses it to influence the output of the cell at the current time
step.
 The output of each LSTM cell is passed to the next cell in the
network, allowing the LSTM to process and analyze
sequential data over multiple time steps.

LSTM
HISTORY
 1990: LSTM concept proposed by Sepp Hochreiter
and Jürgen Schmidhuber.
 1997: Paper published explaining the design with
input, forget, and output gates.
 2015: Rise of Attention Mechanisms and
Transformers challenging LSTMs.
 2020: New architectures and training algorithms.
 2021: Introduction of Corrector LSTM for accurate
predictions.
 2022: NXAI invented xLSTM (Extended LSTM) with
billions of parameters.
Recurrent Neural
Network RNN
What is RNN? Basic Structure
Definition: Recurrent Neural Networks
(RNNs) are neural networks designed to
process sequential data with temporal
dependencies.
Key Characteristics:
 Analyze data with a temporal dimension
(e.g., time-series, speech, text).
 Use a hidden state passed from one
timestep to the next.
 Hidden state updates based on current
input and previous hidden state.
Strengths: Excellent at capturing short-term
dependencies.
Challenges: Struggle to handle long-term
dependencies due to vanishing or exploding
gradients.
Vanilla RNN

↙
PROBLEMS IN RNN
Long Term Dependency Issue in RNN

Let us consider a sentence-

"I am a data science student and I love
machine_______.”
We know the blank has to be filled with 'learning'. But
had there been many terms after "l am a data science
student" like, "l am a data science student pursuing
MS from University of...... and I love machine _______”.
This time, however, RNNS fails to work. Likely in this
case we do not need unnecessary information like
"pursuing MS from University of......".
What LSTMs do is, leverage their forget gate to
eliminate the unnecessary information, which helps
them handle long-term dependencies.
VANISHING GRADIENT PROBLEM
 RNNs use the tanh (hyperbolic tangent) function.
 Output range: [-1, 1], derivative range: [0, 1].
 RNNs perform repeated matrix multiplications as
input sequence length increases.
 Use the chain rule of differentiation during
backpropagation.
 Multiplying small numbers repeatedly makes
gradients exponentially smaller.
 Leads to gradient values approaching 0, halting
weight updates.
 Results in poor training and inability to capture
long-term dependencies.
Vanishing Gradient Problem
RNN VS LSTM
RNN VS LSTM
Graphs of Sigmoid and
Tanh Functions
↖ FEATURES
2
↗ OF LSTM
↘ Structure, Gates, Memory Cell

↙
Structure of LSTM
Structure of LSTM
Memory Cell
 Memory cells maintain information across time
steps, enabling LSTMs to learn and utilize long-
term dependencies in data.
 The content of the memory cell is updated or
modified by the interaction of three gates:
 Input Gate: Determines what new information to
add to the memory.
 Forget Gate: Decides which information to erase.
 Output Gate: Controls what part of the memory is
used for the current output.
Memory Cell
LSTM Memory Cell
Input Gate

As discussed earlier, the input gate optionally permits

information that is relevant from the current cell state.
It is the gate that determines which information is
necessary for the current input and which isn't by using
the sigmoid activation function. It then stores the
information in the current cell state. Next, comes to
play the tanh activation mechanism, which computes
the vector representations of the input-gate values,
which are added to the cell state.
Forget Gate

We already discussed, while introducing gates, that the hidden

state is responsible for predicting outputs. The output generated
from the hidden state at (t-1) timestamp is h(t-1). After the
forget gate receives the input x(t) and output from h(t-1), it
performs a pointwise multiplication with its weight matrix with
an add-on of sigmoid activation, which generates probability
scores. These probability scores help it determine what is useful
information and what is irrelevant.
THANK YOU !

Digital Modulations using Matlab
From Everand
Digital Modulations using Matlab
Mathuranathan Viswanathan
4/5 (6)
What is LSTM - Long Short Term Memory_ - GeeksforGeeks
No ratings yet
What is LSTM - Long Short Term Memory_ - GeeksforGeeks
10 pages
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Dfa Nfa
No ratings yet
Dfa Nfa
9 pages
Slides On ARIMA Models - Robert Nau
No ratings yet
Slides On ARIMA Models - Robert Nau
21 pages
lstm
No ratings yet
lstm
12 pages
LSTM Networks Thesis Updated
No ratings yet
LSTM Networks Thesis Updated
5 pages
LSTM
No ratings yet
LSTM
12 pages
CO2_LSTM_5
No ratings yet
CO2_LSTM_5
17 pages
longshorttermmemorylstm-231215171600-1feb7b1b
No ratings yet
longshorttermmemorylstm-231215171600-1feb7b1b
17 pages
LSTM by Bushra
No ratings yet
LSTM by Bushra
16 pages
LSTM
No ratings yet
LSTM
19 pages
What is LSTM - Long Short Term Memory_ - GeeksforGeeks
No ratings yet
What is LSTM - Long Short Term Memory_ - GeeksforGeeks
14 pages
Unit IV
No ratings yet
Unit IV
22 pages
LSTM
No ratings yet
LSTM
14 pages
LSTM_AryanGomes
No ratings yet
LSTM_AryanGomes
13 pages
Long Short-Term Memory Survey Paper
No ratings yet
Long Short-Term Memory Survey Paper
6 pages
LSTM_1738024034
No ratings yet
LSTM_1738024034
13 pages
Lecture Notes_RRN
No ratings yet
Lecture Notes_RRN
8 pages
LSTM - Aishit Dharwal
No ratings yet
LSTM - Aishit Dharwal
8 pages
LSTM
No ratings yet
LSTM
24 pages
DL CO-3 PPT 3
No ratings yet
DL CO-3 PPT 3
19 pages
UNIT-III
No ratings yet
UNIT-III
5 pages
Deep Learning (MODULE-5)
No ratings yet
Deep Learning (MODULE-5)
71 pages
34-Long-Term Dependencies - Echo State Networks - Long Short-Term Memory and Othe-03!10!2024
No ratings yet
34-Long-Term Dependencies - Echo State Networks - Long Short-Term Memory and Othe-03!10!2024
14 pages
CS5560 Lect12-RNN - LSTM
No ratings yet
CS5560 Lect12-RNN - LSTM
30 pages
Cs224n 2025 Lecture06 Fancy Rnn
No ratings yet
Cs224n 2025 Lecture06 Fancy Rnn
57 pages
Sequence Modeling
No ratings yet
Sequence Modeling
131 pages
LSTM
No ratings yet
LSTM
8 pages
UNIT-5 Foundations of Deep Learning
No ratings yet
UNIT-5 Foundations of Deep Learning
9 pages
Unit III- Recurrent Neural Networks
No ratings yet
Unit III- Recurrent Neural Networks
44 pages
NLP Exp1
No ratings yet
NLP Exp1
5 pages
Unit 2
No ratings yet
Unit 2
34 pages
LSTM
No ratings yet
LSTM
24 pages
UNIT_2_DL
No ratings yet
UNIT_2_DL
44 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
A Complete Guide To LSTM Architecture and Its Use in Text Classification
No ratings yet
A Complete Guide To LSTM Architecture and Its Use in Text Classification
10 pages
I
No ratings yet
I
1 page
Slides PyConfr Bordeaux Calcagno
No ratings yet
Slides PyConfr Bordeaux Calcagno
46 pages
DLT UNIT-4
No ratings yet
DLT UNIT-4
18 pages
Introduction to Long Short-Term Memory(LSTM) _ Simplilearn
No ratings yet
Introduction to Long Short-Term Memory(LSTM) _ Simplilearn
7 pages
UNIT-IV DL
No ratings yet
UNIT-IV DL
23 pages
UNIT_2_DL[1]
No ratings yet
UNIT_2_DL[1]
43 pages
Deep Learning Notes
100% (1)
Deep Learning Notes
44 pages
Long Short-Term Memory Networks PDF
No ratings yet
Long Short-Term Memory Networks PDF
22 pages
RNN LSTM
No ratings yet
RNN LSTM
37 pages
What are Recurrent Neural Networks.docx
No ratings yet
What are Recurrent Neural Networks.docx
7 pages
What is LSTM
No ratings yet
What is LSTM
5 pages
Neural Networks
No ratings yet
Neural Networks
22 pages
Chapter III
No ratings yet
Chapter III
27 pages
Presentation Title
No ratings yet
Presentation Title
10 pages
Week 6 (1)
No ratings yet
Week 6 (1)
60 pages
RNN Part1
No ratings yet
RNN Part1
12 pages
Unit 4
No ratings yet
Unit 4
27 pages
CNN RNN LSTM GRU Simple
100% (3)
CNN RNN LSTM GRU Simple
20 pages
Long-Short Term Memory
No ratings yet
Long-Short Term Memory
21 pages
ML for NLP-LO3
No ratings yet
ML for NLP-LO3
61 pages
Stock Price Predictor: Project Report
No ratings yet
Stock Price Predictor: Project Report
5 pages
Long Short-Term Memory Recurrent Neural Network Architectures For Large Scale Acoustic Modeling
No ratings yet
Long Short-Term Memory Recurrent Neural Network Architectures For Large Scale Acoustic Modeling
5 pages
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
From Everand
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
Fouad Sabry
No ratings yet
Flash Memory Evolution
From Everand
Flash Memory Evolution
Sterling Blackwood
No ratings yet
Long Short Term Memory: Fundamentals and Applications for Sequence Prediction
From Everand
Long Short Term Memory: Fundamentals and Applications for Sequence Prediction
Fouad Sabry
No ratings yet
Deep Learning Notes For Easy Access
No ratings yet
Deep Learning Notes For Easy Access
14 pages
D071171011 - Tugas 02
No ratings yet
D071171011 - Tugas 02
8 pages
GSEMinstataintroduction
No ratings yet
GSEMinstataintroduction
39 pages
WEEK 6 MODULE 6 - Neural Network Models
No ratings yet
WEEK 6 MODULE 6 - Neural Network Models
31 pages
Lexical Analysis Finite Automata
No ratings yet
Lexical Analysis Finite Automata
12 pages
Continuous Type Random Variables
No ratings yet
Continuous Type Random Variables
3 pages
Tafl Unit 1
No ratings yet
Tafl Unit 1
88 pages
01 Speed Read Tensorflow Playground
No ratings yet
01 Speed Read Tensorflow Playground
6 pages
DL - Unit - 1 - Foundations of Deep Learning
No ratings yet
DL - Unit - 1 - Foundations of Deep Learning
35 pages
Areas For The Standard Normal Distribution Table PDF
No ratings yet
Areas For The Standard Normal Distribution Table PDF
2 pages
6. Neural Network Algorithm
No ratings yet
6. Neural Network Algorithm
2 pages
Automata Revision
No ratings yet
Automata Revision
6 pages
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
20 pages
Chapter 3
No ratings yet
Chapter 3
46 pages
TH3769 1
No ratings yet
TH3769 1
10 pages
Normal PDF
No ratings yet
Normal PDF
2 pages
Ppt -4 (Poisson Distribution)
No ratings yet
Ppt -4 (Poisson Distribution)
9 pages
CFA Probability Distribution Tables For L1 & L2 (300hours Updated)
No ratings yet
CFA Probability Distribution Tables For L1 & L2 (300hours Updated)
6 pages
Tipping Point Analysis
No ratings yet
Tipping Point Analysis
5 pages
3random Variable - Joint PDF Notes PDF
No ratings yet
3random Variable - Joint PDF Notes PDF
33 pages
E PX X T: Cumulative Poisson Probabilities
No ratings yet
E PX X T: Cumulative Poisson Probabilities
6 pages
CSE 4237 SoftCom Solutions
No ratings yet
CSE 4237 SoftCom Solutions
115 pages
Mengenali Fungsi Logika "And" Melalui Pemrograman Perceptron Dengan Matlab
No ratings yet
Mengenali Fungsi Logika "And" Melalui Pemrograman Perceptron Dengan Matlab
8 pages
Back Propagation
No ratings yet
Back Propagation
20 pages
Time Series Analysis Cheat Sheet
No ratings yet
Time Series Analysis Cheat Sheet
2 pages
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
No ratings yet
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
127 pages
Laporan Tugas Besar Si 3211 - Analisis Struktur SEMESTER II TAHUN 2018/2019
No ratings yet
Laporan Tugas Besar Si 3211 - Analisis Struktur SEMESTER II TAHUN 2018/2019
16 pages
Unit 2
No ratings yet
Unit 2
38 pages