0% found this document useful (0 votes)

339 views104 pages

Recurrent Neural Networks

This slide is about RNN which is very relevant in designing a deep learning system.

Uploaded by

Dimpoli Toppo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

339 views104 pages

Recurrent Neural Networks

This slide is about RNN which is very relevant in designing a deep learning system.

Uploaded by

Dimpoli Toppo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 104

Lecture 10:

Recurrent Neural Networks

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 1 May

May 2,
2, 2019
2019
Last Time: CNN Architectures

GoogLeNet
AlexNet

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 4 May 2, 2019
Last Time: CNN Architectures

ResNet

SENet

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 5 May 2, 2019
Comparing complexity...

An Analysis of Deep Neural Network Models for Practical Applications, 2017.

Figures copyright Alfredo Canziani, Adam Paszke, Eugenio Culurciello, 2017. Reproduced with permission.

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 6 May 2, 2019
Efficient networks...
MobileNets: Efficient Convolutional Neural Networks for
Mobile Applications
[Howard et al. 2017]

Depthwise separable convolutions replace

standard convolutions by factorizing them
into a depthwise convolution and a 1x1
convolution that is much more efficient
Much more efficient, with little loss in
accuracy
Follow-up MobileNetV2 work in 2018
(Sandler et al.)
- Other works in this space e.g. ShuffleNet
(Zhang et al. 2017)

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - May

May 2,
2, 2019
2019
Meta-learning: Learning to learn network architectures...
Neural Architecture Search with Reinforcement Learning (NAS)
[Zoph et al. 2016]

- “Controller” network that learns to design a good

network architecture (output a string
corresponding to network design)
- Iterate:
1) Sample an architecture from search space
2) Train the architecture to get a “reward” R
corresponding to accuracy
3) Compute gradient of sample probability, and
scale by R to perform controller parameter
update (i.e. increase likelihood of good
architecture being sampled, decrease
likelihood of bad architecture)

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 8 May 2, 2019
Meta-learning: Learning to learn network architectures...
Learning Transferable Architectures for Scalable Image
Recognition
[Zoph et al. 2017]
- Applying neural architecture search (NAS) to a
large dataset like ImageNet is expensive
- Design a search space of building blocks
(“cells”) that can be flexibly stacked
- NASNet: Use NAS to find best cell structure
on smaller CIFAR-10 dataset, then transfer
architecture to ImageNet
- Many follow-up works in this
space e.g. AmoebaNet (Real et
al. 2019) and ENAS (Pham,
Guan et al. 2018)

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 9 May 2, 2019
Today: Recurrent Neural Networks

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 10 May 2, 2019
“Vanilla” Neural Network

Vanilla Neural Networks

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 11 May 2, 2019
Recurrent Neural Networks: Process Sequences

e.g. Image Captioning

image -> sequence of words

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 12 May 2, 2019
Recurrent Neural Networks: Process Sequences

e.g. Sentiment Classification

sequence of words -> sentiment

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 13 May 2, 2019
Recurrent Neural Networks: Process Sequences

e.g. Machine Translation

seq of words -> seq of words

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 14 May 2, 2019
Recurrent Neural Networks: Process Sequences

e.g. Video classification on frame level

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 15 May 2, 2019
Sequential Processing of Non-Sequence Data

Classify images by taking a

series of “glimpses”

Ba, Mnih, and Kavukcuoglu, “Multiple Object Recognition with Visual Attention”, ICLR 2015.
Gregor et al, “DRAW: A Recurrent Neural Network For Image Generation”, ICML 2015
Figure copyright Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra,
2015. Reproduced with permission.

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 16 May 2, 2019
Sequential Processing of Non-Sequence Data
Generate images one piece at a time!

Gregor et al, “DRAW: A Recurrent Neural Network For Image Generation”, ICML 2015
Figure copyright Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra, 2015. Reproduced with
permission.

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 17 May 2, 2019
RecurrentNeuralNetwork

RNN

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 18 May

May 2,
2, 2019
2019
RecurrentNeuralNetwork

y
Key idea: RNNs have an
“internal state” that is
updated as a sequence is
RNN processed

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 19 May

May 2,
2, 2019
2019
RecurrentNeuralNetwork
We can process a sequence of vectors x by
applying a recurrence formula at every time step: y

RNN
new state old state input vector at
some time step
some function x
with parameters W

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 20 May

May 2,
2, 2019
2019
RecurrentNeuralNetwork
We can process a sequence of vectors x by
applying a recurrence formula at every time step: y

RNN

Notice: the same function and the same set x

of parameters are used at every time step.

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 21 May

May 2,
2, 2019
2019
(Simple) Recurrent Neural Network
The state consists of a single “hidden” vector h:

RNN

x
Sometimes called a “Vanilla RNN” or an
“Elman RNN” after Prof. Jeffrey Elman

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 22 May

May 2,
2, 2019
2019
RNN: Computational Graph

h0 fW h1

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 23 May

May 2,
2, 2019
2019
RNN: Computational Graph

h0 fW h1 fW h2

x1 x2

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 24 May

May 2,
2, 2019
2019
RNN: Computational Graph

x1 x2 x3
W

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 30 May

May 2,
2, 2019
2019
RNN: Computational Graph: One to Many

y1 y2 y3 yT

h0 fW h1 fW h2 fW h3
… hT

x
W

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 31 May

May 2,
2, 2019
2019
Sequence to Sequence: Many-to-one +
one-to-many
Many to one: Encode input
sequence in a single vector

h
fW
h
fW
h
fW
h … h
0 1 2 3 T

x x x
W
1 2 3
1

Sutskever et al, “Sequence to Sequence Learning with Neural Networks”, NIPS 2014

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 32 May

May 2,
2, 2019
2019
Sequence to Sequence: Many-to-one +
one-to-many
One to many: Produce output
sequence from single input vector
Many to one: Encode input
sequence in a single vector y y
1 2

h
fW
h
fW
h
fW
h … h
fW
h
fW
h
fW …
0 1 2 3 T 1 2

x x x
W W
1 2 3
1 2

Sutskever et al, “Sequence to Sequence Learning with Neural Networks”, NIPS 2014

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 33 May

May 2,
2, 2019
2019
Example:
Character-level
Language Model

Vocabulary:
[h,e,l,o]

Example training
sequence:
“hello”

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 34 May

May 2,
2, 2019
2019
Example:
Character-level
Language Model

Vocabulary:
[h,e,l,o]

Example training
sequence:
“hello”

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 35 May

May 2,
2, 2019
2019
Example:
Character-level
Language Model

Vocabulary:
[h,e,l,o]

Example training
sequence:
“hello”

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 36 May

May 2,
2, 2019
2019
“e” “l” “l” “o”
Example: Sample

Character-level .03
.13
.25
.20
.11
.17
.11
.02
Softmax .00 .05 .68 .08
Language Model .84 .50 .03 .79

Sampling

Vocabulary:
[h,e,l,o]

At test-time sample
characters one at a time,
feed back to model

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 37 May

May 2,
2, 2019
2019
“e” “l” “l” “o”
Example: Sample
Character-level .03
.13
.25
.20
.11
.17
.11
.02
Softmax .00 .05 .68 .08
Language Model .84 .50 .03 .79

Sampling

Vocabulary:
[h,e,l,o]

At test-time sample
characters one at a time,
feed back to model

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 38 May

May 2,
2, 2019
2019
“e” “l” “l” “o”
Example: Sample
Character-level .03
.13
.25
.20
.11
.17
.11
.02
Softmax .00 .05 .68 .08
Language Model .84 .50 .03 .79

Sampling

Vocabulary:
[h,e,l,o]

At test-time sample
characters one at a time,
feed back to model

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 39 May

May 2,
2, 2019
2019
“e” “l” “l” “o”
Example: Sample
Character-level .03
.13
.25
.20
.11
.17
.11
.02
Softmax .00 .05 .68 .08
Language Model .84 .50 .03 .79

Sampling

Vocabulary:
[h,e,l,o]

At test-time sample
characters one at a time,
feed back to model

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 40 May

May 2,
2, 2019
2019
Forward through entire sequence to
compute loss, then backward through
Backpropagation through time entire sequence to compute gradient

Loss

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 41 May

May 2,
2, 2019
2019
Truncated Backpropagation through time
Loss

Run forward and backward

through chunks of the
sequence instead of whole
sequence

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 42 May

May 2,
2, 2019
2019
Truncated Backpropagation through time
Loss

Carry hidden states

forward in time forever,
but only backpropagate
for some smaller
number of steps

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 43 May

May 2,
2, 2019
2019
Truncated Backpropagation through time
Loss

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 44 May

May 2,
2, 2019
2019
min-char-rnn.py gist: 112 lines of Python

(https://gist.github.com/karpathy/d4dee
566867f8291f086)

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 45 May

May 2,
2, 2019
2019
y

RNN

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 46 May

May 2,
2, 2019
2019
at first:
train more

train more

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 47 May

May 2,
2, 2019
2019
PANDARUS: VIOLA:
Alas, I think he shall be come approached and the day Why, Salisbury must find his flesh and thought
When little srain would be attain'd into being never fed, That which I am not aps, not a man and in fire,
And who is but a chain and subjects of his death, To show the reining of the raven and the wars
I should not sleep. To grace my hand reproach within, and not a fair are hand,
That Caesar and my goodly father's world;
Second Senator: When I was heaven of presence and our fleets,
They are away this miseries, produced upon my soul, We spare with hours, but cut thy council I am
Breaking and strongly should be buried, when I great,
perish Murdered and by thy master's ready there
The earth and thoughts of many states. My power to give thee but so much as hell:
Some service in the noble bondman here,
DUKE VINCENTIO: Would show him to her wine.
Well, your wit is in the care of side and that.
KING LEAR:
Second Lord: O, if you were a feeble sight, the courtesy of your law,
They would be ruled after this chamber, and Your sight and several breath, will wear the gods
my fair nues begun out of the fact, to be conveyed, With his heads, and my hands are wonder'd at the deeds,
Whose noble souls I'll have the heart of the wars. So drop upon your lordship's head, and your opinion
Shall be against your honour.
Clown:
Come, sir, I will make did behold your worship.

VIOLA:
I'll drink it.

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 48 May

May 2,
2, 2019
2019
The Stacks Project: open source algebraic geometry textbook

Latex source http://stacks.math.columbia.edu/

The stacks project is licensed under the GNU Free Documentation License

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 49 May

May 2,
2, 2019
2019
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 50 May
May 2,
2, 2019
2019
Generated
C code

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 53 May

May 2,
2, 2019
2019
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 54 May
May 2,
2, 2019
2019
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - 55 May
May 2,
2, 2019
2019
Searching for interpretable cells

Karpathy, Johnson, and Fei-Fei: Visualizing and Understanding Recurrent Networks, ICLR Workshop 2016