0% found this document useful (0 votes)

62 views

Fundamentals of Deep Learning

Uploaded by

daniyadevacc

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views

Fundamentals of Deep Learning

Uploaded by

daniyadevacc

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 195

FUNDAMENTALS OF

DEEP LEARNING
Part 1: An Introduction to Deep Learning

1
To see lecture notes, make full screen
and click the “notes” button

2
WELCOME!
3
THE GOALS OF THIS COURSE

• Get you up and on your feet quickly

• Build a foundation to tackle a deep learning project right away
• We won’t cover the whole field, but we’ll get a great head start
• Foundation from which to read articles, follow tutorials, take further
classes
Part 1: An Introduction to Deep Learning

Part 2: How a Neural Network Trains

AGENDA Part 3: Convolutional Neural Networks

Part 4: Data Augmentation and Deployment

Part 5: Pre-trained Models

Part 6: Advanced Architectures

AGENDA – PART 1
• History of AI
• The Deep Learning Revolution

• What is Deep Learning

• How Deep Learning is Transforming the World

• Overview of the Course

• First Exercise
HAVE FUN!
HUMAN VS MACHINE LEARNING
Relaxed Alertness

Human Machine

Rest and Digest Training

Fight-or-flight Prediction
LET’S GET STARTED

9
HISTORY OF AI
10
BEGINNING OF ARTIFICIAL INTELLIGENCE

COMPUTERS ARE MADE IN EARLY ON, GENERALIZED TURNED OUT TO BE HARDER

PART TO COMPLETE HUMAN INTELLIGENCE LOOKED THAN EXPECTED
TASKS POSSIBLE

11
EARLY NEURAL NETWORKS

Inspired by biology

Created in the 1950’s

Outclassed by Von
Neumann Architecture

12
EXPERT SYSTEMS

Highly complex

Programmed by hundreds of engineers

Rigorous programming of many rules

EXPERT SYSTEMS - LIMITATIONS

What are these three images?

14
HOW DO CHILDREN LEARN?

• Expose them to lots of data

• Give them the “correct
answer”
• They will pick up the
important patterns on their
own

15
THE DEEP LEARNING
REVOLUTION
16
DATA

- Networks need a lot of

information to learn from
- The digital era and the
internet has supplied that
data

17
COMPUTING POWER
Need a way for our artificial “brain” to observe lots of data
within a practical amount of time.

18
THE IMPORTANCE OF THE GPU

A Rendered Image A Neural Network

19
WHAT IS DEEP LEARNING?
20
DEEP LEARNING FLIPS TRADITIONAL
PROGRAMMING ON ITS HEAD

21
TRADITIONAL PROGRAMMING
Building a Classifier

1 2 3
Define a set of Program those Feed it examples,
rules for rules into the and the program
classification computer uses the rules to
classify
MACHINE LEARNING
Building a Classifier

1 2 3
Show model the Model takes Model learns to
examples with the guesses, we tell it correctly
answer of how to if it’s right or not categorize as it’s
classify training. The
system learns the
rules on its own
THIS IS A FUNDAMENTAL SHIFT
WHEN TO CHOOSE DEEP LEARNING

Classic Programming Deep Learning

If rules are clear

If rules are
and
nuanced, complex,
straightforward,
difficult to discern,
often better to just
use deep learning
program it
25
DEEP LEARNING COMPARED TO OTHER AI

Depth and complexity of networks

Up to billions of parameters (and growing)

Many layers in a model

Important for learning complex rules

HOW DEEP LEARNING IS
TRANSFORMING THE WORLD
27
COMPUTER VISION

ROBOTICS AND OBJECT SELF DRIVING

MANUFACTURING DETECTION CARS

28
NATURAL LANGUAGE PROCESSING

REAL TIME VOICE VIRTUAL

TRANSLATION RECOGNITION ASSISTANTS

29
RECOMMENDER SYSTEMS

CONTENT TARGETED SHOPPING

CURATION ADVERTISING RECOMMENDATIONS

30
REINFORCEMENT LEARNING

ALPHAGO BEATS AI BOTS BEAT STOCK TRADING

WORLD CHAMPION PROFESSIONAL ROBOTS
IN GO VIDEOGAMERS
31
OVERVIEW OF THE
COURSE
32
HANDS ON EXERCISES

• Get comfortable with the

process of deep learning
• Exposure to different models
and datatypes
• Get a jump-start to tackle
your own projects

33
STRUCTURE OF THE COURSE
“Hello World” of Deep Learning

Train a more complicated model

New architectures and techniques to improve

performance

Pre-trained models

Transfer learning
34
PLATFORM OF THE COURSE

GPU powered cloud server

JupyterLab platform

Jupyter notebooks for interactive coding

SOFTWARE OF THE COURSE

• Major deep learning platforms:

• TensorFlow + Keras (Google)

• Pytorch (Facebook)

• MXNet (Apache)
• We’ll be using TensorFlow and Keras

• Good idea to gain exposure to others

moving forward

36
FIRST EXERCISE:
CLASSIFY HANDWRITTEN
DIGITS
37
HELLO NEURAL NETWORKS

Train a network to • Historically important and

correctly classify
handwritten digits difficult task for computers

• Get exposed to the example, and

Try learning like a
Neural Network try to figure out the rules to how
it works

38
LET’S GO!

39
40
FUNDAMENTALS OF
DEEP LEARNING
Part 2: How a Neural Network Trains

41
Part 1: An Introduction to Deep
Learning

Part 2: How a Neural Network Trains

AGENDA Part 3: Convolutional Neural Networks

Part 4: Data Augmentation and

Deployment

Part 5: Pre-trained Models

Part 6: Advanced Architectures

42
AGENDA – PART 2
• Recap

• A Simpler Model

• From Neuron to Network

• Activation Functions

• Overfitting

• From Neuron to Classification

43
RECAP OF THE EXERCISE
What just happened?

Loaded and visualized our data

Edited our data (reshaped, normalized, to categorical)

Created our model

Compiled our model

Trained the model on our data

44
DATA PREPARATION
Input as an array

28 [0,0,0,24,75,184,185,78,32,55,0,0,0…]

45
DATA PREPARATION
Targets as categories

0 [1,0,0,0,0,0,0,0,0,0]

1 [0,1,0,0,0,0,0,0,0,0]

2 [0,0,1,0,0,0,0,0,0,0]
3 [0,0,0,1,0,0,0,0,0,0]
.
.
. 46
AN UNTRAINED MODEL

[ 0, 0, …, 0] (784,)

… … … (512,)
Layer
Size
… … … (512,)

(10,)

47
A SIMPLER MODEL
48
A SIMPLER MODEL
𝑦 = 𝑚𝑥 + 𝑏
6
5 m•x 𝑚 =?
x y 4
3

y
1 3 2

2 5
1
0 𝑦! b=?
0 2 4
x

49
A SIMPLER MODEL
𝑦 = 𝑚𝑥 + 𝑏
6
5 m•x 𝑚 =?
x y 4
3

y
1 3 2

2 5
1
0 𝑦! b=?
0 2 4
x

50
A SIMPLER MODEL
𝑦 = 𝑚𝑥 + 𝑏
Start
6 Random
5 m•x
4
x y !
𝒚
3 𝑚 = −1
y
1 3 4 2

2 5 3
1
0 𝑦! b=5
0 2 4
x

51
A SIMPLER MODEL
𝑦 = 𝑚𝑥 + 𝑏
6
x y !
𝒚 𝒆𝒓𝒓!
5 %
𝑀𝑆𝐸 = ∑&#$%(𝑦# − 𝑦,# )'
1 3 4 1 4 &
3

y
2 5 3 4 2
&
1 1
MSE = 2.5 0 𝑅𝑀𝑆𝐸 = ((𝑦# − 𝑦,# )'
0 2 4 𝑛
#$%
RMSE = 1.6 x

52
A SIMPLER MODEL
𝑦 = 𝑚𝑥 + 𝑏
6
x y !
𝒚 𝒆𝒓𝒓!
5
1 3 4 1 4
3

y
2 5 3 4 2
1
MSE = 2.5 0
0 2 4
RMSE = 1.6 x

53
THE LOSS CURVE

Loss Surface 16

MSE

54
THE LOSS CURVE

6 16
5
4
3
y

Current
2
1 MSE
0
0 2 4 Target
x
𝑚 = −1
b=5 0

55
THE LOSS CURVE

6 16
5
4
3
y

Old
2 Current
1 MSE
0
0 2 4 Target
x
𝑚 = −1
b=4 0

56
THE LOSS CURVE

6 16
5
4
3
y

2 Current
1 MSE
0
0 2 4 Target
x
𝑚=0
b=4 0

57
THE LOSS CURVE

The 16
Gradient Which direction loss decreases
the most
λ: The
learning
rate How far to travel

Epoch A model update with the full MSE

dataset

Target
Batch
A sample of the full dataset

Step An update to the weight 0

parameters

58
THE LOSS CURVE

The 16
Gradient Which direction loss decreases
the most
λ: The
learning
rate How far to travel

Epoch A model update with the full MSE

dataset

Target
Batch
A sample of the full dataset

Step An update to the weight 0

parameters

59
OPTIMIZERS

Loss – Momentum Optimizer • Adam

• Adagrad
• RMSprop
• SGD

60
FROM NEURON TO
NETWORK
61
BUILDING A NETWORK

• Scales to more inputs

w1 w2

𝑦!

62
BUILDING A NETWORK

x1 x2
w2 w3
w1 w4
• Scales to more inputs
• Can chain neurons
w5 w6

𝑦!

63
BUILDING A NETWORK

x1 x2
w2 w3
w1 w4
• Scales to more inputs
• Can chain neurons
w5 w6
• If all regressions are
linear, then output will
𝑦! also be a linear
regression

64
ACTIVATION FUNCTIONS
65
ACTIVATION FUNCTIONS

Linear ReLU Sigmoid

𝑤𝑥 + 𝑏 𝑖𝑓 𝑤𝑥 + 𝑏 > 0 1
𝑦! = 𝑤𝑥 + 𝑏 𝑦! = '
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑦! =
1 + 𝑒 !(#$%&)

10 10 1
5 5
0 0 0,5
-5 -5
-10 -10 0
-10 -5 0 5 10 -10 -5 0 5 10 -10 -5 0 5 10

66
ACTIVATION FUNCTIONS

Linear ReLU Sigmoid

67
ACTIVATION FUNCTIONS

x1 2 2
w1 w2
0 0

-2 -2
-10 -5,5 -1 3,5 8 -10 -5,5 -1 3,5 8
w3 w4
2

𝑦! -2
-10 -5,5 -1 3,5 8

68
OVERFITTING
69
OVERFITTING
Why not have a super large neural network?

70
OVERFITTING
Which Trendline is Better?

0,9 0,9

0,8 0,8

0,7 0,7

0,6 0,6

0,5 0,5

0,4 0,4

0,3 0,3

0,2 0,2

0,1
MSE = .0000 0,1
MSE = .0113

0 0
0 0,2 0,4 0,6 0,8 1 0 0,2 0,4 0,6 0,8 1

71
OVERFITTING
Which Trendline is Better?

0,9 0,9

0,8 0,8

0,7 0,7

0,6 0,6

0,5 0,5

0,4 0,4

0,3 0,3

0,2 0,2

0,1
MSE = .0308 0,1
MSE = .0062

0 0
0 0,2 0,4 0,6 0,8 1 0 0,2 0,4 0,6 0,8 1

72
TRAINING VS VALIDATION DATA
Avoid memorization

Training data MSE Per Epoch

60
• Core dataset for the model to learn on
50

Validation data 40

MSE
30
• New data for model to see if it truly 20
understands (can generalize)
10
Overfitting 0
1 2 3 4 5 6 7 8 9 10
• When model performs well on the training Epoch
data, but not the validation data (evidence of
memorization) Training MSE

• Ideally the accuracy and loss should be Validation MSE - Expected

similar between both datasets Validation MSE - Overfitting

73
FROM REGRESSION TO
CLASSIFICATION
74
AN MNIST MODEL

[ 0, 0, …, 0] (784,)

… … … (512,)
Layer
Size
… … … (512,)

(10,)

75
AN MNIST MODEL

[ 0, 0, …, 0] (784,)

ReLU … … … (512,)
Layer
Size
ReLU … … … (512,)

Sigmoid
(10,)

76
AN MNIST MODEL

[ 0, 0, …, 0] (784,)

ReLU … … … (512,)
Layer
Size
ReLU … … … (512,)

Softmax
(10,)

77
RMSE FOR PROBABILITIES?

0
0 1 2 3 4

78
RMSE FOR PROBABILITIES?

0
0 1 2 3 4

79
CROSS ENTROPY

4 Cross Entropy
3 Blue Point Prediction
6
2 5
4
1
3

Loss
0 2
0 1 2 3 4 1
0

0,00
0,01
0,15
0,30
0,45
0,60
0,75
0,90
1,00
-1
Assigned Probability
Loss if True Loss if False

80
CROSS ENTROPY

4 Cross Entropy
3 Blue Point Prediction
6
2 5
4
1
3

Loss
0 2
0 1 2 3 4 1
0

0,00
0,01
0,15
0,30
0,45
0,60
0,75
0,90
1,00
𝐿𝑜𝑠𝑠 = − (𝑡 𝑥 ) log(𝑝 𝑥 ) + 1 − 𝑡 𝑥 ) log 1 − 𝑝(𝑥) -1
Assigned Probability
𝑡 𝑥 = 𝑡𝑎𝑟𝑔𝑒𝑡 (0 𝑖𝑓 𝐹𝑎𝑙𝑠𝑒, 1 𝑖𝑓 𝑇𝑟𝑢𝑒) Loss if True Loss if False

𝑝 𝑥 = 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝑝𝑜𝑖𝑛𝑡 𝑥

81
CROSS ENTROPY

4 Cross Entropy
3 Blue Point Prediction
6
2 5
4
1
3

Loss
0 2
0 1 2 3 4 1
0

0,00
0,01
0,15
0,30
0,45
0,60
0,75
0,90
1,00
-1
Assigned Probability
Loss if True Loss if False

82
BRINGING IT TOGETHER
83
THE NEXT EXERCISE
The American Sign Language Alphabet

84
LET’S GO!
85
APPENDIX: GRADIENT
DESCENT
HELPING THE COMPUTER CHEAT CALCULUS

86
LEARNING FROM ERROR

% & %
𝑀𝑆𝐸 = ∑#$%(𝑦 , '
− 𝑦) = ∑&#$%(𝑦 − (𝑚𝑥 + 𝑏))'
& &

1
𝑀𝑆𝐸 = ((3 − (𝑚 1 + 𝑏))' + (5 − (𝑚 2 + 𝑏))')
2

𝜕𝑀𝑆𝐸 𝜕𝑀𝑆𝐸
= 5𝑚 + 3𝑏 − 13 = 3𝑚 + 2𝑏 − 8
𝜕𝑚 𝜕𝑏
𝑚 = −1
𝜕𝑀𝑆𝐸 𝜕𝑀𝑆𝐸 b=5
= −3 = −1
𝜕𝑚 𝜕𝑏
87
THE LOSS CURVE

Loss Surface 16

Current

Target

88
THE LOSS CURVE

16
𝜕𝑀𝑆𝐸 𝜕𝑀𝑆𝐸
= −7 = −3
𝜕𝑚 𝜕𝑏

Target

89
THE LOSS CURVE

16
𝜕𝑀𝑆𝐸 𝜕𝑀𝑆𝐸
= −7 = −3
𝜕𝑚 𝜕𝑏

𝜕𝑀𝑆𝐸
m ∶= m − λ
𝜕𝑚 Target

𝜕𝑀𝑆𝐸
b ≔𝑏 −λ 0
𝜕𝑏

90
THE LOSS CURVE

16
𝜕𝑀𝑆𝐸 𝜕𝑀𝑆𝐸
= −7 = −3
𝜕𝑚 𝜕𝑏

𝜕𝑀𝑆𝐸
m ∶= m − λ
𝜕𝑚 Target
λ = .6
𝜕𝑀𝑆𝐸
b ≔𝑏 −λ 0
𝜕𝑏

91
THE LOSS CURVE

16
𝜕𝑀𝑆𝐸 𝜕𝑀𝑆𝐸
= −7 = −3
𝜕𝑚 𝜕𝑏

𝜕𝑀𝑆𝐸
m ∶= m − λ
𝜕𝑚 Target
λ = .005
𝜕𝑀𝑆𝐸
b ≔𝑏 −λ 0
𝜕𝑏

92
THE LOSS CURVE

λ = .1

m ≔ −1 + 7 λ = −0.3
Target
b ≔ 5 + 3 λ = 4.7

93
94
FUNDAMENTALS OF
DEEP LEARNING
Part 3: Convolutional Neural Networks

95
Part 1: An Introduction to Deep
Learning

Part 2: How a Neural Network Trains

AGENDA Part 3: Convolutional Neural Networks

Part 4: Data Augmentation and

Deployment

Part 5: Pre-trained Models

Part 6: Advanced Architectures

96
AGENDA – PART 3
• Kernels and Convolution

• Kernels and Neural Networks

• Other Layers in the Model

97
RECAP OF THE EXERCISE

Trained a dense neural network model

Training accuracy was high

Validation accuracy was low

Evidence of overfitting

98
KERNELS AND
CONVOLUTION
99
KERNELS AND CONVOLUTION

Blur Sharpen

Original Image

Brighten Darken

100
KERNELS AND CONVOLUTION

.06 .13 .06 0 -1 0

.13 .25 .13 -1 5 -1
.06 .13 .06 0 -1 0
Blur Sharpen

0 0 0 0 0 0
0 1.5 0 0 0.5 0
Original Image
0 0 0 0 0 0
Brighten Darken

101
KERNELS AND CONVOLUTION

Blur Kernel Original Image Convolved Image

1 0 1 1 0 1

0 1 0 0 1 0
.06 .13 .06
0 1 1 1 1 0
.13 .25 .13 ∗ 0 1 1 1 1 0
=
.06 .13 .06
1 0 1 1 0 1

1 1 0 0 1 1

102
KERNELS AND CONVOLUTION

Blur Kernel Original Image Convolved Image

1 0 1 1 0 1

0 1 0 0 1 0
.06 .13 .06
0 1 1 1 1 0
.13 .25 .13 ∗ 0 1 1 1 1 0
=
.06 .13 .06
1 0 1 1 0 1

1 1 0 0 1 1

103
KERNELS AND CONVOLUTION

Blur Kernel Original Image Convolved Image

u ltiply 1
.06 0 1
.06 1 0 1
M
00 .25
1 0 0 1 0
.06 .13 .06
00 .13
1 .06
1 1 1 0
.13 .25 .13 ∗ 0 1 1 1 1 0
=
.06 .13 .06
1 0 1 1 0 1

1 1 0 0 1 1

104
KERNELS AND CONVOLUTION

Blur Kernel Original Image Convolved Image

1
.06 0 1
.06 1 0 1
.56
00 .25
1 0 0 1 0
.06 .13 .06
00 .13
1 .06
1 1 1 0
.13 .25 .13 ∗ 0 1 1 1
Total
1 0
=
.06 .13 .06
1 0 1 1 0 1

1 1 0 0 1 1

105
KERNELS AND CONVOLUTION

Blur Kernel Original Image Convolved Image

1 00 1
.13 1
.06 0 1
.56 .57
0 .13
1 0 0 1 0
.06 .13 .06
0 .06
1 .13
1 .06
1 1 0
.13 .25 .13 ∗ 0 1 1 1 1 0
=
.06 .13 .06
1 0 1 1 0 1

1 1 0 0 1 1

106
KERNELS AND CONVOLUTION

Blur Kernel Original Image Convolved Image

1 0 1 1 0 1
.56 .57 .57 .56
0 1 0 0 1 0
.7 .82 .82 .7
.06 .13 .06
0 1 1 1 1 0
.13 .25 .13 ∗ 0 1 1 1 1 0
= .69 .95 .95 .69
.06 .13 .06 .64 .69 .69 .64
1 0 1 1 0 1

1 1 0 0 1 1

107
STRIDE

1 0 1 1 0 1

Stride 1 0 1 0 0 1 0 .56 .57 .57 .56

0 1 1 1 1 0

1 0 1 1 0 1

Stride 2 0 1 0 0 1 0 .56 .57

0 1 1 1 1 0

1 0 1 1 0 1

Stride 3 0 1 0 0 1 0 .56 .56

0 1 1 1 1 0
108
PADDING
Original Image Zero Padding

0 0 0 0 0 0 0 0

1 0 1 1 0 1 0 1 0 1 1 0 1 0

0 1 0 0 1 0 0 0 1 0 0 1 0 0

0 1 1 1 1 0 0 0 1 1 1 1 0 0

1 0 1 1 0 1 0 1 0 1 1 0 1 0

1 1 0 0 1 1 0 1 1 0 0 1 1 0

0 0 0 0 0 0 0 0
109
PADDING
Original Image Mirror Padding

1 1 0 1 1 0 1 1

1 0 1 1 0 1 1 1 0 1 1 0 1 1

0 1 0 0 1 0 0 0 1 0 0 1 0 0

0 1 1 1 1 0 0 0 1 1 1 1 0 0

1 0 1 1 0 1 1 1 0 1 1 0 1 1

1 1 0 0 1 1 1 1 1 0 0 1 1 1

1 1 1 0 0 1 1 1
110
KERNELS AND NEURAL
NETWORKS
111
KERNELS AND NEURAL NETWORKS

Kernel

w1 w2 w3
w4 w5 w6
w7 w8 w9

112
KERNELS AND NEURAL NETWORKS

Kernel Neuron

x1 w w x2
w1 w2 w3 w1
2 3
w4
w4 w5 w6
w5 w6
w7 w8 w9
𝑦!

113
KERNELS AND NEURAL NETWORKS

…
…

…
(28, 28, 2) (28, 28, 2)
Stacked Images Stacked Images

(3, 3, 1, 2) (3, 3, 2, 2) (512) (512)

Kernels Kernels Dense Dense
(28, 28, 1) (1568)
Image Input Flattened Image (24)
Vector Output Prediction

114
FINDING EDGES
Vertical Edges Original Image Horizontal Edges

1 0 -1 0 0 0 1 2 1
2 0 -2 0 1 0 0 0 0
1 0 -1 0 0 0 -1 -2 -1 115
Input

Convolution

Edges
Textures Convolution

Convolution

Convolution
Objects

Dense
NEURAL NETWORK PERCEPTION

Dense

Output
116
NEURAL NETWORK PERCEPTION

117
OTHER LAYERS IN THE
MODEL
118
MAX POOLING

110 256 153 67

12 89 88 43 256 153

10 15 50 55 23 55

23 9 49 23

119
DROPOUT

rate = 0 rate = .2 rate = .4

120
Input

Convolution

Max Pooling

Convolution

Dropout

Max Pooling

Convolution

Max Pooling
WHOLE ARCHITECTURE

Dense

Output
121
LET’S GO!
122
123
FUNDAMENTALS OF
DEEP LEARNING
Part 4: Data Augmentation and Deployment

124
Part 1: An Introduction to Deep
Learning

Part 2: How a Neural Network Trains

AGENDA Part 3: Convolutional Neural Networks

Part 4: Data Augmentation and

Deployment

Part 5: Pre-trained Models

Part 6: Advanced Architectures

125
AGENDA – PART 4
• Data Augmentation

• Model Deployment

126
RECAP OF THE EXERCISE

Analysis Solution

• CNN increased validation • Clean data provides better

accuracy examples
• Still seeing training accuracy • Dataset variety helps the model
higher than validation generalize

128
DATA AUGMENTATION
129
DATA AUGMENTATION

130
IMAGE FLIPPING
Horizontal Flip

Vertical Flip

131
ROTATION

90⁰

3 1
45

5 ⁰
⁰
0⁰
180⁰

⁰ 5
5⁰

2
270⁰
31

2
132
ZOOMING

133
WIDTH AND HEIGHT
SHIFTING

134
HOMOGRAPHY

135
BRIGHTNESS

136
CHANNEL
SHIFTING

137
MODEL DEPLOYMENT
138
MODEL DEPLOYMENT

…
…

…
(28, 28, 2) (28, 28, 2)
Stacked Images Stacked Images

(3, 3, 1, 2) (3, 3, 2, 2) (512) (512)

Kernels Kernels Dense Dense
(28, 28,1) (1568)
Image Input Flattened Image (24)
Vector Output Prediction

139
MODEL DEPLOYMENT

Training
Batch Input

Convolution

Max Pooling

…
140
MODEL DEPLOYMENT

(220, 155, 3) (220, 155, 1)

(1, 220, 155, 1)
(287, 433, 3)

Resize Greyscale “Batch”

141
LET’S TRY IT OUT!
142
143
FUNDAMENTALS OF
DEEP LEARNING
Part 5: Pre-trained Models

144
Part 1: An Introduction to Deep
Learning

Part 2: How a Neural Network Trains

AGENDA Part 3: Convolutional Neural Networks

Part 4: Data Augmentation and

Deployment

Part 5: Pre-trained Models

Part 6: Advanced Architectures

145
AGENDA – PART 5
• Review so far

• Pre-trained Models

• Transfer Learning

146
REVIEW SO FAR
147
REVIEW SO FAR

• Learning Rate
• Number of Layers
• Neurons per Layer
• Activation Functions
• Dropout
• Data

148
PRE-TRAINED MODELS
149
PRE-TRAINED MODELS

PYTORCH
HUB
150
PRE-TRAINED MODELS

IM GENET
151
THE NEXT CHALLENGE
An Automated Doggy Door

152
TRANSFER LEARNING
153
THE CHALLENGE AFTER
An Automated Presidential Doggy Door

154
TRANSFER LEARNING

155
TRANSFER LEARNING

…
…

…
(28, 28, 2) (28, 28, 2)
Stacked Images Stacked Images

(3, 3, 1, 2) (3, 3, 2, 2) (512) (512)

Kernels Kernels Dense Dense
(28, 28,1) (1568)
Image Input Flattened Image (10)
Vector Output Prediction

156
Input

More
Generalized
Convolution

Max Pooling

Convolution

Dropout

Max Pooling

Convolution
TRANSFER LEARNING

Max Pooling

Dense

Output
More

157
Specialized
TRANSFER LEARNING
Freezing the Model?

158
TRANSFER LEARNING

159
LET’S GET STARTED!
160
161
FUNDAMENTALS OF
DEEP LEARNING
Part 6: Advanced Architectures

162
Part 1: An Introduction to Deep
Learning

Part 2: How a Neural Network Trains

AGENDA Part 3: Convolutional Neural Networks

Part 4: Data Augmentation and

Deployment

Part 5: Pre-trained Models

Part 6: Advanced Architectures

163
AGENDA – PART 6
• Moving Forward

• Natural Language Processing

• Recurrent Neural Networks

• Other Architectures

• Closing Thoughts

164
MOVING FORWARD
165
FIELDS OF AI

Computer Vision
•Optometry

Natural Language Processing

•Linguistics

Reinforcement Learning
•Game Theory
•Psychology

Anomaly Detection
•Security
•Medicine

166
FIELDS OF AI

Computer Vision
•Optometry

Natural Language Processing

•Linguistics

Reinforcement Learning
•Game Theory
•Psychology

Anomaly Detection
•Security
•Medicine

167
FIELDS OF AI

Computer Vision
•Optometry

Natural Language Processing

•Linguistics

Reinforcement Learning
•Game Theory
•Psychology

Anomaly Detection
•Security
•Medicine

168
NATURAL LANGUAGE
PROCESSING
169
FROM WORDS TO NUMBERS

Dictionary
“A dog barked at a cat.”
1. A 8. Cat
2. An 9. Cats
[1, 10, 7, 4, 1, 8] 3. And 10. Dog
4. At 11. Dogs
5. Ate 12. Eat
6. Bark
7. Barked

170
FROM WORDS TO NUMBERS

Inputs Outputs

A A
An An Dictionary
And And
At At
1. A 8. Cat
Ate Ate
2. An 9. Cats
Bark Bark
Barked Barked
3. And 10. Dog
Cat Cat
4. At 11. Dogs
Cats Cats
5. Ate 12. Eat
Dog Dog
6. Bark
Dogs Dogs
7. Barked
Eat Eat

171
FROM WORDS TO NUMBERS

Inputs Outputs

0 0%
0 0% Dictionary
0 10%
0 5%
1. A 8. Cat
0 35%
2. An 9. Cats
0 0%
0 50%
3. And 10. Dog
0 0%
4. At 11. Dogs
0 0%
5. Ate 12. Eat
1 0%
6. Bark
0 0%
7. Barked
0 0%

172
FROM WORDS TO NUMBERS

Big
Giraffe
(.9, .9)

Llama
Bigger Dictionary
(-.9, .1)
1. A 31. Ate 61. Cats
2. An 32. Bark 62. Dog
3. And 33. Barked 63. Dogs
4. At 34. Cat 64. Eat
5. Ate 35. Cats 65. Eaten
6. Bark 36. Dog 66. A
7. Barked 37. Dogs 67. An
8. Cat 38. Eat 68. And
Domestic Wild 9.
10.
Cats
Dog
39.
40.
Eaten
A
69.
70.
At
Ate
11. Dogs 41. An 71. Bark
12. Eat 42. And 72. Barked
13. Eaten 43. At 73. Cat
14. A 44. Ate 74. Cats
15. An 45. Bark 75. Dog
16. And 46. Barked 76. Dogs

Falcon 17.
18.
At
Ate
47.
48.
Cat
Cats
77.
78.
Eat
Eaten
19. Bark 49. Dog 79. …
(.15, -.4) 20.
21.
Barked
Cat
50.
51.
Dogs
Eat
80.
81.
…
…
22. Cats 52. Eaten 82. …

Puffin
23. Dog 53. A

Kitty 24.
25.
Dogs
Eat
54.
55.
An
And

(-.75, -.8) (.85, -.65) 26.

27.
Eaten
A
56.
57.
At
Ate

Small 28.
29.
An
And
58.
59.
Bark
Barked
30. At 60. Cat

173
FROM WORDS TO NUMBERS

Inputs Technically Outputs

an
A Embedding A
An An Dictionary
And And
At At
1. A 8. Cat
Ate Ate
2. An 9. Cats
Bark Bark
Barked Barked
3. And 10. Dog
Cat Cat
4. At 11. Dogs
Cats Cats
5. Ate 12. Eat
Dog Dog
6. Bark
Dogs Dogs
7. Barked
Eat Eat

174
RECURRENT NEURAL
NETWORKS
175
RECURRENT NEURAL NETWORKS

“Cats say ___.”

“Dogs say ___.” Dictionary

1. Cats
2. Dogs
3. Meow
4. Say
5. Woof

176
RECURRENT NEURAL NETWORKS

“Cats say ___.”

Cats “Dogs say ___.” Dictionary

Dogs
Meow
Cats 1. Cats
Dogs
Say 2. Dogs
RNN Woof 3. Meow
Meow 4. Say
Say Outputs 5. Woof
Embedding
Woof

Inputs
177
RECURRENT NEURAL NETWORKS

0
“Cats say ___.”
0
0% “Dogs say ___.” Dictionary
0
0%
50%
1 1. Cats
50% 2. Dogs
0
RNN 0% 3. Meow
0 4. Say
0 Outputs 5. Woof
Embedding
0

Inputs
178
RECURRENT NEURAL NETWORKS

0
“Cats say ___.”
0
0% “Dogs say ___.” Dictionary
0 .1
0%
-.5
50%
1 1. Cats
.6
50% 2. Dogs
0
RNN 0% 3. Meow
0 4. Say
0 Outputs 5. Woof
Embedding
0

Inputs
179
RECURRENT NEURAL NETWORKS

0
“Cats say ___.”
0
“Dogs say ___.” Dictionary
0 .1

-.5
1 1. Cats
.6
2. Dogs
0
RNN 3. Meow
0 4. Say
0 Outputs 5. Woof
Embedding
0

Inputs
180
RECURRENT NEURAL NETWORKS

.1
“Cats say ___.”
-.5
“Dogs say ___.” Dictionary
.6

0 1. Cats
2. Dogs
0
RNN 3. Meow
0 4. Say
1 Outputs 5. Woof
Embedding
0

Inputs
181
RECURRENT NEURAL NETWORKS

.1
“Cats say ___.”
-.5
0% “Dogs say ___.” Dictionary
.6
-.3
0%
.2
100% 1. Cats
0
.5
0% 2. Dogs
0
RNN 0%
3. Meow
0 4. Say
1 Outputs 5. Woof
Embedding
0

Inputs
182
RECURRENT NEURAL NETWORKS

Input Input

RNN LSTM

Output Output

183
OTHER ARCHITECTURES
184
AUTOENCODERS

Inputs Outputs

185
AUTOENCODERS

Inputs Outputs

186
AUTOENCODERS

-.3 -.3

.6 .6

Encoder Decoder
187
GENERATIVE ADVERSARIAL NETWORKS (GANS)

Prediction

Real Discriminator
Images
Real

Generator
Fake

Noise Fake
Images

188
REINFORCEMENT LEARNING

Agent Environment

189
NEXT STEPS
190
ENABLING PORTABILITY WITH NGC CONTAINERS
Extensive
NGC Deep Learning Containers
- Diverse range of workloads and industry specific use cases
Optimized
- DL containers updated monthly
- Packed with latest features and superior performance
Secure & Reliable
- Scanned for vulnerabilities and crypto
- Tested on workstations, servers, & cloud instances
Scalable
- Supports multi-GPU & multi-node systems
Designed for Enterprise & HPC
- Supports Docker, Singularity & other runtimes
Run Anywhere
- Bare metal, VMs, Kubernetes
- x86, ARM, POWER
- Multi-cloud, on-prem, hybrid, edge Riva

Learn more about NGC Containers 191

NEXT STEPS FOR THIS CLASS

Step 1 Sign up for NGC

https://docs.nvidia.com/dgx/ng
c-registry-for-dgx-user-
guide/index.html
Step 2 Visit NGC Catalog
https://catalog.ngc.nvidia.com/
orgs/nvidia/containers/dli-dl-
fundamentals
Step 3 Pull and Run Container
Visit localhost:8888 to check out
a JupyterLab environment with
a Next Steps Project

192
CLOSING THOUGHTS
193
COPYING ROCKET SCIENCE

194
LET’S GET STARTED!
195
196

Bijay Kumar Sahoo (UNHZL8850) - Appointment Letter - 3rd Party Payroll Employees
No ratings yet
Bijay Kumar Sahoo (UNHZL8850) - Appointment Letter - 3rd Party Payroll Employees
5 pages
Proposal to hire HR Outsourcing
No ratings yet
Proposal to hire HR Outsourcing
5 pages
ESIC From 5A Return of Contribution
No ratings yet
ESIC From 5A Return of Contribution
4 pages
PF Form
No ratings yet
PF Form
20 pages
Salary Administration
No ratings yet
Salary Administration
17 pages
Eram Shaikh - Onboarding KIT Template 23 May 2022
No ratings yet
Eram Shaikh - Onboarding KIT Template 23 May 2022
97 pages
B, C Sections List
No ratings yet
B, C Sections List
4 pages
Online Registration of Establishment With DSC: User Manual
No ratings yet
Online Registration of Establishment With DSC: User Manual
39 pages
Chinmoy Builders Pvt. LTD.: Promoters and Build Ers
No ratings yet
Chinmoy Builders Pvt. LTD.: Promoters and Build Ers
17 pages
CLG List - B'lore
No ratings yet
CLG List - B'lore
19 pages
Chemical Industries
No ratings yet
Chemical Industries
8 pages
Employees' State Insurance Corporation E-Pehchan Card: Insured Person: Insurance No.: Date of Registration
No ratings yet
Employees' State Insurance Corporation E-Pehchan Card: Insured Person: Insurance No.: Date of Registration
3 pages
Ashish John HR Analytics Assingement
No ratings yet
Ashish John HR Analytics Assingement
32 pages
Walkins IT Recruiter / Domestic Recruiter-Iprocess Business Solutions
No ratings yet
Walkins IT Recruiter / Domestic Recruiter-Iprocess Business Solutions
2 pages
List of UG Colleges
No ratings yet
List of UG Colleges
80 pages
Pharmacy College List
No ratings yet
Pharmacy College List
6 pages
Students List
No ratings yet
Students List
36 pages
ITI - Interested List - xlsx3.7.21
No ratings yet
ITI - Interested List - xlsx3.7.21
28 pages
1 Osmania University Colleges List
No ratings yet
1 Osmania University Colleges List
14 pages
TS Final Year Diploma 2023 24 Batch SAMPLE 2
No ratings yet
TS Final Year Diploma 2023 24 Batch SAMPLE 2
6 pages
KPIT Data (2026 Batch)1
No ratings yet
KPIT Data (2026 Batch)1
15 pages
PF Pension Settlement Form-TCS
No ratings yet
PF Pension Settlement Form-TCS
4 pages
YTD Statement 2017 - 2018
No ratings yet
YTD Statement 2017 - 2018
1 page
New List Ayc
No ratings yet
New List Ayc
10 pages
Vendor Mech All
0% (1)
Vendor Mech All
152 pages
Osmania University
No ratings yet
Osmania University
7 pages
Kar Shops Commercial Forms Format
No ratings yet
Kar Shops Commercial Forms Format
16 pages
HR Vision HR Objectives HR Execution HR Implementation
No ratings yet
HR Vision HR Objectives HR Execution HR Implementation
9 pages
Telangana 12th BPC 2017 18 Batch Samples
No ratings yet
Telangana 12th BPC 2017 18 Batch Samples
9 pages
PFWITHDRAWALFORM-19
No ratings yet
PFWITHDRAWALFORM-19
2 pages
List of D Pharm Colleges 12.2.2020
No ratings yet
List of D Pharm Colleges 12.2.2020
6 pages
Nodal Officer - 1733466619
No ratings yet
Nodal Officer - 1733466619
26 pages
Aiq 2023202307160756
No ratings yet
Aiq 2023202307160756
2 pages
Registration
100% (3)
Registration
15 pages
Govt Approved D PHARMA Institutions 2020
No ratings yet
Govt Approved D PHARMA Institutions 2020
5 pages
List of Misconduct (Maju Intan)
No ratings yet
List of Misconduct (Maju Intan)
6 pages
Contract Labour FORM X
No ratings yet
Contract Labour FORM X
1 page
TDS Notification, Circular and Form16, Form 16A, Etc
No ratings yet
TDS Notification, Circular and Form16, Form 16A, Etc
27 pages
YATRA-Consultancy Agreement (Yatra Payroll)
No ratings yet
YATRA-Consultancy Agreement (Yatra Payroll)
13 pages
Hse Induction List
No ratings yet
Hse Induction List
24 pages
Students List - Chemistry Cycle Sections I Sem-AY2022-23
No ratings yet
Students List - Chemistry Cycle Sections I Sem-AY2022-23
81 pages
AP 2nd Year BIPC 2024
No ratings yet
AP 2nd Year BIPC 2024
3,764 pages
CCNA Cisco Routing Protocols and Concepts Final Exam-Practice
100% (1)
CCNA Cisco Routing Protocols and Concepts Final Exam-Practice
22 pages
TDS Rates and Returns
No ratings yet
TDS Rates and Returns
3 pages
Esic Challan
No ratings yet
Esic Challan
7 pages
Trinity Placement Services: The Deliverance of Right Choice
No ratings yet
Trinity Placement Services: The Deliverance of Right Choice
6 pages
Dearness Allowance: by Alpi Sharma Kavya Krishnan K
No ratings yet
Dearness Allowance: by Alpi Sharma Kavya Krishnan K
15 pages
L& T Final List
No ratings yet
L& T Final List
31 pages
Income Tax Department
No ratings yet
Income Tax Department
19 pages
Industrial Leave Policy
No ratings yet
Industrial Leave Policy
5 pages
KARE - TCS Interview (26.11.2024)
No ratings yet
KARE - TCS Interview (26.11.2024)
16 pages
Group A & B College List PDF
No ratings yet
Group A & B College List PDF
19 pages
New List Designation
No ratings yet
New List Designation
19 pages
DeltaX - Product Engineer - Virtual Placement Drive 2025 - Noida Institute of Engineering and Technology
No ratings yet
DeltaX - Product Engineer - Virtual Placement Drive 2025 - Noida Institute of Engineering and Technology
80 pages
Monthly New Joining Tracker Excel
No ratings yet
Monthly New Joining Tracker Excel
9 pages
Statutory Due Date For F y 15 16
No ratings yet
Statutory Due Date For F y 15 16
1 page
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
40 pages
Nividia Slides
No ratings yet
Nividia Slides
40 pages
Nvidia Fundamentals of Deep Learning PPT 1
No ratings yet
Nvidia Fundamentals of Deep Learning PPT 1
40 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
37 pages
Syllabus-EE 414, 517, Deep Learning, Fall 2023
No ratings yet
Syllabus-EE 414, 517, Deep Learning, Fall 2023
4 pages
HOClass Paper SashaKiselev
No ratings yet
HOClass Paper SashaKiselev
18 pages
THESIS
No ratings yet
THESIS
21 pages
Pothole Severity Prediction Using Monocular Depth (3) (1) - 2
No ratings yet
Pothole Severity Prediction Using Monocular Depth (3) (1) - 2
15 pages
Difference Between ANN, CNN and RNN
100% (1)
Difference Between ANN, CNN and RNN
5 pages
You Only Look Once - Object Detection Models A Review
No ratings yet
You Only Look Once - Object Detection Models A Review
8 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
17 pages
3D Hand Shape and Pose Estimation From A Single RGB Image
No ratings yet
3D Hand Shape and Pose Estimation From A Single RGB Image
12 pages
Yousef Udacity Deep Learning Part 3 CNN
No ratings yet
Yousef Udacity Deep Learning Part 3 CNN
253 pages
Personality Prediction System Based On Graphology Using Machine Learning
No ratings yet
Personality Prediction System Based On Graphology Using Machine Learning
34 pages
Semantics-Aware BERT For Language Understanding: Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li
No ratings yet
Semantics-Aware BERT For Language Understanding: Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li
8 pages
17-Microscopic Blood Smear Segmentation and Classification Using Deep Contour Aware CNN and Extreme Machine Learning
No ratings yet
17-Microscopic Blood Smear Segmentation and Classification Using Deep Contour Aware CNN and Extreme Machine Learning
7 pages
Eye Recognition With Mixed Convolutional and Residual Network (Micore-Net)
No ratings yet
Eye Recognition With Mixed Convolutional and Residual Network (Micore-Net)
1 page
IEEE Zeta Rho Chapter - Artificial Intelligence-Based Fault Detection and Localization For Underground Cables - Slides
No ratings yet
IEEE Zeta Rho Chapter - Artificial Intelligence-Based Fault Detection and Localization For Underground Cables - Slides
26 pages
Mini Project Report Format (1)
No ratings yet
Mini Project Report Format (1)
32 pages
A Comprehensive Guide to Computer Vision
No ratings yet
A Comprehensive Guide to Computer Vision
6 pages
Iitmp Aaiml Brochure
No ratings yet
Iitmp Aaiml Brochure
30 pages
Object Recognition in Aerial Images
No ratings yet
Object Recognition in Aerial Images
9 pages
Batch A6 - Project Requirements
No ratings yet
Batch A6 - Project Requirements
5 pages
j2020 A Survey of The Usages of Deep Learning For Natural Language Processing
No ratings yet
j2020 A Survey of The Usages of Deep Learning For Natural Language Processing
21 pages
Book 7
No ratings yet
Book 7
35 pages
Perception in AI
No ratings yet
Perception in AI
15 pages
Megersa, Thesis Presentation
No ratings yet
Megersa, Thesis Presentation
40 pages
Deep Learning Basics
No ratings yet
Deep Learning Basics
4 pages
Data Science Machine Learning Resume
No ratings yet
Data Science Machine Learning Resume
2 pages
Detecting Pneumonia Using Convolutions and Dynamic Capsule Routing For Chest X-Ray Images
No ratings yet
Detecting Pneumonia Using Convolutions and Dynamic Capsule Routing For Chest X-Ray Images
30 pages
CAPSULE NETWORK Project Research
No ratings yet
CAPSULE NETWORK Project Research
6 pages
Brain
100% (1)
Brain
41 pages
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
No ratings yet
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
127 pages
A Deep Learning Based Modeling of Reconfigurable Intelligent Surface Assisted Wireless Communications For Phase Shift Configuration
No ratings yet
A Deep Learning Based Modeling of Reconfigurable Intelligent Surface Assisted Wireless Communications For Phase Shift Configuration
9 pages