0% found this document useful (0 votes)

5 views

Lecture 9. Neural Networks

The document provides an overview of neural networks, detailing their structure, including one hidden layer, and the computation of outputs using activation functions. It discusses the importance of non-linear activation functions, gradient descent, and backpropagation for training neural networks. Additionally, it touches on deep neural networks, their advantages, and the distinction between parameters and hyperparameters in the context of deep learning.

Uploaded by

Văn Nguyễn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Lecture 9. Neural Networks

Uploaded by

Văn Nguyễn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 106

Neural Networks

PGS.TS. Nguyễn Phương Thái

NLP Laboratory, Institute of Artificial Intelligence, VNU-UET

Adapted from slides of Andrew Ng

Andrew Ng
One hidden layer
Neural Network

Neural Networks
Overview
What is a Neural Network?
𝑥1
𝑥2 ^
𝑦
𝑥3 x

𝑧 =𝑤 𝑥 +𝑏 𝑎=𝜎 ( 𝑧 ) ℒ (𝑎, 𝑦 )
𝑇
w

b
𝑥1
𝑥2 ^
𝑦

𝑥3 x

[ 1] [2 ] [2 ]
[ 1] [ 1] [1 ]
𝑎[1] =𝜎 (𝑧 [1 ]) 𝑧 =𝑊 𝑎 +𝑏 𝑎 =𝜎 ( 𝑧 ) ℒ(𝑎[ 2] , 𝑦 )
[ 2] [2] [ 1] [ 2]
𝑊 𝑧 =𝑊 𝑥 +𝑏
[ 1] [ 2]
𝑏 𝑊
[ 2] Andrew Ng
𝑏
One hidden layer
Neural Network

Neural Network
Representation
Neural Network Representation

𝑥1
𝑥2 ^
𝑦

𝑥3

Andrew Ng
One hidden layer
Neural Network

Computing a
Neural Network’s
Output
Neural Network Representation

𝑥1 𝑥1
𝑥2 𝑤
𝑇 𝜎 (𝑧)
𝑥 +𝑏 𝑎= ^
𝑦 𝑥2 ^
𝑦
𝑧 𝑎 𝑥3
𝑥3

𝑇
𝑧 =𝑤 𝑥 +𝑏
𝑎=𝜎 (𝑧 )

Andrew Ng
Neural Network Representation

𝑥1 𝑥1
𝑥2 𝑤
𝑇 𝜎 (𝑧)
𝑥 +𝑏 𝑎= ^
𝑦 𝑥2 ^
𝑦
𝑧 𝑎 𝑥3
𝑥3

𝑇
𝑧 =𝑤 𝑥 +𝑏 𝑥1
𝑎=𝜎 (𝑧 ) 𝑥2 ^
𝑦
𝑥3
Andrew Ng
Neural Network Representation
𝑎 [11 ] )
𝑥1 𝑎 [21 ] )
𝑥2 ^
𝑦
)
𝑎 [31 ]
𝑥3
𝑎 [41 ] )

Andrew Ng
Neural Network Representation learning
𝑎 [11 ]
Given input x:
𝑥1 𝑎 [21 ] 𝑧 [ 1] = 𝑊 [ 1 ] 𝑥+ 𝑏[ 1 ]
𝑥2 ^
𝑦
𝑎 [31 ]
𝑥3 𝑎 [ 1 ] = 𝜎 ( 𝑧 [ 1] )
𝑎 [41 ]
𝑧 [ 2 ] =𝑊 [ 2] 𝑎 [ 1] +𝑏 [ 2 ]

𝑎[ 2 ] = 𝜎 ( 𝑧 [ 2 ] )

Andrew Ng
One hidden layer
Neural Network

Vectorizing across
multiple examples
Vectorizing across multiple
examples
𝑧 [ 1] = 𝑊 [ 1 ] 𝑥+ 𝑏[ 1 ]
𝑥1
[1 ] [ 1]
𝑎 =𝜎 ( 𝑧 )
𝑥2 ^
𝑦
𝑧 [ 2 ] =𝑊 [ 2] 𝑎 [ 1] +𝑏 [ 2 ]
𝑥3
𝑎 [ 2 ] =𝜎 ( 𝑧 [ 2 ] )

Andrew Ng
Vectorizing across multiple
examples
for i = 1 to m:
[ 1] (𝑖 ) [ 1 ] (𝑖 ) [1 ]
𝑧 =𝑊 𝑥 +𝑏
[ 1 ] ( 𝑖) [ 1] ( 𝑖 )
𝑎 =𝜎 ( 𝑧 )
[ 2 ] (𝑖 ) [ 2] [ 1] ( 𝑖 ) [ 2]
𝑧 =𝑊 𝑎 +𝑏
[2 ](𝑖 ) [2 ]( 𝑖 )
𝑎 =𝜎 (𝑧 )

Andrew Ng
One hidden layer
Neural Network

Explanation
for vectorized
implementation
Justification for vectorized
implementation

Andrew Ng
Recap of vectorizing across multiple
examples
𝑥1 for i = 1 to m
[ 1] (𝑖 ) [1 ] (𝑖 ) [1 ]
𝑥2 ^
𝑦 𝑧 =𝑊 𝑥 +𝑏
[ 1 ] ( 𝑖) [ 1] ( 𝑖 )
𝑥3 𝑎 =𝜎 ( 𝑧 )
[ 2 ] (𝑖 ) [ 2] [ 1] ( 𝑖 ) [ 2]
𝑧 =𝑊 𝑎 +𝑏
[2 ](𝑖 ) [2 ]( 𝑖 )
𝑎 =𝜎 (𝑧 )
𝑋=¿ 𝑥( 1)𝑥( 2)… 𝑥( 𝑚)
[ 1] [1 ] [ 1]
𝑍 =𝑊 𝑋 +𝑏
𝐴 [ 1 ] =𝜎 ( 𝑍 [ 1] )
[2 ] [ 2] [ 1] [ 2]
[1] (1 ) [1] (2 …) [1] (𝑚 ) 𝑍 =𝑊 𝐴 +𝑏
𝑎 𝑎 𝑎 𝐴 [ 2 ] =𝜎 ( 𝑍 [ 2 ] )
Andrew Ng
One hidden layer
Neural Network

Activation functions
Activation functions

𝑥1
𝑥2 ^
𝑦
𝑥3

Given x:
[ 1] [1 ] [1 ]
𝑧 = 𝑊 𝑥+ 𝑏
𝑎 [ 1 ] =𝜎 ( 𝑧 [ 1] )
𝑧 [ 2 ] =𝑊 [ 2] 𝑎 [ 1] +𝑏 [ 2 ]
𝑎 [ 2 ] =𝜎 ( 𝑧 [ 2 ] )
Pros and cons of activation functions
a a

x
z
1
sigmoid: 𝑎= 1+𝑒 − 𝑧

a a

z z
Andrew Ng
One hidden layer
Neural Network

Why do you
need non-linear
activation functions?
Activation function
𝑥1
𝑥2 ^
𝑦
𝑥3
Given x:
𝑧 [ 1] = 𝑊 [ 1 ] 𝑥+ 𝑏[ 1 ]
[1 ] [1] [1 ]
𝑎 =𝑔 (𝑧 )
𝑧 [ 2 ] =𝑊 [ 2] 𝑎 [ 1] +𝑏 [ 2 ]
𝑎 [ 2 ] =𝑔 [2 ]( 𝑧 [ 2 ] )
Andrew Ng
One hidden layer
Neural Network

Derivatives of
activation functions
Sigmoid activation function

a
1
𝑔 ( 𝑧 )= −𝑧
1+𝑒
z

Andrew Ng
Tanh activation function
a
𝑔 ( 𝑧 )=tanh ( 𝑧 )

Andrew Ng
ReLU and Leaky ReLU
a a

z z
ReLU Leaky ReLU

Andrew Ng
One hidden layer
Neural Network

Gradient descent for

neural networks
Gradient descent for neural
networks

Andrew Ng
Formulas for computing derivatives

Andrew Ng
One hidden layer
Neural Network

Backpropagation
intuition (Optional)
Computing gradients
Logistic regression
𝑥
𝑤
𝑏

Andrew Ng
Neural network gradients
[ 2]
𝑊
𝑥 𝑏
[ 2]

[ 1]
𝑊
[ 1]
𝑏

Andrew Ng
Summary of gradient descent
𝑑 𝑧 [ 2]=𝑎 [2 ] − 𝑦
𝑇
[ 2] [ 2] [ 1]
𝑑 𝑊 =𝑑 𝑧 𝑎
[ 2] [ 2]
𝑑 𝑏 =𝑑 𝑧
[1 ] [ 2] 𝑇 [ 2] [ 1] [ 1]
𝑑 𝑧 =𝑊 𝑑 𝑧 ∗ 𝑔 ′ (z )

[1 ] [1 ] 𝑇
𝑑 𝑊 =𝑑 𝑧 𝑥
[1 ] [1 ]
𝑑 𝑏 =𝑑 𝑧
Andrew Ng
Summary of gradient descent
𝑑 𝑧 [ 2]=𝑎 [2 ] − 𝑦 [ 2]
𝑑 𝑍 = 𝐴 −𝑌
[2 ]

[ 2] [ 2] [ 1]
𝑇 1
[ 2][ 2] [1 ] 𝑇

𝑑 𝑊 =𝑑 𝑧 𝑎 𝑑𝑊 = 𝑑𝑍 𝐴
𝑚
[ 2] [ 2] 1
[ 2] [ 2]
𝑑 𝑏 =𝑑 𝑧 𝑑 𝑏 = 𝑛𝑝 . 𝑠𝑢𝑚(𝑑 𝑍 , 𝑎𝑥𝑖𝑠=1 , 𝑘𝑒𝑒𝑝𝑑𝑖𝑚𝑠=𝑇𝑟𝑢𝑒
𝑚
[1 ] [ 2] 𝑇 [ 2] [ 1] [ 1]
𝑑 𝑧 =𝑊 𝑑 𝑧 ∗ 𝑔 ′ ( z ) 𝑑 𝑍 [1 ]=𝑊 [ 2 ] 𝑇 𝑑 𝑍 [2] ∗𝑔 [1] ′ ( Z [ 1 ] )

[1 ] [1 ] 𝑇 1
[1 ][1 ] 𝑇
𝑑 𝑊 =𝑑 𝑧 𝑥 𝑑𝑊 = 𝑑 𝑍 𝑋
𝑚
[1 ] [1 ]
𝑑 𝑏 =𝑑 𝑧
Andrew Ng
One hidden layer
Neural Network

Random Initialization
What happens if you initialize weights to
zero?
𝑥1 𝑎
[1]
1
[2 ]
𝑎 1
^
𝑦
𝑥2 𝑎
[1]
2

Andrew Ng
Random initialization
𝑥1 𝑎
[1]
1
[2 ]
𝑎 1
^
𝑦
𝑥2 𝑎
[1]
2

Andrew Ng
Multi-class
classification

Softmax regression
Recognizing cats, dogs, and baby chicks

3 1 2 0 3 2 0 1

X ^
𝑦

Andrew Ng
Andrew Ng
Andrew Ng
Andrew Ng
Andrew Ng
Deep Neural
Networks
Deep L-layer
neural network
Andrew Ng
Andrew Ng
Deep Neural
Networks
Getting your matrix
dimensions right
Parameters and

𝑥1
^
𝑦
𝑥2

Andrew Ng
Vectorized implementation

𝑥1
^
𝑦
𝑥2

Andrew Ng
Deep Neural
Networks
Why deep
representations?
Intuition about deep representation

^
𝑦

Andrew Ng
Circuit theory and deep learning
Informally: There are functions you can compute with a
“small” L-layer deep neural network that shallower
networks require exponentially more hidden units to
compute.

Andrew Ng
Deep Neural
Networks
Building blocks of
deep neural networks
Forward and backward functions
𝑥1
𝑥2
^
𝑦
𝑥3
𝑥4

Andrew Ng
Forward and backward functions

Andrew Ng
Deep Neural
Networks
Forward and backward
propagation
Forward propagation for layer l
Input
Ocache ()

Andrew Ng
Backward propagation for layer l
Input
O

Andrew Ng
Summary

Andrew Ng
Deep Neural
Networks
Parameters vs
Hyperparameters
What are hyperparameters?
Parameters:

Andrew Ng
Applied deep learning is a very
empirical process
Idea

cost

Experiment Code # of iterations

Andrew Ng
Deep Neural
Networks
What does this
have to do with
the brain?
Forward and backward propagation
[ 1] [ 1] [ 1] [ 𝐿] [ 𝐿]
𝑍 =𝑊 𝑋 +𝑏 𝑑𝑍 = 𝐴1 −𝑌
[1] [ 1] [ 1] [𝐿 ] [ 𝐿]𝑇
𝐴 =𝑔 ( 𝑍 ) [ 𝐿]
𝑑𝑊 = 𝑑 𝑍 𝐴
[ 2]
𝑍 =𝑊 𝐴 +𝑏
[2 ] [ 1] [2 ] 𝑚
1
𝑑 𝑏 = 𝑛𝑝 . ∑ (d 𝑍 , 𝑎𝑥𝑖𝑠=1 , 𝑘𝑒𝑒𝑝𝑑𝑖𝑚𝑠=𝑇𝑟𝑢𝑒)
[ 𝐿] [ 𝐿]
[2 ] [ 2] [2 ]
𝐴 =𝑔 ( 𝑍 ) 𝑚 𝑇

𝑑𝑍 [ 𝐿− 1]
=𝑑𝑊 [ 𝐿 ] 𝑑 𝑍 [ 𝐿 ] 𝑔 [ 𝐿 ] (𝑍 [ 𝐿− 1] )
′
…

^
𝐴 =𝑔 [ 𝐿 ] ( 𝑍 [ 𝐿 ] )= 𝑌
[ 𝐿]

…
𝑇
[ 𝐿]
[1 ]
𝑑𝑍 =𝑑 𝑊 𝑑 𝑍 𝑔 (𝑍 [ 1] )
[2 ] ′[ 1]
1
[1 ] [ 1] [ 1] 𝑇

𝑑𝑊 = 𝑑 𝑍 𝐴
𝑚
1
𝑑 𝑏 = 𝑛𝑝 . ∑ (d 𝑍 , 𝑎𝑥𝑖𝑠=1 ,𝑘𝑒𝑒𝑝𝑑𝑖𝑚𝑠=𝑇𝑟𝑢𝑒)
[1 ] [ 1]
𝑚

Andrew Ng
Setting up your

ML application

Train/dev/test
sets
Applied ML is a highly iterative
process
# Idea
layers
# hidden
units
learning rates

activation functions

… Experiment Code

Andrew Ng
Train/dev/test sets

Andrew Ng
Mismatched train/test distribution

Training set: Dev/test sets:

Cat pictures from Cat pictures from
webpages users using your
app

Not having a test set might be okay. (Only dev

set.)
Andrew Ng
Setting up
your
ML
application

Bias/Variance
Bias and Variance

high bias “just right” high variance

Andrew Ng
Bias and Variance
Cat classification

Train set error:

Dev set error:

Andrew Ng
High bias and high variance
𝑥2

𝑥1

Andrew Ng
Setting up
your
ML
application
Basic “recipe”
for machine learning
Basic “recipe” for machine learning

Andrew Ng
Basic recipe for machine learning

Andrew Ng
Regularizing
your neural
network

Regularization
Logistic regression
min 𝐽 (𝑤 , 𝑏)
𝑤, 𝑏

Andrew Ng
Neural network

Andrew Ng
How does regularization prevent
overfitting?
𝑥1
𝑥2 ^
𝑦
𝑥3

Andrew Ng
How does regularization prevent
overfitting?

Andrew Ng
Regularizing
your neural
network

Why regularization
reduces overfitting
How does regularization prevent
overfitting?
𝑥1
𝑥2 ^
𝑦
𝑥3

high bias “just right” high variance

Andrew Ng
How does regularization prevent
overfitting?

Andrew Ng
Regularizing
your neural
network

Dropout
regularization
Dropout regularization

𝑥1 𝑥1
𝑥2 ^
𝑦
𝑥2 ^
𝑦
𝑥3 𝑥3
𝑥4 𝑥4

Andrew Ng
Implementing dropout (“Inverted
dropout”)

Andrew Ng
Making predictions at test time

Andrew Ng
Regularizing
your neural
network

Understanding
dropout
Why does drop-out work?
Intuition: Can’t rely on any one feature, so have to
spread out weights.

𝑥1
𝑥2 ^
𝑦

𝑥3

Andrew Ng
Regularizing
your neural
network

Other regularization
methods
Data augmentation

4
Andrew Ng
Early stopping

#
iterations
Andrew Ng
Setting up your
optimization
problem

Normalizing inputs
Normalizing training sets

Andrew Ng
𝑚
1
Why normalize inputs? 𝐽 ( 𝑤 , 𝑏 ) = ∑ℒ ( ^
𝑚𝑖 =1
( 𝑖)
𝑦 ,𝑦 )
(𝑖 )

Unnormalized: Normalized:
𝐽 𝐽

𝑤 𝑤
𝑏 𝑏
𝑏 𝑏

𝑤 𝑤Andrew Ng
Setting up your
optimization
problem

Vanishing/exploding
gradients
Vanishing/exploding gradients
^
𝑦
=

Andrew Ng
Single neuron example

^
𝑦
𝑎=𝑔( 𝑧 )

Andrew Ng
Setting up your
optimization
problem
Numerical approximation
of gradients
Checking your derivative computation

Andrew Ng
Checking your derivative
computation
𝑓

Andrew Ng
Setting up your
optimization
problem

Gradient Checking
Gradient check for a neural network

Take and reshape into a big vector

Andrew Ng
Gradient checking (Grad check)

Andrew Ng
Setting up your
optimization
problem
Gradient Checking
implementation notes
Gradient checking implementation
notes
- Don’t use in training – only to debug

- If algorithm fails grad check, look at components to try to identify b

- Remember regularization.

- Doesn’t work with dropout.

- Run at random initialization; perhaps again after some training.

Andrew Ng

Neural Networks Optional
No ratings yet
Neural Networks Optional
96 pages
C1_W3
No ratings yet
C1_W3
35 pages
Neural Networks Week 3
No ratings yet
Neural Networks Week 3
35 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
36 pages
Lecture_09_slides_-_after
No ratings yet
Lecture_09_slides_-_after
57 pages
Artificial Neural Networks: Introduction To Computational Neuroscience
No ratings yet
Artificial Neural Networks: Introduction To Computational Neuroscience
42 pages
Machine Learning: Dr. Syed Aun Irtaza Lecture 10-11
No ratings yet
Machine Learning: Dr. Syed Aun Irtaza Lecture 10-11
33 pages
ML Lec 10 Neural Networks
No ratings yet
ML Lec 10 Neural Networks
87 pages
2. Neural Network Training
No ratings yet
2. Neural Network Training
73 pages
An Introduction To Artificial Neural Networks - by Srivignesh Rajan - Towards Data Science
No ratings yet
An Introduction To Artificial Neural Networks - by Srivignesh Rajan - Towards Data Science
11 pages
Week 14 (NN)
No ratings yet
Week 14 (NN)
49 pages
Chap6 (Neural Network)
No ratings yet
Chap6 (Neural Network)
63 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
82 pages
CV Lec5
No ratings yet
CV Lec5
54 pages
Slides 11
No ratings yet
Slides 11
48 pages
Slide 7 - Neural Networks
No ratings yet
Slide 7 - Neural Networks
64 pages
ANN-Unit 6 - Deep Neural Networks
No ratings yet
ANN-Unit 6 - Deep Neural Networks
29 pages
Neural Networks - Annotated
No ratings yet
Neural Networks - Annotated
21 pages
NN_Notes
No ratings yet
NN_Notes
39 pages
mv_cs4243_2024_amir_6_p1 (1)
No ratings yet
mv_cs4243_2024_amir_6_p1 (1)
97 pages
Lec-06
No ratings yet
Lec-06
20 pages
UNIT-I.pptx
No ratings yet
UNIT-I.pptx
90 pages
MachineLearningSlides PartOne
No ratings yet
MachineLearningSlides PartOne
252 pages
Learning Algorithm
No ratings yet
Learning Algorithm
100 pages
cst414- Deep learning
No ratings yet
cst414- Deep learning
34 pages
Lecture 2 - Neural Network v1.0
No ratings yet
Lecture 2 - Neural Network v1.0
64 pages
ST M Hdstat RNN Deep Learning
No ratings yet
ST M Hdstat RNN Deep Learning
17 pages
Sparseautoencoder 2011new
No ratings yet
Sparseautoencoder 2011new
19 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Week2_Intro to Neural Nets
No ratings yet
Week2_Intro to Neural Nets
33 pages
Lect 5
No ratings yet
Lect 5
89 pages
ANN - Back Propagation
No ratings yet
ANN - Back Propagation
22 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Annette Paper
No ratings yet
Annette Paper
7 pages
Chapter 9
No ratings yet
Chapter 9
73 pages
NeuralNetworks
No ratings yet
NeuralNetworks
29 pages
Unit -4 Artificial Neural Networks
No ratings yet
Unit -4 Artificial Neural Networks
33 pages
01 Basics 01ML 02
No ratings yet
01 Basics 01ML 02
35 pages
CS460 - Deep Learning - W02 & W03
No ratings yet
CS460 - Deep Learning - W02 & W03
44 pages
CS 611 Slides 5
No ratings yet
CS 611 Slides 5
28 pages
Chapter 2 - 2 Shallow neural network 2_2
No ratings yet
Chapter 2 - 2 Shallow neural network 2_2
34 pages
UNIT II DNN
No ratings yet
UNIT II DNN
24 pages
1725876123-Unit 1 Fundamental of Deep Learning
No ratings yet
1725876123-Unit 1 Fundamental of Deep Learning
51 pages
DL Notes
No ratings yet
DL Notes
652 pages
ANNs
No ratings yet
ANNs
17 pages
Introduction To Neural Network Toolbox in Matlab: Matlab Stands For Matrix Laboratory. Matlab 5.3.1 With Toolboxs
No ratings yet
Introduction To Neural Network Toolbox in Matlab: Matlab Stands For Matrix Laboratory. Matlab 5.3.1 With Toolboxs
22 pages
Ann MJJ-1
No ratings yet
Ann MJJ-1
64 pages
NN Concepts
No ratings yet
NN Concepts
4 pages
L10 Neural Network
No ratings yet
L10 Neural Network
52 pages
SHALLOW NETWORKS VERSUS DEEP NETWORKS
No ratings yet
SHALLOW NETWORKS VERSUS DEEP NETWORKS
6 pages
5 Backward Propagation
No ratings yet
5 Backward Propagation
81 pages
Safari - 25 Jul 2019 at 11:43
No ratings yet
Safari - 25 Jul 2019 at 11:43
1 page
Deep Learning PDF
100% (1)
Deep Learning PDF
87 pages
cs188-fa24-lec24
No ratings yet
cs188-fa24-lec24
46 pages
AI - W7L13
No ratings yet
AI - W7L13
46 pages
7 Neural Networks - Lecture Slides
No ratings yet
7 Neural Networks - Lecture Slides
74 pages
Neural Networks - 2
No ratings yet
Neural Networks - 2
79 pages
ML Week 4 To 10 PDF
No ratings yet
ML Week 4 To 10 PDF
146 pages
MLS+1+-+Presentation
No ratings yet
MLS+1+-+Presentation
11 pages
Inverse Trigonometric Functions (Trigonometry) Mathematics Question Bank
From Everand
Inverse Trigonometric Functions (Trigonometry) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
L5 Neural Network
No ratings yet
L5 Neural Network
67 pages
ML_QUESTION BANK PART I
No ratings yet
ML_QUESTION BANK PART I
6 pages
1-Resnet Slides
No ratings yet
1-Resnet Slides
89 pages
DL3_Backpropagation.pptx
No ratings yet
DL3_Backpropagation.pptx
17 pages
Important Questions Soft Computing (1)
No ratings yet
Important Questions Soft Computing (1)
9 pages
Module 1 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
100% (1)
Module 1 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
18 pages
NNFLC Question
No ratings yet
NNFLC Question
1 page
Introduction To Artificial Neural Networks: Andrew L. Nelson
No ratings yet
Introduction To Artificial Neural Networks: Andrew L. Nelson
29 pages
ANN & CNN
No ratings yet
ANN & CNN
1 page
Week 2 Sol
No ratings yet
Week 2 Sol
3 pages
Lec2 Perceptron MLP
No ratings yet
Lec2 Perceptron MLP
66 pages
DL Student Lab Manual
No ratings yet
DL Student Lab Manual
81 pages
Bee4333 Intelligent Control: Artificial Neural Network (ANN)
No ratings yet
Bee4333 Intelligent Control: Artificial Neural Network (ANN)
120 pages
Chapter 1 - Introduction To Deep Learning 2023
No ratings yet
Chapter 1 - Introduction To Deep Learning 2023
50 pages
DL - M2 - Deep Feedforward NN
No ratings yet
DL - M2 - Deep Feedforward NN
97 pages
COMP 488 Neural Network Deep Learning
No ratings yet
COMP 488 Neural Network Deep Learning
3 pages
Fully Connected (FC) Layer: Usually The Last Part (Or Layers) of Every CNN Architecture
No ratings yet
Fully Connected (FC) Layer: Usually The Last Part (Or Layers) of Every CNN Architecture
2 pages
Code Question1-Adaline
No ratings yet
Code Question1-Adaline
29 pages
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
No ratings yet
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
55 pages
RNN LSTM Gru R
No ratings yet
RNN LSTM Gru R
97 pages
Neural Networks
No ratings yet
Neural Networks
21 pages
CNN Case Studies Unit 4
No ratings yet
CNN Case Studies Unit 4
13 pages
4 - Mcq-Ann-Ann-Quiz - Selected
No ratings yet
4 - Mcq-Ann-Ann-Quiz - Selected
13 pages
Learning Law in Neural Networks
100% (2)
Learning Law in Neural Networks
19 pages
Modelling Time Series With Neural Networks: Volker Tresp Summer 2017
No ratings yet
Modelling Time Series With Neural Networks: Volker Tresp Summer 2017
24 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
1 page
Unit 4 LSTM
No ratings yet
Unit 4 LSTM
85 pages
Deep Learning Lab Manual
No ratings yet
Deep Learning Lab Manual
47 pages
IC Unit6 DeepLearning
No ratings yet
IC Unit6 DeepLearning
35 pages

Lecture 9. Neural Networks

Uploaded by

Lecture 9. Neural Networks

Uploaded by

Neural Networks

PGS.TS. Nguyễn Phương Thái

Adapted from slides of Andrew Ng

Gradient descent for

Experiment Code # of iterations

Training set: Dev/test sets:

Not having a test set might be okay. (Only dev

high bias “just right” high variance

Train set error:

high bias “just right” high variance

Take and reshape into a big vector

Take and reshape into a big vector

- If algorithm fails grad check, look at components to try to identify b

- Doesn’t work with dropout.

- Run at random initialization; perhaps again after some training.

You might also like