0% found this document useful (0 votes)

1 views

Lecture 02 - Artificial Neural Network

The document discusses the Delta Learning Rule and Backpropagation Algorithm in neural networks, focusing on how to minimize error through weight adjustments during training. It explains the importance of learning constants, the stochastic approximation to gradient descent, and the challenges of local minima. Additionally, it covers the architecture of Multi Layer Perceptrons (MLPs) and their application in tasks like face recognition and autonomous vehicle steering.

Uploaded by

yingo.xingo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

Lecture 02 - Artificial Neural Network

Uploaded by

yingo.xingo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 37

NEURAL NETWORKS

Delta Learning Rule

Let E = accumulative error over a data set. It is a function of

the neuron weights

E = training samples(d – O)2

d is the desired output and O is the actual output

The error is squared so that the positive and negative errors

may not cancel each other out during summation
NEURAL NETWORKS

Delta Learning Rule

Each weight configuration can be represented by a point on

an error surface
NEURAL NETWORKS

Delta Learning Rule

Starting from a random weight configuration, we want our

training algorithm to move in the direction where error is
reduced more rapidly

Delta rule attempts to minimize the local error and uses the
derivative of the error to find the slope of the error space in
the region local to a particular point
NEURAL NETWORKS

Delta Learning Rule

Delta rule:
wi = -c (Error/ wi)

If the learning constant “c” is large (more than 0.5), weights

move quickly to optimal value but there is a risk of
overshooting the minimum or oscillation around optimum
weights

If “c” is small, the training is less prone to these problems

but system does not learn quickly; also the algorithm may
get stuck in local minima
NEURAL NETWORKS

Delta Learning Rule

The weights are updated incrementally, following the

presentation of each training example

This corresponds to a stochastic approximation to gradient

descent

To obtain the true gradient of Error, one would consider all

of the training examples before altering the weight values

The stochastic approximation avoids costly computations per

weight update
NEURAL NETWORKS
Delta Learning Rule

To calculate the weight change, we use chain rule

The Error is only indirectly dependent on wi, but it is directly
dependent on variable O

Error/ wi = (Error/ O) . ( O / wi)

Error/ O = rate of change of error w.r.t output

Now Error/ O = (d - O)2 /O = -2(d - O)

For  O / wi we have ( O / act) ( act / wi)

( O / act) = ( f(act) / act) = f’(act)
( act / wi) = ( i xiwi/ wi) = xi
Hence wi = -c (Error/ wi) = c[-(d - O) . f’(act) . xi]
NEURAL NETWORKS

Delta Learning Rule

A typical activation function is logistic function (which is a

type of sigmoidal function)

f(act) = 1/(1 + e-act)

If value of  (squashing parameter) is large we have a unit

step function, if it is small we have almost a straight line
between two saturation limits

f’(act) = f(act)(1 – f(act))

NEURAL NETWORKS

Multi Layer Perceptron: Architecture & Forward Pass

Hidden Units Output Units

NEURAL NETWORKS

Delta Learning Rule

Now the accumulative error E over a data set will be

E = training samplesj (dj – Oj)2

dj is the desired output of node j and Oj is the actual output

NEURAL NETWORKS

Backpropagation Algorithm

• Set up the architecture & initialize the weights of the

network
• Apply the training pairs (input-output vectors) from the
training set, one by one
• For each training pair, calculate the output of the network
• Calculate the error between actual output & desired output
• Propagate the error backwards & adjust the weights in such
a way that minimizes the error
• Repeat the above steps for each pair in the training set until
the error for the set is lower than the required
minimum error
NEURAL NETWORKS

Multi Layer Perceptron: Training of Hidden Layer Weights

wki
k
i

Hidden layer Output layer

NEURAL NETWORKS

Multi Layer Perceptron: Training of Hidden Layer Weights

wki
k
i

Hidden layer Output layer

NEURAL NETWORKS

Multi Layer Perceptron: Training

Since training examples provide target values only for the

the network outputs , no target values are directly available
to indicate the error of hidden unit’s values

Instead, the error term for a hidden unit is calculated by

taking the weighted sum of the error terms for each output
unit influenced by it

This weight characterizes the degree to which the hidden unit

is responsible for the error in the output unit
NEURAL NETWORKS

Multi Layer Perceptron: Training of Hidden Layer Weights

wki
k
i

Hidden layer Output layer

NEURAL NETWORKS

Multi Layer Perceptron: Training of Hidden Layer Weights

Adjustment of kth weight of node “i”

wki = -c (Error/ wki)

Error/ wki = (Error/ Oi) . ( Oi / wki)

Error/ Oi = rate of change of error w.r.t output of node i

=  j Errorj/ Oi
Since each Errorj is dependent upon Oi but all Errorj are
independent of each other (each has its own independent
weight set)
Hence  j Errorj/ Oi = j ( Errorj/ Oi )
NEURAL NETWORKS

Multi Layer Perceptron: Training of Hidden Layer Weights

j
wij
Oi = xi
i

Hidden layer Output layer

NEURAL NETWORKS

Multi Layer Perceptron: Training of Hidden Layer Weights

Hence j ( Errorj/ Oi ) = j [( Errorj/ actj) . (actj / Oi)]

 Errorj/ actj = ( Errorj/ Oj) ( Oj/ actj)

 Errorj/ Oj =  (dj – Oj)2/ Oj = -2(dj - Oj)
 Oj/ actj =  f(actj)/ actj = f’(actj)

(actj / Oi) = (  xiwij / Oi)

Since Oi = xi
hence actj / Oi = wij
NEURAL NETWORKS

Multi Layer Perceptron: Training of Hidden Layer Weights

Furthermore  Oi / wki = ( Oi / acti)( acti / wki)

( acti / wki) = ( k xkwki/ wki) = xk

( Oi / acti) = ( f(act)i / acti) = f’(act)i

Hence wki = -c (Error/ wki)

= -c[-2j {(dj - Oj) f’(actj) wij }f’(act)i xk]
NEURAL NETWORKS

Multi Layer Perceptron: Training

This approach is called “gradient descent learning”

Requirement of this approach is that the activation function

must be differentiable (i.e. continuous)

The number of input and output neurons are fixed

But the selection of number of hidden layers and the number

of neurons in the hidden layers is done by trial and error
NEURAL NETWORKS

Multi Layer Perceptron: Training

The gradient descent is not guaranteed to converge to the

global optimum

The algorithm we have discussed is the incremental gradient

descent (or stochastic gradient descent) version of the
Backpropagation
NEURAL NETWORKS

Multi Layer Perceptron: Face Recognition Example

Images of 20 different people

32 images per person

With varying expressions (happy, sad, angry, neutral)

and
looking in various directions (left, right, straight, up)
and
with and without sunglasses

Grayscale images (intensity between 0 to 255) and

size (resolution) of 120 x 128 pixels
NEURAL NETWORKS

Multi Layer Perceptron: Face Recognition Example

NEURAL NETWORKS

Multi Layer Perceptron: Face Recognition Example

An ANN can be trained on any one of a variety of target

functions using this image data, e.g.
- identity of a person
- direction in which person is looking
- gender of the person
- whether or not they are wearing sunglasses
NEURAL NETWORKS

Multi Layer Perceptron: Face Recognition Example

Design Choices:

Separate the data into

training (260 images) and test sets (364 images)

Input Encoding
- 30 x 32 pixel image
- A coarse resolution of the 120 x 128 pixel image
- Every 4 x 4 pixels are replaced by their mean value
- The pixel intensity is linearly scaled from 0 to 1 so
that inputs, hidden units and output units have
the same range
NEURAL NETWORKS

Multi Layer Perceptron: Face Recognition Example

Design Choices:

Output Encoding
- Learning Task: Direction in which person is looking
- Only one neuron could have been used with outputs
0.2, 0.4, 0.6, and 0.8 to encode the four possible
values
- But we use 4 output neurons, so that measure of
confidence in the ANN’s decision can be
obtained
- Output vector:
1 for true & 0 for false; e.g. [1, 0, 0, 0]
NEURAL NETWORKS

Multi Layer Perceptron: Face Recognition Example

Design Choices:

Network Structure
- How many Layers?
Usually one hidden layer is enough
- How many units in the hidden layer
More than necessary units result in over-fitting
Less units result in failure of training
Trial & error: Start with a number and prune
the units with the help of a cross-validation set
NEURAL NETWORKS

Example of MLP based Systems:

ALVINN

The system ALVINN uses a

trained ANN to steer an
autonomous vehicle driving at
normal speeds on public
highways

The input is a 30 x 32 grid of

pixel intensities obtained from a
forward pointed camera
mounted on the vehicle
NEURAL NETWORKS

Example of MLP based Systems:

ALVINN

The network output is the

direction in which the vehicle is
steered

The training examples have been

obtained from the steering
commands of a human driver
NEURAL NETWORKS

Appropriate Problems for MLPs

MLPs are appropriate for problems with following

characteristics:

1. Input is high dimensional, and the input attributes may

be highly correlated or they may be independent

2. The output vector may have discrete or real valued

attributes

3. The training data may contain noise (errors)

4. Long training times are acceptable

NEURAL NETWORKS

Appropriate MLP Problems

5. Fast system response is required (trained NN evaluate the

outputs quickly)

6. The ability of humans to understand the learned target

function is not important
NEURAL NETWORKS

Adding Momentum

A popular variation in the backpropagation algorithm is to

have the following weight update rule:

The weight update on the nth iteration is partially dependent

upon the update that occurred during the (n-1)th iteration

The constant “alpha” is called momentum and its value is

between 0 & 1

The first term on the right of the equation is just the weight
update rule described before
NEURAL NETWORKS

Adding Momentum

To understand the effect of the

momentum term consider that the
gradient descent search trajectory
is analogous to that of a ball rolling
down the error surface

Withoutthe momentum term the

ball is pushed only by the weight
change term
NEURAL NETWORKS

Adding Momentum

The effect of momentum is the

tendency to keep the ball rolling in
the same direction from one
iteration to next

It keeps the ball rolling through

small local minima in the error
surface and along flat regions in
the surface where the ball would normally stop

It also has the effect of gradually increasing the step size of

the search in regions where the gradient is unchanging,
thereby speeding convergence
NEURAL NETWORKS

Backpropagation Algorithm: Convergence & Local Minima

May become trapped in local minimum

- However, due to high dimensionality of weights, chances

are less
- Sometimes, near the global minimum, the local
minimum is good enough

- Remedy: Add momentum

- Remedy: Stochastic gradient descent
- Remedy: Train multiple nets with different initial
weights
Reading Assignment & References

Chapter 4 of T. Mitchell

http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/
ai/areas/neural/systems/nevprop/np.c
Assignment

• “Global Optimization Algorithm for

training Product Unit Neural Networks”
• Read the paper
• Make a summary of 1 page
• Implement the paper.
NEURAL NETWORKS

Home Work

Read 2nd Chapter of Engelbrecht’s book

Intermediate Accounting 2020 Volume 2 - Conrado T. Valix - PDF
No ratings yet
Intermediate Accounting 2020 Volume 2 - Conrado T. Valix - PDF
220 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
68 pages
Dersnot 6452 1668688984
No ratings yet
Dersnot 6452 1668688984
36 pages
Unit 2-Ann
No ratings yet
Unit 2-Ann
62 pages
Artificial Neural Network (2)
No ratings yet
Artificial Neural Network (2)
75 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
4.2 Ann
No ratings yet
4.2 Ann
26 pages
Back Propagation Technique
No ratings yet
Back Propagation Technique
24 pages
Back Propagation
No ratings yet
Back Propagation
56 pages
ML UNIT-5
No ratings yet
ML UNIT-5
19 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
Main
No ratings yet
Main
25 pages
ANN Unit 3
No ratings yet
ANN Unit 3
100 pages
ML Unit-2
No ratings yet
ML Unit-2
141 pages
ANN 2 A
No ratings yet
ANN 2 A
20 pages
Artificial Neural Networks - MLP
No ratings yet
Artificial Neural Networks - MLP
52 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
No ratings yet
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
24 pages
Lecture 10
No ratings yet
Lecture 10
155 pages
2012-1158. Backpropagation NN
No ratings yet
2012-1158. Backpropagation NN
56 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Clase3_redUnidireccional
No ratings yet
Clase3_redUnidireccional
74 pages
Slide 2
No ratings yet
Slide 2
35 pages
Jntuk R20 ML Unit-V
No ratings yet
Jntuk R20 ML Unit-V
19 pages
12. NN Introduction MES
No ratings yet
12. NN Introduction MES
39 pages
Multi Layer Perceptron 1
No ratings yet
Multi Layer Perceptron 1
54 pages
ANN-Implemetation of Back-Prop
No ratings yet
ANN-Implemetation of Back-Prop
89 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
PERCEPTRONS
No ratings yet
PERCEPTRONS
13 pages
Unit 2 - Soft Computing
No ratings yet
Unit 2 - Soft Computing
49 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
71 pages
Lecture 4: Perceptrons and Multilayer Perceptrons: Cognitive Systems II - Machine Learning SS 2005
No ratings yet
Lecture 4: Perceptrons and Multilayer Perceptrons: Cognitive Systems II - Machine Learning SS 2005
25 pages
Neural
No ratings yet
Neural
53 pages
Unit-5 AI
No ratings yet
Unit-5 AI
19 pages
Introduction To Neural Networks: Revision Lectures: © John A. Bullinaria, 2004
No ratings yet
Introduction To Neural Networks: Revision Lectures: © John A. Bullinaria, 2004
24 pages
Unit 4
No ratings yet
Unit 4
38 pages
Chapter_7
No ratings yet
Chapter_7
68 pages
Week10 (Backprop and Competitive)
No ratings yet
Week10 (Backprop and Competitive)
63 pages
Supervised Learning Network
No ratings yet
Supervised Learning Network
33 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
14 pages
Ann MJJ-1
No ratings yet
Ann MJJ-1
64 pages
Module 3 Chap 4 ANNs
No ratings yet
Module 3 Chap 4 ANNs
69 pages
Neural Network
100% (1)
Neural Network
54 pages
Unit 5
No ratings yet
Unit 5
219 pages
ML_NOTES_NN
No ratings yet
ML_NOTES_NN
4 pages
Classification BP Regression KNN Other Classifiers_ Final.ppt
No ratings yet
Classification BP Regression KNN Other Classifiers_ Final.ppt
116 pages
Lec03 NeuralNetwork
No ratings yet
Lec03 NeuralNetwork
39 pages
ML 03
No ratings yet
ML 03
42 pages
Kevin Swingler - Lecture 4: Multi-Layer Perceptrons
No ratings yet
Kevin Swingler - Lecture 4: Multi-Layer Perceptrons
20 pages
neural (2)
No ratings yet
neural (2)
32 pages
chapter3- Perceptron Adaline
No ratings yet
chapter3- Perceptron Adaline
53 pages
Lecture 10 Neural Network
No ratings yet
Lecture 10 Neural Network
34 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Perceptrons: Fundamentals and Applications for The Neural Building Block
From Everand
Perceptrons: Fundamentals and Applications for The Neural Building Block
Fouad Sabry
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Bronzeystrainer - Kvs 200 T
No ratings yet
Bronzeystrainer - Kvs 200 T
1 page
Electrical Instrumentation Trade
No ratings yet
Electrical Instrumentation Trade
2 pages
Course Module 1 - OA Core 2
No ratings yet
Course Module 1 - OA Core 2
6 pages
10 Service Manual Neo NX
No ratings yet
10 Service Manual Neo NX
189 pages
Project Document
100% (1)
Project Document
28 pages
8614 Assignment 01
No ratings yet
8614 Assignment 01
24 pages
Business and Management Paper 2 HL
No ratings yet
Business and Management Paper 2 HL
9 pages
Geographical Indication & Trademark
100% (1)
Geographical Indication & Trademark
10 pages
English q3 Module8
No ratings yet
English q3 Module8
8 pages
Small Business: Small Businesses Are Privately Owned Corporations, Partnerships, or Sole Proprietorships That Have
No ratings yet
Small Business: Small Businesses Are Privately Owned Corporations, Partnerships, or Sole Proprietorships That Have
1 page
The Stony Brook Press, Volume 34, Issue 9
No ratings yet
The Stony Brook Press, Volume 34, Issue 9
32 pages
MCS Protocol v3.2.1.2
No ratings yet
MCS Protocol v3.2.1.2
62 pages
Vda de Zulueta vs. Octaviano
No ratings yet
Vda de Zulueta vs. Octaviano
2 pages
Vignan University PPT by RAFI
No ratings yet
Vignan University PPT by RAFI
12 pages
Ketan Parekh Account
No ratings yet
Ketan Parekh Account
20 pages
Project Management
No ratings yet
Project Management
13 pages
Arduino Mega 2560 Datasheet
No ratings yet
Arduino Mega 2560 Datasheet
16 pages
Calibración Door Fan Test Bloer 6000
No ratings yet
Calibración Door Fan Test Bloer 6000
2 pages
Development of SCARA Robots: Kazuo Yamafuji
No ratings yet
Development of SCARA Robots: Kazuo Yamafuji
6 pages
Disclosure To Promote The Right To Information: IS 5058 (1996) : Sodium Citrate, Food Grade (FAD 8: Food Additives)
No ratings yet
Disclosure To Promote The Right To Information: IS 5058 (1996) : Sodium Citrate, Food Grade (FAD 8: Food Additives)
9 pages
Module 6.05 Trails to the West. PDF - Google Drive
No ratings yet
Module 6.05 Trails to the West. PDF - Google Drive
1 page
Faith in The Valley
No ratings yet
Faith in The Valley
6 pages
248 Vista Aft Systems Harness Assembly
No ratings yet
248 Vista Aft Systems Harness Assembly
1 page
Restricted Access Notice: Due To Third Party Proprietary Information
No ratings yet
Restricted Access Notice: Due To Third Party Proprietary Information
38 pages
Blast Fact Sheet
No ratings yet
Blast Fact Sheet
1 page
RotaMASS 3 Series Coriolis Mass Flow & Density Meter PDF
No ratings yet
RotaMASS 3 Series Coriolis Mass Flow & Density Meter PDF
36 pages
Banco - 7SX8000
No ratings yet
Banco - 7SX8000
5 pages
Unit 1
No ratings yet
Unit 1
67 pages
Audit Network Checklist Whitepaper
No ratings yet
Audit Network Checklist Whitepaper
25 pages