8th_lecture_Delta_Rule_Learning_s1_21_22
8th_lecture_Delta_Rule_Learning_s1_21_22
• Widrow-Hoff Algorithm
• Perceptron example
Outline
• Delta Rule Learning (one neuron)
• Example
• MATLAB example
• Backpropagation
Tarek A. Tutunji
𝑦𝑦 = 𝑓𝑓(𝑛𝑛𝑒𝑒𝑡𝑡)
∆𝒘𝒘
α
Tarek A. Tutunji
𝒏𝒏𝒆𝒆𝒕𝒕 = 𝒘𝒘T𝒙𝒙
The error between the network output and the desired output is
𝟏𝟏
𝑬𝑬 = (𝒅𝒅 − 𝒚𝒚) 𝟐𝟐
𝟐𝟐
𝒅𝒅𝑬𝑬 𝒅𝒅𝑬𝑬 𝒅𝒅𝒚𝒚
The derivative w.r.t. weights is = = − 𝒅𝒅 − 𝒚𝒚 𝒇𝒇′(𝒏𝒏𝒆𝒆𝒕𝒕) 𝒙𝒙 𝒊𝒊
𝒅𝒅𝒘𝒘𝒊𝒊 𝒅𝒅𝒚𝒚 𝒅𝒅𝒘𝒘𝒊𝒊
𝒅𝒅𝑬𝑬
Update the weighs using delta rule 𝒘𝒘𝒊𝒊 = 𝒘𝒘𝒊𝒊 − 𝜶𝜶
𝒅𝒅𝒘𝒘𝒊𝒊
In vector format: 𝑾𝑾 = 𝑾𝑾 − 𝜶𝜶 𝜵𝜵𝑬𝑬
Tarek A. Tutunji
Example
f(net) y
𝟏𝟏 − 𝒆𝒆−𝒏𝒏𝒆𝒆𝒕𝒕
𝒚𝒚 = 𝒇𝒇 𝒏𝒏𝒆𝒆𝒕𝒕 =
𝟏𝟏 + 𝒆𝒆−𝒏𝒏𝒆𝒆𝒕𝒕
with α = 0.1
Tarek A. Tutunji
Example
∇𝐸𝐸 = −𝛼𝛼 𝑑𝑑 − 𝑦𝑦 𝑦𝑦 ′ 𝑥𝑥
1
−2
Iteration One = −0.1 −1 − 0.848 0.14
0
Pattern One 1 −1
1 0.0259
−2 −0.0518
𝑛𝑛𝑒𝑒𝑡𝑡 = 𝑤𝑤 𝑇𝑇 𝑥𝑥 = 1 − 1 0 0.5
0
= 2. 5 = 0.0259 −2 = 0
0
−1 −1 −0.0259
1 − 𝑒𝑒 −𝑛𝑛𝑒𝑒𝑡𝑡
𝑦𝑦 = ⇒ 𝑦𝑦 = 0.848 1 0.0259 0.9741
1 + 𝑒𝑒 −𝑛𝑛𝑒𝑒𝑡𝑡
−0.9482
𝑤𝑤 = 𝑤𝑤 − ∇𝐸𝐸 = −1 − −0.0518 =
0 0 0
𝑦𝑦 ′ = 0.5 1 − 𝑦𝑦 2 = 0.140 0.5 −0.0259 0.5259
Tarek A. Tutunji
Example
Example
Iteration One Two Loops
Pattern One
Pattern Two Iteration Loop
Pattern Three Pattern Loop
Iteration Two
Pattern One
Pattern Two
Pattern Three
Iteration Three
Pattern One Calculate the Error
Pattern Two
Pattern Three
…
MATLAB CODE
Tarek A. Tutunji
• w=[1 -1 0 0.5]';
• x1=[1 -2 0 -1]'; x2=[0 1.5 -0.5 -1]'; x3=[-1 1 0.5- % Pattern 3
1]'; d1 = -1; d2=-1; d3=1; net = w'*x3;
• a=0.1; y3 = ( 1 - exp(-net) ) / ( 1 + exp(-net) );
yp = 0.5 * ( 1 - y3^2);
dE = -a * (d3 - y3)*yp*x3;
• for iter = 1:100 w = w - dE;
• dE = -a * (d1 - y1)*yp*x1;
• w = w - dE;
• % Pattern 2
• net = w'*x2;
• y2 = ( 1 - exp(-net) ) / ( 1 + exp(-net) );
yp = 0.5 * ( 1 - y2^2);
• dE = -a * (d2 - y2)*yp*x2;
• w = w - dE;
Tarek A. Tutunji
MATLAB Results
α = 0.1 α = 0.9
7
7
6
6
5 5
4 4
3 3
2 2
1
1
0
0 10 20 30 40 50 60 70 80 90 100 0
0 10 20 30 40 50 60 70 80 90 100
y1 = -0.8897 y1 = -0.9669
y2 = -0.7191 y2 = -0.9240
y3 = 0.7319 y3 = 0.9278
Tarek A. Tutunji
Multi-Neurons
Tarek A. Tutunji
𝒚𝒚𝒋𝒋 = 𝒇𝒇 𝒏𝒏𝒆𝒆𝒕𝒕𝒋𝒋
𝒅𝒅𝑬𝑬
= − 𝒅𝒅𝒋𝒋 − 𝒚𝒚𝒋𝒋 𝒇𝒇′(𝒏𝒏𝒆𝒆𝒕𝒕𝒋𝒋) 𝒙𝒙𝒊𝒊
𝒅𝒅𝒘𝒘𝒊𝒊𝒋𝒋
𝒅𝒅𝑬𝑬
𝒘𝒘𝒊𝒊𝒋𝒋 = 𝒘𝒘 𝒊𝒊𝒋𝒋 − 𝜶𝜶
𝒅𝒅𝒘𝒘𝒊𝒊𝒋𝒋
Tarek A. Tutunji
Multilayer FeedforwardNetwork
Tarek A. Tutunji
Backpropagation
• Backpropagation is an algorithm for supervised learning
of artificial neural networks using gradient descent.
Three-layer Network
𝒗𝒗𝟎𝟎𝒋𝒋 𝒘𝒘𝟎𝟎𝒌𝒌
𝒗𝒗𝒊𝒊𝒋𝒋 𝒘𝒘𝒋𝒋𝒌𝒌
x1 y1
z1
y2
x2 z2
xq zn
ym Network Weights
BackpropagationTheory
𝑛𝑛
1
Error for each pattern: 𝐸𝐸 = � (𝑧𝑧𝑘𝑘 −𝑑𝑑𝑘𝑘 )2
2
𝑘𝑘=1
𝝏𝝏𝑬𝑬
= 𝜹𝜹𝒌𝒌 𝒚𝒚𝒋𝒋 where 𝜹𝜹𝒌𝒌 = 𝒛𝒛𝒌𝒌 − 𝒅𝒅𝒌𝒌 𝒛𝒛′𝒌𝒌
𝝏𝝏𝒘𝒘𝒋𝒋𝒌𝒌
BackpropagationTheory
𝑛𝑛
1 𝒗𝒗𝟎𝟎𝒋𝒋 𝒘𝒘𝟎𝟎𝒌𝒌
𝐸𝐸 = � (𝑧𝑧𝑘𝑘 −𝑑𝑑𝑘𝑘 )2 𝒗𝒗𝒊𝒊𝒋𝒋 y1 𝒘𝒘𝒋𝒋𝒌𝒌
2 x1 z1
𝑘𝑘=1
x2 y2
z2
𝒀𝒀 = 𝒇𝒇(𝑽𝑽𝑻𝑻𝑿𝑿 + 𝑽𝑽𝒐𝒐)
𝒁𝒁 = 𝒇𝒇(𝑾𝑾𝑻𝑻𝒀𝒀 + 𝑾𝑾𝒐𝒐)
xq zn
ym
Step Two: Hidden Weights 𝒏𝒏
𝒏𝒏
𝝏𝝏𝑬𝑬 𝝏𝝏𝑬𝑬 𝝏𝝏𝒛𝒛 𝒌𝒌 𝝏𝝏𝒚𝒚𝒋𝒋
𝝏𝝏𝑬𝑬
= Σ ( 𝜹𝜹𝒌𝒌𝒘𝒘𝒋𝒋𝒌𝒌) 𝒚𝒚′𝒋𝒋𝒙𝒙𝒊𝒊
𝝏𝝏𝒗𝒗𝒊𝒊𝒋𝒋 = Σ 𝝏𝝏𝒛𝒛𝒌𝒌 𝝏𝝏𝒚𝒚𝒋𝒋 𝝏𝝏𝒗𝒗𝒊𝒊𝒋𝒋 𝝏𝝏𝒗𝒗𝒊𝒊𝒋𝒋
𝒏𝒏
𝒌𝒌=𝟏𝟏
𝒌𝒌=𝟏𝟏
𝝏𝝏𝑬𝑬
𝝏𝝏𝒗𝒗𝒊𝒊𝒋𝒋
= 𝜹𝜹𝒋𝒋𝒙𝒙𝒊𝒊 where 𝜹𝜹𝒋𝒋 = 𝒚𝒚′𝒋𝒋 Σ𝜹𝜹𝒌𝒌𝒘𝒘𝒋𝒋𝒌𝒌
𝒌𝒌=𝟏𝟏
Function Derivatives:Review
Sigmoidal
1
𝑦𝑦 = 𝑓𝑓 𝑛𝑛𝑒𝑒𝑡𝑡 =
1 + 𝑒𝑒 −𝑛𝑛𝑒𝑒𝑡𝑡
𝑦𝑦 ′ = 𝑦𝑦(1 − 𝑦𝑦)
Hyperbolic Tangent
1 − 𝑒𝑒 −𝑛𝑛𝑒𝑒𝑡𝑡
𝑦𝑦 = 𝑓𝑓 𝑛𝑛𝑒𝑒𝑡𝑡 =
1 + 𝑒𝑒 −𝑛𝑛𝑒𝑒𝑡𝑡
𝑦𝑦 ′ = 0.5(1 − 𝑦𝑦 2 )
Tarek A. Tutunji
BackpropagationAlgorithm
Start
Last
Calculate 𝜹𝜹𝒌𝒌 Pattern Stop
Calculate 𝜹𝜹𝒋𝒋
Tarek A. Tutunji
BackpropagationAlgorithm
Start
Last
Calculate 𝜹𝜹𝒌𝒌 Pattern Stop
Calculate 𝜹𝜹𝒋𝒋
Tarek A. Tutunji
Initialize Weights
Weights are initialized V=
randomly using normal
distribution -0.4677 -0.8608 -0.2339 -0.0867
-0.1249 0.7847 -1.0570 -1.4694
1.4790 0.3086 -0.2841 0.1922
W=
-0.8223 -0.2883
-0.0942 0.3501
0.3362 -1.8359
-0.9047 1.0360
Tarek A. Tutunji
BackpropagationAlgorithm
Start
Last
Calculate 𝜹𝜹𝒌𝒌 Pattern Stop
Calculate 𝜹𝜹𝒋𝒋
Tarek A. Tutunji
Enter patterns
iteration = 1 Input Output
x3 x2 x1 d2 d1
pattern = 1 0 0 0 0 1
0 0 1 1 0
X= 0 1 0 1 0
0 1 1 1 0
0
1 0 0 1 0
0
0 1 0 1 1 0
1 1 0 1 0
1 1 1 0 1
D=
0
1
Tarek A. Tutunji
BackpropagationAlgorithm
Start
Last
Calculate 𝜹𝜹𝒌𝒌 Pattern Stop
Calculate 𝜹𝜹𝒋𝒋
Tarek A. Tutunji
Calculate Y and Z
𝒀𝒀 = 𝒇𝒇 𝑽𝑽𝑻𝑻𝑿𝑿 Y=
−0.4677 −0.1249 1.4790 0.5000
0
−0.8608 0.7847 0.3086
𝑌𝑌 = 𝑓𝑓 0 0.5000
−0.2339 −1.0570 −0.2841
−0.0867 −1.4694 0.1922 0 0.5000
0.5000
1
Where 𝑓𝑓 𝑛𝑛𝑒𝑒𝑡𝑡 = 1+𝑒𝑒 −𝑛𝑛𝑒𝑒𝑡𝑡
Z=
BackpropagationAlgorithm
Start
Last
Calculate 𝜹𝜹𝒌𝒌 Pattern Stop
Calculate 𝜹𝜹𝒋𝒋
Tarek A. Tutunji
𝜹𝜹𝒁𝒁 = 𝒁𝒁 − 𝑫𝑫
𝜹𝜹𝒁𝒁 =
−0.7425
− 0
−0.3690 1
dZ’ =
-0.7425 -1.3690
Tarek A. Tutunji
𝜹𝜹𝒀𝒀 = 𝟏𝟏 − 𝒀𝒀 . 𝒀𝒀 . 𝐖𝐖(𝛅𝛅𝒁𝒁)
dY’ =
BackpropagationAlgorithm
Start
Last
Calculate 𝜹𝜹𝒌𝒌 Pattern Stop
Calculate 𝜹𝜹𝒋𝒋
Tarek A. Tutunji
Update Weights
Scalar equation
𝒘𝒘𝒋𝒋𝒌𝒌 = 𝒘𝒘𝒋𝒋𝒌𝒌 − 𝜶𝜶𝜹𝜹𝒌𝒌𝒚𝒚𝒋𝒋
Matrix equation
𝑾𝑾 = 𝑾𝑾 − 𝜶𝜶 𝒀𝒀 (𝜹𝜹𝒛𝒛)𝑻𝑻
-0.7852 -0.2198
-0.0571 0.4185
0.3733 -1.7674
-0.8675 1.1044
Tarek A. Tutunji
Update Weights
Scalar equation
𝒗𝒗𝒊𝒊𝒋𝒋 = 𝒗𝒗𝒊𝒊𝒋𝒋 − 𝜶𝜶𝜹𝜹𝒋𝒋𝒙𝒙𝒊𝒊
Matrix equation
𝑽𝑽 = 𝑽𝑽 − 𝜶𝜶 𝑿𝑿 (𝜹𝜹𝒀𝒀)𝑻𝑻
−0.4677 −0.8608 −0.2339 −0.0867 0
𝑽𝑽 = −0.1249 0.7847 −1.0570 −1.4694 − 𝟎𝟎. 𝟏𝟏 0 0.2513 −0.1023 0.5659 −0.1866
1.4790 0.3086 −0.2841 0.1922 0
V=
BackpropagationAlgorithm
Start
Last
Calculate 𝜹𝜹𝒌𝒌 Pattern Stop
Calculate 𝜹𝜹𝒋𝒋
Tarek A. Tutunji
𝟏𝟏
𝑬𝑬𝟏𝟏 = ( −𝟎𝟎. 𝟕𝟕𝟒𝟒𝟐𝟐𝟓𝟓 − 𝟎𝟎)𝟐𝟐+(−𝟎𝟎. 𝟑𝟑𝟔𝟔𝟗𝟗𝟎𝟎 −𝟏𝟏)𝟐𝟐 = 𝟎𝟎. 𝟕𝟕𝟕𝟕𝟖𝟖𝟕𝟕
𝟐𝟐𝒙𝒙𝟐𝟐
Tarek A. Tutunji
Pattern 2
Enter Pattern: X= Calculate Output: Y=
0 0.8144
0 0.5765
1 0.4294
0.5479
D=
Z=
1
0 -0.9874
-0.0916
Tarek A. Tutunji
Pattern 2
Calculate Gradient Components
Update Weights
W=
V=
-0.6233 -0.2123
0.0575 0.4238 -0.4677 -0.8608 -0.2339 -0.0867
0.4587 -1.7635 -0.1249 0.7847 -1.0570 -1.4694
-0.7586 1.1094 1.4551 0.3068 -0.2699 0.1520
Next patterns
• Pattern 3
• Enter pattern; calculate output; calculate gradient; update weights
• Pattern 4
• Enter pattern; calculate output; calculate gradient; update weights
• Pattern 5
• Enter pattern; calculate output; calculate gradient; update weights
• Pattern 6
• Enter pattern; calculate output; calculate gradient; update weights
• Pattern 7
• Enter pattern; calculate output; calculate gradient; update weights
Tarek A. Tutunji
Pattern 8
Enter Pattern: X= Calculate Output: Y=
1 0.6829
1 0.5704
1 0.1858
0.1798
D= Z=
0 0.0393
1 -0.0055
Tarek A. Tutunji
Pattern 8
Calculate Gradient Components
Update Weights
W= V=
BackpropagationAlgorithm
Start
Enter Pattern
with X and D Calculate the Pattern Error
𝑛𝑛
1 2 E < 𝜺𝜺
𝐸𝐸𝑝𝑝 = � 𝑧𝑧𝑘𝑘 − 𝑑𝑑𝑘𝑘
Calculate Y 2𝑛𝑛
𝑘𝑘=1
Calculate Z
Last
Calculate 𝜹𝜹𝒌𝒌 Pattern Stop
Calculate 𝜹𝜹𝒋𝒋
Tarek A. Tutunji
?
E = 0.095 < 0.001
No go to next iteration
Tarek A. Tutunji
Next Iteration
Iteration= 2
Pattern = 1
X= Y=
0 0.5000
0 0.5000
0 0.5000
0.5000
D=
Z=
0
1 0.0821
-0.1085
Tarek A. Tutunji
Next Iteration
dZ’ = 0.0821 -1.1085
W=
-0.2896 -0.0707
0.3702 0.5514
0.6391 -1.6747
-0.5718 1.1988
V=
Next patterns
• Pattern 2
• Enter pattern; calculate output; calculate gradient; update weights
• Pattern 3
• Enter pattern; calculate output; calculate gradient; update weights
• Pattern 4
• Enter pattern; calculate output; calculate gradient; update weights
• Pattern 5
• Enter pattern; calculate output; calculate gradient; update weights
• Pattern 6
• Enter pattern; calculate output; calculate gradient; update weights
• Pattern 7
• Enter pattern; calculate output; calculate gradient; update weights
• Pattern 8
• Enter pattern; calculate output; calculate gradient; update weights
Tarek A. Tutunji
?
E = < 0.001
No go to next iteration ??
Tarek A. Tutunji
Z=
D=
0 1 1 1 1 1 1 0
1 0 0 0 0 0 0 1
After 1500 iteration
0 .14
0 .12
0 .1
0 .08
Err
or
0 .06
0 .04
0 .02
0
0 5 00 1 00 0 1 50 0
I te r a ti o n