Backpropagation
Backpropagation
Introduction
The Backpropagation Algorithm is used to train artificial neural networks by adjusting weights based on the gradient
of the error. It consists of:
• Forward Propagation: Computing outputs layer by layer.
• Error Computation: Finding the difference between actual and predicted outputs.
w11 w̄11
x1 g1 (⋅) ḡ1 (⋅)
w 1i w̄ 1j
P
1N
w1
w̄
⋮ ⋮ ⋮
wj w̄k
1 1
wji w̄kj
xi gj (⋅) ḡk (⋅)
w jP w̄ kN
w
⋮ N
1 ⋮ w̄ M ⋮
1
wN w̄M
i j
wN P w̄M N
xP gN (⋅) ḡM (⋅)
Notation:
Weight Matrices:
1
Weight Matrix from Hidden to Output Layer
⎡ w̄ ... w̄1j ... w̄1N ⎤⎥
⎢ 11
⎢ ⎥
⎢ ⋮
⎢ ⋱ ⋮ ⋱ . . . ⎥⎥
W2 = ⎢⎢ w̄k1 ... w̄kj ... w̄kN ⎥⎥ , W2 ∈ RM ×N (2)
⎢ ⋮ ⋱ ⋮ ⋱ . . . ⎥⎥
⎢
⎢ ⎥
⎢w̄M 1 ... w̄M j ... w̄M N ⎥⎦
⎣
where w̄kj represents the weight connecting the j th hidden neuron to the k th output neuron.
Input Vector
⎡ ⎤
⎢ x1 ⎥
⎢ ⎥
⎢x ⎥
X = ⎢⎢ 2 ⎥⎥ , X ∈ RP ×1 (3)
⎢ ⋮ ⎥
⎢x ⎥
⎢ P⎥
⎣ ⎦
Actual Output Vector
⎡ ⎤
⎢ y1 ⎥
⎢ ⎥
⎢y ⎥
Y = ⎢⎢ 2 ⎥⎥ , Y ∈ RM ×1 (4)
⎢ ⋮ ⎥
⎢y ⎥
⎢ M⎥
⎣ ⎦
Predicted Output Vector
⎡ ⎤
⎢ ŷ1 ⎥
⎢ ⎥
⎢ ŷ ⎥
Ŷ = ⎢⎢ 2 ⎥⎥ , Ŷ ∈ RM ×1 (5)
⎢ ⋮ ⎥
⎢ŷ ⎥
⎢ M⎥
⎣ ⎦
Forward Propagation:
2
Backward Propagation:
E = Y − Ŷ , δ2 = E ⊙ ḡ ′ (H2 ), δ2 ∈ RM ×1 (10)
Weight Updates:
Example 1: A neural network is shown in Fig. 2, where g(⋅) is the sigmoid activation function. Perform the
following tasks:
T
1. Compute the network output Ŷ = [ŷ1 ŷ2 ] for the given input in Fig. 2.
T
2. Compute the error using the Mean Squared Error (MSE) loss if the actual output is Y = [0.5 0.7] .
3. Update the weights shown in Fig. 2 using the gradient descent approach with a learning rate of η = 0.1.
0.7 0.2
x2 = 0.9 g(⋅) g(⋅) ŷ2
Solution:
Given Data
• Input Vector:
0.6
X =[ ]
0.9
0.2 0.5
W1 = [ ]
0.3 0.7
0.6 0.9
W2 = [ ]
0.4 0.2
3
Step 1: Compute Hidden Layer Input
1 0.638
ϕ = g(H1 ) = =[ ]
1 + e−H1 0.692
Step 3: Compute Output Layer Input
0.733
Ŷ = ḡ(H2 ) = [ ]
0.597
Step 5: Compute Error
1. Update W2 :
0.6 0.9 −0.0457
W2new = W2 + ηδ2 ϕT = [ ] + 0.1 [ ] [0.638 0.692]
0.4 0.2 0.0248
0.597 0.897
W2new = [ ]
0.402 0.203
2. Update W1 :
0.2 0.5 −0.0040
W1new = W1 + ηδ1 X T = [ ] + 0.1 [ ] [0.6 0.9]
0.3 0.7 −0.0070
0.1998 0.4996
W1new = [ ]
0.2996 0.6993
Final Updated Weights
• Updated W1 :
0.1998 0.4996
W1new = [ ]
0.2996 0.6993
• Updated W2 :
0.597 0.897
W2new = [ ]
0.402 0.203