0% found this document useful (0 votes)
3 views

Backpropagation

The document provides instructions for implementing the Backpropagation Algorithm in a two-layer fully connected feedforward neural network, detailing the processes of forward propagation, error computation, and backward propagation. It includes mathematical formulations for weight matrices, input vectors, and activation functions, along with a step-by-step example demonstrating the computation of network output, error, and weight updates. The course is part of EE-241 (Neural Networks and Fuzzy Systems) taught by Pankaj K. Mishra at NIT Hamirpur, India.

Uploaded by

Devkriti Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Backpropagation

The document provides instructions for implementing the Backpropagation Algorithm in a two-layer fully connected feedforward neural network, detailing the processes of forward propagation, error computation, and backward propagation. It includes mathematical formulations for weight matrices, input vectors, and activation functions, along with a step-by-step example demonstrating the computation of network output, error, and weight updates. The course is part of EE-241 (Neural Networks and Fuzzy Systems) taught by Pankaj K. Mishra at NIT Hamirpur, India.

Uploaded by

Devkriti Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Instruction for implementation of LSE, RLS, and RBFN

Course: EE-241 (Neural Networks and Fuzzy Systems)


Instructor: Pankaj K. Mishra, NIT Hamirpur, India

Introduction
The Backpropagation Algorithm is used to train artificial neural networks by adjusting weights based on the gradient
of the error. It consists of:
• Forward Propagation: Computing outputs layer by layer.
• Error Computation: Finding the difference between actual and predicted outputs.

• Backward Propagation: Adjusting weights based on the computed error.


• Weight Update: Using gradient descent to minimize the error.

Input Layer (P ) Hidden Layer (N ) Output Layer (M )

w11 w̄11
x1 g1 (⋅) ḡ1 (⋅)
w 1i w̄ 1j
P

1N
w1


⋮ ⋮ ⋮
wj w̄k
1 1

wji w̄kj
xi gj (⋅) ḡk (⋅)
w jP w̄ kN

w
⋮ N
1 ⋮ w̄ M ⋮
1
wN w̄M
i j

wN P w̄M N
xP gN (⋅) ḡM (⋅)

Figure 1: Two-Layer Fully Connected Feedforward Neural Network

Consider a two-layer fully connected feedforward neural network with:

• Input Layer: P neurons


• Hidden Layer: N neurons
• Output Layer: M neurons

Notation:

Weight Matrices:

Weight Matrix from Input to Hidden Layer


⎡w ... w1i ... w1P ⎤⎥
⎢ 11
⎢ ⎥
⎢ ⋮
⎢ ⋱ ⋮ ⋱ . . . ⎥⎥
W1 = ⎢⎢ wj1 ... wji ... wjP ⎥⎥ , W1 ∈ RN ×P (1)
⎢ ⋮ ⋱ ⋮ ⋱ . . . ⎥⎥

⎢ ⎥
⎢wN 1 ... wN i ... wN P ⎥⎦

where wji represents the weight connecting the ith input neuron to the j th hidden neuron.

1
Weight Matrix from Hidden to Output Layer
⎡ w̄ ... w̄1j ... w̄1N ⎤⎥
⎢ 11
⎢ ⎥
⎢ ⋮
⎢ ⋱ ⋮ ⋱ . . . ⎥⎥
W2 = ⎢⎢ w̄k1 ... w̄kj ... w̄kN ⎥⎥ , W2 ∈ RM ×N (2)
⎢ ⋮ ⋱ ⋮ ⋱ . . . ⎥⎥

⎢ ⎥
⎢w̄M 1 ... w̄M j ... w̄M N ⎥⎦

where w̄kj represents the weight connecting the j th hidden neuron to the k th output neuron.

Input Vector
⎡ ⎤
⎢ x1 ⎥
⎢ ⎥
⎢x ⎥
X = ⎢⎢ 2 ⎥⎥ , X ∈ RP ×1 (3)
⎢ ⋮ ⎥
⎢x ⎥
⎢ P⎥
⎣ ⎦
Actual Output Vector
⎡ ⎤
⎢ y1 ⎥
⎢ ⎥
⎢y ⎥
Y = ⎢⎢ 2 ⎥⎥ , Y ∈ RM ×1 (4)
⎢ ⋮ ⎥
⎢y ⎥
⎢ M⎥
⎣ ⎦
Predicted Output Vector
⎡ ⎤
⎢ ŷ1 ⎥
⎢ ⎥
⎢ ŷ ⎥
Ŷ = ⎢⎢ 2 ⎥⎥ , Ŷ ∈ RM ×1 (5)
⎢ ⋮ ⎥
⎢ŷ ⎥
⎢ M⎥
⎣ ⎦

Forward Propagation:

Compute Hidden Layer Input


⎡h ⎤
⎢ 1⎥
⎢ ⎥
⎢ ⋮ ⎥
⎢ ⎥
H1 = W1 X = ⎢⎢ hj ⎥⎥ , H1 ∈ RN ×1 (6)
⎢ ⋮ ⎥
⎢ ⎥
⎢ ⎥
⎢hN ⎥
⎣ ⎦
Apply Activation Function at Hidden Layer
⎡ g (h ) ⎤
⎢ 1 1 ⎥
⎢ ⎥
⎢ ⋮ ⎥
⎢ ⎥
ϕ = ⎢⎢ gj (hj ) ⎥⎥ = g(H1 ), ϕ ∈ RN ×1 (7)
⎢ ⋮ ⎥
⎢ ⎥
⎢ ⎥
⎢gN (hN )⎥
⎣ ⎦
where common activation functions are:
• Sigmoid: g(H1 ) = 1
1+e−H1
, g ′ (H1 ) = g(H1 ) ⊙ (1 − g(H1 ))
• Tanh: g(H1 ) = tanh(H1 ), g ′ (H1 ) = 1 − g(H1 ) ⊙ g(H1 )
Compute Output Layer Input
⎡ h̄ ⎤
⎢ 1⎥
⎢ ⎥
⎢ ⋮ ⎥
⎢ ⎥
H2 = W2 ϕ = ⎢⎢ h̄k ⎥⎥ , H2 ∈ RM ×1 (8)
⎢ ⋮ ⎥
⎢ ⎥
⎢ ⎥
⎢h̄M ⎥
⎣ ⎦
Apply Activation Function at Output Layer
⎡ ḡ (h̄ ) ⎤
⎢ 1 1 ⎥
⎢ ⎥
⎢ ⋮ ⎥
⎢ ⎥
Ŷ = ⎢⎢ ḡk (h̄k ) ⎥⎥ = ḡ(H2 ) (9)
⎢ ⋮ ⎥
⎢ ⎥
⎢ ⎥
⎢ḡM (h̄M )⎥
⎣ ⎦

2
Backward Propagation:

Compute Error Signal at Output Layer

E = Y − Ŷ , δ2 = E ⊙ ḡ ′ (H2 ), δ2 ∈ RM ×1 (10)

Compute Hidden Layer Error


E1 = W2T δ2 , δ1 = E1 ⊙ g ′ (H1 ) (11)

Weight Updates:

Update Weights for Hidden to Output Layer

W2new = W2old + ηδ2 ϕT (12)

Update Weights for Input to Hidden Layer

W1new = W1old + ηδ1 X T (13)

Example 1: A neural network is shown in Fig. 2, where g(⋅) is the sigmoid activation function. Perform the
following tasks:
T
1. Compute the network output Ŷ = [ŷ1 ŷ2 ] for the given input in Fig. 2.
T
2. Compute the error using the Mean Squared Error (MSE) loss if the actual output is Y = [0.5 0.7] .
3. Update the weights shown in Fig. 2 using the gradient descent approach with a learning rate of η = 0.1.

x1 = 0.6 0.2 g(⋅) 0.6 g(⋅) ŷ1


0.5 0.9
0.3 0.4

0.7 0.2
x2 = 0.9 g(⋅) g(⋅) ŷ2

Figure 2: Network Architecture with Weights

Solution:

Given Data

• Input Vector:
0.6
X =[ ]
0.9

• Weight Matrix from Input to Hidden Layer:

0.2 0.5
W1 = [ ]
0.3 0.7

• Weight Matrix from Hidden to Output Layer:

0.6 0.9
W2 = [ ]
0.4 0.2

• Target Output Vector:


0.5
Y =[ ]
0.7

• Activation Function: Sigmoid (g(z) = 1


1+e−z
)
• Learning Rate: η = 0.1

3
Step 1: Compute Hidden Layer Input

0.2 0.5 0.6 0.57


H1 = W1 X = [ ][ ] = [ ]
0.3 0.7 0.9 0.81
Step 2: Apply Activation Function at Hidden Layer

1 0.638
ϕ = g(H1 ) = =[ ]
1 + e−H1 0.692
Step 3: Compute Output Layer Input

0.6 0.9 0.638 1.006


H2 = W2 ϕ = [ ][ ]=[ ]
0.4 0.2 0.692 0.393
Step 4: Apply Activation Function at Output Layer

0.733
Ŷ = ḡ(H2 ) = [ ]
0.597
Step 5: Compute Error

0.5 0.733 −0.233


E = Y − Ŷ = [ ] − [ ]=[ ]
0.7 0.597 0.103
Step 6: Compute Output Layer Error Signal

−0.233 0.733 0.267 −0.0457


δ2 = E ⊙ ḡ ′ (H2 ) = E ⊙ Ŷ ⊙ (1 − Ŷ ) = [ ]⊙[ ]⊙[ ]=[ ]
0.103 0.597 0.403 0.0248
Step 7: Compute Hidden Layer Error Signal

0.6 0.4 −0.0457 −0.0195


E1 = W2T δ2 = [ ][ ]=[ ]
0.9 0.2 0.0248 −0.0357

−0.0195 0.638 0.362 −0.0040


δ1 = E1 ⊙ g ′ (H1 ) = E1 ⊙ ϕ ⊙ (1 − ϕ) = [ ]⊙[ ]⊙[ ]=[ ]
−0.0357 0.692 0.308 −0.0070
Step 8: Update Weights

1. Update W2 :
0.6 0.9 −0.0457
W2new = W2 + ηδ2 ϕT = [ ] + 0.1 [ ] [0.638 0.692]
0.4 0.2 0.0248
0.597 0.897
W2new = [ ]
0.402 0.203
2. Update W1 :
0.2 0.5 −0.0040
W1new = W1 + ηδ1 X T = [ ] + 0.1 [ ] [0.6 0.9]
0.3 0.7 −0.0070
0.1998 0.4996
W1new = [ ]
0.2996 0.6993
Final Updated Weights

• Updated W1 :
0.1998 0.4996
W1new = [ ]
0.2996 0.6993
• Updated W2 :
0.597 0.897
W2new = [ ]
0.402 0.203

You might also like