Learning Rules
Learning Rules
Network
Learning Rules
Presented By:
Suman
Asstt. Prof.
S.S.I.E.T.
DerraBassi
Neural Network
Inputs Outputs
Wn
xn
Neural Network
Mathematics
Output
Inputs
y 14 f ( x4 , w14 ) 4
Neural Network Learning
Rules
A Neuron is considered to be an adaptive
element.
Its weight are modifiable depending on the
input signal,
it receives, its output value and associated
teacher response.
In some cases the teacher signal is not
available and
no error information can be used, thus the
neuron
will modify its weight based only on the
input and or/ output. This a case of
unsupervised learning.
Types of Learning Rules
Hebbian
Preception
Delta
Widrow Hof
Correlation
Winner-take-all
Outstar
General Learning Rule
The weight vector
Wi = [Wi1 Wi2 Win]
Increases in proportion to the product to
input x and learning signal r.
The learning signal r is in general a function
of w i , x and sometimes of the teachers
signal d i.
The learning signal is function of
The increment of the weight vector w i
produced by the learning step at time t
according to the general rule
r r ( wi , x, d i )
wi cf ( wit x) x
The single weight is adapted using the
following increments
wij cf ( wit x) x j
This can be written as
wij coi x j
The learning rule requires the weight
initialization at small values around 0 prior to
learning.
The rule states that if the crossoi xproduct
j of
output of output and input, or correlation
terms is positive, this results in an
increase of weight wij ; otherwise the weight
decreases.
Example
Assume the network shown in figure with initial
weight vector 1
1
w
1
0
0.5
1
2
net 1 w1t x1 [1 1 0 0.5] 3
1.5
0
The updated weight are
w 2 w1 sgn(net 1 ) x1 w1 x1
1 1 2
1 2 3
w 2 w1 x1
0 1.5 1.5
0.5 0 0.5
Step2
The learning step with x2 input
1
0.5
net 2 w 2t x 2 [2 3 1.5 0.5] 0.25
2
1.5
The updated weights are
w 3 w 2 sgn(net 2 ) x 2 w 2 x 2
2 1 1
3 0.5 2.5
w3 w 2 x2
1.5 2 3.5
0 .5 1 .5 2
Step 3
For input x3 we obtain
0
1
net 3 w 3t x3 [1 2.5 3.5 2] 3
1
1.5
The updated weights are
w 4 w 3 sgn(net 3 ) x3 w 3 x3
1 0 1
2.5 1 3.5
w 4 w 3 x3
3.5 1 4.5
2 1.5 0.5
It can be seen that learning with discrete
f(net) and c= 1 results in adding or
subtracting the entire input pattern vectors
to and from the weight vector.
In the case of a continuous f(net) , the
weight incrementing/decrementing vector is
scaled down to a fractional value of the input
pattern
Example with bipolar
activation
Solving the previous example with bipolar
activation function f(net) , input Xi and initial
weight W1. 1
j (x j ) 1
j x j
1 e
STEP 1
1.905
f (net ) 0.905
1
2.81
w2
1.357
0 . 5
STEP 2 1.828
2.772
f (net ) 0.077
2 w3
1.512
0 .616
STEP 3
o
Where i sgn( w t
i x) and d i is the desired
response
Theweight adjustment in this method are
obtained as follows.
wi c[ d i sgn( wit x )]x
For j = 1,2 ,n
0
0.5
w 2 w1 0.1(1 1) x1
Therefore
1 1 0.8
1 2 0.6
w2 0.2
0 0 0
0.5 1 0.7
Step2
Inputis x2, desired output is d2. For the
present vector w2. The activation value as
follows. 0.8
0.6
net w x 2 [0 1.5
2 2t
0.5 1] 1.6
0
0. 7
d 3 sgn(2.1)
The updated weight values are
0.8 1 0.6
0.6 1 0.4
w4 0.2
0 0.5 0.1
0. 7 1 0. 5
Delta Learning Rule
The delta learning is only valid for
continuous activation function in supervised
traning mode.
The learning signal for this rule is called
delta and is definedt as follows.
r [d1 f ( wi x)] f ' ( wit x)
f ' ( wit x)
The term is the derivative of the
activation function f(net) computed for
net wit x
Delta learning Rule
The learning rule can be readily derived from
the condition of least squared error between
o i and d i.
Calculating the gradient vector with respect
to w i of the squared error defined as
1
E ( d i oi ) 2
2
1
E [ d i f ( wit x i )] 2
2
We obtain the error gradient vector value
E (d i oi ) f ' ( wit x) x
For the
wsingle
ij weight
( d i
adjustment
o i ) f ' ( becomes
net i )x j
Forj= 1,2 , 3n
The weight adjustment is computed based
on minimization of the squared error.
Example
Input vectors
1 0 1
2 1.5 1
x1 , x 2 , x3
0 0.5 0.5
1 1 1
Initial weight vector
1
1
w1
0
0.5
2
1
w2 c(d1 o1 ) f ' ( net 1 ) x1 w
t
[0.974 0.948 0 0.526]
Step2
Input is vector x2, weight vector is w3
net 2 w 2t x1 1.948
o 2 f (net 2 ) 0.75
1
f ' (net ) [1 (o 2 ) 2 ] 0.218
2
2
w 3 c(d 2 o 2 ) f ' (net 2 ) x 2 w 2
net 3 w 3t x3 2.46
o 3 f (net 3 ) 0.842
1
f ' ( net 3 ) [1 (o 3 ) 2 ] 0.145
2
w 4 c(d 3 o 3 ) f ' (net 3 ) x3 w 3
wi c(d i wit x) x
Or, for the single weight the adjustment is