0% found this document useful (0 votes)
31 views

Internethebbian1 090921232804 Phpapp02

The document discusses Hebbian learning and how it relates to neural networks. It introduces Hebb's rule, which states that connections between neurons are strengthened when they are activated at the same time. It then discusses various implementations of Hebbian learning, including the basic Hebb rule, Hebb rule with decay, instar rule, and outstar rule.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Internethebbian1 090921232804 Phpapp02

The document discusses Hebbian learning and how it relates to neural networks. It introduces Hebb's rule, which states that connections between neurons are strengthened when they are activated at the same time. It then discusses various implementations of Hebbian learning, including the basic Hebb rule, Hebb rule with decay, instar rule, and outstar rule.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 53

Next Assignment

Train a counter-propagation network to


compute the 7-segment coder and its
inverse.
You may use the code in
/cs/cs152/book:

counter.c
readme
ART1 Demo

Increasing
vigilance
Try
causes the
different
network to
patterns
be more
selective, to
introduce a
new
prototype
when the fit
is not good.
Hebbian Learning
Hebbs Postulate
When an axon of cell A is near enough to excite a cell B and
repeatedly or persistently takes part in firing it, some growth
process or metabolic change takes place in one or both cells such
that As efficiency, as one of the cells firing B, is increased.
D. O. Hebb, 1949

In other words,
B when a weight
contributes to
firing a neuron,
the weight is
increased. (If
the neuron
A doesnt fire,
then it is not).
B

A
B

A
Colloquial Corollaries

Use it or lose it.


Colloquial Corollaries?

Absence makes the heart grow fonder.


Generalized Hebb Rule

When a weight contributes to firing a neuron,


the weight is increased.

When a weight acts to inhibit the firing of a


neuron, the weight is decreased.
Flavors of Hebbian Learning

Unsupervised
Weights are strengthened by the
actual response to a stimulus

Supervised
Weights are strengthened by the
desired response
Unsupervised Hebbian
Learning

(aka Associative Learning)


Simple Associative Network

a = hardlimwp + b = hardlim wp 0.5

input output

1 stimulus 1 response
p = a =
0 no stimulus 0 no response
Banana Associator

Didnt
Pavlov
anticipate
this?

Unconditioned Stimulus Conditioned Stimulus


0 1 shap e detected 1 smell detected
p = p =
0 shap e not detected 0 smell not detected
Banana Associator Demo

can be toggled
Unsupervised Hebb Rule
w ij q = w ij q 1 + ai qp j q

input
actual response
Vector Form:
W q = W q 1 + a q p T q

Training Sequence:

p1 p 2 pQ
Learning Banana Smell
Initial Weights:
0
w = 1 w 0 = 0
unconditioned
(shape) Training Sequence:
0 0
p 1 = 0 p 1 = 1 p 2 = 1 p 2 = 1

conditioned
(smell) =1
w q = w q 1 + aq pq

First Iteration (sight fails, smell present):


0 0
a 1 = hardlim w p 1 + w 0 p 1 0.5
= hardlim 1 0 + 0 1 0.5 = 0 (no banana)

w 1 = w 0 + a 1p 1 = 0 + 0 1 = 0
Example
Second Iteration (sight works, smell present):
0 0
a 2 = hardlim w p 2 + w 1 p 2 0.5
= hardlim 1 1 + 0 1 0.5 = 1 (banana)

w 2 = w 1 + a 2p 2 = 0 + 1 1 = 1

Third Iteration (sight fails, smell present):


0 0
a 3 = hardlim w p 3 + w 2 p 3 0.5
= hardlim 1 0 + 1 1 0.5 = 1 (banana)

w 3 = w 2 + a 3p 3 = 1 + 1 1 = 2

Banana will now be detected if either sensor works.


Problems with Hebb Rule
Weights can become arbitrarily large.

There is no mechanism for weights to


decrease.
Hebb Rule with Decay
W q = W q 1 + a q p T q W q 1

W q = 1 W q 1 + a q pT q

This keeps the weight matrix from growing without bound,


which can be demonstrated by setting both ai and pj to 1:
m ax m ax
wi j = 1 wi j + ai p j
m ax m ax
wi j = 1 wi j +
m ax
wi j = ---

Banana Associator with Decay
Example: Banana Associator
=1 = 0.1

First Iteration (sight fails, smell present):


0 0
a 1 = hardlim w p 1 + w 0 p 1 0.5
= hardlim 1 0 + 0 1 0.5 = 0 (no banana)

w 1 = w 0 + a 1p 1 0.1w 0 = 0 + 0 1 0.1 0 = 0

Second Iteration (sight works, smell present):


0 0
a 2 = hardlim w p 2 + w 1 p 2 0.5
= hardlim 1 1 + 0 1 0.5 = 1 (banana)

w 2 = w 1 + a 2p 2 0.1w 1 = 0 + 1 1 0.1 0 = 1
Example
Third Iteration (sight fails, smell present):
0 0
a 3 = hardlim w p 3 + w 2 p 3 0.5
= hardlim 1 0 + 1 1 0.5 = 1 (banana)

w 3 = w 2 + a 3p 3 0.1w 3 = 1 + 1 1 0.11 = 1.9


General Decay Demo

no decay max
w ij = ---

larger decay
Problem of Hebb with Decay
Associations will be lost if stimuli are not occasionally
presented.
3

If ai = 0, then
2

wij q = 1 w ij q 1

If = 0, this becomes 1

wi j q = 0.9wi j q 1

0
0 10 20 30

Therefore the weight decays by 10% at each iteration


where there is no stimulus.
Solution to Hebb Decay Problem

Dont decay weights when there is no


stimulus
We have seen rules like this (Instar)
Instar (Recognition Network)
Instar Operation
a = hardlim Wp + b = hardlim 1w T p + b

The instar will be active when


T
1w p b

or
Tp p cos b
1w = 1w

For normalized vectors, the largest inner product occurs when the
angle between the weight vector and the input vector is zero -- the
input vector is equal to the weight vector.

The rows of a weight matrix represent patterns


to be recognized.
Vector Recognition
If we set
b = 1w p

the instar will only be active when = 0.


If we set
b > 1w p

the instar will be active for a range of angles.

1w

As b is increased, the more patterns there will be (over a


wider range of ) which will activate the instar.
Instar Rule
Hebb with Decay
w ij q = wij q 1 + ai qp j q

Modify so that learning and forgetting will only occur


when the neuron is active - Instar Rule:

w ij q = w ij q 1 + a i q p j q a i q w ij q 1

or
w ij q = wij q 1 + ai q pj q wi j q 1

Vector Form:

iwq = iw q 1 + ai q p q iwq 1
Graphical Representation
For the case where the instar is active (ai = 1):
iwq = iw q 1 + p q iw q 1
or
iw q = 1 iw q 1 + p q

For the case where the instar is inactive (ai = 0):


iw q = iwq 1
Instar Demo

input
vector

weight DW
vector
Outstar (Recall Network)
Outstar Operation
Suppose we want the outstar to recall a certain pattern a*
whenever the input p = 1 is presented to the network. Let
W = a

Then, when p = 1

a = satlins W p = s atli ns a 1 = a

and the pattern is correctly recalled.

The columns of a weight matrix represent patterns


to be recalled.
Outstar Rule
For the instar rule we made the weight decay term of the Hebb
rule proportional to the output of the network.

For the outstar rule we make the weight decay term proportional to
the input of the network.
wij q = wi j q 1 + ai qp j q p j qw ij q 1

If we make the decay rate equal to the learning rate ,


wi j q = wi j q 1 + ai q w ij q 1 pj q

Vector Form:

w j q = w j q 1 + a q wj q 1 p j q
Example - Pineapple Recall
Definitions
a = satl ins W0 p0 + W p

100
0
W = 010
001

shape 1
0 pi neap ple
p = tex ture p = 1
w eight 1

1 if a p ineapple can be seen


p =
0 otherwise
Outstar Demo
Iteration 1
0 1
0 0
p 1 = 0 p 1 = 1
p 2 = 1 p 2 = 1

0 1

=1


0 0 0
a 1 = s atli ns 0 + 0 1 = 0 (no response)

0 0 0


0 0 0 0
w 1 1 = w1 0 + a 1 w 1 0 p 1 = 0 + 0 0 1 = 0

0 0 0 0
Convergence

1 0 1
a 2 = satlins 1 + 0 1 = 1 (measurements given)

1 0 1

0 1 0 1
w 1 2 = w1 1 + a 2 w1 1 p 2 = 0 + 1 0 1 = 1

0 1 0 1

0 1 1

a 3 = s atli ns 0 + 1 1 = 1 (measurements recalled)

0 1 1

1 1 1 1
w 1 3 = w1 2 + a 2 w1 2 p 2 = 1 + 1 1 1 = 1

1 1 1 1
Supervised
Hebbian Learning
Linear Associator

R
a = Wp ai = w ij p j
j =1

Training Set:
{ p1, t 1} {p2, t 2} {pQ , t Q}
Hebb Rule
w new ij + f i ai qgj p jq
= wol d
ij

Presynaptic Signal
Postsynaptic Signal

actual output
Simplified Form:
winew ij + aiq p jq
= w ol d
j

input pattern
Supervised Form:
winew
j = w ol d
ij + t iq p jq

Matrix Form: desired output


ne w old T
W = W + t q pq
Batch Operation
Q
+ t Qp TQ
T T T
W= t 1 p1 + t 2 p2 + = t q pq (Zero Initial
q= 1 Weights)

Matrix Form:
T
p1 P = p1 p2 pQ
T
W = t1 t2 tQ p 2 = TP T

T
pQ T = t1 t2 tQ
Performance Analysis
Q T
Q
a = Wp k = t q p q p k =
T
t q p q p k
q = 1 q= 1

Case I, input patterns are orthogonal.


T
pq pk = 1 q = k

= 0 qk

Therefore the network output equals the target:


a = Wpk = t k

Case II, input patterns are normalized, but not orthogonal.

q q pk
T
a = Wpk = t k + t p
q k
Error term
Example
Banana Apple Normalized Prototype Patterns

1 1 0.5774 0.5774

p1 = 1 p2 = 1 p 1 = 0.5774 t1 = 1 p 2 = 0.5774 t 2 = 1

1 1 0.5774 0.5774

Weight Matrix (Hebb Rule):

W = TP T = 1 1 0.5774 0.5774 0.5774 = 1.1548 0 0


0.5774 0.5774 0.5774

Tests:
0.5774
Banana Wp 1 = 1.1548 0 0 0.5774 = 0.6668
0.5774

0.5774
Apple Wp 2 = 0 1.1548 0 0.5774 = 0.6668
0.5774
Pseudoinverse Rule - (1)
Performance Index: Wpq = tq q = 1 2 Q


2
F W = ||t q Wpq || Mean-squared error
q= 1

Matrix Form: WP = T

T = t1 t2 tQ P = p1 p2 pQ

2 2
F W = ||T WP|| = ||E||

2 2
|| E || = eij
i j
Pseudoinverse Rule - (2)
WP = T

Minimize:
2 2
F W = ||T WP|| = ||E||

If an inverse exists for P, F(W) can be made zero:


W = TP 1

When an inverse does not exist F(W) can be minimized


using the pseudoinverse:
W = TP +

1
P + = P TP P T
Relationship to the Hebb Rule
Hebb Rule
T
W = TP

Pseudoinverse Rule
W = TP +
1
P + = P TP P T

If the prototype patterns are orthonormal:

PT P = I

+ T 1 T T
P = P P P = P
Example
+
1

1
+ 1 1
1
p =
1 1 t = 1 2
p =
1 2 t = 1 W = TP = 1 1 1 1

1 1 1 1

1
+ T 1 T 1 1 1 = 0.5 0.25 0.25
P = P P P = 3 1
13 1 1 1 0.5 0.25 0.25

W = T P+ = 1 1 0.5 0.25 0.25 = 1 0 0


0.5 0.25 0.25

1 1
Wp 1 = 1 0 0 1 = 1 Wp 2 = 1 0 0 1 = 1
1 1
Autoassociative Memory

T
p1 = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

T T T
W = p1p1 + p2p2 + p3p3
Tests
50% Occluded

67% Occluded

Noisy Patterns (7 pixels)


Supervised Hebbian Demo
Spectrum of Hebbian Learning
ne w old T
Basic Supervised Rule: W = W + tqpq

Supervised with Learning Rate: W


new
= W
old
+ tq pq
T

new old T old old T


Smoothing: W =W + tqpq W = 1 W + tqpq

target
actual

Delta Rule: Wnew = Wol d + tq aq pTq

Unsupervised: W
new
= W
old
+ aqpq
T

You might also like