AM
AM
NETWORKS
PATTERN ASSOCIATION
Associating patterns which are
• similar,
• contrary,
• in close proximity (spatial),
• in close succession (temporal).
Associative recall
s1 s1t1...s1tm ∆w11...∆w1m
s t ...s t
∆W ( p ) = s ( p ) ⋅ t ( p ) = [t1 ,..., t m ] =
T 2 1 2 m =
s
n s t ...s
n 1 n m n1t ∆w ...∆wnm
and
P
W ( p ) = ∑ s T ( p) ⋅ t ( p )
p =1
HETERO-ASSOCIATIVE MEMORY NETWORK
Binary pattern pairs s:t with |s| = 4 and |t| = 2.
Total weighted input to output units: y _ in j = ∑ x i w ij
i
Activation function: threshold 1 if y _ in j > 0
yj =
0 if y _ in j ≤ 0
Weights are computed by Hebbian rule (sum of outer products
of all training pairs) P
W = ∑ si ( p) t j ( p)
T
Training samples: p =1
s(p) t(p)
p=1 (1 0 0 0) (1, 0)
p=2 (1 1 0 0) (1, 0)
p=3 (0 0 0 1) (0, 1)
p=4 (0 0 1 1) (0, 1)
COMPUTING THE WEIGHTS
1 1 0 1 1 0
0 0 0 1 1 0
sT (1) ⋅ t (1) = (1 0 ) = sT (2) ⋅ t (2) = (1 0 ) =
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
sT (3) ⋅ t (3) = (0 1) = sT (4) ⋅ t (4) = (0 1) =
0 0 0 1 0 1
1 0 1 1 0 1
2 0
1 0
W =
0 1
0 2
TEST/ RECALL THE NETWORK
x = [1 0 0 0] x = [0 1 1 0]
2 0 2 0
(0 1 1 0)
1 0
(1 0 0 0)
1 0 = (1 1)
= (2 0 ) 0 1
0 1
0
0 2 2
y1 = 1, y 2 = 0 y1 = 1, y 2 = 1
1 1 1 − 1
1 1 1 − 1
W =
1 1 1 − 1
− 1 −1 − 1 1
training pattern (111 − 1)⋅W = (4 4 4 − 4) → (111 − 1)
noisy pattern (− 111 − 1) ⋅W = (2 2 2 − 2) → (111 − 1)
missing info (0 0 1 − 1) ⋅W = (2 2 2 − 2) → (111 − 1)
more noisy (− 1 − 11 − 1) ⋅W = (0 0 0 0) not recognized
AUTO-ASSOCIATIVE MEMORY NETWORK –
DIAGONAL ELEMENTS
Diagonal elements will dominate the computation when
multiple patterns are stored (= P).
When P is large, W is close to an identity matrix. This causes
output = input, which may not be any stoned pattern. The
pattern correction power is lost.
Replace diagonal elements by zero.
0 1 1 − 1
1 0 1 − 1
W0 =
1 1 0 − 1
− 1 − 1 − 1 0
(1 1 1 − 1)W ' = (3 3 3 − 3) → (1 1 1 − 1)
( −1 1 1 − 1)W ' = (3 1 1 − 1) → (1 1 1 − 1)
(0 0 1 − 1)W ' = ( 2 2 1 − 1) → (1 1 1 − 1)
( −1 − 1 1 − 1)W ' = (1 1 − 1 1) → wrong
STORAGE CAPACITY
Number of patterns that can be correctly stored & recalled by a
network.
More patterns can be stored if they are not similar to each
other (e.g., orthogonal).
Non-orthogonal
0 0 −2 2
0 0 0 0
(1 − 1 − 1 1) → W0 = (1 − 1 − 11) ⋅W0 = (1 0 − 1 1)
(1 1 − 1 1) − 2 0 0 − 2
It is not stored correctly
2 0 − 2 0
Orthogonal
0 −1 −1 − 1
(1 1 − 1 − 1) − 1 0 −1 − 1
(−1 1 1 − 1) → W0 =
(−1 1 − 1 1) − 1 −1 0 − 1 All three patterns can be
− 1 −1 −1 0 correctly recalled
BIDIRECTIONAL ASSOCIATIVE MEMORY (BAM)
NETWORK
Architecture:
• Two layers of non-linear units: X-layer, Y-layer.
• Units: discrete threshold, continuing sigmoid (can be either
binary or bipolar).
Weights:
P
Wn×m = ∑ s T ( p) ⋅ t ( p) (Hebbian/outerproduct)
p =1
Symmetric: w ij = w ji
Convert binary patterns to bipolar when constructing
W.
RECALL OF BAM NETWORK
Bidirectional, either by X (to recall Y) or by Y (to recall X).
Recurrent:
y (t ) = [ f ( y _ in1 (t ),..., f ( y _ inm (t )]
n
where y _ in j (t ) = ∑ wi j ⋅ xi (t − 1)
i =1
w ij = w ji
w ii = 0
P
Hopfield’s observation: P ≈ 0.15n, ≈ 0.15
n
n P 1
Theoretical analysis: P≈ , ≈
2 log 2 n n 2 log 2 n
dui (t ) n
• Internal activation ui : with = ∑ wij x j (t ) + θ i = neti (t )
dt j =1
Computation: all units change their output (states) at the same time,
based on states of all others.
n
• Compute net: neti (t ) = ∑ wij x j (t ) + θi
j =1
0 1 1 −1
1 0 1 − 1 Output units are
x = (1, 1, 1, − 1) W =
1 1 0 −1 threshold units
− 1 − 1 − 1 0
Iteration will eventually stop because the total number of distinct state
is finite (3^n) if threshold units are used. If patterns are continuous,
the system may continue evolve forever (chaos) if no such K exists.