Dynamics of Self-Organizing Adaptive Networks
Dynamics of Self-Organizing Adaptive Networks
3/4, 1984
Dynamics of Self-Organization in
Complex Adaptive Networks
D. d'Humi~res l'2 and B. A. Huberman 1
1. INTRODUCTION
T h e p r o b l e m of h a n d l i n g large a m o u n t s of r e d u n d a n t d a t a a n d extracting
relevant i n f o r m a t i o n from t h e m lies at the h e a r t of b o t h p a t t e r n recognition
a u t o m a t a a n d m o d e l s of the higher b r a i n functions, such as learning a n d
associative m e m o r y . As such, they have b e e n the focus of intense efforts
a i m e d at designing algorithms a n d architectures. A m o n g the m a n y avenues
b e i n g explored, a p r o m i s i n g one resorts to local a n d parallel c o m p u t a t i o n
b y a r r a y s of processors with d e l o c a l i z e d memories. (1-3)
In spite of all this work, little is k n o w n a b o u t the d y n a m i c s of c o m p l e x
networks a n d their b e h a v i o r u n d e r general circumstances. Issues such as
i I',II I I
layer2 c e ~ ~ ] esll2,42,'i4 I
~ 4 ~ll/l~ i=1t04 I
I I - - " I I I I - '
t --t -- "--
I " rr . . . . . . . . . . . . . . . . . . . . . . . . . trrr" ....
, -[_ . . . . . . . . . . . . . . . . . . . . . . . . . . . rlr
..... -:-~--'2-- :--_--'2--_::.---2s
"., ., ., ., ", , ,
Fig. 1. A network with two layers and four cells per layer. The dashed lines show the
feedback path during the adaptive phase.
364 d'Humieres and Huberman
at time t + r to its input and its state at time t, i.e., Ol(t + r) = F(St(t),
It(t)). Each cell i is a device with n afferent inputs stemming from the n
outputs of the preceding layer, i.e., Ii(t ) = 01_ 1(0, which in turn produces
an output Ot, i(t ) whose value depends on its internal state St, i(t), the input
values, and the propagation rules given by F. The inputs to the first layer
are given externally, and the information is transmitted from layer to layer,
down to the last layer, which we will call the output layer.
As long as the states of all the cells are fixed there are no fundamental
differences between asynchronous or synchronous networks. This is no
longer true however, if the states are allowed to change in time through an
adaptive process. Since synchronous networks cover a wide range of
applications and are easier to model and simulate than asynchronous ones,
we will restrict our study to them. Moreover, we will assume that the
external inputs of the first layer are stable for long enough to insure that
both the state variables of the network and the output of the last layer are
stable. We will also change the external inputs of the first layer at a fixed
rate (define by an external clock with a period T >> r). Under these
conditions the network can then be sampled at the same clock rate in time
intervals k . T, which are multiples of T. Therefore transients can be
ignored and It(t ), Ot(t ) and St(t ) can then be replaced by I~k), O~k) and
St(h) , respectively, with
= 1 91 (2.1)
We should point out that the general network being considered here
has a material connectivity (or wiring) independent of the adaptive process.
The latter only changes the relative strength of the couplings as the input is
changed, thereby producing an effective change in connectivity. This is to
be contrasted with other automata where the rules are such that the wiring
itself is allowed to change with the adaptive process.
Dynamics of Self-Organization in Complex Adaptive Networks 365
3 Thus we will consider as irrelevant any modification of the intermediate outputs or of the
internal states which do not produce any change on the last output.
366 d'Humieres and Huberman
filter 1
_•Sl,l,h,1
I1.2 2_ ,
I f'~/ ~l,l , h , 3
[ 1,3 >', ~-'/~----- non linear
] f ~ ~',-~r S 1 l h /Tf /
differential amplifier
I 1,4 ,, ,j (~_,,,
'~-~- 01,1
i
[1,4 >l i
i t ;
, ,
I1, a >, ,
I J \ ' $2,1,h,3
[1,2 ~ ~ S 2 l,h,2
I 1,1 -~-~~' .-~
S2,1,h,l +1
filter2
Fig. 2. Schematic representation of a cell of the network.
outputs of all the cells are fed back, as shown by the dashed lines of Fig. 2.
The outputs from the two filters are then compared by a nonlinear
differential amplifier acting as a rectifier (a threshold device with a fixed
threshold equal to zero). In addition, the output of the second filter added
to 1 sets the inverse of the amplifier gain.
The effect of these filters is determined by the actual adaptive process.
In a pattern recognition automaton, the goal is to increase the distance
between the outputs produced by the different training patterns and to
broaden the equivalence classes associated with them. To achieve this, the
adaptive process is implemented as follows. The output of each cell is
locally compared to outputs from other given cells in the same layer. The
states of the cells producing local maxima are then changed by adding a
part of the input vector to the coefficients of the first filter, thus producing
filters better matched to this input. In this way the first set of filters acts as
a template comparator and is referred to as the exictatory set of connec-
368 d'Humidres and Huberman
tions. At the same time, a constant term related to the average input value
is added to the coefficients of the second filter, leading to a measure of the
"linear power" of all positive inputs, along with a normalization of the
output vectors. This is done in order to make them independent of the
number of times the first filter has been modified by the same input. This
second set of coefficients will be referred to as the inhibitory set of
connections. Finally, the gain of the differential amplifier is set by a
Layer 3
Q 9 9 9 9 9 9 9
9 t 9 9 9 9 9 9
9 Q 9 9 9 9 9 9
Layer 2 " ~ 9 9 9 ~
9 _/-2"~ /./ / ~. , . . . .
[ 1,' I ,'
" t~ . t,' 9 I //";'
[i
/I 9 . . . .
...... ~--~-.-~;~
~ " . .
Pill . ~" ." /
Fig. 3. Actual implementation of the network. All the cells inside the squares act as inputs
for the cell at the top of the cones. T h e insert shows the random set of connections for the cells
as indexed by a random permutation. The shaded diamond shows the range of comparison
between outputs of the same layer during the adaptive process.
Dynamics of Self-Organization in Complex Adaptive Networks 369
4. RESULTS OF A D A P T I V E E X P E R I M E N T S
With the specifications given above, we studied the dynamics of the
network for two training sets of input patterns as a function of the ratio of
excitation to inhibition Q0. The first training set consisted of all the
horizontal and vertical full lines (12 dots long) which could be arranged
into the square input matrix, whereas the second set was composed of the
26 capital letters A to Z plus the ten digits 0 to 9, arranged in 9 x 11 matrix
within the 12 x 12 array.
Typically, the maximum and the minimum of the quantities defined by
Eq. (2.6) were measured over many periods (from 60 to 600) of the input
pattern sequence and their decimal logarithm was plotted as a function of
370 d'Humk~res and Huberman
time (in units of pattern sequences). This is shown in Figs. 4-6. The lower
curves correspond to the minimum distance measured for some patterns
and the upper one denotes the maximum distance.
For both sets of patterns, the best convergence properties for the
network, as measured by these curves, were found for Qo ~ 2. As expected,
the time to reach a fixed point was longer for the more complicated set of
input patterns. As Q0 was decreased or increased away from that value, we
found out that the convergence of the adaptive process was altered, as
shown in Figs. 4 6.
In particular, with the input set composed of lines, we discovered (Fig.
5d) a periodic behavior in a very narrow range of parameter values, i.e.,
1.48 < Q0 < 1.51. It was characterized by rapid oscillations of the distance
o ' I I t = I i I
QO = 1.20 (a) O0 =1.80 (b)
LINES LINES
2 2
ca
#
4 4
6 i P
80 160 240 0 8O 160 240
t t
o j [ I t I I I I
o o = 2.40 (c) Qo =3.00 (d)
LINES LINES
>-
o o
I ,1, 6
0 80 160 240 0 80 , 160 240
t t
Fig. 4. The distance (V, W ) as a function of time for a training set of lines. Data obtained
for values of Q0 = 1.2, 1.8, 2.4, and 3.0. The time unit for this figure, and the following two, is
defined as the time to process a complete set of input patterns.
Dynamics of Self-Organization in Complex Adaptive Networks 371
0 * I = r ~ I J
O.0 = 1.45 (a)
LINES
~z
2
r 4 ' 4
~ F--q---,
6
80 160 240 320 0 100 200 300 400
t t
r [ ~ [ , I i
(c)
2 2
J
' 4 ' 4
6 r~~~
(] 80 160 240 320
6
0 160 240 320
t t
n E i I i I
(e)
>"
v
o
6 I r I I I
40 80 120 160
t
Fig. 5. The distance (V,W) as a function of time for a training set of lines. Data obtained
for Q0 = 1.45, 1.47, 1.48, 1.49, and 1.55.
t
J (a)
f
I
|
(IO=1.20
ALPHABET
9 I I ,,, L_ 1
0 200 4O0 600 800
t
0 --
Q0 = 2.00 (b)
ALPHABET
~
~4
t ,[ I
0 160 320 480 640
t
I ,-~]
QO = 2.20
(c)
ALPHABET
>-
Fig. 6. The distance (V,W) as a function of time for the alphabet training set. Data
obtained for Q0 = 1.2, 2.0, and 2,2.
between zero and small values for several periods of the input sequence.
We should also point out that these periodic and chaotic behaviors were
entangled with regimes for which the network flowed toward a fixed point,
the ranges of existence for each of them being very narrow.
Another interesting phenomenon is illustrated in Fig. 4d. For long
times the network shows monotonic convergence towards a self-organized
Dynamics of Self-Organization in Complex Adaptive Networks 373
state with a simple fixed point, only to start unraveling itself at later times.
Although all these phenomena exist for different input sequences, the exact
numerical values of Q0 associated with them depend on the actual input
sequence; see, for example, the adaptive behavior for an alphabetic training
set depicted in Fig. 6a.
These results are to be contrasted with dynamical Systems with few
degrees of freedom, in which the sequences of attractors one observes are
both simpler and immune to external perturbations. (11) The reason for this
difference seems to be due to the presence of a few patterns which during
the adaptive process start producing weaker outputs. As this process
continues, a particular pattern ends up producing a zero output vector, thus
leading to a bootstrapping procedure to recover from this situation. This in
turn produces a change in the distributed memory of the network in such a
way so as to take it away from its fixed point. One can conclude from this
observation that minor perturbations can eventually drive a complex net-
work away from its fixed points.
5. E X P E R I M E N T S ON THE S E L F - O R G A N I Z E D N E T W O R K
In what follows we will describe experiments which are performed on
the network after the adaptive process took place. These experiments used
the network in a pattern recognition mode as a probe of its final state, thus
studying the filtering properties of the network and how they were related
to the training set of patterns.
With the state of the network encoded in its final state arrays St, we
computed, and stored as templates, all the output patterns O(ek) produced
by the different input patterns I~k) of the training set. Typical output
patterns are shown in Fig. 7; as can be seen, they range from having only
one nonzero component (Figs. 7a and 7b) to having several cells with
positive values (Figs. 7c and 7d). We should point out that most of the
output patterns obtained for all values of Q0 showed this latter behavior.
For each couple of training input patterns {I~,I~ k') ) we then mea-
sured both their mutual distance, as defined by Eq. (2.5), i.e.,
Qo = 1.2 Qo =2.1
o l a m ~ 8 ~ J
0
0
0 0
0
QO = 1.2 O.0 = 3
0 0
0
Fig. 7. Input and output patterns for a character and a line. The stars ( , ) represent the
position of the strongest output, the circles ( o ) the positions of outputs greater than average,
and the dots (.) the positions of outputs less than average.
s The equivalence classes are arbitrarily defined in such a way that the distance between the
last pattern in a class and the first one excluded is a maximum.
Dynamics of Self-Organization in Complex Adaptive Networks 375
Table I. The input and output distances between the pattern S and other
letters of the alphabet for different values of Q0- The distances equal to 1
(orthogonal patterns) are omitted for clarity. The numbers with a star
correspond to distances between patterns lower at the output than at the
input (worse discrimination after processing). The underlined values denote
patterns belonging to the same equivalence class.
Output
Q0 =
Pattern Input 1.2 1.4 1.6 1.8 2.0 2.2
8 0.067 0.700 0.560 0.065* 0.037* 0.002* 0.001 *
9 0.101 0.559 0.623 0.106 0.092* 0.012" 0.010"
6 0.101 0.672 0.344 0.095* 0.051" 0.140 0.016"
B 0.230 0.952 0.828 0.140" 0.080*
C 0.238 0.786 0.760 0.103" 0.267 0.081"
3 0.245 0.811 0.718 0.130" 0.080*
G 0.262 0.781 0.747 0.039* 0.278 0.105"
O 0.273 0.814 0.792 0.153" 0.296 0.108"
5 0.274 0.885 0.771 0.386
Q 0.320 0.800 0.987 0.856 0.282* 0.338 0.183"
D 0.388 0.970 0.794 0.222* 0.068*
2 0.396 0.701 0.808 0.275* 0.201 * 0.198"
E 0.397 0.935 0.812 0.480
P 0.444 0.936 0.830
R 0.456 0.993 0.809
F 0.500 0.991
U 0.500 0.967 0.819 0.829 0.800
Z 0.593 0.544
or noisy inputs as well. Quite generally, we concluded that the higher the
ratio between excitatory and inhibitory connections, the larger the equivalence
class and therefore the easier for a given input to produce an output
correlated with the learnt patterns.
A possible explanation of these results, along with those of the previ-
ous section, can be derived from the fact that Qo measures the differential
gain of the cell and thus the amount of information flowing through the
network. For low values of Qo, only few elements of the total input
information are propagated from layer to layer. This leads to a high
network selectivity and a consequent destruction of important elements of
the input patterns. On the other hand, for large values of Qo most of the
information is propagated through the network, leading to a buildup of
376 d'Humk~res and Huberman
6. CONCLUSION
Complex automata are structures situated in between the dynamics of
few degrees of freedom and the simplifying disorder encountered in many-
body systems like gases. As such they pose a special challenge when trying
to understand their dynamical properties as a function of given parameters
and inputs.
In this paper we have shown that it is indeed possible to study in a
quantitative fashion the dynamics of their self-organization. Through the
introduction of a general methodology, we were able to obtain crisp
information on issues that are central to the understanding of data process-
ing by machines and brains. Also, by performing experiments on a particu-
lar nonlinear adaptive network, we uncovered a rich variety of behaviors
and quantified them as a function of excitation, inhibition, and connectiv-
ity. In that fashion we discovered that in addition to regimes where
asymptotic learning can take place, there exist scenarios characterized by
periodic oscillations and chaos. Moreover, experiments on the recognition
properties of the automaton led to an understanding of the dependence of
equivalence classes on excitation and inhibition. Generally speaking, we
concluded that the higher the ratio of excitation to inhibition, the broader
the equivalence class into which patterns are lumped together. This finding
might be of relevance to both pattern recognition machines and neuro-
biology.
Last but not least, we should mention the issue of universality, i.e., to
what extent our results depend on both the particular set of training
patterns or the automaton being simulated. Whereas they indicate that the
behavior encountered in this study does not depend on a particular pattern
sequence or type, we have only tentative conclusions concerning indepen-
dence of network architecture. Although we believe that our findings are
likely to be found in any layered automaton obeying local computational
rules, more experiments will be necessary to test this hypothesis.
ACKNOWLEDGMENTS
We have benefited from useful conversations with T. Hogg and
M. Kerszberg. D. d'Humi~res would like to thank the Xerox Palo Alto
Dynamics of Self-Organization in Complex Adaptive Networks 377
Research Center for its hospitality during his stay. This work was partially
supported by O.N.R. contract N00014-82-0699.
APPENDIX
and
s(k+2,/1) = s(k)2,l + .~l~.-Cl~k~z , Ot~xz ) (A3b)
where ~I't and "-z are two transformations of R n • R" into R n'n such that the
elements Mi,j of M = qtz(U, V), and Ni,j of N = ,-'z(U, V), are related to two
given parameters ql and q2 (qJ > q2), to the components U/ of U, V, of V
and to the elements Bzj,j of given n • n matrices Bz, through the relations
tive layers when needed, and where the terms El,s are given sets of integers
between 1 and n which determine the outputs from other cells in the same
layer l to be locally compared to the output of cell i.
If the inputs remain always positive, the coefficients of the state array
grow indefinitely in time as the training sequence is repeated. At the same
time, the modifications induced by each pattern become decreasingly
important. Within this context, ql determines the relative amount of
change. Simultaneously, V, becomes large compared to 1 in Eq. (A2), and
Qo = 1/q2 then provides a measure of the differential amplifier gain, or of
the amount of information flowing through the network.
Lastly, we should mention that the exact implementation of the
algorithm is slightly more complicated than that described above. This
stems out of the need to deal with the problems of bootstrapping the
network out of a state characterized by a null output for any input. We
thus replaced Eqs. (A4) and (A5) by
M,? = q. 8,-B,,,?. Uj (A8)
N~,j = q. 6i. (Bt, i X U) (a9)
whenever V~ was zero and still a maximum. In this case the cells in the
neighborhood of a given one produce a zero output for nonzero input, and
it becomes necessary to build up the connections with equal weights for the
two filters. This particular process is continued with the same input pattern
until the network produces a nonzero output.
Within this scheme, Eq. (A1) is also replaced by
F(S,V)=~(~(s,• s2XV).fXdP(s, XV, s2XV)) (AI')
where f is a given n X n matrix, usually labeled lateral inhibition in brain
modeling. This is equivalent to splitting each layer in two parts, the first
one remaining as an adaptive layer updated by the output of a second one
having the same structure but given filters. The first filter selects the input
with the same index as the cell (identity transfer matrix), and the coeffi-
cients of the second filter are given by f. This second step provides an edge
enhancement process of the intermediate output vector. For example, if f is
a bidiagonal matrix filled with 1/2, then V - f • V is the discrete first-
order approximation to the derivative of V along its components (V is
essentially the sampling of a continuous function). This same process can
construct higher order derivatives or many similar output transformations.
REFERENCES
1. J. A. Feldman and D. H. Ballard, Cognitive Science 6:205 (1982) and references therein,
2. T. Kohonen, Associative Memory (Springer, New York, 1978); see also Parallel Models of
Dynamics of Self-Organization in Complex Adaptive Networks 379