ART Notes

Uploaded by

tarunbandari4504

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

25 views

ART Notes

Uploaded by

tarunbandari4504

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 12

8 t Adaptive Resonance Theory / fo The human‘brain Performs the formidable task of sorting a continuous flood of sensory information received from the environment. From a deluge of trivia, it must extract vital information, act upon it, and Perhaps file it away in long-term memory, Understanding human memorization presents Serious problems; new memories . Conventional artificial neural networks have fuiled 1 solve the stability—plasticit dilemma. Too often, learning a new j cites oF aS, Previous training. In some ¢ this is unim- portant, If there is only a fixed set of trining vectors, the network can be cycled through these repeatedly and n Hy eventually learn them all. Ina backpropagation network, for example, the training vectors are applied sequentially until the network has learned the entire set. If, however, a fully trained network Must learn a new training vector, it May disrupt the weights so bad! hat complete rtraining is required. 7 Taree ‘world case, the network will be exposed to &constantly changing environment; it may never see the same training vecto, . twice. der such circumstances, a backpropagation network will often learn‘nothing: ie will continuously Modify its weights to no avail, never arriving at satisfactory settings.7 Adaptive Resonance Theory 129 Ca ave shown exam- which ofa number of stored patte: sit most resembles. Its classifi- Ples of a network in which only it . Presented feoe decision is indicated by the single recognition layer that * Mill cause network weights to change continuously, res (Se Figure 8: [| Trae Taput vector hretors ak ’tBing. This temporal instability is ve or the main “Baller, a nen- category is creat led Grossberg and his associates to explore radically the Input vector. Once a stored patiecn matches th cena urations. Adaptive resonance theory. oF ART one [gout vector within a specified tolerance (the vigilancey that pat- ‘arch into this problem (Carpenter and Grossberg fern adjusted (trained) to make K Still more liké the input vector. : Grossberg 1987). Jo Rarer PANETT is S ver modified if it does nor, Sur: ART neiwocks and algorithms maintain the plasticity required to feRt input paueen within the vi tolerance. In this way, the {caro _new patierns, while pre venting the modificatiow sr patterns stabilityplasticity dilenmmia is ‘new patterns from the en- that have heen leartied previo sly. This capability has stimulated Wommenom-cremte Additional classification categories Buta WEI deal ot wircres pee MANY people have found the theory | ew input pattern-eamice ‘use an existing memory tobe changed i care to Utalerstand, The mathematics belinda recompli- | Unless the two match elosely cd. bur the fundamental ideas and implementations are not. We mare maine on IRE actual operation of AI: those where : 7 7 more mathematically inclined will find an abundance of theory in 7A Simplified ART Architecture the references, Our objective is to Provide enough conerete infor- EE eeeeeEEEeoccce 4 “ikorithmic form so that the reader can understand the ture 8-1 shows a simplified ART network configuration ba gr Bro and, pechaps, write computer simulations ve explore five functional modules. tt Consists of two layers of neurdn: = (he characteristics of this important network, : Peled “comparison” and “recognition.” Gain T, Gain 2a provide control Tanetions'm : divided into ovo paradigms, eact bythe form of the Taput data and its processing. ARTI i ©. accept only binary input vectors, whereas ART eg opment thac generilizes ART, can classify both binary tf Gontinudus inpitg “Only ARI is presented ia this Woe, he reader ineerested in ARTE2 is referred to Carpenter sid ene Peo CORD) for a complete treatment of this significant develop. f far tevuy, ARIEL is eeteered to ay AIT in the parageaphs that Jan ov the ART network isa Hel Lisstttes 1 tte an | Miguce 8-1. Simplified Adaptive Resonance Theory Network . eview of ART pis an Input_vector ies depending upon1 30 Neural Computing: Theory and Pracilce ‘sary to understand the it internal operation of the modules; the discussion that follows describes each of them. te Comparison Layer i comparison. layer receives the binary input vector X and ini- asses it through unchanged to become the vector C. In a ater phase, binary vector R iF Rs produced f serps ry ss S produ from the eecognition layer, ely neuron In the comparison layer (sce Figure 8-2) receives a binary inputs (zero or one): (YE component x, from the Soom x Qythe feedback signal Py the weighted'sum of the recognition layer outputs; and @Y an“Zaput from the gun signal aint ( he same signal goes to all neurons in this layer), og Suu one, at east wo ofa newron’s three inputs must be Otherwise, {ts Output Is zero. This implements the “two- thirds rule," described by Carpenter and Grossberg (1987b). loi: Figure 8-2. Simplified Comparison Layer ese wes ‘Adaptive Resonance Theory ist tially, gain signal Gain 1 is set to one, providing one of the needed Tnputs, and.all compone: is of the yerwr Kare set to 4e1 vector C starts out identical to the binary input vector X: Recopnlion Laver “The cecognition layer serves to classify the input vector, Fach re¢~ ‘Ognition layer neuron has an associated wei sit vector B, Only the ‘fieuron_with a weight vector best matching the input "fires"; all others are Inhibited. As illustrated in Figure 8-3, a neuron in the cecugnition responds maximally when the vector € from the comparison I ina SeCOT welghts; hence, these weights constitute a stored pattern or exemplar, an idealized example, for a category of input Vectors, These, weights are real numbers, not bindy valued, A bin. ry version of the same pattern is also stored in a correspond “aT weights in-the Comparison layer (sce Figure 8-2); this set con- Figure 8-3. Simplified Recognition LayerTCE this feedbiek tends to rein 12 Neucal Computing: Theary and Practice Sists of those we Aeuron, one wei ights that connect to a specific recognition-layer ; ight per comparison-layer neuron, I operation, cach reeognition-lave ecognition-layer neuron computes a dot Product betwe and Tie icone ee oe ron thatch v : a : ¢ the Vector C will have the largest Winhitig the competition while inhibiting al other Output, thereby wi yer Reuron; oe ae 8-4, the neurons in the recognition layer are TIE ony an ee ateeal ison NetWork. Tn the simplest case iene ne considered in this volume), this ensures that only neiailin me ; atime (L.e., only the neuron with the highest ann othe iput_a One; all others will be zero). This negative wegen Se akeall response is achieved by connecting @ ‘kitive weight /, from cach neuron’s output F, fo the input of the Other neurons. This, Wa neuron ha TTP nhibies all OEE NEUrONS TH Ne layer. Also, each HCurOn Nas 4 postive Weig Output to its own input, Tra neurOn’s Output is aca One farce and sustain it, G2. the output of Gain 2. is one if input vector X has any component that is ane, Mare precisely. G2 is the logical “or” of the components of X = ~ Figure 8-4, Lateral Inhibition-Recognition Layer ‘Adaptive Resonance Theory 133 Gain Like G2, the output of Gain 1 is one if any component of the binary input vector X is one; however, any component of Rls One, OT Is Forced to zer0. The (able that Tollows shows this rela- tionship: “or ofX "Or" of R Components Components G3 Reset j The reset module measures the similarity between vectors X and ffer by more than the vigilance parameter, a reset signal ‘sent to disable the firing ncuron in the recognition laver. Ze Tn operation, the reset module calculates similarity as the ratio ‘of the number of ones in the vectar C to the number of ones in the below the vigilance parameter level, the vector X. If this ratio i ART Classification Operation The ART ogaition, comparison, and search. assification process consists of thre major phases: eee- hases: rec The Recognition Phase {nitially, no input vector is applied; hence, all components of inpit vector X are zero. This sets G2 to zero, thereby disabling all recog nitlon-layer neucons and causing thelr outputs to be zero. Because all recognition-layer neurons start out in the same state, all have an equal chance to win the subsequent competition.sree Neural Computing: Theory and Practice The vero wo Be latte, X snow apie. or Rane Sanponenis at sree Mite malo GU ad 2 Seer vince al oT ire comparison layer neurons, providing one of the two Inputs required by the two-thirds rule, thereby allowing a neuron to fire if the corresponding component ‘of the X Input vector Is one. Thus, during this phase, vector € Is an exact duplicate of X. Next, for each neuron in the recognition layer a dot product is formedibetween its associated weight vector B, and the vector C Gee Elgure 8-4), The neuron with the largest dot product has weights that best match the input vector. It wins the competition and fires, inhibiting all other outputs from this layer. This makes 2 single component r) of vector R (see Figure 8-1) equal to one, and all other components equal to zero. ‘To summarize, the ART network stores a sct of patterns in the weights associated with the recognition-layer neurons, one for weiahe aetncaon category. The recognition ayer neuron with fwelghts that best match the applied vector fires, its output becomes one, and all other outputs from this layer are forced to zero. “The Compieison Phase “The single heuron firing in the recognition layer passes 2 one back to the comparison layer on its output signal r,. This single.one may “ “be visualized as fanning dut, going through a separate binary weight fq {0 each neuron in the comparison layer, providing each with a signal p, which is equal to the value of fy (one or 2er0) (see Figure 8 The initialization and training algorithms ensure that each weight vector 7, consists of binary valued weights; also, each weight vector B, constitutes a scaled version of the corresponding weight vector 7. This means that all components of P, the comparison-layer excitation vector, are also binary valued. Since the vector R is no longer all zeros, Gain 1 is inhibited and its output set to zero, Thus, in accordance with the two-thirds rule, thonly comparison-layer neurons that will fire are those that receive simultaneous ones from thé input vector X and the “Yector P. uw Sian Layer Senn on i Fanuc B <= COMPARISON LAYER Xn G Figure 8-5. Signal Path for a Single-Firing Recugnition-Layer Neuron ‘in other words, the top-down feedback from the recognition layer acts to force components of C to zero in cases in which the input does not match the stored pattern, that is, when X and P do not have coincident ones. If there is a substantial mismatch between the X and P (few coincident ones), few neurons in the comparison layer will Fire and C will contain many zeros, while X contains ones. This ind: ‘cates chat the pattern P being fed back is not the one sought and the neuron firing in the recognition layer should be inhibited. This inhibition is-performed by the reset block in Figure 8-1, which compares the Input vector X to the C yector and causes the reset signal to occur if their degree of similarity is less than the vigilance level. The effect of the reset is to force the ourpuc of the firing neuron in the recognition layer to zero, disabling 1 for the du tion of the current classification. ‘ ‘The Search Phase eee If there is no reset signal generated, the match is adequate and the36 Neural Computing: Theory and Practice ification is finished. Otherwise, other stored patterns must be searched to seek a better match. In the latter case, the inhibition of the hiring neuron in the recognition layer causes all components of the vector R 10 return to zero, G1 goes to one, and input pattern X once again appears at C. As a result, a different neuron wins ia the recognition layer and a different stored pattern P is fed back to the comparison layer. IPP does not match X, that firing recognition layer neuron iy also inhibited. This process repeats, ncuron by Acuron, until one Of two events occurs: 1A stored p: tera is found that matches X above the level of the vigilance parameter, that is, $>p. If this occurs, the network enters a training cycle that modifies the weights in both T, and B,, the weight vectors associated with the firing recog: nition layer neuron. . All stored patterns have been tried, found to mismatch the input vector, and all recognition-layer neurons are inhibited. If this is the case, a previously unallocated neuron in the recognition layer 1s assigned to this pattern and its weight vectors B, and T, are set to match the input pattern. The network described must perform a sequential search through all of its stored patterns. In an analog implementation, this will occur very rapidly, however, it can be a time-consuming process in a simutition ‘computer. If, however. the ART network is implemented with parallel processors, all dot products in the recognition layer can be performed simultancous: Te Inthe ease the seaech will be very capt Hane vequired for the lateral-inhibition network 1 digital computer, For tateral tabi tion te seleet “winner” all ncurons in the layer are Involved [0 canultaieans computation and communication This can require & rqence oecurn, & Sy alse be Lengthy in a ser substantial ann feedtonwand ater tron can substantially reduce this time (see Chats {compu infiibition necwork as used In the acocogni- 10). ‘Adaptive Resonance Theory 137 ” yr IMPLEMENTATION Overview ART, as it is generally found in the literature. is something more than a philosophy, but much less concrete than a computer program. This has allowed a wide range of implementations that adhere to the spirit of ART, while they differ greatly in detail, The implementation that follows is based on: Lippman (1987). with certain aspects changed for compatibility Carpenter and Grossberg (19878) and the conventions of tiis volume, This treatment fs typical, but other successful implementations differ greatly. fa ART Operation “ “ Considered in more detail, the operation of an ART system fonsists of five phases: initialization, recognition, comparison, seatch, and training, = Initialization Before starting the network tri nd T, as well as the vigil values. The weights of the bottom-up vectors B, are all initialized to the me low value. According w Carpenter and Grossberg (198° this should be ing process, all weight vectors B, nce parameter p MUSE bE set 0 ji by StL 14m) for alg 1) where t= the number of components in the Input vector Le aconstant> | (typically, £ = 2) This value Is critical; IF it Is tow ts ‘ognition-t re the network can allocate all neurons to at slagle Input vector, The weights of the top-down vectors T, are all initialized to 1, so138 ‘Neural Computiag: Theory and Practice twat forall), t (6-2) This value is also critical; Carpenter and Grossberg (1987a) prove that top-down weights that are too small will result in no matches at the comparison layer and no training. ‘The vigilance parameter pis set in the range from 0 to 1, depend: ing upon the.degree of mismatch that is to be accepted between the stored pattern and the input vector. At 2 high value of p, the network makes fine distinctions. On the other hand, a low value causes’ the grouping of input patterns that may be only slightly similaz. It may be desirable to change the vigilance during the training process, making only coarse distinctions at the start, and thén gradually increasing the vigilance to produce accurate catego- tization at the end, Recognition Application of an input vector X initiates the secognition phase. Because initfally there is no output from the recognition layer, G1 is set to 1 by the “or” of X, providing all comparison-layer ncurons with ne of the two inputs needed for it to fire (as required by the (wo-tirds rule). As a result, any component of X that Is one provides the second input, thereby causing its associated compari- ‘sotielayer ‘neuron to fire and output a one. Thus, at this time, the ‘yector C will be Identical to X. "AS discussed previously, recognition is performed as a dot prod- Suge for each neuron in the recognition layer, and Is expressed as fd Ny follows: NET,=(B,C) 3) where B,= the weight vector associated with recognition-layer neu ronj uron; at this C= the output vector of the comparison-layer ne time, C is equal to X the excitation of neuron j in the recognition layer NET, Fis the threshold function that follows: Adaptive Resonance Hea y 19 OUT, = Lif NET,>7 (4) 0 otherwise where Tis a threshold, Lateral inhibition is assumed 10 exist but is plify these equations. It ensures that only the reeves euron with the highest value for NET will have anv outpur oF O88 all others will output zero. It is quite possible to de ems in which more than one recognition-layer neuron fires 4 this is beyond the scope of this volume. gnored here to sin iow-laye! Comparison: At this polis ‘the feedback signal from the recogni G1‘t0 go to zero; the two-thirds rule permits only son-layer neuroris to fire that have corresponding compon the vectors P’and X both equal to one. ‘The reset block compares the vector producing a reset output whenever thei Sis below the vigilance threshold. Computing this similarity is simpl ied by the fact that both vectors are binary (all elements are either one oF zero). The procedure that follows computes the required measure of similarity. those compati nents of to the input vector X 1, Call’D the number of 1s in the X vector. 2." Call N the number of Is in the C vector. ‘Then compute the similarity Sas follows: S=NID For example, suppose that X=1011101 thenD=5 C#OO11101 thenN=4 Sa NID=08 $ will vary from 1 (perfect match) to 0 (worst mismatch). Note that the two-thirds rule makes C the logical “and” of the140 PUL vector X with the vector P weet gate X with the vector B But P is equal to T,, the weight Ore inning neuron. Thus, D may be found as the number of Is in the logical “and"* Of T, with x. rh IC the similarity: § of the Winning neuron is greater than the vigis Previously satis Feduired. If, however, the network has been Cal C0 any seen tepePleation of an input vector that Is not idente Below tne Ore May fire a recognition-layer neuron witha ce level. Due to the training algorithm, itis 4 different recognition-layer neuron will provide 2 Ten ceding the vigilance level, even though the dot Pree PetwCeN its weight vector and the Input ween may be lower. situation is shown below. jdeahe s § below te vigilance level, the stored patterns MUSE DE 5 relied, seeking one that matches the Input vector more ing th Will then be trained tarily’ dis cloxely, oF £ + terminating on an uncommitted neuron that iate the search, the reset signal tempo- bes the firing neuron in the recognition layer for the cll, GI goes to one, and a different recogaition- tion, Its pattern is then tested for On ent the Process repeats until either a recogaition-layer ane Ciné the competition with similarity greater than the vigilance (1 successful search), or all committed recognition-layer now: fons have bees tried and disabled (unsuccessful search). AN unsuccessful search will automatically terminate on an uncommitted ncuron, as its top-down weights are all ones, their ini. ties Thus, the two-thirds cule will make the vector € Ident tS the ouutowy Swill be one, and the vigitace will be tistied ‘Training Hatning ty the process in which a set of Input vectors are presented Seqquentially wy the input af the network, and the network welghts are so adjusted that similar vectors activate the same recogaltion- Is wosupervlied tealningy there In ne fercher and no target vector w mdicate the desired response. Carpenter and Greohers (1984) distinguish two kinds of train: layer ne ‘Adaptive Resonance Theory a ing: slow and fast. In slow training, an input vector may be applied So busy that network weights do not have enough time to reach theic Asymptotic values during 2 single presentation. Thus, the Nwcjghts will be determined by the statistics of the Input vectors rather than by the characteristies of any one. The differential equa- ons of slow training describe the network dynamics during the training process. Fast training is a special case of slow training that applies If the input vectors are applied for a long enough period of tine to allow the welghts to approach their final values. In this case, the training formulas involve only algebrale equations. Also, top-down welghis assume only binary values rather than the continuous range rc- Guired in fast training. Only fast training is described in this vol- time; the interested reader can find an excellent treatment of the more general, slow-training case in Carpenter and Grossberg (19874), a “ide training algorithm that follows is applied in both successful and unsuccessful searches. 4 Set the vector of Bottom-up weights B, (associated with the firing recognition-layer neuron /) to the normalized values of the vector C. Carpenter and Grossberg (1987a) calculate these weights follow: byah cai(e- 1+) (6) where €:= the “th component of the comparison-layer output vector J= the number of the winning recognition-layer neuron, ‘y= the bottom-up weight in B, connecting neuron in the eom- parixon layer to neuron In the recogaltion layer a constant> 1 (typically 2) L Weights in the vector T, that are associated with the new stored Pattern are adjusted so that they equal the corresponding biniey values in the vector C: tee, for ale at) Where 4, Is the weight from the winning neuron J in the recogni tion layer neuron Cin the comparison layer,ues UR Ge OU «a a2 ‘Neural Computing: Theory and Practice wa ART TRAINING EXAMPLE. In outline, the network is trained bj 5 y adjusting the top-down and bonoat-up. ‘weights so that the application of an input pattern bo ises the network to activate the recognition-layer neuron asso- clated with a similar stored pattern. Furthermore, training Is ac- complished in afashion that does not destroy patterns that were learned previously, thereby preventing temporal instability. This sk Is'Controlled by the level of the vigilance parameter. A novel :lapuepattera (one that the network has not seen before) will fall to stored patterns within the tolerance imposéd by the vigl- lance level, thereby causing a new stored pattern to be formed. An put pattern sufficiently like a stored pattern will not form a new exemplar; ic will simply modify one that it resembles. Thus, with a suitable setting of the vigilance level, new input patterns already learned and temporal instability are avoided. Figure 8-6 shows 2 typical ART training session. Letters are -shown'as patterns of small squares on an 8-by-8 grid. Each square — joa Be + Figure 8-6. ART Training Session Adaptive Resonance Theory. Ma on the left represents a component of the X vector with a value of one; all squares nut shown ure components with values of zero. Letters on the right represent the stored patterns; each is the set of the values of the components of a vector First, the letter C is Input to the newly ec! system, Be- cause there Is no stored pattern that matches it within the vigilance limit, the search phase fails; a new neuron is assigned in the recog: ition layer, and the weights T, are set to equal the corresponding components of the Input vector, with weights B, becoming a scaled version. 3; Next, the letter B is presented. This also fails in the search phase and another new neuron is assigned. This is repeated for the letter E, Then, a slighitly corrupted version of the leer E is presented to the netwark. Its close enough to the stored F to pass Me vigilance test, $0 It ig used to teain the network. The missing pixel in the ower leg of the E produces 2 zero in the corresponding position of the vector C, causing the training algorithm to sex that weight of the stored pattern to zero, thereby reproducing the break in the not corrupt the stored pattern, The extra isolated square do stored pattern, as there is no corresponding on ‘The fourth character is an E, with two different errors. This fails to match a stored pattern ($ is less than the p), so the search fails and a new neuron Is assigned ‘This example illustrates the importance Of setting-the vigilance paramejer correctly. If the vigilance is too high, most patterns will fail to match those in storage and the network will create a new neuron for: each of them: This results in poor generalization, as minor variations of the same pattern become separute categorics. ‘These categories proliferate, all available recognition-layer neurons ure assigned, and the system's ability to incorporate new data halts. Conversely, if the vigilance Is too low, totally different letters will be grouped together, distorting the stored pattern until it bears little resemblance to any of them. Unfortunately, there is no theory to guide the setting of the vigilance parameter; one must first decide what degree of differ ence between patterns will constitute a different category. The boundaries between categories are often “fuzzy"" and @ priori de- cisions on a large set of input examples may be prohibitively diffi- cult.44 Neural Computing: Theory and Practice + ag utPenter and Grossherg (19872) propose a feedback process to Zdlust the vigilance, wherchy incorrect eategorlzation vesults In Penishment’’ from an outside agency that acts to raise the vig lance. Such a system requires.a standaed to determine if the classi cation was incorrect ht ACTERISTICS OF ART Dx¢ AMT system has a number of important characteristics that are ase The formulas and algorithms may seem arbitrary, wih se an fact. they have heen carefully chosen to satisfy theo: OF tne ettding system performance. This section discusses some rese implications of the ART algorithms, thereby showing the 8 behind the design of the initialization and training for. mulas, Top-Down Weight Initialization rom the earlier training example it fay be seen that the two: thirds rule makes vector € the “and” between the input vector X, and the winning stored vector T,. That is, only If corresponding components of cach are one will that component of © be one After training, thse components of T, remain one; all others are forced to zero. ‘This explains why the top-down weights must be initialized to ones. If they were initialized to zeros, all components of veetor C would be zero regardless of the input vector components, and.the (alning algorithos would prevent the weights from being anything but zero. ‘Training may be viewed as a process of “pruning” components Of the stored vectors that do nat mateh the input vectors, This Process to tneversible, thitt ts, once a top-down welght has been set Wo zero, the training algorithm can never cestore Ittoaone, ‘This characteristic important implications for the learning can. Suppane th; nroup af elonely related vectora should be assified Into the same category, indicated by thelr fleing the same Fecognition Laver sewn I they are presented sequentially 1 the © Adaptive Resonance Theory us network, the first will be assigned a recognition-layer neuron; Its weights‘will be trained to match the input vector. Training with the rest of tire vectors will set the welghts of the stored vector to zero in all positions where they coincide with zeros from any of these input vectors. Thus, the stored vector comes to represent the logical intersection of all of the training vectors and may be thought of as encoding the essential features of a category of input vectors. A new vector consisting only of these essential features will be assigned to this category; thus, the network correctly recognizes a pattern it has never seen before, an ability reminiscent of human abstraction.» Bottom-Up Weight Adjustments st eis repeated Were for 4 The weight adjustment formula (Equatidn 8-6, reference) Is central to the operation of the ART system, for ten(e-1+ a) a) ‘The summation in the denominator represents the number of ‘ones in the output of the comparison layer. As such, this number may be thought of as the “size of this vector. With this interpreta- tion, large C vectors produce smaller weight values for by than do small C vectors. This “‘selfscaling™ property makes it possible to separate two vectors when one Is a subset of another: that Is, its ones are in some but not all of the positions of the other. To demonstrate the problem that resulty if the scaling shown in ation 8-6 is not used, suppose that the network has been tetlned 1 the (wo Input vectors that follow, wlth a recognitions layer neuron assigned to each. Xe10000 Xeri1080 Note that X, is a subset of X,. Without the scaling property, bot- tomaup welghts would be trained to the same values for each pate tern, If this value were chosen to be 1.0, the welght patterns that follow would result,146 ‘Neural Computing: Theory and Practlee T,=B,=10000 T,=B,=11100 IfX, is applied once more, bot , both recognition-layer neurons receive the’same activation; hence, likel; _the's 5 i as not, neuron 2, th . ‘will'win the competition. means wrongone In’addition to making an incorrect Fos, 1B classification, training can be sesmoyed. Becalise T, feeds down 11100, only the first 1 Is ae nc ed by the Input vector, C becomes 10000, vigilance is satisfied, and training sets the second and third 1s of T; and B, to ‘destroying the trained pattern. : Scaling: the bottom-up welghts according to Equation 8-6 pre- ». Wen ‘this undesirable behavior. Suppose, for example, that Equa- Z thon. os is-used with L=2, thereby producing the formula that st bya emt + Da) Bottom-up weights will now triin-to the values { B10 0 00 B,='/2 v2 200 x : c Bt “Applying X, produces an excitation of 1.0 on recognition-layer neuron, 1, but only 1/2, for neuron 2; thus, neuroA I (correctly) ‘ving’ the competition. Similarly, applying X, produces excitation "levels of 1.0 for neuron 1, but 3/2 for neuron 2, again sélecting the correct winner. Bottom-Up Weight Initialization Initializing the bottom-up weights to low values is essential (0 the correct functioning of the ART system. If they are too high, input Vectors that have already been learned will activate an uncommit- rrr by< Lei +m) forall, (1) Setting these weights to low values ensures that an uncommitted neuron will not “overpower" 2 trained recognition-layer neuron. Using our previous example with L= 2 and 11 = 5, by<1/3, 50 we arbitrarily set by = 1/6. Wich these weights, applying a vector for which the network has been trained will cause the correctly tralned recognition-layer neuron to win over an uncommitted neu ron..For.example, on an uncommitted neuron, X, would produce an excitation of 1/6, while X, would produce 1/2; both are below the excitation produced on the neuron for which they were trained: : Searching It may appear that direct access obviates the need for a search except When an uncommitted recognition-layer neuron is to be as signed. This is not the case; application of an input vector that is similar, but not identical, to one of the stored patterns may not on tHe first trial select a recognition-layer neuron such that the similarity $ exceeds the vigilance p, even though another neuron will ‘As in the preceding example, assume that the network has been trained on the two vectors that follow xX,=10000 X,=11100 with bottom-up weight vectors trained as follows By=1 0 0 00 Bys'2 2200 Now apply an input vector X,=1 10 0 0. In this case, the excitation to recognition-layer neuron | will be 1.0, while that of neuron 2 will be only 2/3. Neuron 1 will win (even though it is not the best match), C will be set to 1 0 0 0 0, and the similarity $ will be 1/2, 1f 29¢ted*recognition-layer neuron rather than the one that has been é:ce;previously trained. The formula for:bottom-up weight assign: ments, Equation 8-1, is repeated here for reference: the vigilance is set at 3/4, neuron 1 will be disabled, and neuron 2 will now win the competition. C will now become 110.00, S will be 1, the vigilance will be satisfied, and the search will stop.148 Neural Computing: Theory and Practice * Theorems of ART : In Carpenter and Grossberg (19872), several theorems are proven that show powerful characteristics to Be inherent to the system. ‘The four results that follow are among tle most Important: 1. After training has stabilized, application of one of the training vectors (or one with the essential features of the category) ‘ill activate the correct recognition-layer neuron without searching, This “direct-access” characteristic implics rapid access to previously learned patterns 2. ‘The search process is stable. After the winning recognition- neuron is chosen, the system will not switch from one neuron to another as result of the top-down vector's modification of €, the output of the comparison I: ver; only reset can cause this ng will not cause & switch from one recognition-layer neuron to another. 4. The training process terminates. Any sequence of arbitrary input vectors will produce a stable set of weights after, a finite numb-r of learning trials; no repetitive sequence of training vec~ tors will cause ART’s weights to cycle endlessly. DISCUSSION” ART is an interesting and important paradi wer im. It solves the stabill- \d performs well in other regards. The ART architecture was designed to be blologically plausible; that ts, is mechanisms are intended to be consistent with those of the brain (ay we understand them). It may fail, however, to simulate the distributed storage of internal representations, which many sce as an tity (character tate af the cerebral function, ART's excm: plars represent “grandmother cells”; loss of one cell destroys an entire memory, In contrast, memories in the bral seem to be dls- iributed over substantial regions; a recollection can often survive considerable physical damage without belng lost entirely, Ie scems logical to study architectures that do not violate our understanding of the brain's organization and function, The hue [Adaptive Resonance Theory 49 man brain constitutes an existence proof that a solution to the patiern-recognition problem is possible. It seems sensible 10 ems Tate this working system if we ‘wish to duplicate its performance. However 2 counterargument recounts the history of powered fight; man failed to get off the ground until he stopped trying to imitate the moving wings and feathers of the birds. References x na i Carpenter, G2 and Grossherg, 8. 1986, Neural dynamics Of eateROry, EN ‘ng and recognition: Attention, memory consolidation, and amnesia 5 In Brain Structure, Learning and Memory (AAAS Sympoalies SoHit ties), eds. J. Davis, R, Newburgh, and E, Wegman. ok _SSYoa7a. A massively parallel architectyte for a seltneganiaing mew Tal pattern recognition machine, Camputer Viston, Grapplest cand Tinae Processing 37:54-115. . TONTH, ART 2: SelCorgantzation of stable category recogaltlon Toules for analog input patterns. Applled Optics 26(23 ionp 80 ee Grossberg, ° 1957. Competitive learning: From interactive activation (9 ndaptive resonance. Cognitive Science U:23-63. Lippman, 1K. % 1987. An introduction to computing with neuFal nets, REE Transactions on Acoustles, Speech end Signal Process Aptil. pp. 4-22,

Introduction To Neural Networks Using MATLAB
100% (1)
Introduction To Neural Networks Using MATLAB
548 pages
Adaptive Resonance Theory - Tutorialspoint
No ratings yet
Adaptive Resonance Theory - Tutorialspoint
3 pages
Artnaveen 121201082626 Phpapp01
No ratings yet
Artnaveen 121201082626 Phpapp01
23 pages
Image Classification With Feed-Forward Neural Networks: Meller, Matula and Chłąd
No ratings yet
Image Classification With Feed-Forward Neural Networks: Meller, Matula and Chłąd
7 pages
05 Adaptive Resonance Theory
No ratings yet
05 Adaptive Resonance Theory
0 pages
Adaptive Resonance Theory (Art) Network
No ratings yet
Adaptive Resonance Theory (Art) Network
22 pages
Unit-III
No ratings yet
Unit-III
27 pages
Adaptive Resonance Architectures
No ratings yet
Adaptive Resonance Architectures
4 pages
Adaptive Resonance Theory PDF
No ratings yet
Adaptive Resonance Theory PDF
12 pages
WINSEM2022-23 CSI3006 ETH VL2022230503257 ReferenceMaterialI FriFeb1000 00 00IST2023 AdaptiveResonanceTheory
No ratings yet
WINSEM2022-23 CSI3006 ETH VL2022230503257 ReferenceMaterialI FriFeb1000 00 00IST2023 AdaptiveResonanceTheory
5 pages
05-Adaptive Resonance Theory
No ratings yet
05-Adaptive Resonance Theory
0 pages
Neural Networks - Lecture 4
No ratings yet
Neural Networks - Lecture 4
82 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
A New Sort of Computer: What Are (Everyday) Computer Systems Good At... and Not So Good At?
No ratings yet
A New Sort of Computer: What Are (Everyday) Computer Systems Good At... and Not So Good At?
30 pages
FRJ Paper
No ratings yet
FRJ Paper
9 pages
Grossberg1987 Nonlinear Neural Networks Principles Mechanisms and Arhitectures
No ratings yet
Grossberg1987 Nonlinear Neural Networks Principles Mechanisms and Arhitectures
45 pages
NNDL
No ratings yet
NNDL
64 pages
Lecture25 Spring2018
No ratings yet
Lecture25 Spring2018
54 pages
Research and Prospect of Image Recognition Based o
No ratings yet
Research and Prospect of Image Recognition Based o
7 pages
WINSEM2022-23 CSI3006 ETH VL2022230503257 ReferenceMaterialI ThuFeb0900 00 00IST2023 Module4
No ratings yet
WINSEM2022-23 CSI3006 ETH VL2022230503257 ReferenceMaterialI ThuFeb0900 00 00IST2023 Module4
12 pages
Introduction To Artificial Neural Networks
No ratings yet
Introduction To Artificial Neural Networks
73 pages
Deep Learning
No ratings yet
Deep Learning
90 pages
DL Concepts 1 Overview
No ratings yet
DL Concepts 1 Overview
80 pages
chapter 3 soft computing
No ratings yet
chapter 3 soft computing
1 page
Neural Network Using Matlab
63% (30)
Neural Network Using Matlab
548 pages
132618915-Neural-Network-Using-Matlab Sumathi and Sivanandam PDF
67% (3)
132618915-Neural-Network-Using-Matlab Sumathi and Sivanandam PDF
548 pages
NNDL_Unit.
No ratings yet
NNDL_Unit.
18 pages
Research Paper On ART
No ratings yet
Research Paper On ART
3 pages
Learning Law in Neural Networks
100% (2)
Learning Law in Neural Networks
19 pages
Artificial Neural Networks: Introduction To Computational Neuroscience
No ratings yet
Artificial Neural Networks: Introduction To Computational Neuroscience
42 pages
Chap 2
No ratings yet
Chap 2
105 pages
ECSE484 Intro v2
No ratings yet
ECSE484 Intro v2
67 pages
Unit Iv DM
No ratings yet
Unit Iv DM
58 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
33 pages
Kiet School of Engineering & Technology: Department of Computer Appication
No ratings yet
Kiet School of Engineering & Technology: Department of Computer Appication
30 pages
30 Years of Adaptive Neural Networks (JNL Article) - B. Widrow, M. Lehr (1990) WW
No ratings yet
30 Years of Adaptive Neural Networks (JNL Article) - B. Widrow, M. Lehr (1990) WW
28 pages
Widrow Lehr
No ratings yet
Widrow Lehr
28 pages
Introduction To Deep Convolutional Neural Networks: March 2016
No ratings yet
Introduction To Deep Convolutional Neural Networks: March 2016
51 pages
NN Learning
No ratings yet
NN Learning
69 pages
CCS355 NNDL Unit1
No ratings yet
CCS355 NNDL Unit1
30 pages
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
No ratings yet
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
45 pages
Adaptive Resonance Theory
No ratings yet
Adaptive Resonance Theory
5 pages
ANN PG Module1
No ratings yet
ANN PG Module1
75 pages
6S191 MIT DeepLearning L1
No ratings yet
6S191 MIT DeepLearning L1
108 pages
Artificial Neural Networks: A Tutorial: Neginyousefpour
No ratings yet
Artificial Neural Networks: A Tutorial: Neginyousefpour
27 pages
Neural Networks
No ratings yet
Neural Networks
40 pages
Artificial Neural Networks: System That Can Acquire, Store, and Utilize Experiential Knowledge
100% (1)
Artificial Neural Networks: System That Can Acquire, Store, and Utilize Experiential Knowledge
40 pages
Ann I
No ratings yet
Ann I
41 pages
Baum 1987 - On The Capabilities of Multilayer Perceptrons
No ratings yet
Baum 1987 - On The Capabilities of Multilayer Perceptrons
23 pages
Lecture 2a An Overview of The Main Types of Neural Network Architecture
No ratings yet
Lecture 2a An Overview of The Main Types of Neural Network Architecture
32 pages
Neural Network Fundamentals With Graphs
No ratings yet
Neural Network Fundamentals With Graphs
6 pages
Mitja Perus - Neural Networks As A Basis For Quantum Associative Networks
No ratings yet
Mitja Perus - Neural Networks As A Basis For Quantum Associative Networks
12 pages
ECE/CS 559 - Neural Networks Lecture Notes #8: Associative Memory and Hopfield Networks
No ratings yet
ECE/CS 559 - Neural Networks Lecture Notes #8: Associative Memory and Hopfield Networks
9 pages
Neural Networks: Some Material Adopted From Notes by
No ratings yet
Neural Networks: Some Material Adopted From Notes by
35 pages
11_chapter 5
No ratings yet
11_chapter 5
14 pages

ART Notes

Uploaded by

ART Notes

Uploaded by

You might also like