0% found this document useful (0 votes)
17 views

Lecture 5 reading

This document reviews the role of learning and neural plasticity in visual object recognition, highlighting how the adult primate visual system adapts to recognize objects in complex environments. It discusses the mechanisms through which the brain builds selective and tolerant neuronal representations for object recognition, emphasizing the importance of visual experience and learning in shaping these representations. The authors propose that experience-based plasticity across various stages of visual processing enhances the ability to integrate local features into coherent objects, ultimately facilitating recognition in cluttered scenes.

Uploaded by

dk4bpxnffy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Lecture 5 reading

This document reviews the role of learning and neural plasticity in visual object recognition, highlighting how the adult primate visual system adapts to recognize objects in complex environments. It discusses the mechanisms through which the brain builds selective and tolerant neuronal representations for object recognition, emphasizing the importance of visual experience and learning in shaping these representations. The authors propose that experience-based plasticity across various stages of visual processing enhances the ability to integrate local features into coherent objects, ultimately facilitating recognition in cluttered scenes.

Uploaded by

dk4bpxnffy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Learning and neural plasticity in visual object recognition

Zoe Kourtzi1 and James J DiCarlo2

The capability of the adult primate visual system for rapid and temporal cortex]). The highest stages of this stream
accurate recognition of targets in cluttered, natural scenes far (i.e. anterior inferior temporal cortex [AIT] in the mon-
surpasses the abilities of state-of-the-art artificial vision key, and lateral occipital complex [LOC] in the human)
systems. Understanding this capability remains a fundamental are thought to convey neuronal signals that are well suited
challenge in visual neuroscience. Recent experimental to support object recognition directly. In particular,
evidence suggests that adaptive coding strategies facilitated unlike earlier visual areas, patterns of neuronal activity
by underlying neural plasticity enable the adult brain to learn in these regions explicitly convey object identity, in that
from visual experience and shape its ability to integrate and object identity can be directly extracted from those
recognize coherent visual objects. populations, even in the face of identity-preserving image
Addresses changes [2–5]. But how are such useful neuronal repre-
1
School of Psychology, University of Birmingham, Edgbaston, sentations constructed in the brain? How do neuronal
Birmingham B15 2TT, UK
2
connections become wired up and modified so that neu-
McGovern Institute for Brain Research and Department of Brain and rons respond to complex combinations of simple image
Cognitive Sciences, MIT, Cambridge, MA, 02139, USA
features and are sensitive to subtle changes in object
Corresponding author: Kourtzi, Zoe ([email protected]) identity, yet are relatively insensitive to large, identity-
preserving image changes? How does the brain even know
what an ‘object’ is in the first place? Do these neuronal
Current Opinion in Neurobiology 2006, 16:152–158 representations code for all possible objects, or do they just
This review comes from a themed issue on
represent objects that are behaviourally relevant or often
Cognitive neuroscience encountered in the environment? At the core of these
Edited by Paul W Glimcher and Nancy Kanwisher issues are fundamental questions about the role of visual
experience and learning in the establishment and main-
Available online 24th March 2006
tenance of the neuronal representations that support com-
0959-4388/$ – see front matter plex object recognition (see [6] for an earlier review).
# 2006 Elsevier Ltd. All rights reserved.

DOI 10.1016/j.conb.2006.03.012
At a theoretical level, there is growing appreciation of the
potentially powerful role of learning in establishing robust
representations crucial for object recognition [7–12].
Experimentally, the role of learning in object representa-
Introduction tion can be approached by studying developing visual
Detecting and recognizing meaningful objects in com- systems [13], visual systems that have been deprived of
plex environments is a crucial skill that underlies a range experience during early life [14,15] and the role of visual
of behaviours, from identifying predators and prey and experience in adults [16–18]. All of these approaches have
recognizing edible and poisonous foods, to diagnosing been used to gain new insight into the role of learning in
tumours on medical images and finding familiar faces in a feature and object representation. For the purposes of this
crowd. In humans and other visual primates, these pro- review, we focus on experience-based plasticity in the
cesses operate quickly, automatically and effortlessly, and adult visual system. In particular, many psychophysical
are thus easily taken for granted. However, the computa- studies in adults have shown learning-dependent changes
tional challenges of visual recognition are far from trivial. in discrimination and recognition using stimuli ranging
In particular, the recognition of coherent meaningful from simple features, such as oriented lines and gratings
objects entails integration at different levels of visual [19], to complex objects [20]. Recent neurophysiological
complexity, from local contours to complex objects, [21–25,26!,27–30,31!!,32–34] and functional magnetic
and representations that are highly tolerant of identity- resonance imaging (fMRI) [35–39,40!,41!] studies have
preserving image changes (e.g. changes in position, size, focused on elucidating the loci of brain plasticity and
pose or background clutter). changes in neuronal responses that underlie this visual
learning. Here, we briefly review these advances and
A wide range of methods provide converging evidence propose that experience-based plasticity across multiple
that neuronal processes supporting object recognition are stages of visual analysis bolsters selective, robust repre-
coarsely localized in the ventral visual stream [1], which sentations of visual objects, and thus directly underlies
has a rough hierarchy of cortical processing stages (V1 the perceptual integration of local features into coherent
[primary visual cortex] ! V2 ! V4 ! PIT [posterior meaningful objects and their recognition in the complex
inferior temporal cortex] ! AIT [anterior inferior environments we inhabit.

Current Opinion in Neurobiology 2006, 16:152–158 www.sciencedirect.com


Learning and neural plasticity in visual object recognition Kourtzi and DiCarlo 153

Learning to put an object together changes is equivalent to the process of defining what an
The ability to build neuronal representations that are ‘object’ is in the first place. The experimental challenge is
highly selective for visual pixel combinations (‘features’) to understand how this is done in the brain.
and feature combinations (‘objects’) in addition to being
highly tolerant to identity-preserving image changes is Learning selectivity
the computational crux of object recognition [42–44], and How does the brain figure out how to appropriately
the hallmark of primate vision. At the most basic level, connect neurons together and weight their inputs so that
object recognition requires the visual system to discrimi- selective neuronal representations are ultimately
nate among different patterns of visual input (e.g. the obtained? Plasticity of neuronal connections along with
letter ‘A’ among all other possible letters and objects). As appropriate learning ‘rules’ is one obvious potential
this discrimination cannot be solved simply by monitoring mechanism. For example, learning to respond selectively
the output of any one photoreceptor (or even one lateral to an object might amount to learning which simpler
geniculate nucleus [LGN] or V1 neuron), a solution to image features tend to often co-occur in the world
this problem requires some ‘binding’ of responses from [47,48] (Figure 1a). At a mechanistic level, this could
neurons at early visual stages. Standard computational result if inputs conveying simpler image features are
approaches propose that ‘binding’ is achieved by synap- brought together at downstream neurons that respond
tically combining inputs from neurons at early visual non-linearly to those inputs (e.g. respond only if feature A
stages (e.g. through a thresholded weighted sum of such and feature B are both present). An iteration of this
inputs). As a result, neurons at higher stages of the visual strategy at each level of the visual system would result
system are tuned to patterns of increasing complexity in progressively more complex preferred stimulus con-
until the required pattern discrimination can be sup- figurations. Competitive mechanisms could ensure effi-
ported [45]. Because pattern discrimination must be cient coverage of the subspace of possible images that is
performed in the real world, this selectivity for combina- spanned by natural images [49]. Recent neurophysiolo-
tions of visual features should have robust tolerance for gical studies provide evidence for such learning. In par-
image changes that produce profound transformations in ticular, neurons in monkey IT cortex show enhanced
the visual input without modifying object identity (e.g. selectivity after training for novel objects [23,25,32],
position, size, pose or background clutter) [42,44]. Com- holistic multiple-part configurations [29] and even phy-
putational models [11,43,46] have shown that such expli- sically unrelated pairs of shapes [24,50!]. The time-course
cit object representations can be built using neuronal of changes in some of these neurons parallel that of
connections that group together similar features regard- learning [51], suggesting a strong link between under-
less of image changes. In this view, the process of con- lying neuronal plasticity and behavioural improvement.
structing neurons that are both tuned for complex Furthermore, learning can shape the assignment of novel
configurations of simpler visual features (e.g. pixels, objects into classes [52] by modulating the selectivity of
edges) and relatively insensitive to some types of image neurons in the inferior temporal and frontal cortex for
OAI: Some neurones recieve input from neurones which respond to features that always remain the same - even when the object presentation changes
- Simple example, in order to recognise a car when it’s green or blue, you might have neurones which encode the wheels - a constant feature accross all the presentations
Figure 1

Key neuronal response properties required for robust object recognition might ultimately be acquired and maintained through visual experience.
(a) Image arrangements that are well represented by the visual system are those that are commonly encountered in the world (compare the upper
with the lower images in the left panel). (b) Similarly, image transformations that the visual system is able to tolerate robustly are those that are
often experienced across short time intervals. For example, the upper image sequences show translation and pose change respectively without
changes in object identity. The lower image sequences show changes in object identity across short time intervals, a situation that is rarely
encountered, but can alter the tolerance of object recognition [54,58!]. (c) Items in a scene that are not part of an object can make object
recognition more difficult (clutter), but if those items are often seen with the object (e.g. hat, shirt and jacket are often seen with a face), the
items themselves might be sufficient for object representation and perception [68].

www.sciencedirect.com Current Opinion in Neurobiology 2006, 16:152–158


154 Cognitive neuroscience

features crucial for these categorization processes particular, learning has been suggested to enhance corre-
[27,30,53!]. lations among neurons responding to the features of target
patterns while de-correlating neural responses to target
Learning tolerance and background patterns. As a result, input (stimulus)
To date, most studies of visual learning have focused on redundancy is reduced and target salience is enhanced
changes in neuronal or behavioural selectivity. However, [33], supporting the efficient detection and identification
as described above, simply learning selectivity is not of objects in cluttered scenes [67].
enough to create useful object representations — that
is, selective object representations must be tolerant to It should be noted however, that the background ‘stuff’ in
image changes (e.g. object position, size and pose). But which an object is embedded in a scene should not always
how does the visual system know which neurons to be viewed as ‘clutter’ that must be ignored. Indeed, that
connect (or, equivalently, how to weight those connec- ‘clutter’ can include features that, although not perfectly
tions) to enable this tolerance? Again, learning from the correlated with the object, are more often than not seen
statistical regularities of the natural world has been pro- with the object (i.e. context), and thus might aid the
posed as a potential solution [8–11,54]. One central idea is detection and recognition of the object (Figure 1c). Thus,
that features and objects in the world do not tend to jump the visual system might learn to incorporate these clues in
in and out of existence, but they have temporal conti- its object representations [68,69]. In addition, the ‘clutter’
nuity, in that an object seen at one instant in time will will typically include other objects that might also need to
probably be seen in the next instant, but perhaps in a be detected and recognized. Understanding how the visual
different position, size or pose (Figure 1b). Another system represents multiple objects simultaneously [70,71]
related idea is that once semi-tolerant representations is crucial for unravelling the mechanisms that mediate
are established, the later appearance of a feature or object successful interactions in complex, dynamic environments.
that is similar to that representation, but differs slightly in
(e.g.) position, scale, or view, can re-activate the initial Neuronal plasticity underlying visual object
representation and enable tolerance learning without learning
temporal continuity [55]. Although ‘tolerance learning’ Studies demonstrating experience-dependent changes in
is of great computational importance, and there is some the selectivity and tolerance of high-level neuronal repre-
behavioural [19] and circumstantial neuronal evidence sentations (e.g. IT) beg the question of the locus and
[32,53!,56] suggesting that tolerance is not automatic, its nature of these changes, as improvement at higher stages
neural basis remains largely unknown. of visual analysis might be inherited from changes at one
or more earlier stages.
Recently, however, psychophysical studies have directly
demonstrated that targeted disruptions of the temporal It has often been suggested [19,72,73] that the key
continuity of an object result in disruptions in object plasticity locus in simple feature ‘perceptual learning’
perception consistent with tolerance learning is likely to be in early visual stages, as this learning is
[54,57,58!]. For example, Cox et al. [58!] recently found somewhat confined to the trained retinal position. That is,
that even the most fundamental type of recognition changes in the receptive field tuning properties of neu-
tolerance — the ability to recognize an object despite rons in V1 might account for the specificity of learning
its position on the retina — can be predictably modified effects to the trained visual field position and trained
by visual experience. In particular, changes to object stimulus attribute. Indeed, recent imaging studies
identity during normal eye movements that bring the [37,40!,74] provide evidence for the involvement of V1
‘object’ from one retinal position to another over short in object feature learning. However, neurophysiological
time intervals disrupt later recognition of the object across evidence for the contribution of V1 in behavioural
those same retinal positions. But there is also evidence improvement after training on visual discrimination
that view tolerance can be learned without temporal remains controversial [21,22]. There is some evidence
continuity [59]. An important goal of ongoing and future for sharpening of orientation tuning after training [22],
work is to elucidate neuronal changes in the ventral but no evidence for changes in the size of the cortical
stream in the context of tolerance learning. representation or the receptive field properties of neurons
in V1 [21,75]. One possibility is that V1 learning effects
Learning objects in clutter can be detected in the average response of large numbers
Beyond learning selectivity and tolerance for object iden- of neurons, as measured by fMRI, but are very small at the
tity, the visual system must learn to detect objects in the level of the individual neurons. Another possibility is that
real world, in which they are seen in clutter and context they reflect task-dependent changes in intra- and inter-
[60,61]. During a course of training, observers can learn area connectivity.
distinctive target features by using information (image
regularities) crucial for target detection more efficiently Recently, two studies combining psychophysics and
and by suppressing background noise [62–65,66!]. In fMRI [41!,76!] examined the relationship between shape

Current Opinion in Neurobiology 2006, 16:152–158 www.sciencedirect.com


Learning and neural plasticity in visual object recognition Kourtzi and DiCarlo 155

learning and experience-dependent reorganization across [38,78,79]. Future studies combining fMRI and simulta-
stages of visual processing. The results of Sigman et al. neous chronic recordings from these areas will provide
[76!] suggest that shape representation might shift from novel insights for understanding both bottom-up and top-
higher to early visual areas, which support rapid and down mechanisms for experience-dependent reorganiza-
automatic search and detection in visual cluttered scenes tion at the level of inter- and intra- area networks.
independently of attentional control. These findings are
consistent with the suggestion that learning moulds In summary, the current experimental evidence suggests
object representations not only by enhancing the proces- that there is no single locus of brain plasticity underlying
sing of feature detectors with increasing complexity in a visual learning. These findings are consistent with com-
bottom-up manner but also in a top-down manner taking putational approaches proposing that associations
into account the relevant task dimensions and demands. between features that mediate the recognition of familiar
In particular, it has been suggested that learning begins at objects might occur across stages of visual analysis, from
higher visual areas for easy tasks and proceeds to early orientation detectors in the primary visual cortex to
retinotopic areas that have higher resolution for finer occipitotemporal neurons tuned to object parts and views
and more difficult discriminations [77!!]. Kourtzi et al. [7,8,43]. At the neuronal level, learning could be imple-
[41!] provide evidence that these distributed plasticity mented by changes in core feedforward neuronal pro-
mechanisms are adaptable to natural image regularities cessing, especially at higher visual stages [31!!], or by
that determine the salience of targets in cluttered scenes. changes in the interactions between object analysis cen-
In particular, their results suggest that opportunistic tres in temporal and frontal cortical areas and local con-
learning [63] of salient targets in natural scenes is nections in the primary visual cortex based on top-down
mediated by sparser feature coding at higher stages of feedback mechanisms. For example, learning has been
visual analysis, whereas learning of camouflaged targets is suggested to modulate neuronal sensitivity in the early
implemented by mechanisms that enhance the segmen- visual areas by modulating networks of lateral interactions
tation and recognition of ambiguous targets in both early and through feedback connections from higher visual
and higher visual areas. areas [17,74,75,80–82]. Such changes in the connectivity
of visual analysis circuits might be adaptive and efficient
One of the main advantages of fMRI is that it provides compared with changes in core feedforward visual pro-
global brain coverage and, thus, it is a highly suitable cessing (e.g. receptive fields) that might have deleterious
method for studying learning-dependent changes across consequences for the visual processing of the trained
stages of analysis in the visual system. However, experi- stimuli in another context or task. Current research direc-
ence-dependent activation changes in fMRI studies tions focus on further understanding the effects of sti-
could be the result of changes in the numbers or the mulus and task demands on learning across stages of
gain of neurons recruited for processing of a stimulus in visual analysis, the relative time courses of learning-
the context of a task. As imaging studies measure activa- dependent changes, and the underlying neuronal
tion at the large scale of neural populations rather than at responses and network interactions that change to enable
the scale of single neurons, they cannot discern these learning to occur while not disrupting general visual
different neural plasticity mechanisms. Recent neurophy- processing.
siological studies [26!,28] have shed light on to cortical
reorganization mechanisms at the level of the single Conclusions
neuron when monkeys learn to discriminate images of Visual object perception and recognition in cluttered,
natural scenes presented in noise. These studies show natural scenes poses a series of computational challenges
that learning enhances the selective processing of crucial to the adult visual system, from the detection of image
features for the detection of object targets in early occi- regularities to binding contours, parts and features into
pito-temporal areas. By contrast, learning appears to coherent objects, recognizing them independent of image
facilitate efficient object processing independent of back- changes (e.g. position, scale, pose, clutter) and assigning
ground noise in the prefrontal cortex. These findings them to abstract categories. This review highlights the
suggest that learning in different cortical areas bolsters potentially fundamental role of learning in solving some
functions that are important for different tasks, ranging of these challenges. What general conclusions are we to
from the bottom-up detection and integration of target take from the experimental evidence available so far?
features in cluttered scenes across visual occipitotemporal First, the adult visual system is clearly plastic, in terms of
areas to the top-down selection of familiar objects in the both behavioural improvements and changes in neuronal
prefrontal cortex. Consistent with top-down approaches responses. Second, there is no single locus of plasticity in
to visual processing, recent neuroimaging studies suggest the visual system that is the exclusive site underlying
that learning might enhance the functional interactions object learning. On the contrary, in most cases learning
between occipitotemporal areas that encode physical modifies visual representations for features and objects by
stimulus experiences and parieto-frontal circuits that modulating processing across multiple cortical levels.
represent our perceptual interpretations of the world Third, learning does not always result in simple, static

www.sciencedirect.com Current Opinion in Neurobiology 2006, 16:152–158


156 Cognitive neuroscience

changes to core feedforward visual processing — instead, 9. Foldiak P: Learning invariance from transformation
sequences. Neural Comput 1991, 3:194-200.
changes can be dynamic and task dependent. Thus,
10. Wiskott L, Sejnowski TJ: Slow feature analysis: unsupervised
understanding object learning cannot be divorced from learning of invariances. Neural Comput 2002, 14:715-770.
the context of the computational problems faced by the
11. Ullman S, Soloviev S: Computation of pattern invariance in
visual system in complex environments; that is, learning brain-like structures. Neural Netw 1999, 12:1021-1036.
robust object representations depends on the stimulus 12. Karklin Y, Lewicki MS: A hierarchical Bayesian model for
conditions and the task demands. An important goal for learning nonlinear statistical regularities in nonstationary
future work is to understand the effects of these factors on natural signals. Neural Comput 2005, 17:397-423.
optimal computations and the neuronal correlates of 13. Rodman HR: Development of inferior temporal cortex in the
monkey. Cereb Cortex 1994, 4:484-498.
visual learning. Finally, the relationship between the
neural mechanisms that mediate adult, experience- 14. Blakemore C, Van Sluyters RC: Innate and environmental
factors in the development of the kitten’s visual cortex.
dependent plasticity and developmental plasticity is intri- J Physiol 1975, 248:663-716.
guing and remains largely unknown. Although the adult
15. Sinha P, Bouvrie JVaS: Object concept learning: observations
visual system is remarkably powerful at representing and in congenitally blind children and a computational model.
distinguishing among objects even the very first time they Neurocomputing 2006. in press.
are seen, this cannot rule out a potentially crucial role of 16. Goldstone RL: Perceptual learning. Annu Rev Psychol 1998,
visual experience in the establishment of such represen- 49:585-612.
tations. On the contrary, it is surprising that small 17. Gilbert CD, Sigman M, Crist RE: The neural basis of perceptual
learning. Neuron 2001, 31:681-697.
amounts of adult visual experience, as reviewed here,
produce measurable changes in both behaviour and neu- 18. Schyns PG, Goldstone RL, Thibaut JP: The development of
features in object concepts. Behav Brain Sci 1998, 21:1-17,
ronal representations, even when superimposed on a discussion 17-54.
lifetime of natural experience. To the extent that adult
19. Fahle M: Perceptual learning: a case for early selection.
visual learning shares computational and perhaps even J Vis 2004, 4:879-890.
some mechanistic commonalities with the developing 20. Fine I, Jacobs RA: Comparing perceptual learning tasks: a
visual system, understanding experience-based plasticity review. J Vis 2002, 2:190-203.
could reveal the key principles that underlie our remark- 21. Ghose GM, Yang T, Maunsell JH: Physiological correlates of
able ability for robust object recognition. perceptual learning in monkey V1 and V2. J Neurophysiol 2002,
87:1867-1888.

Acknowledgements 22. Schoups A, Vogels R, Qian N, Orban G: Practising orientation


We would like to thank S Engel, N Kanwisher, H Op de Beeck and M identification improves orientation coding in V1 neurons.
Nature 2001, 412:549-553.
Riesenhuber for comments and discussion on the manuscript. Supported
by funding from The Biotechnology and Biological Sciences Research 23. Kobatake E, Wang G, Tanaka K: Effects of shape-discrimination
Council (BBSRC) BB/D52199X/1 to Z Kourtzi and The National Eye training on the selectivity of inferotemporal cells in adult
Institute R01-EY014970 to JJ DiCarlo. monkeys. J Neurophysiol 1998, 80:324-330.
24. Miyashita Y: Cognitive memory: cellular and network
References and recommended reading machineries and their top-down control. Science 2004,
Papers of particular interest, published within the annual period of 306:435-440.
review, have been highlighted as:
25. Rolls ET: Learning mechanisms in the temporal lobe visual
cortex. Behav Brain Res 1995, 66:177-185.
! of special interest
!! of outstanding interest 26. Rainer G, Lee H, Logothetis NK: The effect of learning on the
! function of monkey extrastriate visual cortex. PLoS Biol 2004,
2:E44.
1. Ungerleider LG, Mishkin M: Two cortical visual systems. In This study, in combination with previous findings (Rainer et al., [28]),
Analysis of Visual Behavior. Edited by Ingle DJ, Goodale MA, suggests that different neural plasticity mechanisms in early visual areas
Mansfield RJW. M.I.T. Press; 1982:549-585. (V4) and the prefrontal cortex mediate enhanced behavioural perfor-
2. Rolls ET: Functions of the primate temporal lobe cortical visual mance in discriminating degraded images after training. In particular,
areas in invariant visual object and face recognition. stimulus degradation by increasing noise resulted in decreased
Neuron 2000, 27:205-218. responses of V4 neurons to novel images, whereas learning enhanced
their responses to degraded stimuli. By contrast, the number of prefrontal
3. Hung CP, Kreiman G, Poggio T, DiCarlo JJ: Fast readout of neurons that responded to undegraded images decreased as these
object identity from macaque inferior temporal cortex. images became familiar with training, whereas their object selectivity
Science 2005, 310:863-866. increased (narrowed tuning) and generalized across degradation levels.

4. Quiroga RQ, Reddy L, Kreiman G, Koch C, Fried I: Invariant visual 27. Sigala N, Logothetis NK: Visual categorization shapes feature
representation by single neurons in the human brain. selectivity in the primate temporal cortex. Nature 2002,
Nature 2005, 435:1102-1107. 415:318-320.

5. Grill-Spector K, Malach R: The human visual cortex. Annu Rev 28. Rainer G, Miller EK: Effects of visual experience on the
Neurosci 2004, 27:649-677. representation of objects in the prefrontal cortex. Neuron 2000,
27:179-189.
6. Wallis G, Bulthoff H: Learning to recognize objects. Trends Cogn
Sci 1999, 3:22-31. 29. Baker CI, Behrmann M, Olson CR: Impact of learning on
representation of parts and wholes in monkey inferotemporal
7. Poggio T: A theory of how the brain might work. Cold Spring cortex. Nat Neurosci 2002, 5:1210-1216.
Harb Symp Quant Biol 1990, 55:899-910.
30. Freedman DJ, Riesenhuber M, Poggio T, Miller EK: A comparison
8. Wallis G, Rolls ET: Invariant face and object recognition in the of primate prefrontal and inferior temporal cortices during
visual system. Prog Neurobiol 1997, 51:167-194. visual categorization. J Neurosci 2003, 23:5235-5246.

Current Opinion in Neurobiology 2006, 16:152–158 www.sciencedirect.com


Learning and neural plasticity in visual object recognition Kourtzi and DiCarlo 157

31. Yang T, Maunsell JH: The effect of perceptual learning on 49. Simoncelli EP: Vision and the statistics of the visual
!! neuronal responses in monkey visual area V4. J Neurosci 2004, environment. Curr Opin Neurobiol 2003, 13:144-149.
24:1617-1626.
This paper is the latest in a series of impressively thorough monkey 50. Messinger A, Squire LR, Zola SM, Albright TD: Neural correlates
neurophysiological studies of perceptual learning effects along the ventral ! of knowledge: stable representation of stimulus associations
stream (V1, V2 and V4) [21]. In these studies the authors systematically across variations in behavioral performance. Neuron 2005,
examine a large range of potential single neuron effects of orientation 48:359-371.
discrimination training. The overall conclusion is that, whereas all effects This study provides evidence for a novel class of inferotemporal neurons
in V1 and V2 are small (especially when compared with results in other that represent arbitrary associations between pairs of stimuli independent
primary sensory areas), V4 neurons near the trained visual field location of whether the animal chooses the correct remembered visual stimulus or
and stimulus orientation show moderately sharper orientation selectivity not. These findings suggest that knowledge in the context of memorized
and better population discrimination performance. associations is represented in the inferior temporal cortex regardless of its
role and weight in behavioral decisions.
32. Logothetis NK, Pauls J, Poggio T: Shape representation in the
inferior temporal cortex of monkeys. Curr Biol 1995, 5:552-563. 51. Messinger A, Squire LR, Zola SM, Albright TD: Neuronal
representations of stimulus associations develop in the
33. Jagadeesh B, Chelazzi L, Mishkin M, Desimone R: Learning temporal lobe during learning. Proc Natl Acad Sci USA 2001,
increases stimulus salience in anterior inferior temporal 98:12239-12244.
cortex of the macaque. J Neurophysiol 2001, 86:290-303.
52. Rosenthal O, Fusi S, Hochstein S: Forming classes by stimulus
34. Lee TS, Yang CF, Romero RD, Mumford D: Neural activity in early frequency: behavior and theory. Proc Natl Acad Sci USA 2001,
visual cortex reflects behavioral experience and higher-order 98:4265-4270.
perceptual saliency. Nat Neurosci 2002, 5:589-597.
53. Freedman DJ, Riesenhuber M, Poggio T, Miller EK: Experience-
35. van Turennout M, Ellmore T, Martin A: Long-lasting cortical ! dependent sharpening of visual shape selectivity in inferior
plasticity in the object naming system. Nat Neurosci 2000, temporal cortex. Cereb Cortex 2005. In press.
3:1329-1334. This neurophysiological study provides evidence for enhanced perfor-
mance in object learning in the context of a categorization task linked to
36. Grill-Spector K, Kushnir T, Hendler T, Malach R: The dynamics of improved neural selectivity in IT. Interestingly, these learning effects were
object-selective activation correlate with recognition specific to the trained object orientation and were observed not only for
performance in humans. Nat Neurosci 2000, 3:837-843. objects with which the monkeys were explicitly trained but also for stimuli
to which the monkeys were passively exposed. These findings suggest
37. Schiltz C, Bodart JM, Dubois S, Dejardin S, Michel C, Roucoux A, that similar neural plasticity mechanisms that result in the sharpening of
Crommelinck M, Orban GA: Neuronal mechanisms of neural responses might underlie learning through supervised training or
perceptual learning: changes in human brain activity with passive experience.
training in orientation discrimination. Neuroimage 1999,
9:46-62. 54. Wallis G, Bulthoff HH: Effects of temporal association on
recognition memory. Proc Natl Acad Sci USA 2001,
38. Dolan RJ, Fink GR, Rolls E, Booth M, Holmes A, Frackowiak RS, 98:4800-4804.
Friston KJ: How the brain learns to see objects and faces in an
impoverished context. Nature 1997, 389:596-599. 55. Stringer SM, Perry G, Rolls ET, Proske JH: Learning invariant
object recognition in the visual system with continuous
39. Gauthier I, Tarr MJ, Anderson AW, Skudlarski P, Gore JC: transformations. Biol Cybern 2006, 94:128-142.
Activation of the middle fusiform ‘face area’ increases with
expertise in recognizing novel objects. Nat Neurosci 1999, 56. DiCarlo JJ, Maunsell JHR: Anterior inferotemporal neurons of
2:568-573. monkeys engaged in object recognition can be highly
sensitive to object retinal position. J Neurophysiol 2003,
40. Furmanski CS, Schluppeck D, Engel SA: Learning strengthens 89:3264-3278.
! the response of primary visual cortex to simple patterns.
Curr Biol 2004, 14:573-578. 57. Kourtzi Z, Shiffrar M: One-shot view invariance in a moving
This study provides an elegant example of the link between experience- world. Psychol Sci 1997, 8:461-466.
dependent behavioral improvement and neural plasticity in the primary
visual cortex. In particular, after training with oblique orientation patterns, 58. Cox DD, Meier P, Oertelt N, DiCarlo JJ: ‘Breaking’ position-
behavioral performance in the discrimination of these patterns improved ! invariant object recognition. Nat Neurosci 2005, 8:1145-1147.
and fMRI responses in V1 increased at magnitudes similar to those for In normal, real-world viewing, the same object is seen at different posi-
cardinal orientation patterns before training. tions across short time intervals as the eyes rapidly explore the visual
world. To test the idea that this experience might underlie the position-
41. Kourtzi Z, Betts LR, Sarkheil P, Welchman AE: Distributed neural tolerance of recognition, the authors introduced undetected changes in
! plasticity for shape learning in the human visual cortex. PLoS object identity during eye movements. This manipulation disrupted the
Biol 2005, 3:e204. position tolerance of object recognition in healthy adult observers. This
The authors show that the human brain learns novel objects in complex shows that position tolerance is plastic in adults, and suggests that even
scenes by reorganizing shape processing across visual areas, while this bedrock property of primate object recognition might be acquired or
taking advantage of natural image correlations and optimising neural maintained by natural visual experience.
coding for the task context.
59. Wang G, Obama S, Yamashita W, Sugihara T, Tanaka K:
42. Edelman S: Representation and Recognition in Vision. Prior experience of rotation is not required for recognizing
.MIT Press; 1999 objects seen from different angles. Nat Neurosci 2005,
8:1768-1775.
43. Riesenhuber M, Poggio T: Hierarchical models of object
recognition in cortex. Nat Neurosci 1999, 2:1019-1025. 60. Rolls ET, Aggelopoulos NC, Zheng F: The receptive fields of
inferior temporal cortex neurons in natural scenes. J Neurosci
44. Ullman S: High Level Vision. MIT Press; 1996. 2003, 23:339-348.
45. Barlow H: The neuron doctrine in perception.. In The Cognitive 61. Sheinberg DL, Logothetis NK: Noticing familiar objects in real
Neurosciences. Edited by Gazzaniga. MIT Press; 1995:415-435. world scenes: the role of temporal cortical neurons in natural
vision. J Neurosci 2001, 21:1340-1350.
46. Fukushima K: Neocognitron: a self organizing neural network
model for a mechanism of pattern recognition unaffected by 62. Gold J, Bennett PJ, Sekuler AB: Signal but not noise changes
shift in position. Biol Cybern 1980, 36:193-202. with perceptual learning. Nature 1999, 402:176-178.
47. Fiser J, Aslin RN: Unsupervised statistical learning of higher- 63. Brady MJ, Kersten D: Bootstrapped learning of novel objects.
order spatial structures from visual scenes. Psychol Sci 2001, J Vis 2003, 3:413-422.
12:499-504.
64. Dosher BA, Lu ZL: Perceptual learning reflects external noise
48. Chun MM: Contextual cueing of visual attention. Trends Cogn filtering and internal noise reduction through channel
Sci 2000, 4:170-178. reweighting. Proc Natl Acad Sci USA 1998, 95:13988-13993.

www.sciencedirect.com Current Opinion in Neurobiology 2006, 16:152–158


158 Cognitive neuroscience

65. Li RW, Levi DM, Klein SA: Perceptual learning improves discrimination. Proc Natl Acad Sci USA 2002,
efficiency by re-tuning the decision ‘template’ for position 99:17137-17142.
discrimination. Nat Neurosci 2004, 7:178-183.
75. Crist RE, Li W, Gilbert CD: Learning to see: experience and
66. Eckstein MP, Abbey CK, Pham BT, Shimozaki SS: Perceptual attention in primary visual cortex. Nat Neurosci 2001, 4:519-525.
! learning through optimization of attentional weighting: human
versus optimal bayesian learner. J Vis 2004, 4:1006-1019. 76. Sigman M, Pan H, Yang Y, Stern E, Silbersweig D, Gilbert CD: Top-
This paper introduces a new experimental paradigm into the study of ! down reorganization of activity in the visual pathway after
perceptual learning by comparing human and optimal Bayesian learners. learning a shape identification task. Neuron 2005, 46:823-835.
The results suggest that humans learn to localize targets with uncertainty The authors report large-scale reorganization across cortical networks for
about orientation and polarity within a brief sequence of trials, but they task-dependent learning. Learning to detect a target among distractors in
rely more on previous decisions than on feedback, resulting in slower and the context of a visual search task resulted in increased activations in
more incomplete learning than that of an ideal observer. retinotopic areas By contrast, decreased activations were observed both
in higher occipitotemporal areas involved in shape analysis and in frontal
67. Barlow H: Conditions for versatile learning, Helmholtz’s and parietal areas involved in attentional processing.
unconscious inference, and the task of perception. Vision Res
1990, 30:1561-1571. 77. Ahissar M, Hochstein S: The reverse hierarchy theory of visual
!! perceptual learning. Trends Cogn Sci 2004, 8:457-464.
68. Cox D, Meyers E, Sinha P: Contextually evoked object-specific This review highlights psychophysical and physiological evidence that
responses in human visual cortex. Science 2004, 304:115-117. learning begins at higher stages of visual analysis and proceeds in a top-
down manner to earlier stages when finer analysis of the stimulus is
69. Bar M, Aminoff E: Cortical analysis of visual context. Neuron
necessary.
2003, 38:347-358.
70. Zoccolan D, Cox DD, DiCarlo JJ: Multiple object response 78. Buchel C, Coull JT, Friston KJ: The predictive value of changes
normalization in monkey inferotemporal cortex. J Neurosci in effective connectivity for human learning. Science 1999,
2005, 25:8150-8164. 283:1538-1541.

71. Aggelopoulos NC, Rolls ET: Scene perception: inferior temporal 79. McIntosh AR, Rajah MN, Lobaugh NJ: Interactions of prefrontal
cortex neurons encode the positions of different objects in the cortex in relation to awareness in sensory learning. Science
scene. Eur J Neurosci 2005, 22:2903-2916. 1999, 284:1531-1533.

72. Crist RE, Li W, Gilbert CD: Learning to see: experience and 80. Li W, Piech V, Gilbert CD: Perceptual learning and top-down
attention in primary visual cortex. Nat Neurosci 2001, 4:519-525. influences in primary visual cortex. Nat Neurosci 2004,
7:651-657.
73. Schoups AA, Vogels R, Orban GA: Human perceptual learning in
identifying the oblique orientation: retinotopy, orientation 81. Sagi D, Tanne D: Perceptual learning: learning to see. Curr Opin
specificity and monocularity. J Physiol 1995, 483:797-810. Neurobiol 1994, 4:195-199.
74. Schwartz S, Maquet P, Frith C: Neural correlates of perceptual 82. Sigman M, Gilbert CD: Learning to find a shape. Nat Neurosci
learning: a functional MRI study of visual texture 2000, 3:264-269.

Current Opinion in Neurobiology 2006, 16:152–158 www.sciencedirect.com

You might also like