Shao and Shen - 2023 - How can artificial neural networks approximate the
Shao and Shen - 2023 - How can artificial neural networks approximate the
COPYRIGHT
© 2023 Shao and Shen. This is an open-
access article distributed under the terms
of the Creative Commons Attribution
License (CC BY). The use, distribution or
reproduction in other forums is permitted, Introduction
provided the original author(s) and the
copyright owner(s) are credited and that
the original publication in this journal is Studies in ANNs have been the focus of contemporary society since the Image Net
cited, in accordance with accepted competition in visual object recognition was won by a deep neural network(Kietzmann
academic practice. No use, distribution or
reproduction is permitted which does not et al., 2019; Kriegeskorte and Golan, 2019). Engineers dream of pursuing a class of brain-
comply with these terms. inspired machines. The situation of robots superseding humans in many functions may
soon be a reality. Meanwhile, neurobiologists and psychologists wonder about progress in
neural networks due to the great differences between ANNs and biological brains(Crick,
1989; Carlson et al., 2018).
This article reviews the contemporary theories and technical advances in three related
fields: ANNs, neuroscience and psychology. ANNs were born mainly from more than two
thousand years of mathematical theories and algorithms; over the past two hundred years,
neuroscience has revealed more truths about the mysteries of the brain; and psychology
has just passed its 143th anniversary of accumulating conceptions about conscious and
unconscious cognitive processes. Cognitive sciences, encapsulating the disciplines of both
natural and artificial intelligence, were founded in the mid-1970s to inquire into the
mystery of intelligent substitutes, including humans, animals and machines. Although
progress in these related fields is promising, a completed brain-inspired intelligent machine
is far from the human brain. How can we reduce the distance between engineers’ dreams
and reality? Some suggestions for brain simulation will be offered in the paper.
Three generations of ANNs a game on paper. As a result, much foundation grant money was
withdrawn, no longer supporting the projects.
Artificial neural networks simulate brain intelligence by
mathematical equations, software or electronic circuits. As the
first constituent sub-discipline of cognitive sciences, ANNs The second generation of ANNs:
already have a 79-year history since the conception of the “neural Connectionist networks
unit” and “Hebbian synapse” as well as the first model of the
perceptron neural network in the 1940s–1950s (Mcculloch and 2-G ANNs began to be developed in the mid-1980s and can
Pitts, 1943; Hebb, 1949; Rosenblatt, 1958). We review the be divided into two periods: parallel distributed processing (PDP)
developmental course of the discipline and divide its history into and deep learning. In addition to the demand for ANN
three stages that represent the view of the human brain and development, the demand for debugging in artificial intelligence
intelligence from the perspective of mathematics. The first (AI) also promoted the renaissance of ANNs in the 1980s.
generation of ANNs is linear logic networks; the second generation A few dedicated scientists persisted in their neural network
is connectionist networks, including parallel distributed programs after 1969, their achievements provided a prelude for an
processing (PDP) and deep neural networks (DNNs); and the ANN renaissance. For example, two neural network models, the
third generation is spike neural networks (SNNs). discrete two-state model(Hopfield, 1982) and the continuous-
response model (Hopfield, 1984), were published. A two-volume
book (Rumelhart and McClelland, 1986) about PDP written by 16
The first generation of ANNs: Linear logic experts as coauthors was spread quickly over the world. Many
networks neural models and learning algorithms, such as the massively
parallel network with hidden unit-Boltzmann machine (Ackley
The first sentence in the first ANN paper was “Because of the et al., 1985), error backpropagation (Rumelhart et al., 1986),
‘all-or-none’ character nervous activity, neural events and the competitive learning, neural Darwinism, adaptive systems, and
relations among them can be treated by means of propositional self-organized learning, were created quickly during the 1980s–
logic.”(Mcculloch and Pitts, 1943). Then, the authors described ten 1990s. Meanwhile, connectionist modern supercomputers were
theorems and emphasized the calculus principle and academic developed, and very large-scale analog integrated circuits, analog
significance in the following sentence in the last part of the paper: neural computers, and applied electronic chips, such as electronic
“Specification for any one time of afferent stimulation and of the retina chips and see-hear chips, emerged.
activity of all constituent neurons, each an ‘all-or-none’ affair, It is worth discussing backpropagation algorithm (BP) due to
determines the state.” It is worth paying attention to the keywords its broad applications in the field of ANNs. Rumelhart et al. (1986)
in the cited sentence, all-or-none and determine, which mean that described the learning procedure and abstracted a universal
perception is associated with the statistic separation of two-state δ-learning rule. The results in implementation of the learning
affairs as a deterministic system. The second classical ANN paper algorithm demonstrate that there is error gradient descent in
stated, “When an axon of cell A is near enough to excite a cell B weight-space, so that network’s output reaches at its targeted value
and repeatedly or persistently takes part in firing it, some growth in a little by a little, meanwhile the error derivatives (δ) between
process or metabolic change takes place in one or both cells such the output and the desired result propagates back along the
that A’s efficiency, as one of the cells firing B, is increased” gradient descent. In fact, the efficiency of learning procedure
(Hebb, 1949). usually is very low. For example, it is necessary more than several
The first ANN perceptron model was a linear network model hundred even thousand turns of training for a very simple
that comprised A units and R units. Eleven equations were used network with a hidden unit to solve a task of XOR function
to analyze the model’s predominant phase through six parameters partition. BP is not a nonlinear model, error back-propagation is
clearly defining physical variables that were independent of the just an implicit nonlinear mapping without a really feedback loop.
behavioral and perceptual phenomena. As a result, the model was The number of hidden nodes has to be make sure by experience
able to predict learning curves from neurological variables and or testing without theoretical guideline. The authors wrote in the
likewise to predict neurological variables from learning curves. end of their paper that “the learning procedure, in its current
The author assertively wrote, “By the study of systems such as the form, is not a plausible model of learning in brains. However, …it
perceptron, it is hoped that those fundamental laws of organization is worth looking for more biologically plausible ways of doing
which are common to all information handling systems, machines gradient descent in neural networks.”
and men included, may eventually be understood” (Rosenblatt, The studies in DNNs were initiated by the heuristic ideas:
1958). It is obvious for the classical writer to have too optimistic parallel distributed processing (PDP), and stochastic computing,
foresee for his linear network model. In fact, the linear model for example, Boltzmann machine. Although its initial step began
could not solve mathematic puzzles such as XOR partition. early in 1980s, it did not shoot a flash in scientific society until
Therefore, a monograph (Minsky and Papert, 1969) claimed Alex-Net won image competition in 2012. Alex-Net was
assertively that the investigation of linear networks must be only composed of 650,000 artificial neural units, consisted five
convolutional layers, some of which were followed by max-pooling comparison of the functional fashion between cortical NGFCs
layers and three fully connected layers with a final 1,000 ways to and backpropagation in ANNs supports the idea that mechanism
output the results. It successfully recognized and classified 1.2 of implicit backpropagation exists in both biological brain
million high-resolution images by 60 million parameters in the and ANNs.
ILSVRC-2012 competition (Krizhevsky et al., 2012). Recently, a Despite the great progress in ANNs, DNNs, and CNNs, as well
deep CNN based Kaggle network won the contest to recognize as their rapid spread throughout the world, their weaknesses
real-time facial expression in the 5th ICEEICT (Sultana sometimes spoiled their reputation by spontaneously generating
et al., 2021). adverse effects, leading to some strange results. Fortunately, this
Beside the multiple hidden layers of network, a number of phenomenon has already been solved. In addition, the many
new algorithm has given DNNs a strong support, for example, neural units and large parameters required by a deep neural
convolutional algorithm, wavelet, and principal component network created a somewhat complicated problem because they
analysis (PCA), increased high accuracy of feature extracting and not only wasted energy but also prevented the application of
pattern classification. The roles of traditional shallow models in DNNs to pragmatic problems. Spiking neural networks have an
pattern recognition had been changed in recent years, because of advantage in decreasing both the unit numbers and energy needed
deep CNNs with strong learning ability—the deep learning-based (Figure 1).
object detection. The situation is based on some premises, such as
the database of large scale networks had been accumulated during
Image Net twice competitions in LSVRC-2010 and LSVRC-2012 The third generation of ANNs: Spiking
(Deng et al., 2009); CNNs for speech recognition had been neural networks
reported (Hinton et al., 2012); a design of regions with CNN
features was successively applied (Girshick et al., 2014). In short, 1,2-G networks are composed of a deterministic system in
CNNs can learn object representations without the need to design which information transmission is determined by presynaptic and
features manually and can straightforwardly process visual or postsynaptic factors. Neurobiological achievements in three
auditory complex objects without object dividing or component research fields, studies on the ion channel conductance of the
extracting. The deep learning-based object detection models and neuronal membrane (Williams, 2004), studies on the probability
algorithms, covering different application domains, had been of transmitter release from the presynaptic membrane (Tsodyks
reviewed in detail (LeCun et al., 2015; Zhao et al., 2019; Naveen and Markram, 1997; Branco and Staras, 2009; Borst, 2010), and
and Sparvathi, 2022). It is possible for the progress in studies of studies on spike timing-dependent plasticity (STDP; Song et al.,
machine deep learning to promote ANNs serving social life and 2000; Nimchinsky et al., 2002; Zucker and Regehr, 2002; Caporale
economic development. But there are many mysteries opening and Dan, 2008; Losonczy et al., 2008; Spruston, 2008; Jia et al.,
with regard to how does the CNNs learn the ability. The answer is 2010; Mark et al., 2017), produced a new conception of synaptic
that BP learning procedure trains CNNs approximating the transmission and have attracted many experts to explore temporal
desired output by the mechanism of error-backpropagation coincidence (Montague and Sejnowski, 1994; Sejnowski, 1995;
(Lillicrap et al., 2020). But the article did not make sure that the Markram et al., 1997; Roelfsema et al., 1997), noise sources
same mechanism exists also in the biological brain. They cited a (Maass, 1997; Roberts, 1999; Hopfield and Brody, 2004; Buesing
number of references which belong to the studies in either and Maass, 2010; McDonnell and Ward, 2011), spiking neural
neuroanatomy or neurophysiology, missing the dada in studies of networks (SNNs) and stochastic computing (Silver, 2010;
neurobiology or oscillatory encoding. For example, the theory of Hamilton et al., 2014; Maass, 2014; Maass, 2015; Tavanaei and
oscillatory encoding claims that information communication Maida, 2017) since the 1990s.
between brain areas might be implemented by the dynamic Such as showing in Figure 2, there are three categories of
synchronization of different rhythmic activities among neural neuronal encoding: rate encoding, paired pulses ratio (PPR)
networks. GABA inter-neurons by their inhibition lock up encoding and spike-time encoding. Rate encoding has been adopted
pyramidal neurons coupling with a network oscillation, then to present neuronal exciting level since 1930s, PPR encoding and
Neurogliaform cells (NGFCs) dynamically decouple neuronal spike-time encoding are discussed in the field of neurophysiology
synchrony between brain areas (Sakalar et al., 2022). It is during 1990s. PPR encoding has been usually used to classify the
interesting that action potentials of NGFCs decoupled pyramidal neurons, Spike-time encoding employs the lengths of inter-spike
cell activity from cortical gamma oscillations but did not reduce intervals (Δt) to encode and transmit information (Sejnowski, 1995;
their firing nor affect local oscillations. Thus, NGFCs regulate Song et al., 2000; Nimchinsky et al., 2002; Zucker and Regehr, 2002;
information transfer by temporarily disengaging the synchrony Caporale and Dan, 2008; Losonczy et al., 2008; Spruston, 2008; Jia
without decreasing the activity of communicating networks. Such et al., 2010; Mark et al., 2017). SNNs comprise neuromorphic
attribute of NGFCs in biological brain seems to be alike of devices in which information transmitted by more than three
backpropagation in ANNs or CNNs, it regulates error gradient in factors is constrained. Dendrites, as the third factor, are added to
weight-space by implicit feedback fashion, but does not disturb pre- and post-synaptic components, and any synaptic state is
feedforward information transfer of ANNs. The result in constrained by the locally surrounding patch of postsynaptic
FIGURE 1
The linear dividing of OR function (left) and nonlinear partition of XOR function (right). XOR function is a propositional logic corresponding to OR
function. As showing in the left plot, the results of OR function can be divided into two parts by a statistical dividing line, because the result is false
(−) at the left side (blue point) of the dividing line, so long as two variable ×1 and ×2 are false (0,0); the results under the other 3 conditioning (0,1;
1,0; 1,1) are true (+, red points) at the up side of the line. The right plot shows nonlinear partition surface of XOR function. The results of XOR
function are inside a surface (red points) by a closed curve, as the conditioning variable is, respectively, true (+,+) or false (−,–); the other two
results (blue points) are outside of the partition surface, during the conditioning variable is, respectively, as −1, 1 or 1, −1.
A B
FIGURE 2
Three categories of neuronal encoding A. Rate encoding by the firing frequency of a neuron, demonstrating approximately 150, 300, and
450 Hz induced by different stimulus strength in plot a (modified from Eccles, 1953) B. Paired-Pulse Ratio encoding (PPR) and spike-time
encoding are implemented, respectively, by the ratio of amplitude B to A or Δt of spike-timing in plot b [modified from Zucker and Regehr,
2002 by permission].
membrane, which contains hundreds of synapses. The pulses with Spike neural networks are closer to the biological brain than
the shortest Δt are identified as getting qualify to play a role in 1,2-G networks are. SNNs do not need many units in the network
information transmission. architecture and save energy in contrast to DNNs. However,
training SNNs remains a challenge because spike training requires neurophysiology or cognitive psychology? As a primitive and
discrete signals without differentiability, and the backpropagation basic conception of ANNs is far from the implications of the
algorithm cannot be directly used, as in DNNs. Recently, deep biological or human brain, it is difficult to establish an
learning in SNNs has been reviewed in detail (Tavanaei and interdisciplinary field. Let us review the corresponding
Maida, 2017), and a hybrid platform has been implemented by conceptions in neuroscience and psychology.
integrating DNNs with SNNs (Pei et al., 2019). There are different models of energy supply for different brain
structure. Large amount of parallel neural fiber in cerebellar did
not put on its myelin coat, its energy supply comes only from its
Theoretical weakness: From the energy cell body; but the neural fiber from cerebrum can get additional
function to the cost function energy supply from myelin (neuroglia cell). BOLD signals
represent hemoglobin-responsive changes in local brain blood
For brain-inspired ANNs, what is the dynamic source that microcircuits or neuroglia assemblies, because hemoglobin
drives state changes? Hopfield already answered this question with cannot directly reach any neuron. Each neuron in the neocortex
the energy function of the spin glass model, a solid-state physics has 3.72 glial cells, and each Purkinje cell (PC) in the cerebellum
model (Hopfield, 1982; Hopfield, 1984). Every unit of the system has only 0.23 glia cells (Azevedo et al., 2009). In fact, energy
is a uniform attractor, and each unit’s orientation and position are supply in brain is, respectively, implemented by area, lamina,
statistically determined by the attractor’s surface energy and the column. We suggest that the energy supply in ANNs needs to
interaction competition among attractors in the system state be improved, so that the energy consumption will be saved.
space. The energy is the dynamic source of the system and will
trend toward global or local energy minimization. The system will
reach the stationary state or convergent point when the energy Information processing in the
function reaches its minimum. The spin-glass model is too far brain
from the energy metabolism of the biological brain. Brain neurons
are polymorphic, and their morphologic appearance and location Although the conception of information processing did not
in the brain are determined by phylogenetic evolution and appear until the 1940s, brain anatomy and neurophysiology had
ontogenetic processes rather than being uniform and stochastically already become important by the end of the 19th century. Through
allocated. Brain energy metabolism usually occurs at the basic study of the anatomy of the nervous system, scientists had
metabolic level during an organism’s calm state, but it changes to mastered the knowledge of sensory-motor pathways and the
global imbalance and the highest local level with cognitive activity. visceral autonomic nervous system as well as brain functional
Therefore, the theoretical attractiveness of the energy function has localization. The classical neurophysiological theory, in which the
decayed since the 1990s. brain was regarded as a reflex organ, was founded at the beginning
The concept of the cost function or loss function has recently of the 20th century. By the mid-20th century, the brain was
been regarded as the dynamic basis of ANNs (Marblestone et al., regarded as an information-processing organ since
2016; Scholte et al., 2018; Richards and Lillicrap, 2019), but the electrophysiological techniques provided scientific evidence of
definition differs among different references, for example, “The neuron firing and postsynaptic potentials (PSPs) in the 1930s–
idea of cost functions means that neurons in a brain area can 1960s. The brain is now considered an organ in which both neural
somehow change their properties, e.g., the properties of their and genetic information are processed; that is, the brain’s long-
synapses, so that they get better at doing whatever the cost term memory is the result of a dialog between synapse and gene
function defines as their role”; “A cost function maps a set of (Kandel, 2001). In recent years, transcriptomic expression has
observable variables to a real value representing the ‘loss’ of the been used to classify the cell types of neuron distribution in the
system”; “A cost function is defined as the composition of neocortex (Kim et al., 2017; Boldog et al., 2018). The concept of
mathematical operations, e.g., mean squared error”; “A loss neural information processing is understood from four
function provides a metric for the performance of an agent on perspectives: the principle of the simultaneous existence of digital
some learning task. In a neural circuit, loss functions are functions encoding and analog encoding, the principle of the simultaneous
of synaptic strength”; and “Cost is the partial derivative of the existence of multiple processing processes and multiple
error with respect to the weight.” information streams, the principle of circular permutation and
All of these definitions and terms are from mathematics or coupling between electrical transmission and biochemical
ANN, with very few neuroscientific or psychological implications. transmission in the processing of neural information, and the
How can the cost function be understood by neuroscientists and principle of relevance between neural and genetic information
psychologists? The brain-inspired ANN is an interdisciplinary (Shen, 2017). Therefore, neural information processing is more
field, and the main theoretical conceptions should include complicated than any communication information or industry
neuroscientific and psychological implications, at least in terms of information. Of course, the brain-simulated parameters must
brain physiology. Cost, error, weight, partial derivative, be simplified to build an ANN model, but the primitive unit, the
mathematical operations, the credit assignment problem, etc.; how network architecture, and the operational dynamics should
many of these come from system neuroscience, integrative approximate the biological brain.
Two types of primitive unit vs. an uniform cortical areas in rats, whereas in the primates, the proportion
excitatory unit reaches 20% in the visual cortex and 25% in the other cortex. The
numbers of inhibitory interneurons have increased during
A series of discoveries in studies of the spinal reflex at the phylogenetic evolution along with the appearance of unique
beginning of the 20th century, such as spatial and temporal interneuron subtypes in humans (Wonders and Anderson, 2006;
summation, convergence, divergence, fraction, synaptic Petanjek et al., 2008). In contrast to the prenatal development of
retardation, final common path, and reciprocal innervation, excitatory neurons in the human cortex, interneuron production,
accounted for the interaction between excitatory and inhibitory at least in the frontal lobe, extends through 7 months after birth.
processes (Sherrington, 1906). In particular, the conditioned reflex GABA concentrations in the occipital cortex of adult subjects, as
theory emphasized internal inhibition as a function of the cerebral measured by magnetic resonance spectroscopy, are relatively stable
hemispheres (Pavlov, 1927). Inhibitory neurons were not found over periods as long as 7 months(Near et al., 2014). In evolutionary
until electrophysiological techniques and electro-microscopy were history, the cortical projection neurons derive from the dorsal
used. A model of the soma and large dendritic fields of a cat’s (pallial) telencephalon and migrate radially into the cortical mantle
spinal motor neurons was published, showing that any nerve cell zone; the interneurons generated in the sub-pallium migrate
is encrusted with hundreds of thousands of synapses, with a mean tangentially across areal boundaries of the developing cerebral
diameter of approximately 1.0 μm (Haggar and Barr, 1950). Even cortex into the overlying cortex (Mrzljak et al., 1992; Petanjek et al.,
when six different synaptic appearances were demonstrated, 2009). The inhibition shapes cortical activity, and inhibitory rather
synaptic features could not credibly be used to discriminate than excitatory connectivity maintains brain functions(Isaacson
whether a synapse is excitatory or inhibitory (Whittaker and Gray, and Scanziani, 2011; Mongillo et al., 2018; Fossati et al., 2019).
1962). The features of inhibitory synapses were judged by There are large numbers of interneuron types with different
inhibitory postsynaptic potentials (IPSPs), and many more morphological, electrophysiological, and transcriptomic properties
feedforward inhibitions, such as presynaptic inhibition and in the human neocortex (Wonders and Anderson, 2006; Petanjek
collateral inhibition, were found in brain networks, including the et al., 2008; Kim et al., 2017; Boldog et al., 2018).
cerebral cortex, cerebellar cortex, and hippocampus (Eccles et al., Artificial neural networks have oversimplified the inhibitory
1954; Eccles, 1964). unit as a supplementary variant of the excitatory process or a
Studies in brain chemical pathways in the 1960s–1970s found dependent variant of the activation function. ANN experts claim
several categories of inhibitory neurons on the basis of their that “inhibition need not be relayed through a separate set of
released transmitter molecules, such as gamma aminobutyric inhibitory units” (Kietzmann et al., 2019). Actually, “in the
acid (GABA), glycine, and serotonin, and found that the absence of inhibition, any external input, weak or strong, would
inhibitory neurons in the neocortex are composed mainly of generate more or less the same one-way pattern, an avalanche of
GABAergic neurons. The neurons in the neocortex can be divided excitation involving the whole population” (Buzsaki, 2006).
into two categories: interneurons, which make local connections, ANNs might be comprised of different categories of units with
and projection neurons, which extend axons to distant intra- different connecting strength with their target units by different
cortical, sub-cortical and sub-cerebral targets. Projection neurons weight shift range 0–1 or 0–(−1). As a result, there are two types
are excitatory, synthesizing glutamatergic transmitters with a of primitive unit: excitatory and inhibitory unit. Each categories
typical pyramidal morphology that transmit information of units will be further divided into different types by their
between different regions of the neocortex and to other regions threshold activated in different encoding, for example, sensitivity
of the brain (Paredes et al., 2016; Cadwell et al., 2019). to spike timing: the lower sensitivity(Δt < 10 ms) or the higher
Molecular neurobiology using transcriptomic techniques sensitivity(Δt < 6 ms). Also the unit types can be classified by pulse
investigates the stereotyped distributions of mainly inhibitory frequency to be evoked or by PPR standard.
cells in the brain and classifies three classes of inhibitory neurons:
parvalbumin-positive interneurons (PV+ neurons), somatostatin-
positive interneurons (SST+ neurons) and vasoactive intestinal Hierarchical small-world architecture
polypeptide-positive interneurons (VIP+ neurons; Kim et al., with a circular vs. a uniform feedforward
2017). Recently, the biological marker for GABAergic neurons in network
immunocytochemistry and molecular neurobiology has enabled
direct classification into three categories: excitatory, inhibitory The structural basis of the brain and spinal reflex is the
and non-neuronal cells, such as glial cells. By this method, a reflex arc that comprises afferent, efferent and neural centers.
group of human interneurons with anatomical features in The dual neural processes—the excitatory and inhibitory
neocortical layer 1, with large ‘rosehip’-like axonal button and processes—run in two directions in the reflex arc: along a
compact arborization (Boldog et al., 2018), was discriminated centripetal pathway from the sensory organs, reaching the
that had never been described in rodents. corresponding brain sensory center, and along a centrifugal
Studies in GABAergic neurons found that the proportion of pathway returning from the brain to the peripheral effectors.
GABAergic neurons is approximately 15% of the population of all The small-world network comprises a few components, and
each component contains a few constituent parts. The be found in reflex arcs, synapses, dendrites, axons, presynaptic
biological reflex arc comprises three components, and each membranes, biochemical transmission, etc. In the centrifugal
component contains 3–5 neurons, for example, a monkey’s pathway, recurrent inhibition in spinal networks is usually
food-fetch reflex pathway (Thorpe and Fabre-Thorpe, 2001). mediated by another inhibitory neuron, for example, Renshaw
The centripetal pathway is an input or afferent component that cells or their gamma loop between alpha and gamma motor
comprises three neurons from the retina, lateral geniculate neurons in the spinal reflex arc (Eccles, 1973, 1989). In the
body to primary visual cortex (V1), producing visual sense sensory-perceptual system, neural information is transformed
about the stimuli; the bottom-up stream is a perceptual along the centripetal pathway to the primary area of the sensory
mechanism from V1 to the anterior inferior temporal gyrus, cortex, and the bottom-up information stream reaches the highest
producing visual perception by the ventral visual pathway; the area of the perceptual cortex. In addition, there are many
prefrontal cortex to the premotor area of the cortex formats the feedbacks or top-down streams from the highest perceptual or
decision-making mechanism; and the centrifugal pathway is memory areas, and there are concurrent thalamus-cortical
the efferent or output component of the motor cortex (MC) to connections among the thalamus-cortical layer IV-cortical layer
the spinal cord and finger muscles. In short word, the food- VI-thalamus (Kamigaki, 2019; Egger et al., 2020). This
fetch reflex is not only implemented by 4 hierarchical networks phenomenon is not consistent with the cognitive model rule that
with small-world architecture, but also there are many a unit can be connected to a unit in a higher layer or to a unit in
inhibitory modulating mechanisms: a self-feedback loop, the the same layer but not to a unit in a lower layer (Feldman and
extrapyramidal system, the descending nonspecific reticular Ballard, 1982; Cummins and Cummins, 2000). Recently, some
system and the cerebellum. The classical specific brain new facts have been reported, such as dual whole-cell recordings
pathways are small-world architectures that have both the from pyramidal cells in layer V of the rat somatosensory cortex,
attributes of random rewiring and regular local organization; revealing an important mechanism for controlling cortical
their synaptic path lengths are both the shortest for random inhibition and mediating slow recurrent inhibition by SST+
networks and the longest for local networks (Watts and neurons (Deng et al., 2020). In sensory experience and perceptual
Strogatz, 1998; Buzsaki, 2006). Therefore, the efficiency of brain learning, input from the higher-order thalamus increases the
networks is excellent, with both high-speed information activity of VP+ neurons and VIP+ neurons and decreases the
processing and saving of network sources. activity of SST+ neurons, resulting in the disinhibition of
On the contrary, DNNs are one-way feedforward networks pyramidal cells in the sensory cortex and the LTP effect.
and usually comprise an input layer, hidden layers, and an Contextual feedback from the higher-order thalamus is helpful in
output layer in which the highest layer does not to processing first-order sensory information (Williams and
be discriminated, and all units are uniform at the initial state Holtmaat, 2019).
or before the training period. The information stream moves In the mechanism of biochemical transmission of neural
in a feedforward or downward manner from the input layer to information, the re-take-up and auto-receptor protein in the
the output layer. Sometimes, the convolutional algorithm is presynaptic membrane receives or reabsorbs its released
implemented in all input sets, the hidden layer is comprised transmitter. Reverse messengers such as NO and CO molecules
more than tens of layers, and whole connections among all released by postsynaptic neurons can quickly suppress the
units are implemented in layers. transmitter release of presynaptic neurons.
Therefore, we suggest that the network architecture in ANNs Sum up in a word, we hope that the feedback or recurrent
need to be improved. As a commenter on an early version of this inhibition should be set in everywhere of the whole networks,
manuscript noted: “Current ANNs largely ignore anatomical instead of only error backpropagation.
organizations: topographic mapping, precise wiring between brain
areas, layers, and cell types. This specific wiring bestows huge
computational power with minimal request of energy The multiple learning mechanisms vs. a
consumption. For example, the wiring of the olfactory system is uniform weight switch
very different from that of the visual system. Similarly, the wiring
of the sensory system differs drastically from that of the The fixed reflex arc in the biological brain was established by
motor system.” phylogenetic evolution and provides a basis of unconditioned
reflex; temporary connections between neural centers are the basis
of the conditioned reflex. How can the temporary connection
Biofeedback everywhere vs. only error be established by training or experience? The unconditioned
backpropagation stimulus (US), for example, food, water or a sexual partner,
induces a stronger excitatory process in the corresponding brain
The nervous system is a typical servo system with many center; the conditioning stimulus (CS) at first is an unrelated
biofeedback mechanisms that exist everywhere in brain networks neutral stimulus that usually induces a weaker excitatory process
but are not like error backpropagation in ANNs. Biofeedback can in the corresponding brain center. Temporary connection is the
result of the attraction of a stronger center to a weaker center Uniquely executive control mechanism
(Pavlov, 1927). Therefore, the CS must be presented first, and the vs. programing action command
US must follow in an interval of less than a couple of seconds. The
conditioned reflex has been called classical conditioning as the So fa, there are 3–4 theories on the executive control
physiological basis of unsupervised learning, in which increased mechanism in psychology: the unity and diversity model (Miyake
connectionist strength takes place mainly among the neurons in et al., 2000), functional network of prefrontal cortex (Miller and
the cerebral cortex. For supervised learning, a goal or a standard Cohen, 2001) and the distributed executive control (van den
as the supervisor must be set in advance. The response should Heuvell and Sporns, 2013). The unity and diversity model is also
be aimed at an exact goal with a quick reaction time or with the called “three factor model of update, inhibition and shifting or
refined skill that demands the inhibited actions. The neural cognitive flexibility,” because the executive control is defined as the
connection between the cerebrum and cerebellum is the neural cognitive behavior containing updating of working memory,
basis of supervised learning. The short−/long-term plasticity of inhibition of unnecessary actions and frequently changing
the synapse in the sensory-perceptual neocortex is the basis of situating demand.
perceptual learning. The executive control behavior is required to have mid-grade
The causal role of learning is the coincidence in the tempo- of inter-correlation among three experimental indices, so that it
spatial space between stimuli, inducing changes in synaptic presents both the independent and common property of the
plasticity. The temporal-difference learning algorithm has been behavior (Miyake et al., 2000). According to updating of working
tested, and an essential process in learning and memory is the memory, subject is demanded to say the picture name which is
transformation between spatiotemporal information in the brain demonstrated in the last one of a sequence of 60 pictures, but the
(Fitzsimonds and Poo, 1998; Bil and Poo, 2001). sequence will be stopped randomly from time to time. So, the
Two kinds of synapses were identified as the cellular subject has to update his or her working memory without stopping
mechanism of animal learning in the 1980s, while the biochemical during the process of the sequence running. Inhibition infers to
mechanism of learning has been recognized as the molecular eliminate the interfere for the performance running or inhibits
configuration of protein: NMDA-receptor protein and adenylate undesired responses. Stroop color words interfere paradigm is
cyclase (Abrams and Kandel, 1988). Two kinds of synapses infer usually adopted. A “blue” word written by red color and a “red”
an activity-dependent synapse and a reinforcement synapse word written by blue color are inconsistent interfere items, the
between pre- and post-synaptic components. The activity- subjects are demanded to press a button according semantic blue,
dependent synapse takes place between two presynaptic inhibiting the response to blue color. Shifting or cognitive
components, and the facilitated component then acts on the third flexibility confers that the experimental task will be randomly
postsynaptic neuron, providing the conditioned response (CR). changed. For example, 2 digits are successively displayed on a
The reinforcement/rewarding synapse brings about the excitatory screen, then the computational demand is shown either plus
process in postsynaptic neuron, while the CS and US signals are or minus.
transferred on the postsynaptic neuron that is facilitated by the The experimental models may not completely present human
two presynaptic components. social intelligence. For example, Chinese proverb “show
Artificial neural networks does not change their resourcefulness in an emergency” or “turned on by danger” means
methodological or simulating strategy for machine learning, that working memory can reproduce a creative idea, during
although a part of researchers in ANNs field has accepted the dangerous situation.
hypothesis on three types of learning (Doya, 2000). There is The small-world architecture may not be available for the
difficult for them to processing the differences among different social intelligence of the prefrontal cortex, even though it is
types of learning. We suggest to regulate, respectively, the important for brain network of basically cognitive process. A
mode of energy supply for different learning network. The key single cell axon tree of pyramidal neuron has been shown with
neurons of supervised learning are Purkinje cells in cerebellar, thick interlaces and even with 60 thousand branches (Buzsaki,
each cell with a rich dendritic tree in a plane, and receive 2006; Boldog et al., 2018). The prefrontal cortex is the core
information from a lot of parallel nerve fibers uncovered structure to collect different information from other brain areas
myelin coat. The nerve fibers do not require additional energy and efficiently processing the complicate information as well as
supply from neuroglia cell (myelin coat). The electric source controlling the related behavior. A lot of evidence from monkey
should be a stable lower power for supervised leaning circuit; and patients with brain injury demonstrates that prefrontal cortex
on the contrary to the ANN circuit for the unsupervised contains different switches to control proactive, retroactive
learning. The electric source of the circuit should be a little behavior (Hikosaka and Isoda, 2010), many local microcircuits to
higher power and can be quickly changed in a range, because play a role in cognitive flexibility or shifting, and salience network
pyramidal neuron in neocortex with a long axon and axonal to regulate interchanging between default mode network and
tree in 3-diminssion space. Each pyramidal neuron has about central executive network (Tang et al., 2012; Goulden et al., 2014;
4 glial cells who work as both a myelin and additional energy Satpute and Lindquist, 2019). The network hubs in the human
supplier during the process of transmitting neural pulses. brain has been reported by the distributed executive control
theory(van den Heuvell and Sporns, 2013), the network hubs are Population sparse encoding
different from the other brain structures, there are the compact The human brain comprises 1011 cells (approximately 160
white matter in macroscopy and finely structured nerve fiber with billion), and the cerebral cortex comprises 82% of the total brain
high density spinous process in microscopy. It is the structure mass; however, only 19% of all neurons are located in the cerebral
with the qualify to implement integrating information and cortex, which is mainly responsible for intelligence (Azevedo
controlling behavior. To date, the axon trees and their et al., 2009). Each brain function is implemented only by a part of
microcircuits as well as long-range output connectionist pathways the neurons; therefore, the human brain implements most
in the prefrontal cortex has been believed to be the prerequisite intelligent activities by population sparse encoding.
of executive control ability. Especially, medial prefrontal cortex
(mPFC) contains much more GABA-inhibitory neuron that Oscillatory encoding
collect a lot of information from the other neurons, such as Ach Spontaneous oscillatory encoding is a prerequisite of normal
neuron in anterior brain, 5-serotonin neuron in thalamus, brain function because it provides optimal arousal or vigilance
sensory cortex, and limbic-hippocampus system as well as under the mechanism of the nonspecific reticular system.
memory system (Jayachandran et al., 2019; Nakajima et al., 2019;
Ren et al., 2019; Sun et al., 2019; Weitz et al., 2019; Onorato et al., Infra-slow oscillation
2020). There are uniquely two kinds of neuron in PFC and mPFC: The spontaneous fluctuation in brain BOLD signals with
GABA-ergic chandelier cells (ChaCs) and von Economo infra-slow oscillation <0.1 Hz concerns not only the default mode
neurons(VENs). ChCs are the only interneuron subtype that networks(DMN) but also the cerebellum and salience network
selectively innervates the axon initial segment of pyramidal (Raichle et al., 2001; Hikosaka and Isoda, 2010; Raichle, 2010,
neurons in frontal cortex (Tai et al., 2019); there are the higher 2015), which remain to be investigated further. ISA is a distinct
density of VENs in the elderly above 80 year old with higher neurophysiological process that reflects BOLD signals in fMRI,
intelligence (DeFilepe et al., 2002; Evrard, 2018; Gefen and its spatiotemporal trajectories are state dependent (wake
et al., 2018). versus anesthesia) and distinct from trajectories in delta (1–4 Hz)
The programing behavior sequences written by program activity with the electrical property (Tang et al., 2012; Goulden
editor cooperate robots, even though a guiding robot can et al., 2014; Satpute and Lindquist, 2019).
performances the act of etiquette and answers question from
customer. There are not creative idea or contingency approaches.
That an auto drive bicycle runs on the street and avoids obstacles Stochastic computation
presents only human’s sensory-perception-motor ability but not
general intelligence. Spike time encoding
Even though the principal neurons in the classical specific
nervous system have many dendritic branches and an obvious
Intelligence is a laminar distribution, their axon has only a few buttons, usually
neuro-computational system fewer than 15–20 (Branco and Staras, 2009; Borst, 2010; Silver,
2010). The neurons seem to meet the spike time encoding by their
Intelligence was regarded as the ability to compute discrete dendrites as the third factor in synaptic transmission (Borst, 2010;
physical symbols when artificial intelligence was founded in 1956; Scholte et al., 2018).
PDP was regarded as neural computation when the second
generation of ANNs was propagated in 1986. Here, intelligence Stochastic encoding
might be a compound of multiple computing processes rather than Apart from dendritic algorism as the third factor of synaptic
any single computing process. Intelligence is a neuro-computational transmission, there are more factors that make up pyramidal neurons
system, containing the following computations: deterministic as a completely stochastic computational component: dendritic tree,
computation, stochastic computation, chaotic computation, and axonal tree, laminar distributing effect, interlaced microcircuit
the other computation can be implemented by the human brain. interaction, stochastic noise and short-term synaptic plasticity. The
multiple factors form hundreds of micrometers of interneuron space
to assemble microcircuits (Bil and Poo, 2001; Fossati et al., 2019).
Deterministic computation
disordered behavior, and obey some laws that completely describe and social environments has existed in the human brain for
its motion. There are three computational indices (Faure and more than ten thousand years. It might represent the
Korn, 2001; Tsuda, 2015; Chao et al., 2018; Sathiyadevi et al., emergent computation.
2019): Lyapunov exponent (λ), Kolmogorov-Sinai entropy (K), The efficiency of brain energy cost interchange for the amount
and strange attractors. λ is positive if the motion is chaotic; λ of information processing, the partial derivative of the amount of
equals zero if the two trajectories are separate; λ is a constant information with respect to energy cost, might be a basic
amount if the trajectories are periodic. K equals zero if the system parameter of brain evolution and brain function. A new concept
is periodic, whereas if it increases without interruption, the system of emergent computing based on the efficiency of brain
is stochastic, or it increases to a constant value if the system is information processing is suggested as an approach to
chaotic. Strange attractors are fractal objects, and their geometry interdisciplinary theory.
is invariant against changes in scale or size. The first principle of emergent computation is the active or
initial desire that drives a human to create. The second is higher
energy efficiency, meaning trade of more information by a lower
Adaptive computation energy cost. The third principle implicates the premise of creative
intelligence, and the fourth principle encompasses the
Adaptive computation is a category of biological computation chronometric effect and spatiotemporal conversion effect. Because
formatted in biological evolution. An individual changes him- or energy metabolism inside the human brain is a slow process
her-self to meet context-dependent multitask demands. (second scale), while nervous information changes or external
environmental changes are usually rapid processes (millisecond
Adaptive encoding scale), chronometric mechanisms and spatiotemporal conversion
The prefrontal cortex coordinates cognition, emotion, are necessary. The fifth principle is recruiting or reusing neurons
interpersonal communication, coping with situations, and other or physical sources. It not only contains multiple encodings but
complicated mental activities. Adaptive encoding is an algorithmic also may enable more advanced computation that has not yet been
principle that meets multiple demands for making decisions, identified by the best mathematical authorities. The algorithm of
planning, monitoring, error finding and revising in a wide range emergent computation might be based on answer of 10 questions
of tasks and successfully implements goal-directed behavior. The as follows.
encoding feature is expressed mainly in context-dependent
computation with chronometric and dynamic adaptation. Its
algorithm is multivariate pattern analysis and multi-voxel pattern Suggestion and the questions to
analysis via principal component analysis. be investigated further
As mentioned above, ANNs, DNNs and CNNs have already
The emergent computation: A unique made much progresses accompanied by some bugs. Except big
basis of human intelligence energy consumption and larger number of hardware parts, it is
very difficult for DNNs to simulate the social intelligence of the
Emergent computation is not completely new but has new human brain, even by increasing the number of hidden layers or
implications. “Emergent collective computational abilities” designing complicated input sets. The bug might be in the basically
appeared in a paper title in Hopfield (1982). In the article, the theoretical conceptions about the primitive unit, network
emergent computation refers to a functional model of creative architecture and dynamic principle as well as the basic attributes
intelligence in the human brain that has higher energy efficiency of the human brain.
or uses less energy to obtain a stronger effect in nervous We have given suggestion to ANNs in the article text, the list
information processing. of suggestion is as follows:
Although dynamic coding (Stokes et al., 2013), abstract
quantitative rules (Eiselt and Nieder, 2013), context-dependent 1. the primitive neural node will be substituted by two types
computation (Mante et al., 2013), mixed selectivity (Rigotti of unit: excitatory and inhibitory with different
et al., 2013), content-specific codes (Uluç et al., 2018) connectionist weight and threshold. For example,
reconfiguration of responses (Jackson and Woolgar, 2018), sensitivity to spike timing: the lower sensitivity (Δt < 10 ms)
situational processing (Lieberman et al., 2019), and so on have or the higher sensitivity (Δt < 6 ms).
been reported, adaptive encoding cannot cover the crucial 2. The feedback or recurrent inhibition should be set in
property of human social intelligence—creativity. Human beings everywhere of the whole networks, ave. instead of only
not only adapt passively to situations but also actively create error backpropagation.
human society and themselves (Eccles, 1989; MacLean et al., 3. topographic mapping, precise wiring between brain areas,
2014; MacLean, 2016). Therefore, a new mathematical layers, and cell types. This specific wiring bestows huge
computation that represents actively unique creativity in natural computational power with minimal request of energy
consumption. Small world network principle should 4. What is the relation of energy with computational power?
be absorbed into ANN architecture. Is there any difference in computational principle between
4. Energy supply in brain is, respectively, implemented by area, the neocortex and cerebellum?
lamina, column. The energy supply in ANNs needs to 5. How does neuroglia cell perform in cognitive function and
be improved, so that the energy consumption will be saved. neural energy supply?
Energy supply modes are different in different learning 6. What is the meaning of mental capacity-limited or
types, for example between supervised learning with lower affordable processes for conscious cognitive function from
energy consumption mainly in cerebellar and unsupervised the viewpoint of brain energy metabolism?
learning with higher energy consumption in cerebral cortex. 7. Is there any unique computation in the human brain
5. the emergent computation as a basis of cognitive science beyond modern computational mathematics or
substitutes “PDP-neuro-computation.” PDP is just one of postmodern mathematics?
neuro-computation. 8. Is there any other computing principle of the prefrontal
cortex beyond adaptive encoding?
Besides ANNs, the studies in brain simulation along 9. Is there any other core computing resource of human brain
biomedical line have already made much progress in recent for social intelligence in addition to the axon tree with
decades. For example, the Blue Brain project simulated large- long-range networks and local microcircuit-to-microcircuit
scale networks of neocortical micro circuit based on biological communication in the prefrontal cortex?
brain data, including morphological and electrophysiological 10. Is it possible to build a common theoretical basis of
properties as well as cell types of bioactive molecule expression cognitive science to promote the interdisciplinary
(Markram et al., 2015). A reconstructed virtual brain slice was development of brain simulation?
made up. The meso-circuit was 230 mm thick, containing a total
of 139,834 neurons. The virtual slice reproduced oscillatory
bursts (1 Hz) and displayed a spectrum of activity states. In Author contributions
addition, the project provided a neocortical model with
0.29 ± 0.01 mm3 containing 31,000 neurons, 55 layer-specific ZS designed the study. FS performed the literature search. ZS
morphological neurons. But, the results are unsatisfactory, and FS drafted, revised, and wrote the paper. All authors
because there is only 1 Hz bioelectric activity in the simulated contributed to the article and approved the submitted version.
brain slice showing its alive state. The problem might be that how
do the much neurons organize together? Apart from laminar and
column-like architecture, it is necessary to consider the principle Acknowledgments
of small-world organization (Watt and Strogatz1998) that means
a few neurons compose a microcircuit with a definite basically We wish to thank five professors for reviewing and
cognitive function, such as sensory-motor circuits. The questions commenting on an early version of our manuscript. They are Lou
are worth to inquire further into: Minmin at the National Institute of Biological Science, Beijing,
China; Wei Chaocun at Shanghai Jiao Tong University; and Fang
1. Except three neuronal encodings: rate, PPR, and spike time Fang and Jiong JiongYang and Yan Bao at Peking University,
encoding described in Figures 2, a specifically dendritic Beijing, China.
action potentials (dAPs) has been reported (Gidon et al.,
2020). The pyramidal neuron in II/III layer of neocortex
has an ability of parting off XOR, because it can reproduce Conflict of interest
dendtric action potentials(dAPs). The dAPs mechanism is
worth investigating further. How does such a neuron The authors declare that the research was conducted in the
implement the computational ability to separate XOR absence of any commercial or financial relationships that could
function? The task should be finished by a deep neural be construed as a potential conflict of interest.
network. So, it might be new neural encoding principle.
2. Neurogliaform cells (NGFCs) dynamically decouple
neuronal synchrony between brain areas (Sakalar et al., Publisher’s note
2022), but does neither decrease their own firing nor affect
local oscillation. Is its function alike with backpropagation All claims expressed in this article are solely those of the
mechanism in BP learning. In the other words, Does authors and do not necessarily represent those of their affiliated
NGFCs take a part in implicit information propagation in organizations, or those of the publisher, the editors and the
biological brain? reviewers. Any product that may be evaluated in this article, or
3. What is the relationship between brain information and claim that may be made by its manufacturer, is not guaranteed or
brain energy during intelligent activity? endorsed by the publisher.
References
Abrams, T. W., and Kandel, E. R. (1988). Is contiguity detection in classical Evrard, H. C. (2018). Von Economo and fork neurons in the monkey insula,
conditioning a system or a cellular property? Learning in Aplysia suggests a possible implications for evolution of cognition. Curr. Opin. Behav. Sci. 21, 182–190. doi:
molecular site. Trends Neurosci. 11, 128–135. doi: 10.1016/0166-2236(88)90137-3 10.1016/j.cobeha.2018.05.006
Ackley, D., Hinton, G., and Sejnowski, T. (1985). A learning algorithm for Faure, P., and Korn, H. (2001). Is there chaos in the brain? I. concepts of nonlinear
Boltzmann machines. Cogn. Sci. 9, 147–169. doi: 10.1207/s15516709cog0901_7 dynamics and methods of investigation. C. R. Acad. Sci. III 324, 773–793. doi:
10.1016/S0764-4469(01)01377-4
Azevedo, F. A., Carvalho, L. R., Grinberg, L. T., Farfel, J. M., Ferretti, R. E.,
Leite, R. E., et al. (2009). Equal numbers of neuronal and nonneuronal cells make Feldman, J., and Ballard, D. (1982). Connectionist models and their properties.
the human brain an isometrically scaled-up primate brain. J. Comp. Neurol. 513, Cogn. Sci. 6, 205–254. doi: 10.1207/s15516709cog0603_1
532–541. doi: 10.1002/cne.21974
Fitzsimonds, R. M., and Poo, M. M. (1998). Retrograde signaling in the
Bil, G., and Poo, M. (2001). Synaptic modification by correlated activity: Hebb's development and modification of synapses. Physiol. Rev. 78, 143–170. doi: 10.1152/
postulate revisited. Annu. Rev. Neurosci. 24, 139–166. physrev.1998.78.1.143
Boldog, E., Bakken, T. E., Hodge, R. D., Novotny, M., Aevermann, B. D., Baka, J., Fossati, M., Assendorp, N., Gemin, O., Colasse, S., Dingli, F., Arras, G., et al.
et al. (2018). Transcriptomic and morphophysiological evidence for a specialized (2019). Trans-synaptic signaling through the glutamate receptor delta-1 mediates
human cortical GABAergic cell type. Nat. Neurosci. 21, 1185–1195. doi: 10.1038/ inhibitory synapse formation in cortical pyramidal neurons. Neuron 104,
s41593-018-0205-2 1081–1094.e7. doi: 10.1016/j.neuron.2019.09.027
Borst, J. G. (2010). The low synaptic release probability in vivo. Trends Neurosci. Gefen, T., Papastefan, S. T., Rezvanian, A., Bigio, E. H., Weintraub, S., Rogalski, E.,
33, 259–266. doi: 10.1016/j.tins.2010.03.003 et al. (2018). Von Economo neurons of the anterior cingulate across the lifespan
and in Alzheimer's disease. Cortex 99, 69–77. doi: 10.1016/j.cortex.2017.10.015
Branco, T., and Staras, K. (2009). The probability of neurotransmitter release:
variability and feedback control at single synapses. Nat. Rev. Neurosci. 10, 373–383. Gidon, A., Zolnik, T. A., Fidzinski, P., Bolduan, F., Papoutsi, A., Poirazi, P., et al.
doi: 10.1038/nrn2634 (2020). Dendritic action potentials and computation in human layer 2/3 cortical
neurons. Science 367, 83–87. doi: 10.1126/science.aax6239
Buesing, L., and Maass, W. (2010). A spiking neuron as information bottleneck.
Neural Comput. 22, 1961–1992. doi: 10.1162/neco.2010.08-09-1084 Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014). Rich feature hierarchies
for accurate object detection and semantic segmentation. IEEE Conf. Comput. Vis.
Buzsaki, G. (2006). Rhythms of the Brain. P61, P287. New York, NY: Oxford
Pattern Recognit. 2014, 580–587. doi: 10.1109/CVPR.2014.81
University Press.
Goulden, N., Khusnulina, A., Davis, N. J., Bracewell, R. M., Bokde, A. L.,
Cadwell, C. R., Bhaduri, A., Mostajo-Radji, M. A., Keefe, M. G., and
McNulty, J. P., et al. (2014). The salience network is responsible for switching
Nowakowski, T. J. (2019). Development and arealization of the cerebral cortex.
between the default mode network and the central executive network: replication
Neuron 103, 980–1004. doi: 10.1016/j.neuron.2019.07.009
from DCM. NeuroImage 99, 180–190. doi: 10.1016/j.neuroimage.2014.05.052
Caporale, N., and Dan, Y. (2008). Spike timing-dependent plasticity: a Hebbian
Haggar, R. A., and Barr, M. L. (1950). Quantitative data on the size of synaptic end-
learning rule. Annu. Rev. Neurosci. 31, 25–46. doi: 10.1146/annurev.
bulbs in the cat's spinal cord. J. Comp. Neurol. 93, 17–35. doi: 10.1002/cne.900930103
neuro.31.060407.125639
Hamilton, T., Afshar, S., van Schaik, A., and Tapson, J. (2014). Stochastic
Carlson, T., Goddard, E., Kaplan, D. M., Klein, C., and Ritchie, J. B. (2018). Ghosts
electronics: A neuro-inspired design paradigm for integrated circuits. Proc. IEEE
in machine learning for cognitive neuroscience: moving from data to theory.
102, 843–859. doi: 10.1109/JPROC.2014.2310713
NeuroImage 180, 88–100. doi: 10.1016/j.neuroimage.2017.08.019
Hebb, D. O. (1949). The Organization of Behavior. New York, NY: John Willy.
Chao, Z. C., Takaura, K., Wang, L., Fujii, N., and Dehaene, S. (2018). Large-scale
cortical networks for hierarchical prediction and prediction error in the primate Hikosaka, O., and Isoda, M. (2010). Switching from automatic to controlled
brain. Neuron 100, 1252–1266.e3. doi: 10.1016/j.neuron.2018.10.004 behavior: cortico-basal ganglia mechanisms. Trends Cogn. Sci. 14, 154–161. doi:
10.1016/j.tics.2010.01.006
Crick, F. (1989). The recent excitement about neural networks. Nature 337,
129–132. doi: 10.1038/337129a0 Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A., Jaitly, N., et al. (2012).
Deep neural networks for acoustic modeling in speech recognition: the shared views
Cummins, R., and Cummins, D. (2000). Minds, Brains, and Computers: An
of four research groups. IEEE Signal Process. Mag. 29, 82–97. doi: 10.1109/
Historical Introduction to the Foundations of Cognitive Science. Oxford: Blackwell
MSP.2012.2205597
Publishers Ltd.
Hopfield, J. J. (1982). Neural networks and physical systems with emergent
DeFilepe, J., Alonso-Nanclares, L., and Arellano, J. I. (2002). Microstructure of the
collective computational abilities. Proc. Natl. Acad. Sci. U. S. A. 79, 2554–2558. doi:
neocortex: comparative aspects. J. Neurocytol. 31, 299–316. doi: 10.1023/
10.1073/pnas.79.8.2554
A:1024130211265
Hopfield, J. J. (1984). Neurons with graded response have collective computational
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., and Li, F. F. (2009). ImageNet: a
properties like those of two-state neurons. Proc. Natl. Acad. Sci. U. S. A. 81,
large-scale hierarchical image database. Proc. CVPR 14067, 248–255. doi: 10.1109/
3088–3092. doi: 10.1073/pnas.81.10.3088
CVPR.2009.5206848
Hopfield, J. J., and Brody, C. D. (2004). Learning rules and network repair in spike-
Deng, S., Li, J., He, Q., Zhang, X., Zhu, J., Li, L., et al. (2020). Regulation of
timing-based computation networks. Proc. Natl. Acad. Sci. U. S. A. 101, 337–342.
recurrent inhibition by asynchronous glutamate release in neocortex. Neuron 105,
doi: 10.1073/pnas.2536316100
522–533.e4. doi: 10.1016/j.neuron.2019.10.038
Isaacson, J. S., and Scanziani, M. (2011). How inhibition shapes cortical activity.
Doya, K. (2000). Complementary roles of basal ganglia and cerebellum in learning
Neuron 72, 231–243. doi: 10.1016/j.neuron.2011.09.027
and motor control. Curr. Opin. Neurobiol. 10, 732–739. doi: 10.1016/
S0959-4388(00)00153-7 Jackson, J. B., and Woolgar, A. (2018). Adaptive coding in the human brain:
distinct object features are encoded by overlapping voxels in frontoparietal cortex.
Eccles, J. C. (1953). The Neurophysiological Basis of Mind, London: Oxford
Cortex 108, 25–34. doi: 10.1016/j.cortex.2018.07.006
University Press.
Jayachandran, M., Linley, S. B., Schlecht, M., Mahler, S. V., Vertes, R. P., and
Eccles, J. C. (1964). The Physiology of Synapses. Berlin: Springer-Verlag. 201–215.
Allen, T. A. (2019). Prefrontal pathways provide top-down control of memory
Eccles, J. C. (1973). "The Understanding of the Brain." New York, NY: McGraw-Hill for sequences of events. Cell Rep. 28, 640–654.e6. doi: 10.1016/j.
book Company. 88–92. celrep.2019.06.053
Eccles, J. C. (1989). Evolution of the Brain: Creation of the Self. London; New York, Jia, H., Rochefort, N. L., Chen, X., and Konnerth, A. (2010). Dendritic
NY: Routledge. 217–236. organization of sensory input to cortical neurons in vivo. Nature 464, 1307–1312.
doi: 10.1038/nature08947
Eccles, J. C., Fatt, P., and Koketsu, K. (1954). Cholinergic and inhibitory synapses
in a pathway from motor-axon collaterals to motoneurones. J. Physiol. 126, 524–562. Kamigaki, T. (2019). Prefrontal circuit organization for executive control.
doi: 10.1113/jphysiol.1954.sp005226 Neurosci. Res. 140, 23–36. doi: 10.1016/j.neures.2018.08.017
Egger, R., Narayanan, R. T., Guest, J. M., Bast, A., Udvary, D., Messore, L. F., et al. Kandel, E. R. (2001). The molecular biology of memory storage: a dialogue
(2020). Cortical output is gated by horizontally projecting neurons in the deep between genes and synapses. Science 294, 1030–1038. doi: 10.1126/
layers. Neuron 105, 122–137.e8. doi: 10.1016/j.neuron.2019.10.011 science.1067020
Eiselt, A. K., and Nieder, A. (2013). Representation of abstract quantitative rules Kietzmann, T., McClure, P., and Kriegeskorte, N. (2019). "Deep neural networks
applied to spatial and numerical magnitudes in primate prefrontal cortex. J. in computational neuroscience." bioRxiv preprint [Epub ahead of preprint] doi:
Neurosci. 33, 7526–7534. doi: 10.1523/JNEUROSCI.5827-12.2013 10.1101/133504
Kim, Y., Yang, G. R., Pradhan, K., Venkataraju, K. U., Bota, M., García Del the Second International Conference on Artificial Intelligence and Smart Energy
Molino, L. C., et al. (2017). Brain-wide maps reveal stereotyped cell-type-based (ICAIS-2022) IEEE Xplore Part Number: CFP22OAB-ART, pp. 115–123.
cortical architecture and subcortical sexual dimorphism. Cells 171, 456–469.e22.
Near, J., Ho, Y. C., Sandberg, K., Kumaragamage, C., and Blicher, J. U. (2014).
doi: 10.1016/j.cell.2017.09.020
Long-term reproducibility of GABA magnetic reso nance spectroscopy. NeuroImage
Kriegeskorte, N., and Golan, T. (2019). Neural network models and deep learning. 99, 191–196. doi: 10.1016/j.neuroimage.2014.05.059
Curr. Biol. 29, R231–r236. doi: 10.1016/j.cub.2019.02.034
Nimchinsky, E. A., Sabatini, B. L., and Svoboda, K. (2002). Structure and function
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). ImageNet classification of dendritic spines. Annu. Rev. Physiol. 64, 313–353. doi: 10.1146/annurev.
with deep convolutional neural networks. physiol.64.081501.160008
LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature 521, 436–444. Onorato, I., Neuenschwander, S., Hoy, J., Lima, B., Rocha, K.-S., Broggini, A. C.,
doi: 10.1038/nature14539 et al. (2020). A distinct class of bursting neurons with strong gamma synchronization
Lieberman, M. D., Straccia, M. A., Meyer, M. L., du, M., and Tan, K. M. (2019). and stimulus selectivity in monkey V1. Neuron 105, 180–197.e5. doi: 10.1016/j.
Social, self, (situational), and affective processes in medial prefrontal cortex neuron.2019.09.039
(MPFC): causal, multivariate, and reverse inference evidence. Neurosci. Biobehav. Paredes, M. F., James, D., Gil-Perotin, S., Kim, H., Cotter, J. A., Ng, C., et al. (2016).
Rev. 99, 311–328. doi: 10.1016/j.neubiorev.2018.12.021 Extensive migration of young neurons into the infant human frontal lobe. Science
Lillicrap, T. P., Santoro, A., Marris, L., Akerman, C. J., and Hinton, G. (2020). 354, 70–73. doi: 10.1126/science.aaf7073
Backpropagation and the brain. Nat. Rev. Neurosci. 21, 335–346. doi: 10.1038/ Pavlov, I. (1927) Conditioned Reflexes: An Investigation of the Physiological Activity
s41583-020-0277-3 of the Cerebral Cortex. trans. ed. G. V. Anrep. Oxford: Oxford University Press.
Losonczy, A., Makara, J. K., and Magee, J. C. (2008). Compartmentalized dendritic Pei, J., Deng, L., Song, S., Zhao, M., Zhang, Y., Wu, S., et al. (2019). Towards
plasticity and input feature storage in neurons. Nature 452, 436–441. doi: 10.1038/ artificial general intelligence with hybrid Tianjic chip architecture. Nature 572,
nature06725 106–111. doi: 10.1038/s41586-019-1424-8
Maass, W. (1997). Networks of spiking neurons: the third generation of neural Petanjek, Z., Berger, B., and Esclapez, M. (2009). Origins of cortical GABAergic
network models. Neural Netw. 10, 1659–1671. doi: 10.1016/S0893-6080(97)00011-7 neurons in the cynomolgus monkey. Cereb. Cortex 19, 249–262. doi: 10.1093/cercor/
Maass, W. (2014). Noise as a resource for computation and learning in networks bhn078
of spiking neurons. Proc. IEEE 102, 860–880. doi: 10.1109/JPROC.2014.2310593 Petanjek, Z., Judas, M., Kostović, I., and Uylings, H. B. (2008). Lifespan alterations
Maass, W. (2015). To spike or not to spike: that is the question. Proc. IEEE 103, of basal dendritic trees of pyramidal neurons in the human prefrontal cortex: a
2219–2224. doi: 10.1109/JPROC.2015.2496679 layer-specific pattern. Cereb. Cortex 18, 915–929. doi: 10.1093/cercor/bhm124
MacLean, E. L. (2016). Unraveling the evolution of uniquely human cognition. Raichle, M. E. (2010). The brain's dark energy. Sci. Am. 302, 44–49. doi: 10.1038/
Proc. Natl. Acad. Sci. U. S. A. 113, 6348–6354. doi: 10.1073/pnas.1521270113 scientificamerican0310-44
MacLean, E. L., Hare, B., Nunn, C. L., Addessi, E., Amici, F., Anderson, R. C., et al. Raichle, M. E. (2015). The brain's default mode network. Annu. Rev. Neurosci. 38,
(2014). The evolution of self-control. Proc. Natl. Acad. Sci. U. S. A. 111, E2140– 433–447. doi: 10.1146/annurev-neuro-071013-014030
E2148. doi: 10.1073/pnas.1323533111 Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., and
Mante, V., Sussillo, D., Shenoy, K. V., and Newsome, W. T. (2013). Context- Shulman, G. L. (2001). A default mode of brain function. Proc. Natl. Acad. Sci. U. S.
dependent computation by recurrent dynamics in prefrontal cortex. Nature 503, A. 98, 676–682. doi: 10.1073/pnas.98.2.676
78–84. doi: 10.1038/nature12742 Ren, S. Q., and Li, Z. Z.,Lin, S., Bergami, M., and Shi, S. H. (2019). “Precise
Marblestone, A. H., Wayne, G., and Kording, K. P. (2016). Toward an integration long-range microcircuit-to-microcircuit communication connects the frontal and
of deep learning and neuroscience. Front. Comput. Neurosci. 10:94. sensory cortices in the mammalian brain.” Neuron, 104: 385–401, doi: 10.1016/j.
neuron.2019.06.028
Mark, S., Romani, S., Jezek, K., and Tsodyks, M. (2017). Theta-paced flickering
between place-cell maps in the hippocampus: a model based on short-term synaptic Richards, B. A., and Lillicrap, T. P. (2019). Dendritic solutions to the credit
plasticity. Hippocampus 27, 959–970. doi: 10.1002/hipo.22743 assignment problem. Curr. Opin. Neurobiol. 54, 28–36. doi: 10.1016/j.conb.2018.08.003
Markram, H., Lübke, J., Frotscher, M., and Sakmann, B. (1997). Regulation of Rigotti, M., Barak, O., Warden, M. R., Wang, X. J., Daw, N. D., Miller, E. K., et al.
synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science 275, (2013). The importance of mixed selectivity in complex cognitive tasks. Nature 497,
213–215. doi: 10.1126/science.275.5297.213 585–590. doi: 10.1038/nature12160
Markram, H., Muller, E., Ramaswamy, S., Reimann, M. W., Abdellah, M., Roberts, P. D. (1999). Computational consequences of temporally asymmetric
Sanchez, C. A., et al. (2015). Reconstruction and simulation of neocortical learning rules: I. differential hebbian learning. J. Comput. Neurosci. 7, 235–246. doi:
microcircuitry. Cells 163, 456–492. doi: 10.1016/j.cell.2015.09.029 10.1023/A:1008910918445
Mcculloch, W. S., and Pitts, W. (1943). A logical calculus of the ideas immanent Roelfsema, P. R., Engel, A. K., König, P., and Singer, W. (1997). Visuomotor
in nervous activity. Bull. Math. Biophys. 5, 115–133. doi: 10.1007/BF02478259 integration is associated with zero time-lag synchronization among cortical areas.
Nature 385, 157–161. doi: 10.1038/385157a0
McDonnell, M. D., and Ward, L. M. (2011). The benefits of noise in neural
systems: bridging theory and experiment. Nat. Rev. Neurosci. 12, 415–425. doi: Rosenblatt, F. (1958). The perceptron: a probabilistic model for information
10.1038/nrn3061 storage and organization in the brain. Psychol. Rev. 65, 386–408. doi: 10.1037/
h0042519
Miller, E. K., and Cohen, J. D. (2001). An integrative theory of prefrontal cortex
function. Annu. Rev. Neurosci. 24, 167–202. doi: 10.1146/annurev.neuro.24.1.1 67 Rumelhart, D., Hinton, G., and Williams, R. (1986). Learning representations by
Back propagating errors. Nature 323, 533–536. doi: 10.1038/323533a0
Minsky, M., and Papert, S. (1969). "Perceptrons." Cambridge, MA: MIT Press.
Rumelhart, D., and McClelland, J. (1986). Parallel Distributed Processing: Explorations
Miyake, A., Friedman, N. P., Emerson, J. M., Witzki, A. H., Howerter, A., and in the Microstructure of Cognition: Foundations. Cambridge, MA: MIT Press.
Wager, T. D. (2000). The unity and diversity of executive functions and their
contributions to complex “frontal lobe” tasks: a latent variable analysis. Cogn. Sakalar, E., Klausberger, T., and Lasztóczi, B. (2022). Neurogliaform cells
Psychol. 41, 49–100. doi: 10.1006/cogp.1999.0734 dynamically decouple neuronal synchrony between brain areas. Science,
377–324–328. doi: 10.1126/science.abo3355
Mongillo, G., Rumpel, S., and Loewenstein, Y. (2018). Inhibitory connectivity
defines the realm of excitatory plasticity. Nat. Neurosci. 21, 1463–1470. doi: 10.1038/ Sathiyadevi, K., Karthiga, S., Chandrasekar, V. K., Senthilkumar, D. V., and
s41593-018-0226-x Lakshmanan, M. (2019). Frustration induced transient chaos, fractal and riddled
basins in coupled limit cycle oscillators. Commun. Nonlinear Sci. Numer. Simul. 72,
Montague, P. R., and Sejnowski, T. J. (1994). The predictive brain: temporal 586–599. doi: 10.1016/j.cnsns.2019.01.024
coincidence and temporal order in synaptic learning mechanisms. Learn. Mem. 1,
1–33. doi: 10.1101/lm.1.1.1 Satpute, A. B., and Lindquist, K. A. (2019). The default mode Network's role in
discrete emotion. Trends Cogn. Sci. 23, 851–864. doi: 10.1016/j.tics.2019.07.003
Mrzljak, L., Uylings, H. B., Kostovic, I., and van Eden, C. G. (1992). Prenatal
development of neurons in the human prefrontal cortex. II. A quantitative Golgi Scholte, H. S., Losch, M. M., Ramakrishnan, K., de Haan, E. H. F., and Bohte, S. M.
study. J. Comp. Neurol. 316, 485–496. doi: 10.1002/cne.903160408 (2018). Visual pathways from the perspective of cost functions and multi-task deep
neural networks. Cortex 98, 249–261. doi: 10.1016/j.cortex.2017.09.019
Nakajima, M., Schmitt, L. I., and Halassa, M. M. (2019). Prefrontal cortex
regulates sensory filtering through a basal ganglia to thalamus pathway. Neuron 103, Sejnowski, T. J. (1995). Pattern recognition. Time for a new neural code? Nature
445–458.e10. doi: 10.1016/j.neuron.2019.05.026 376, 21–22. doi: 10.1038/376021a0
Naveen, K., and Sparvathi, R. M. (2022). “Analysis of medical image by using Shen, Z. (2017). Development of brain function theory and the frontier brain
machine learning applications of convolutional neural networks” In Proceedings of projects. Chin. Sci. Bull. 62, 3429–3439. doi: 10.1360/N972017-00426
Sherrington, C. (1906). "The Integrative Action of the Nervous System." New York, Tsodyks, M. V., and Markram, H. (1997). The neural code between neocortical
NY: Scribners. pyramidal neurons depends on neurotransmitter release probability. Proc. Natl.
Acad. Sci. U. S. A. 94, 719–723. doi: 10.1073/pnas.94.2.719
Silver, R. A. (2010). Neuronal arithmetic. Nat. Rev. Neurosci. 11, 474–489. doi:
10.1038/nrn2864 Tsuda, I. (2015). Chaotic itinerancy and its roles in cognitive neurodynamics.
Curr. Opin. Neurobiol. 31, 67–71. doi: 10.1016/j.conb.2014.08.011
Song, S., Miller, K. D., and Abbott, L. F. (2000). Competitive Hebbian learning
through spike-timing-dependent synaptic plasticity. Nat. Neurosci. 3, 919–926. doi: Uluç, I., Schmidt, T. T., Wu, Y. H., and Blankenburg, F. (2018). Content-specific
10.1038/78829 codes of parametric auditory working memory in humans. NeuroImage 183,
254–262. doi: 10.1016/j.neuroimage.2018.08.024
Spruston, N. (2008). Pyramidal neurons: dendritic structure and synaptic
integration. Nat. Rev. Neurosci. 9, 206–221. doi: 10.1038/nrn2286 van den Heuvell, M. P., and Sporns, O. (2013). Network hubs in the human brain.
TICS 17, 683–696. doi: 10.1016/j.tics.2013.09.012
Stokes, M. G., Kusunoki, M., Sigala, N., Nili, H., Gaffan, D., and Duncan, J. (2013).
Dynamic coding for cognitive control in prefrontal cortex. Neuron 78, 364–375. doi: Watts, D. J., and Strogatz, S. H. (1998). Collective dynamics of 'small-world'
10.1016/j.neuron.2013.01.039 networks. Nature 393, 440–442. doi: 10.1038/30918
Sultana, A., Dey, S., and Rahma, M. A. (2021). “A deep CNN based Kaggle contest Weitz, A. J., Lee, H. J., ManK, C., and Lee, J. H. (2019). Thalamic input to
winning model to recognize real-time facial exppression.” In 2021 5th International orbitofrontal cortex drives brain-wide, frequency-dependent inhibition mediated
Conference on Electrical Engineering and Information and Communication Technology by GABA and zona incerta. Neuron 104, 1153–1167.e4. doi: 10.1016/j.
(ICEEICT) Military Institute of Science and Technology (MIST), Dhaka-1216, neuron.2019.09.023
Bangladesh. Whittaker, V. P., and Gray, E. G. (1962). The synapse: biology and morphology. Br.
Sun, Q. X., Li, M., Ren, M., Zhao, Q., Zhong, Y., Ren, P., et al. (2019). A Med. Bull. 18, 223–228. doi: 10.1093/oxfordjournals.bmb.a069983
whole-brain map of long-range inputs to GABAergic interneurons in the Williams, S. R. (2004). Spatial compartmentalization and functional impact of
mouse medial prefrontal cortex. Nat. Neurosci. 22, 1357–1370. doi: 10.1038/ conductance in pyramidal neurons. Nat. Neurosci. 7, 961–967. doi: 10.1038/
s41593-019-0429-9 nn1305
Tai, Y., Gallo, N. B., Wang, M., Yu, J.-R., and Aelst, L. V. (2019). Axo-axonic innervation Williams, L. E., and Holtmaat, A. (2019). Higher-order Thalamocortical inputs
of neocortical pyramidal neurons by GABAergic chandelier cells requires AnkyrinG- gate synaptic long-term potentiation via disinhibition. Neuron 101, 91–102.e4. doi:
associated L1CAM. Neuron 102, 358–372.e9. doi: 10.1016/j.neuron.2019.02.009 10.1016/j.neuron.2018.10.049
Tang, Y. Y., Rothbart, M. K., and Posner, M. I. (2012). Neural correlates of Wonders, C. P., and Anderson, S. A. (2006). The origin and specification
establishing, maintaining, and switching brain states. Trends Cogn. Sci. 16, 330–337. of cortical interneurons. Nat. Rev. Neurosci. 7, 687–696. doi: 10.1038/
doi: 10.1016/j.tics.2012.05.001 nrn1954
Tavanaei, A., and Maida, A. (2017). "Bio-inspired spiking convolutional neural Zhao, Q.-Z., Zheng, P., Xu, S. T., and Wu, X. (2019). Object detection with deep
network using layer-wise sparse coding and STDP learning." doi: 10.48550/ learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 30, 3212–3232. doi:
arXiv.1611.03000. [Epub ahead of preprint]. 10.1109/TNNLS.2018.2876865
Thorpe, S. J., and Fabre-Thorpe, M. (2001). Seeking categories in the brain. Science Zucker, R. S., and Regehr, W. G. (2002). Short-term synaptic plasticity. Annu. Rev.
291, 260–263. doi: 10.1126/science.1058249 Physiol. 64, 355–405. doi: 10.1146/annurev.physiol.64.092501.114547