Physical Reservoir Computing With Emerging Electronics
Physical Reservoir Computing With Emerging Electronics
Over the past 50 years, the downscaling of transistor sizes—and the working principle is mirrored in a type of a recurrent neural network
resulting exponential growth in the number of transistors in an inte- (RNN) algorithm known as reservoir computing (RC)8.
grated circuit—has driven improvements in computer performance It is the case that RC (together with other RNNs) has frequently
and led to applications such as artificial intelligence (AI)1. However, in been used to study brain functionalities because of the similarities in
recent years, such scaling has begun to approach its physical limits, their complex recurrent dynamics9–12. The origins of RC can be traced
facing challenges related to quantum effects and reliability issues2,3. back to 2001, when Herbert Jaeger invented the echo state network
At the same time, the physical separation of memory and comput- (ESN) as an efficient implementation of an RNN13. That work used
ing units in conventional hardware limits energy efficiency—an issue the echo states in a complex RNN layer, followed by a linear readout
known as the von Neumann bottleneck—creating particular problems layer trained by linear regression. In 2002, Wolfgang Maass proposed
for data-intensive applications such as AI4. the liquid-state machine (LSM), which operates in a continuous and
Neuromorphic computing aims to create energy-efficient com- spiking fashion14,15. Liquid state is an important concept that can be
puting systems by emulating the working mechanisms of the brain5–7. visualized by considering what happens when continuously throw-
Unlike conventional computers, the brain uses an analogue computing ing stones into a lake. The multiple ripples created are interactive
approach with complex recurrent dynamics. Although the nervous and result in a complex ripple pattern—the liquid states. Information
system is extremely complex and yet to be fully understood, simplified related to the stone-throwing activities over the past few seconds can
structures can be recreated in artificial systems. be retrieved by applying a linear layer to the liquid states. Both works
Recently, it has been shown that recorded neuronal activities in were subsequently experimentally unified as RC based on their similar
the brain can be decoded into behaviours via a linear readout8. For principles (Fig. 1b)16: an input layer, high-dimensional mapping in a
example, electrode arrays were implanted into a mouse’s brain to fixed reservoir layer with complex time-dependent dynamics, and a
trace its electrophysiology during a foraging task. By calculating the linear readout layer.
weighted sum of multi-channel neural signals from an electroencepha- In RC with N neurons, a d-dimensional input and a k-dimensional
logram measured in the frontal cortex, the decision variables that affect output, the network topology comprises three layers: an input layer
the mouse’s behaviours could be calculated (Fig. 1a). This biological Win (N × d), a reservoir layer Wres (N × N) and an output layer Wout (k × N).
1
School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Tsinghua University, Beijing, China. 2Beijing National Research
Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, China. 3Institute of Functional Nano & Soft Materials (FUNSOM),
Jiangsu Key Laboratory for Carbon-Based Functional Materials & Devices, Soochow University, Suzhou, Jiangsu, P. R. China. e-mail: [email protected]
States in parallel
Output weight • Device-to-device variation enhances
state richness
Pre-processing
Input
Pros:
Output • Simple implementation
Parallel dynamic
…
devices array Cons:
• Relatively weak MC Dynamic memristor
States in parallel
Input
Pre-processing Pros:
• Multi-channel state generation in one
Input electrode
material
State electrode
Output Cons:
In materia signal • Relatively weak MC
Conductive material • Uncontrollable signal flows Interconnecting nanowires
States in parallel
Counter
Pre-processing Pros:
• Equivalent to cyclic reservoir
Pre-neuron rotor algorithm
Output Input
• Explainable hardware design
Post-neuron rotor
Cons: (–x)
Dynamic neuron
• Complex connections layout in 2D
Fig. 2 | eRC architecture taxonomy. a–d, Four architectures: delay-coupled RC (a), dynamic devices RC (b), in materia RC (c) and rotating neurons RC (d). Also shown
are their building blocks, key features and implementation examples.
node—is obtained by sampling the output of the physical node in series. optical RC systems, in which a long optical fibre is employed to delay
The output of the physical node is delayed and added to the masked the signal37,74, thus enabling ultrahigh-speed signal processing39. One
input signal via a feedback line31,65,66. By linking the current input with disadvantage of delay-coupled RC is its serial operations due to the
the previous node states, the delay line preserves the historical infor- time-multiplexing of the physical node. Parallel computing is preferred
mation of the network and forms an RNN31,67. Multiple delay lines and in analogue neuromorphic computing75. Nevertheless, delay-coupled
a longer delay time can enhance the MC to retain a longer historical RC provides useful insights for PRC based on dynamical hardware and
signal68–70. A non-resonant delay cycle can also boost the MC and has inspired the development of PRC with different physical systems,
the corresponding performance71. In typical eRC implementations as well as other eRC architectures.
(Fig. 2a), feedback and time-multiplexing are realized in the digital
domain, where analogue and digital conversions are necessary. Here, Dynamic devices RC
the physical node can be a circuit or an electronic device, because the In recent years, an eRC architecture based on dynamic devices has
nonlinear region exists in most electronic behaviours28. emerged. In this architecture, the physical node is a single dynamic
An advantage of delay-coupled RC is that only one physical node is device, and the entire reservoir layer consists of multiple dynamic
needed to construct a reservoir layer, which dramatically reduces the devices that receive input signals in parallel (Fig. 2b). This technique is
implementation complexity compared with interconnected neurons also known as spatial multiplexing76 (analogous to time-multiplexing
in conventional RC28. On the other hand, of importance when design- in delay-coupled RC), where the effective number of physical nodes
ing delay-coupled RC is the implementation of the delay line. In early is increased to enhance the system performance. The key idea is to
works, the delay is realized digitally (Fig. 2a), and the conversions of harness the intrinsic memory property and nonlinearity of dynamic
analogue and digital signals can be challenging28. To minimize cost, devices to encode temporal input signals into internal device states,
an analogue-to-digital converter (ADC) and a digital-to-analogue con- such as the conductance of a volatile memristor, which are then read
verter (DAC) with low precisions (usually 4–10 bits) can be employed, as the reservoir state vector. The dynamic device under this architec-
but the noise implication needs to be studied carefully67. Recent imple- ture thus usually possesses short-term memory and nonlinearity, for
mentations of delay-coupled RC are integrated with mature Si com- example, a dynamic memristor23,24,33,77–81, spintronic oscillator82, fer-
plementary metal–oxide–semiconductor (CMOS) circuits, such as roelectric device83–86, ionic transistor87,88 or quantum system76. In 2017,
180-nm (a spike-based delay-coupled RC)72 and 65-nm technology73. a memristive RC based on multiple independent dynamic memristors
In fact, the delay-coupled RC architecture is of particular interest for was proposed23 in which the different spike-based input sequences
were encoded into the memristor conductance states via short-term system, which is an advantage over other materials98–101. The signal
memory. The state richness can be improved to a certain degree input and state output can be accessed via electrodes such as Au pads
through device-to-device variation of the temporal dynamics23,24,86. A for the pre- and post-processing. A number of materials and substrates
group of devices with low device-to-device variation might cause issues have been investigated, including Ag/AgI nanowires102, carbon nano-
like overfitting in the readout layer. Time-multiplexing to a spintronic tubes97,103, organic electrochemical network devices104,105, magnetic
oscillator has also been introduced to substantially increase the length skyrmion memristors106 and ferroelectric devices107–109. Other examples
of the state vector82. However, the absence of a delayed feedback line include wave propagation and transmission through materials110–113,
requires that a single device has to support the MC of hundreds of such as spin waves travelling through a magnetic material114,115 and
virtual nodes. The effective MC declines as the mask length increases. metamaterial116.
Inspired by the above studies, different mask vectors with relatively In materia RC fully explores the internal dynamics of a conductive
short lengths (around five) were applied to parallel dynamic memris- material itself. Therefore, the hardware design outside the material
tors to avoid the vanishing of MC77. Subsequently, efforts have been can be minimized or is even needless, highlighting its unique advan-
made to generate high-quality state vectors using various dynamic tage of extremely low cost and simple hardware implementation.
devices. For example, the number of states was increased by reading Moreover, a single material can produce multiple state channels in
the outputs of ferroelectric tunnelling junctions (FTJs) at multiple parallel. However, its main disadvantage is its shortage in MC, as this
time steps84; applying different gate voltages to an α-In2Se3 transis- mainly depends on the material properties. An exception was found
tor array resulted in different relaxation times for each device and in a spin wave-based RC simulation that yielded an MC of 38 (ref. 115),
thus increased the reservoir’s state richness89; ionic transistors were but further studies are needed to reveal the underlying mechanism.
configured with different gate lengths to induce different relaxation Furthermore, in-material signal flows are hardly accessible and control-
times and nonlinearities to enhance state richness88; and integrating lable, as they are usually sealed inside the material body, unlike other
a configurable memristor, capacitor and resistor in a single cell—the architectures that have independent circuits or devices. Although
temporal kernel90—provided more flexibility for the reservoir’s dynam- the physical reservoir layer, in principle, could be designed in a highly
ics, allowing the system to adapt to a wide range of signal-processing random manner, it is still desirable to monitor the signal flows of each
tasks, from ultrasound (time constant of 10−7 s) to electrocardiogram neuron to understand how the inner structure affects the state vector
(time constant of 1 s). and the corresponding dynamics, especially during the development
Compared with other eRC architectures, dynamic devices RC fully and optimization stages. In existing studies, the signal flows and inner
explores the internal device behaviour, and several techniques can be dynamics are mainly analysed through materials modelling94,99,100,117,
used to improve the quantity and quality of the reservoir states. This which may not fully reveal the actual picture.
architecture simplifies connectivity and facilitates hardware imple-
mentation, minimizes system complexity and hardware cost, while Rotating neurons RC
ensuring sufficient state richness. It is commonly used in in-sensor In a recently proposed architecture32, rotating neurons RC, an equiva-
computing25,91,92 and self-powered computing93, because the devices lent pair of physically rotating objects and a cyclic RC algorithm were
can directly receive environmental stimulation such as light sensing discovered. Cyclic RC is a simplified version of the classical RC64, where
when using optoelectrical devices25, and they can perform in situ com- the reservoir layer has a ring topology and can be designed in a deter-
puting simultaneously, opening a novel edge-computing paradigm. ministic manner with fewer parameters in comparison to randomly
However, the challenge of dynamic devices RC is that the internal generated connections. It has been mathematically proven that if we
states are less likely to encode and distinguish information with a long periodically shift the input and output connections of a first-order
timescale. This is because the MC depends only on the capacity of a dynamical node array, its physical behaviours are equivalent to the
single device, whereas the MCs of multiple parallel devices are mostly in/out of a cyclic reservoir, which we name ‘rotating neurons RC’. So
overlapped rather than accumulating to a higher value. This limits the far, this is the only network-level equivalent pair of an RC hardware
MC to a relatively low level, but could be mitigated by storing intermedi- architecture and algorithm that enables the mapping of cyclic RC to
ate states in a buffer, at the cost of external memory23–25,89. rotational hardware and makes the hardware designs interpretable by
algorithm. Pre- and post-neuron rotors can physically implement the
In materia RC weight matrix of the reservoir layer (that is, a shifted identity matrix).
The in materia RC architecture is a fast-growing approach in recent This explainable hardware design brings unique benefits. It enables
years to implement designless PRC94. It was first implemented using a straightforward and elegant implementation of the reservoir layer so
neuromorphic atomic switch network95. It operates under the prem- that the assistive peripheral circuits and the interface between different
ise that the internal dynamics of an aeolotropic material—a material modules, such as the ADC, buffer and memory, are not required, which
whose properties depend on the direction of measure—are sufficiently substantially reduces the system complexity and power. Furthermore,
intricate to allow high-dimensional mapping94,96,97. A conductive mate- rotating neurons RC could offer a much higher MC than none-rotation
rial suitable for the architecture is analogous to the complex spatially systems (similar to the parallel dynamic devices RC)32, because the MC
dependent ripple patterns observable on a lake surface after stone can be enhanced as the number of physical nodes increases, as has been
throws (as in the case of the LSM example discussed earlier). Injecting analytically studied with the cyclic RC algorithm64. A follow-up study
signals at a particular location of the conductive material results in has shown an over threefold MC improvement by adding rotation to
distinct responses at different state-vector collection locations. This the PRC system118.
requires the material to exhibit heterogeneous conductivity between An example implementation based on electrical circuits is shown
every in–out channel, allowing the in materia signal to undergo differ- in Fig. 2d, where a multiplexer array subject to a counter is used to real-
ent nonlinear transformations. Here, the physical node is the material ize the pre- and post-neuron rotors, and the first-order dynamic neuron
itself, which determines the MC. To enhance the MC, memristive materi- can be properly approximated by an integration–ReLU–leakage circuit.
als are preferable (Fig. 2c). For example, an in materia RC was developed When applying a driving pulse to the counter, every multiplexer in the
by drop-casting interconnected Ag nanowires94. The memristive prop- pre-neuron rotor periodically and sequentially links its input channel
erty resulted from Ag+ ion migration in the polyvinylpyrrolidone (PVP) and each neuron circuit, thus implementing the rotation. The main
at the cross-point junction (acting like neurons) between nanowires, challenge associated with rotating neurons RC is the layout of the
forming a volatile conductive bridge. The connectivity and junction great number of connections on a two-dimensional (2D) substrate,
density of the Ag–PVP–Ag nanowires are relatively controllable in this as a rotating connection is naturally a 3D motion. As the number of
Conventional components
a Input b Mackey–Glass nonlinear circuit c
Output voltage Voltage
Integration–ReLU–leakage
Output voltage Input
d
f
Z–1
Output
Emerging electronics
Memristor cell
Spin-torque nano-
oscillator
Current
Time
Response
Conductance
Time
Current Magnetic Oxygen ions
field Time
f FeFET g Nanowire network h
Input voltage Input Output
Light stimulus
G
Ferroelectric
S Substrate D Time Time
Polarization
Output
Output
Electric field
Fig. 3 | Physical nodes. a, A general model of the physical node. The two basic absence of stimulation, thus providing short-term memory and nonlinearity.
characteristics of physical nodes are nonlinearity and dynamical property. f and A 3D memristor array can provide high-density memristors as physical nodes for
d represent a nonlinear function and a decay factor, respectively, and Z−1 denotes generating node states. The input signal can be injected at corresponding word
a delay operation over discrete time steps. b, A Mackey–Glass nonlinear circuit lines. f, A ferroelectric field-effect transistor (FeFET), whose gate (G) receives an
with its characteristic input voltage versus output voltage curve. It exhibits a input signal, induces distinct outputs at the drain (D), source (S) and substrate
Mackey–Glass-type nonlinear transformation and dynamical property provided (silicon), resulting in three state channels in a single device. g, A nanowire
by the resistor–capacitor network. c, An integration–ReLU–leakage circuit. network as a physical node can produce different nonlinearities and dynamical
The diode offers a nonlinearity similar to the ReLU function, and the resistor– behaviours at the output electrodes in response to a common input signal at the
capacitor circuit offers integration and dynamical property. d, A spin–torque input electrode. h, Optoelectronic physical nodes are naturally compatible with
nano-oscillator. A magnetic tunnel junction subjected to current input and in- and near-sensor computing, where optical stimulation from the environment
an external magnetic field can output an oscillating signal to generate states. (for example, light bouncing off a PRC sign) can be directly presented to the node
e, A dynamic or volatile memristor, whose conductance can be dynamically array to induce reservoir states generation.
modulated by external stimulation and gradually relaxes to its initial value in the
neurons increases, the wiring between rotors and neurons will grow a kernel function, should offer a nonlinear function between the
exponentially on the 2D substrate. This issue could be mitigated by input–output signals. Otherwise, the PRC can only solve linearly
exploring 3D integration. separable problems, because the rest of the network is usually lin-
ear27,28. Unlike feed-forward DNNs, PRC is a dynamical system exhibit-
Physical node design ing short-term memory, which mainly stems from the physical node.
Physical node is another key element for PRC (Fig. 1c). It usually refers Other implementation-specific properties of physical nodes in eRCs
to the interconnected dynamical components acting like neurons in include oscillation82, integration32, leaky integrate-and-fire72 and
RC algorithms. The connectivity and interaction between physical device-to-device variation24. Figure 3 illustrates several examples of
nodes is governed by the chosen PRC architecture. In delay-coupled physical nodes and their main characteristics. Conventional electronics
RC, a single physical node receives the time-multiplexing signal and can provide dynamics with specific nonlinearity and time constants
generates virtual nodes in series. In dynamic devices RC, multiple physi- to implement a physical node, and typical examples to realize the
cal nodes independently receive stimulation and generate output in Mackey–Glass nonlinear and ReLU functions are shown in Fig. 3b,c
parallel, without interaction. In in materia RC, physical nodes could refs. 31,32,67. The magnetic tunnel junction in dynamic devices RC
be the junctions of signal flow between each in/out channel, whose can serve as a nonlinear oscillator for state generation (Fig. 3d)82,119.
interactions are defined by the random connections inside the mate- Memristive devices have also been widely studied to provide dynamics
rial. In rotating neurons RC, physical nodes interact with each other for eRC in a different way: historical stimulation changes the conduct-
via the pre- and post-neuron rotors. ance in a volatile manner, thus connecting the current node state and
Generally, physical nodes possess two main characteristics historical node states23–25,33,63,77,93,120. Memristive devices and materi-
(Fig. 3a): nonlinearity and dynamics. The physical node, acting as als are usually preferred in dynamic devices RC and in materia RC to
compensate for the weak MC of these. Recently, a 3D memristor array During the early stage of PRC development, VMMs were mostly
was proposed to generate more states80,120 (Fig. 3e). A ferroelectric implemented using a computer. Recently, hardware-based VMM has
field-effect transistor can produce three channels of node states in been intensively investigated in the field of neuromorphic comput-
parallel (Fig. 3f). Its ferroelectricity offers a history-dependent polari- ing because it consumes the majority of computational resources in
zation state and nonlinear polarization dynamics107,108. In materia RC is artificial neural network applications5,125. The solutions to the output
commonly based on junctions created by interconnected nanowires layer are similar across eRC research—a memristor crossbar array is
(Fig. 3g), where multiple output pads respond nonlinearly and differ- used that has been proven extremely energy-efficient for VMM opera-
ently to the input100,102,104. Recent developments in physical nodes have tion compared with digital computers125. Different from the reservoir
focused on incorporating more and more versatility. For example, layer, the memristor126 used in the output layer must be non-volatile
optoelectronic physical nodes can simultaneously receive optical to store the trained output weights (Fig. 4d). Ideally, the voltage sig-
stimulation and produce a state output, thus enabling in-sensor com- nals flow through each memristor, resulting in a current according to
puting25,92 or multimode computing87,89 (Fig. 3h). Ohm’s law, which is then integrated at the output to obtain the weighted
sum results of state vectors according to Kirchhoff’s law. However, the
Pre-processing techniques conductance variation in the memristors is a big challenge, and the
Beyond the extensively studied reservoir layer, the pre-processing output weights should be noise-tolerant to avoid substantial accu-
(input layer) and output layers must also be taken into consideration racy loss. In the memristor array, each unit cell commonly consists of
to form a complete PRC hardware. The pre-processing methods vary a one-transistor–one-resistor (1T1R) structure to avoid the sneak path
for different RC architectures, physical nodes and tasks. problem. Memristor array-based output layers have been implemented
A commonly used technique to increase state richness is time- for rotating neurons RC32, parallel dynamic devices RC33,83,84,92,93 and in
multiplexing31,33. Figure 4a illustrates a time-multiplexing operation materia RC94. In 2022, a real-time fully analogue RC system was dem-
with a mask length of 10. The duration of each piece of discrete data onstrated by integrating volatile memristors in the reservoir layer and
(black line) is equally divided into ten units, followed by allocating the non-volatile memristors in the output layer, forming a dynamic devices
data value randomly multiplied by (or added to) −1 or 1 to each unit. RC architecture33. That work overcame the difficulties in interfacing
The generated mask vector remains constant for the duration of every different modules and handling the real-time end-to-end signal flows
data point. Bias and scaling may also be applied after time-multiplexing in an all-analogue fashion, while yielding orders-of-magnitude lower
to find an optimal input range68. To induce complex dynamics (blue power consumption than digital RC.
line), the empirical duration of each unit (θ in Fig. 4a) is approximately
a quarter of the time constant of the physical node31,82. The resulting Optimization strategies
node response of each time-multiplexing unit is the state vector that The development of eRC always involves hyperparameter optimization,
can be collected in series, namely the virtual node. Here, the unit dura- where the tunable parameters in the hardware are more sophisticated
tion θ and mask length need to be carefully chosen. A duration that is than software algorithms28. This section discusses the key factors that
too long would cause saturation before the next value comes and lead can be optimized. First, the reservoir size is the prime feature that deter-
to an inadequate state richness due to bistable state values in the node mines the degree of freedom of the dynamics in the reservoir layer
response31,33. At the same time, if the duration is too short, the physical that affects the MC and approximation capacity16,27,64,127. The reservoir
node may work in a small linear range with a low amplitude. Therefore, size (also the length of the state vector) in conventional RC indicates
the time constant needs to be chosen carefully to adapt to the intervals the number of neurons. However, the concept of virtual nodes in eRC
of the source signal and the desired number of virtual nodes. The mask suggests that the reservoir size could be increased without adding
vector is usually randomly generated. Studies have also revealed that physical nodes31. In general, the minimum reservoir size depends on the
the mask generated by the chaotic system121 and maximum length task complexity and target accuracy and error. A larger reservoir size
sequences122 could perform better. This technique is essential for requires more hardware resources to implement. The computational
delay-coupled RC to produce sufficient virtual node states, as only one capacity cannot be endlessly improved by expanding the reservoir size27
physical node exists31. Meanwhile, it is optional for other architectures and hence should be optimized for a specific application.
to increase the number of states33,77,83,85,120. The RC performance is highly sensitive to its input range. Unlike
Another widely used pre-processing technique is spike encod- gradient-based neural networks, the RC training process cannot auto-
ing, which is particularly useful for 2D inputs such as images. Here we matically adapt to achieve the optimal range of each neuron for feature
use the volatile memristor-based dynamic devices RC as an example, extraction by tuning every pre-neuron weight. Therefore, the input
although similar methods work for other architectures. For a sim- range, including the scaling and bias for the input signal, has to be
ple 2D character (such as ‘E’ in Fig. 4b), the map can be directly con- fine-tuned to maximize performance. A straightforward approach is
verted to a multi-channel spike sequence as the input to the reservoir to scan and find the best parameters. For example, the input scaling
layer23,25,80,85,94,101,120. The spike sequence adds a temporal dynamic to of delay-coupled RC31 and rotating neurons RC32, together with the
the physical nodes. The state vector can be measured once the last feedback strength or the time constant of the physical node, need to
spike is injected. In principle, the last value of the physical node output be scanned to find the optimal setting for nonlinear system approxi-
contains the information of the whole spike sequence due to short-term mation tasks.
memory. The intermediate values within a spike sequence can also be Another crucial factor is the time constant of the physical node (τ),
collected to enhance the state vector and MC24,25,84,108,109, but collecting which determines the operating timescale for the overall system. Basi-
and storing those intermediate values requires additional serial opera- cally, the value mainly depends on the desired computing speed and
tions and memory units. Furthermore, for a more informative image whether time-multiplexing is used. Before optimization, the purpose
input, such as the handwritten-digit dataset123, chopping, merging and of the RC system needs to be known. To accelerate signal processing,
rotating the image before spike encoding (Fig. 4c) can increase the the physical node could be designed with a short τ, such as picosec-
number of parallel spike sequences at the input channels and shorten ond level in delay-coupled RC38, as long as the architecture and other
the length of every sequence to avoid MC shortage33,92,94,124. components allow this. On the other hand, the physical nodes used in
neuromorphic computing such as in-sensor or near-sensor applica-
Output layer tions could exhibit biologically realistic timescales (larger than 1 ms) to
The RC output layer multiplies the state vector by the output weights, interact with the environment and natural signals7. A short τ would be
which is a typical vector–matrix multiplication (VMM) operation. unable to couple with the real-time signal and fail to generate effective
a
Time-multiplexing
Raw signal θ
Mask
[1 -1 1 1 …]
0
Masked signal
Raw signal
After time-multiplexing Duration of one input
Node response Mask length = 10
b c
Spike encoding Image input
112 × 7
28 × 28 28 × 7
Input channels
Spike
Chop Merge encoding
Rotate
eRC
Spike encoding
State channels
eRC Spike
Chop Merge encoding
State collection
State vector
d
Output layer V1 V2 V3 V4
V1
I1
V2 G11 G12 G13 G14
I2 I1 = ∑G1jVj
G21 G22 G23 G24
V3
I3 I2 = ∑G2jVj
G31 G32 G33 G34
V4 I3 = ∑G3jVj
Memristor crossbar
Fig. 4 | Input layer and output layer. a, Time-multiplexing operation for the obtain the state vector. c, For informative image inputs like handwritten digits
input signal with a mask length of 10. During every input interval, the data from the MNIST dataset, the pre-processing usually involves chopping the image
point is multiplied by a randomly generated mask vector (top). By coupling the to shorter length sequences before converting them into spike sequences. To
time constant of the physical node (τ) and the mask duration (θ), the time- increase the length of the state vector (reservoir size), the image can also be
multiplexing signal can induce a complex context-dependent response in the rotated by an angle before chopping and merging, which creates a different
physical nodes, which can be collected in series as the state vector (bottom) input spike map for a common data source. d, The output layer of RC is usually
at different time steps t1, t2, …. In the case of mismatch between θ and τ, for a fully connected linear network, which can calculate the weighted sum of the
example when θ is larger than τ (middle), the node will fail to generate an effective state vector. This can be physically implemented by a memristor crossbar array,
node state because the node rapidly saturates before the next input comes. in which VMM operations can be realized in parallel, in situ, by taking advantage
b, Two-dimensional data can be encoded in spike sequences as the input of eRC. of computing-in-memory. The output currents indicate the VMM results.
For example, the pattern representing the letter ‘E’ is encoded into five spike V1 to V4 are the voltage values corresponding to the state vector, G11 to G34 are
sequences. The bottom right panel is an illustration of the output of physical the conductance values of the memristor crossbar array, and I1 to I3 denote the
nodes (such as a volatile memristor in dynamic devices RC) in response to such current values representing the VMM results.
spike sequences. Applying a state collection scheme on the state channels can
2001
of the handwriting trace contains the information of the past roughly ADC 1 0 1 1 0 1
(ref. 13)
0.8 s thanks to the effect of MC. The storage of intermediate states is ADC 0 1 0 0 1 0 LSM
unnecessary in this case32. In contrast, a shorter τ with a faster rotat- (refs. 14, 15)
ADC 1 0 1 1 0 1
‘The liquid brain’
ing speed leads to rapid forgetting of the past handwriting trace, thus (ref. 30)
degrading the eRC system performance.
b
2011
In addition, there are various configurations provided by the Mixed-signal RC
physical node and architecture that can be optimized in eRC develop- Delay-coupled RC
ADC 1 0 1 (refs. 31, 39, 67–73 )
ment, such as the mask length of time-multiplexing77, the delay line
Dynamic devices RC
(number and duration) in delay-coupled RC70,128 and the rotation speed ADC 0 1 0
(refs. 23, 24, 33, 76–93)
of rotating neurons RC32. Most eRC optimization currently relies on ADC 1 0 1 In materia RC
(refs. 94–117)
trial and error assisted by simulation models (Supplementary Note 1),
and in-depth analysis is still lacking. Advanced strategies in future c
2021
Analogue RC
development should be more physics-aware and heuristic, such as
hardware-in-the-loop optimization subjected to genetic algorithms. Rotating neurons RC
(ref. 32)
All-analogue RC system
Performance benchmark (ref. 33)
From the viewpoint of trainable dynamic systems, eRCs exhibit sev-
eral task-independent network characteristics that are important for d
2024
performance evaluation and comparison32,69. A widely recognized M3D integrated analogue RC
characteristic is MC, indicating the network’s capacity to retain his- Future directions
torical information13,18,64. MC can be quantified by a random binary Architecture
sequence-recalling experiment, where the input is a randomly gener- Reservoir layer
Algorithm
ated i.i.d. sequence, and the expected output (after training) at any time
step is the value at the i step behind the current point in the sequence. Physical node
The sum of the square of correlations between the expected and actual Output layer
Implementation
sequences for i = 1, 2, … ∞ is the value of MC18. This task tests the volume
Application
of historical information preserved in the network, which is a standard
measurement for both hardware- and software-based RC. Previous work CMOS circuits
has also revealed that an RC with linear nodes normally yields much
higher MC than one with nonlinear nodes, as the historical information Fig. 5 | Evolution of RC implementations. a, RC was initially proposed as a
is distorted by nonlinearity129. MC is particularly important in physical machine-learning algorithm, in which the implementation was in the digital
implementations, because maximizing MC can free the eRC system domain and its interactions with the environment were interfaced through ADC
from using assistive memory and even allow all-analogue computing32, modules. b, Since 2011, the reservoir layer has been approximated by physical
thus reducing the cost of system implementations and operations. To systems, and the output layer has remained in the digital domain, forming
achieve this, implementing delayed feedback and pre- and post-neuron mixed-signal RC systems. c, Recently, eRC has been integrated with a memristor
rotation are effective methods. In addition, coupling the timescales array-based output layer to create all-analogue RC, whose sensing, computing
and interaction with the environment can all occur in the analogue domain.
between the physical nodes and input signals can yield a higher MC.
d, Future monolithic three-dimensional (M3D) integration of RC. The reservoir
Supplementary Table 2 summarizes recent eRC implementations whose
layer based on emerging devices is situated at the top of the system to interact
MCs were clearly analysed, and an extended discussion on MC is pro-
with the environment, and the memristor-based output layer in the middle
vided in Supplementary Note 2. serves as a VMM accelerator. The bottom layer, composed of silicon CMOS
As a nonlinear extension of MC, information processing capac- circuits, provides the control units for the overall system. Data can be shuffled
ity (IPC) has been proposed to quantify the computational capacity across these functional layers through high-density interlayer vias with ultrahigh
of dynamical systems with a fading memory condition130. Given the bandwidth, which helps to further boost the system performance. e, Timeline of
random i.i.d. inputs, the approximation target in IPC is a product of RC development, summarizing the milestones, the corresponding references,
the multivariate orthogonal functions of inputs. In most cases, Leg- and the future perspectives.
endre polynomials have been used130–133. The total IPC is the sum of the
approximation results with different degrees of nonlinearity. IPC can
simultaneously measure the capabilities of eRCs in terms of short-term noise of reservoir states. Recently, a CHARC framework was proposed
memory and nonlinear approximation, which is more scientifically to evaluate and predict eRC performance by combining MC, KQ, GR
meaningful and hence recommended for cross-comparison. and a high-level approximation algorithm/network135.
Kernel quality (KQ) is also used to assess an eRC’s ability in terms Apart from these task-independent properties, eRCs have been
of high-dimensional nonlinear mapping. Similar to the kernel function, comprehensively evaluated using benchmark tasks. To demonstrate
an ideal eRC would produce node states that are linearly separable in a temporal signal processing, eRCs have been used to forecast or approxi-
higher-dimensional space. KQ can be measured by calculating the rank mate a number of classical chaotic signals and time series, including
of a matrix composed of the state vectors produced at the end of dis- Mackey–Glass chaotic systems24,31,32,94,116, Hénon Map time series77 and
tinct random sequence inputs134. For similar purposes, the richness of Santa Fe laser intensity69,122,136. Also, nonlinear autoregressive mov-
the state vector can be measured by the number of linearly uncoupled ing average (NARMA) systems137 are a widely accepted benchmark
dynamics, for example, the number of principal components that can to test the capabilities of both nonlinear system approximation and
explain more than 90% of the state variables19. memory32,88,106,138. For example, an interesting simulation69 attempted
Furthermore, generalization rank (GR) indicates how fast the to correlate a delay-coupled RC’s KQ and GR with its performance in
reservoir responds to the recent input sequence while minimizing the approximating a tenth-order NARMA system.
impact of initial states32,69,134. Note that a noise-tolerant scheme should A wide range of machine-learning tasks have been used to evalu-
be applied when calculating the ranks of KQ and GR by considering the ate the performance of eRCs in practical scenarios, including simple
computing, which are particularly appealing for resource-limited designed for eRC are preferred. In particular, the underlying mechanism
applications such as edge computing and the Internet of Things (IoTs). determining their characteristics and how their device-to-device vari-
ations affect the eRC system performance and implementation cost
Outlook is of interest. A well-designed physical node array should be able to
Despite the recent success of eRC, various challenges still need to be efficiently extract uncorrelated or complementary information from
addressed for the technology to deliver practical applications. Here the temporal input, without needlessly expanding the reservoir size and
we discuss potential future developments in five areas: architecture while minimizing the power consumption and implementation costs.
and connectivity, algorithm, physical node, hardware implementation
and application. Hardware implementation
Demonstrating eRC performance by measuring the standalone physi-
Architecture and connectivity cal nodes—while implementing the entire system through software
Architecture determines the connectivity between physical nodes simulation—can only validate the principle of an approach, and it is
and how physical dynamics are used to create state vectors. A key con- a long way from having engineered a technically sound eRC system.
sideration in architecture design is the enhancement of the richness Therefore, future works should systematically consider all three layers,
of reservoir states without needlessly increasing the reservoir size. the end-to-end signal flows, power consumption, real-time demonstra-
From the viewpoint of inter-node connectivity, existing architectures tion and interfaces between modules.
provide limited flexibility in defining the inter-node connections in Currently, there are three levels of eRC systems integration: meas-
the reservoir layer: the delay-coupled RC and rotating neurons RC urement of discrete devices77, circuit-level integration on printed circuit
use a ring structure and have sparse inter-node connections (cyclic boards (PCBs)32,33 and chip-level integration. Recently, the chip design
reservoir), while dynamic devices RC lacks interaction between nodes. for CMOS-implemented eRC has been reported for delay-coupled
In in materia RC, state changes in the junction are only allowed when RC72,73, and more efforts regarding chip-level integration are expected
electrodes to which the signal is applied are adjacent. Therefore, the to emerge in the near future. Furthermore, monolithic 3D integration of
connectivity mainly depends on the signal transmission path, which analogue RC (Fig. 5d) could maximize the node density and interlayer
is complicated and not readily accessible. connectivity149,150, which could be an appealing perspective for the
To improve the theoretical foundations, a systematic study of the hardware implementation of RC.
correlation between performance metrics and different architectures
is needed. Proposing a new architecture could start from the basic RC Application
principle, keeping the feasibility of hardware implementation in mind— Generally, we should maximize the advantages of eRC in lightweight
a PRC that costs more resources than digital RC is meaningless. Further- analogue computing, low-power computing and temporal signal
more, eRC architectures can take inspiration from RC algorithms and processing, while avoiding its shortcomings in computing capability.
also biological systems (Fig. 1a), exploring more bio-plausible neural The applications of eRC can also be considered from a continuous
activities, synaptic plasticity and even structural plasticity. dynamical system viewpoint, rather than a machine-learning classi-
fier viewpoint.
Algorithm In particular, eRC is naturally well-suited to compute at the edge,
RC algorithms determine the computational potential of eRC. As in which analogue sensors are integrated for signal pre-processing or
the reservoir layer becomes deeper and wider, the in–out range of information extraction (that is, near- and in-sensor computing). eRC is
each neuron is usually not optimized, as their connections are ran- also free from the memory wall limitations of digital computers. Thus,
domly generated and fixed, unlike the gradient descent-based DNNs a complete all-analogue eRC system could provide ultrahigh-speed
where the working condition of every neuron can be fine-tuned by nonlinear computing, which has already been demonstrated with
the pre-neuron weights and bias. This leads to saturation of the RC optical RC38 (though this was in an implementation that was bulky
performance at a relatively smaller network scale (usually fewer than and expensive). Finally, eRC could provide a solution for trainable
2,000 neurons). It is thus challenging for RC to obtain state-of-the-art nonlinear controllers, in which combination with feedback sensors
results in machine-learning benchmarks, indicating its relatively weaker is also possible.
computational capabilities when compared to gradient descent-
based DNNs20. References
Novel RC algorithms—such as the recently proposed hierar- 1. Moore, G. E. Cramming more components onto integrated
chical RC144, deep RC145,146, echo state graph neural networks147 and circuits (reprinted from Electronics, 114–117, 19 April 1965).
next-generation RC148—could inspire innovation in eRC architectures Proc. IEEE 86, 82–85 (1998).
and physical nodes. New RC algorithms could propose different 2. Frank, D. J. et al. Device scaling limits of Si MOSFETs and their
requirements for physical implementations. For example, investiga- application dependencies. Proc. IEEE 89, 259–288 (2001).
tions in cyclic RC have enabled the development of delay-coupled RC31 3. Schaller, R. R. Moore’s Law: past, present and future. IEEE Spectr.
and rotating neurons RC32. 34, 52–59 (1997).
Theoretical insight into RC algorithms could facilitate hardware 4. Backus, J. Can programming be liberated from the von Neumann
implementations. For example, investigations into the input/output style? Commun. ACM 21, 613–641 (1978).
layer, MC and benchmarks can be directly used to optimize eRC. In 5. Zhang, W. Q. et al. Neuro-inspired computing chips. Nat. Electron.
addition, the algorithm work is not limited to RC itself. As a physical 3, 371–382 (2020).
implementation of RNN, eRC can act as a trainable arithmetic unit 6. Ielmini, D. & Wong, H. S. P. In-memory computing with resistive
in a larger physical network or system to handle more complicated switching devices. Nat. Electron. 1, 333–343 (2018).
scenarios, which is an open question for the future application of eRC. 7. Indiveri, G. & Liu, S. C. Memory and information processing in
neuromorphic systems. Proc. IEEE 103, 1379–1397 (2015).
Physical node 8. Cazettes, F. et al. A reservoir of foraging decision variables in the
Discovering physical nodes for eRC is not demanding, as most electron- mouse brain. Nat. Neurosci. 26, 840–849 (2023).
ics possess nonlinearity and controllable time constants. Therefore, 9. Mante, V., Sussillo, D., Shenoy, K. V. & Newsome, W. T.
simply replacing the physical node with another type of electronics may Context-dependent computation by recurrent dynamics in
offer limited innovation. Instead, physical nodes that are specifically prefrontal cortex. Nature 503, 78–84 (2013).
10. Sussillo, D. & Abbott, L. F. Generating coherent patterns of activity 32. Liang, X. et al. Rotating neurons for all-analog implementation of
from chaotic neural networks. Neuron 63, 544–557 (2009). cyclic reservoir computing. Nat. Commun. 13, 1549 (2022).
11. Enel, P., Procyk, E., Quilodran, R. & Dominey, P. F. Reservoir This paper proposed rotating neurons RC.
computing properties of neural dynamics in prefrontal cortex. 33. Zhong, Y. N. et al. A memristor-based analogue reservoir
PLoS Comput. Biol. 12, e1004967 (2016). computing system for real-time and power-efficient signal
12. Suárez, L. E., Richards, B. A., Lajoie, G. & Misic, B. Learning processing. Nat. Electron. 5, 672–681 (2022).
function from structure in neuromorphic networks. Nat. Mach. This paper reports the design an all-analogue dynamic devices
Intell. 3, 771–786 (2021). RC system.
13. Jaeger, H. The ‘Echo State’ Approach to Analysing and Training 34. Vandoorne, K. et al. Toward optical signal processing using
Recurrent Neural Networks—with an Erratum Note Technical photonic reservoir computing. Opt. Express 16, 11182–11192 (2008).
Report 148, 13 (German National Research Center for Information 35. Duport, F., Schneider, B., Smerieri, A., Haelterman, M. &
Technology, 2001). Massar, S. All-optical reservoir computing. Opt. Express 20,
This paper proposed the concept of the ESN. 22783–22795 (2012).
14. Natschläger, T., Maass, W. W. & Markram, H. The ‘liquid computer’: 36. Larger, L. et al. Photonic information processing beyond Turing:
a novel strategy for real-time computing on time series. an optoelectronic implementation of reservoir computing.
Spec. Issue Found. Inf. Process. TELEMATIK 8, 39–43 (2002). Opt. Express 20, 3241–3249 (2012).
15. Maass, W., Natschläger, T. & Markram, H. Real-time computing 37. Paquot, Y. et al. Optoelectronic reservoir computing. Sci. Rep. 2,
without stable states: a new framework for neural computation 287 (2012).
based on perturbations. Neural Comput. 14, 2531–2560 (2002). 38. Brunner, D., Soriano, M. C., Mirasso, C. R. & Fischer, I. Parallel
This paper proposed the LSM. photonic information processing at gigabyte per second data
16. Verstraeten, D., Schrauwen, B., D’Haene, M. & Stroobandt, D. An rates using transient states. Nat. Commun. 4, 1364 (2013).
experimental unification of reservoir computing methods. Neural 39. Larger, L. et al. High-speed photonic reservoir computing using
Netw. 20, 391–403 (2007). a time-delay-based architecture: million words per second
This paper unified ESN and LSM as reservoir computing. classification. Phys. Rev. X 7, 011015 (2017).
17. Lukoševičius, M. A practical guide to applying echo state 40. Brunner, D., Soriano, M. C. & Van der Sande, G. Photonic
networks. Lect. Notes Comput. Sci. 7700 LECTU, 659–686 (2012). Reservoir Computing: Optical Recurrent Neural Networks
18. Jaeger, H. Short Term Memory in Echo State Networks (GMD (Walter de Gruyter, 2019).
Forschungszentrum Informationstechnik, 2002). 41. Rafayelyan, M., Dong, J., Tan, Y. Q., Krzakala, F. & Gigane, S.
19. Gallicchio, C. & Micheli, A. Richness of deep echo state network Large-scale optical reservoir computing for spatiotemporal
dynamics. In 2019 Advances in Computational Intelligence: 15th chaotic systems prediction. Phys. Rev. X 10, 041037 (2020).
International Work-Conference on Artificial Neural Networks 42. Dambre, J. et al. in Reservoir Computing: Theory, Physical
(IWANN) Part I 15 480–491 (Springer, 2019). Implementations and Applications (eds Nakajima, K. & Fischer, I.)
20. Sun, C. et al. A systematic review of echo state networks from 397–419 (Springer, 2021).
design to application. IEEE Trans. Artif. Intell. 5, 23–37 (2022). 43. Nakajima, M., Tanaka, K. & Hashimoto, T. Scalable reservoir
21. Lukoševičius, M. & Jaeger, H. Reservoir computing approaches computing on coherent linear photonic processor. Commun.
to recurrent neural network training. Comput. Sci. Rev. 3, Phys. 4, 20 (2021).
127–149 (2009). 44. Nakajima, K. et al. A soft body as a reservoir: case studies in a
22. Chang, H. T. & Futagami, K. Reinforcement learning with dynamic model of octopus-inspired soft robotic arm. Front.
convolutional reservoir computing. Appl. Intell. 50, Comput. Neurosci. 7, 91 (2013).
2400–2410 (2020). 45. Caluwaerts, K. et al. Design and control of compliant tensegrity
23. Du, C. et al. Reservoir computing using dynamic memristors for robots through simulation and hardware validation. J. R. Soc.
temporal information processing. Nat. Commun. 8, 2204 (2017). Interface 11, 20140520 (2014).
This paper proposed dynamic devices RC. 46. Nakajima, K., Li, T., Hauser, H. & Pfeifer, R. Exploiting short-term
24. Moon, J. et al. Temporal data classification and forecasting using memory in soft body dynamics as a computational resource.
a memristor-based reservoir computing system. Nat. Electron. 2, J. R. Soc. Interface 11, 20140437 (2014).
480–487 (2019). 47. Nakajima, K., Hauser, H., Li, T. & Pfeifer, R. Information processing
25. Sun, L. et al. In-sensor reservoir computing for language learning via physical soft body. Sci. Rep. 5, 10487 (2015).
via two-dimensional memristors. Sci. Adv. 7, eabg1455 (2021). 48. Bhovad, P. & Li, S. Physical reservoir computing with origami and
26. Nakajima, M. et al. Physical deep learning with biologically its application to robotic crawling. Sci. Rep. 11, 13002 (2021).
inspired training method: gradient-free approach for physical 49. Tanaka, K. et al. Flapping-wing dynamics as a natural detector of
hardware. Nat. Commun. 13, 7847 (2022). wind direction. Adv. Intell. Syst. 3, 2000174 (2021).
27. Cucchi, M., Abreu, S., Ciccone, G., Brunner, D. & Kleemann, 50. Sakurai, R., Nishida, M., Jo, T., Wakao, Y. & Nakajima, K. Durable
H. Hands-on reservoir computing: a tutorial for practical pneumatic artificial muscles with electric conductivity for
implementation. Neuromorphic Comput. Eng. 2, 032002 (2022). reliable physical reservoir computing. J. Robot. Mechatron. 34,
28. Tanaka, G. et al. Recent advances in physical reservoir computing: 240–248 (2022).
a review. Neural Netw. 115, 100–123 (2019). 51. Tanaka, K. et al. Self‐organization of remote reservoirs:
This paper reviews more general PRC. transferring computation to spatially distant locations. Adv. Intell.
29. Nakajima, K. Physical reservoir computing-an introductory Syst. 4, 2100166 (2021).
perspective. Jpn J. Appl. Phys. 59, 060501 (2020). 52. Hauser, H. in Reservoir Computing Natural Computing Series
30. Fernando, C. & Sojakka, S. in Advances in Artificial Life (eds (eds Nakajima, K. & Fischer, I.) Ch. 8 (Springer, 2021).
Banzhaf, W. et al.) 588–597 (Springer, 2003). 53. Fujii, K. & Nakajima, K. Harnessing disordered-ensemble quantum
This paper fulfilled the idea of LSM in physical domain. dynamics for machine learning. Phys. Rev. Appl. 8, 024030 (2017).
31. Appeltant, L. et al. Information processing using a single 54. Ghosh, S., Paterek, T. & Liew, T. C. H. Quantum neuromorphic
dynamical node as complex system. Nat. Commun. 2, 468 (2011). platform for quantum state preparation. Phys. Rev. Lett. 123,
This paper proposed delay-coupled RC. 260404 (2019).
55. Chen, J. Y., Nurdin, H. I. & Yamamoto, N. Temporal information 79. Wang, T., Huang, H. M., Wang, X. X. & Guo, X. An artificial olfactory
processing on noisy quantum computers. Phys. Rev. Appl. 14, inference system based on memristive devices. Infomat 3,
024065 (2020). 804–813 (2021).
56. Martinez-Pena, R., Giorgi, G. L., Nokkala, J., Soriano, M. C. & 80. Jaafar, A. H. et al. 3D-structured mesoporous silica memristors for
Zambrini, R. Dynamical phase transitions in quantum reservoir neuromorphic switching and reservoir computing. Nanoscale 14,
computing. Phys. Rev. Lett. 127, 100502 (2021). 17170–17181 (2022).
57. Kubota, T. et al. Temporal information processing induced by 81. Zhu, X., Wang, Q. & Lu, W. D. Memristor networks for real-time
quantum noise. Phys. Rev. Res. 5, 023057 (2023). neural activity analysis. Nat. Commun. 11, 2439 (2020).
58. Tran, Q. H. & Nakajima, K. Learning temporal quantum 82. Torrejon, J. et al. Neuromorphic computing with nanoscale
tomography. Phys. Rev. Lett. 127, 260401 (2021). spintronic oscillators. Nature 547, 428–431 (2017).
59. Mujal, P. et al. Opportunities in quantum reservoir computing 83. Tang, M. F. et al. A compact fully ferroelectric-FETs reservoir
and extreme learning machines. Adv. Quantum Technol. 4, computing network with sub-100-ns operating speed.
2100027 (2021). IEEE Electron Device Lett. 43, 1555–1558 (2022).
60. Fujii, K. & Nakajima, K. in Reservoir Computing Natural Computing 84. Yu, J. et al. Energy efficient and robust reservoir computing
Series (eds Nakajima, K. & Fischer, I.) Ch. 18 (Springer, 2021). system using ultrathin (3.5 nm) ferroelectric tunneling junctions
61. Ghosh, S., Nakajima, K., Krisnanda, T., Fujii, K. & Liew, T. C. H. for temporal data learning. In 2021 IEEE Symposium on VLSI
Quantum neuromorphic computing with reservoir computing Technology 1–2 (IEEE, 2021).
networks. Adv. Quantum Technol. 4, 2100053 (2021). 85. Duong, N. T. et al. Dynamic ferroelectric transistor-based reservoir
62. Nakajima, K. & Fischer, I. Reservoir Computing (Springer, 2021). computing for spatiotemporal information processing. Adv. Intell.
63. Cao, J. et al. Emerging dynamic memristors for neuromorphic Syst. 5, 2300009 (2023).
reservoir computing. Nanoscale 14, 289–298 (2022). 86. Chen, Z. et al. All-ferroelectric implementation of reservoir
64. Rodan, A. & Tino, P. Minimum complexity echo state network. computing. Nat. Commun. 14, 3585 (2023).
IEEE Trans. Neural Netw. 22, 131–144 (2011). 87. Liang, X. C., Luo, Y. Y., Pei, Y. L., Wang, M. Y. & Liu, C. Multimode
65. Ortin, S. et al. A unified framework for reservoir computing and transistors and neural networks based on ion-dynamic
extreme learning machines based on a single time-delayed capacitance. Nat. Electron. 5, 859–869 (2022).
neuron. Sci. Rep. 5, 14945 (2015). 88. Nishioka, D. et al. Edge-of-chaos learning achieved by
66. Soriano, M. C., Brunner, D., Escalona-Moran, M., Mirasso, C. R. ion-electron-coupled dynamics in an ion-gating reservoir.
& Fischer, I. Minimal approach to neuro-inspired information Sci. Adv. 8, eade1156 (2022).
processing. Front. Comput. Neurosci. 9, 68 (2015). 89. Liu, K. et al. An optoelectronic synapse based on α-In2Se3 with
67. Soriano, M. C. et al. Delay-based reservoir computing: noise controllable temporal dynamics for multimode and multiscale
effects in a combined analog and digital implementation. reservoir computing. Nat. Electron. 5, 761–773 (2022).
IEEE Trans. Neural Netw. Learn. Syst. 26, 388–393 (2015). 90. Jang, Y. H. et al. Time-varying data processing with nonvolatile
68. Liang, X., Li, H., Vuckovic, A., Mercer, J. & Heidari, H. A memristor-based temporal kernel. Nat. Commun. 12, 5727
neuromorphic model with delay-based reservoir for continuous (2021).
ventricular heartbeat detection. IEEE Trans. Biomed. Eng. 69, 91. Du, W. et al. An optoelectronic reservoir computing for temporal
1837–1849 (2022). information processing. IEEE Electron Device Lett. 43, 406–409
69. Appeltant, L. Reservoir Computing based on Delay-Dynamical (2022).
Systems. PhD thesis, Univ. Illes Balears (2012). 92. Zhang, Z. et al. In-sensor reservoir computing system for latent
70. Ortín, S. & Pesquera, L. Tackling the trade-off between fingerprint recognition with deep ultraviolet photo-synapses and
information processing capacity and rate in delay-based reservoir memristor array. Nat. Commun. 13, 6590 (2022).
computers. Front. Phys. 7, 210 (2019). 93. Lao, J. et al. Ultralow-power machine vision with self-powered
71. Stelzer, F., Rohm, A., Ludge, K. & Yanchuk, S. Performance boost sensor reservoir. Adv. Sci. 9, e2106092 (2022).
of time-delay reservoir computing by non-resonant clock cycle. 94. Milano, G. et al. In materia reservoir computing with a fully
Neural Netw. 124, 158–169 (2020). memristive architecture based on self-organizing nanowire
72. Bai, K. J., Liu, L. J. & Yi, Y. Spatial-temporal hybrid neural network networks. Nat. Mater. 21, 195–202 (2022).
with computing-in-memory architecture. IEEE Trans. Circuits Syst. 95. Sillin, H. O. et al. A theoretical and experimental study of
I Regul. Pap. 68, 2850–2862 (2021). neuromorphic atomic switch networks for reservoir computing.
73. Chandrasekaran, S. T., Bhanushali, S. P., Banerjee, I. & Sanyal, Nanotechnology 24, 384004 (2013).
A. Toward real-time, at-home patient health monitoring using This paper proposed in materia RC.
reservoir computing CMOS IC. IEEE J. Emerg. Sel. Top. Circuits 96. Kan, S. H. et al. Simple reservoir computing capitalizing on
Syst. 11, 829–839 (2021). the nonlinear response of materials: theory and physical
74. Van der Sande, G., Brunner, D. & Soriano, M. C. Advances in implementations. Phys. Rev. Appl. 15, 024030 (2021).
photonic reservoir computing. Nanophotonics 6, 561–576 (2017). 97. Tanaka, H. et al. In-materio computing in random networks of
75. Kendall, J. D. & Kumar, S. The building blocks of a brain-inspired carbon nanotubes complexed with chemically dynamic molecules:
computer. Appl. Phys. Rev. 7, 011305 (2020). a review. Neuromorphic Comput. Eng. 2, 022002 (2022).
76. Nakajima, K., Fujii, K., Negoro, M., Mitarai, K. & Kitagawa, M. 98. Diaz-Alvarez, A. et al. Emergent dynamics of neuromorphic
Boosting computational power through spatial multiplexing nanowire networks. Sci. Rep. 9, 14920 (2019).
in quantum reservoir computing. Phys. Rev. Appl. 11, 034021 99. Daniels, R. K. et al. Reservoir computing with 3D nanowire
(2019). networks. Neural Netw. 154, 122–130 (2022).
77. Zhong, Y. et al. Dynamic memristor-based reservoir computing 100. Hochstetter, J. et al. Avalanches and edge-of-chaos learning
for high-efficiency temporal signal processing. Nat. Commun. 12, in neuromorphic nanowire networks. Nat. Commun. 12,
408 (2021). 4008 (2021).
78. Yang, J. et al. Tunable synaptic characteristics of a Ti/TiO2/Si 101. Milano, G., Montano, K. & Ricciardi, C. In materia implementation
memory device for reservoir computing. ACS Appl. Mater. strategies of physical reservoir computing with memristive
Interfaces 13, 33244–33252 (2021). nanonetworks. J. Phys. D 56, 084005 (2023).
102. Lilak, S. et al. Spoken digit classification by in-materio reservoir 124. Yu, J. et al. Energy efficient and robust reservoir computing
computing with neuromorphic atomic switch networks. Front. system using ultrathin (3.5 nm) ferroelectric tunneling junctions
Nanotechnol. https://doi.org/10.3389/fnano.2021.675792 (2021). for temporal data learning. Proc. 2021 Symp. VLSI Technol. 2,
103. Tanaka, H. et al. A molecular neuromorphic network device 16–14 (2021).
consisting of single-walled carbon nanotubes complexed with 125. Yao, P. et al. Fully hardware-implemented memristor
polyoxometalate. Nat. Commun. 9, 2693 (2018). convolutional neural network. Nature 577, 641–646 (2020).
104. Usami, Y. et al. In-materio reservoir computing in a sulfonated 126. Strukov, D. B., Snider, G. S., Stewart, D. R. & Williams, R. S. The
polyaniline network. Adv. Mater. 33, e2102688 (2021). missing memristor found. Nature 453, 80–83 (2008).
105. Cucchi, M. et al. Reservoir computing with biocompatible 127. Jaeger, H. & Haas, H. Harnessing nonlinearity: predicting chaotic
organic electrochemical networks for brain-inspired biosignal systems and saving energy in wireless communication. Science
classification. Sci. Adv. 7, eabh0693 (2021). 304, 78–80 (2004).
106. Jiang, W. C. et al. Physical reservoir computing using magnetic 128. Stelzer, F., Rohm, A., Vicente, R., Fischer, I. & Yanchuk, S.
skyrmion memristor and spin torque nano-oscillator. Appl. Phys. Deep neural networks using a single neuron: folded-in-time
Lett. 115, 192403 (2019). architecture using feedback-modulated delay loops. Nat.
107. Nako, E., Toprasertpong, K., Nakane, R., Takenaka, M. & Takagi, S. Commun. 12, 5164 (2021).
Experimental demonstration of novel scheme of HZO/Si FeFET 129. Inubushi, M. & Yoshimura, K. Reservoir computing beyond
reservoir computing with parallel data processing for speech memory-nonlinearity trade-off. Sci. Rep. 7, 10199 (2017).
recognition. In 2022 IEEE Symposium on VLSI Technology and 130. Dambre, J., Verstraeten, D., Schrauwen, B. & Massar, S. Information
Circuits (VLSI Technology and Circuits) 220–221 (IEEE, 2022). processing capacity of dynamical systems. Sci. Rep. 2, 514 (2012).
108. Toprasertpong, K. et al. Reservoir computing on a silicon 131. Kubota, T., Takahashi, H. & Nakajima, K. Unifying framework
platform with a ferroelectric field-effect transistor. Commun. for information processing in stochastically driven dynamical
Eng. 1, 21 (2022). systems. Phys. Rev. Res. 3, 043135 (2021).
109. Liu, K. et al. Multilayer reservoir computing based on ferroelectric 132. Vettelschoss, B., Rohm, A. & Soriano, M. C. Information
α-In2Se3 for hierarchical information processing. Adv. Mater. 34, processing capacity of a single-node reservoir computer: an
e2108826 (2022). experimental evaluation. IEEE Trans. Neural Netw. Learn. Syst. 33,
110. Momeni, A. & Fleury, R. Electromagnetic wave-based extreme 2714–2725 (2022).
deep learning with nonlinear time-Floquet entanglement. Nat. 133. Akashi, N. et al. Input-driven bifurcations and information
Commun. 13, 2651 (2022). processing capacity in spintronics reservoirs. Phys. Rev. Res. 2,
111. Marcucci, G., Pierangeli, D. & Conti, C. Theory of neuromorphic 043303 (2020).
computing by waves: machine learning by rogue waves, 134. Legenstein, R. & Maass, W. Edge of chaos and prediction of
dispersive shocks and solitons. Phys. Rev. Lett. 125, 093901 (2020). computational performance for neural circuit models. Neural
112. Silva, N. A., Ferreira, T. D. & Guerreiro, A. Reservoir computing with Netw. 20, 323–334 (2007).
solitons. New J. Phys. 23, 023013 (2021). 135. Dale, M., Miller, J. F., Stepney, S. & Trefzer, M. A. A
113. Maksymov, I. S. & Pototsky, A. Reservoir computing based on substrate-independent framework to characterize reservoir
solitary-like waves dynamics of liquid film flows: a proof of computers. Proc. Math. Phys. Eng. Sci. 475, 20180723 (2019).
concept. Europhys. Lett. 142, 43001 (2023). 136. Soriano, M. C. et al. Optoelectronic reservoir computing:
114. Nakane, R., Hirose, A. & Tanaka, G. Spin waves propagating tackling noise-induced performance degradation. Opt. Express
through a stripe magnetic domain structure and their applications 21, 12–20 (2013).
to reservoir computing. Phys. Rev. Res. 3, 1–15 (2021). 137. Atiya, A. F. & Parlos, A. G. New results on recurrent network
115. Nakane, R., Hirose, A. & Tanaka, G. Performance enhancement of training: unifying the algorithms and accelerating convergence.
a spin-wave-based reservoir computing system utilizing different IEEE Trans. Neural Netw. 11, 697–709 (2000).
physical conditions. Phys. Rev. Appl. 19, 034047 (2023). 138. Bai, K. J. & Yi, Y. DFR: An energy-efficient analog delay feedback
116. Gartside, J. C. et al. Reconfigurable training and reservoir reservoir computing system for brain-inspired computing. ACM J.
computing in an artificial spin–vortex ice via spin–wave Emerg. Technol. Comput. Syst. 14, 1–22 (2018).
fingerprinting. Nat. Nanotechnol. 17, 460–469 (2022). 139. Sun, J. et al. Novel nondelay-based reservoir computing with a
117. Zhu, R. et al. Information dynamics in neuromorphic nanowire single micromechanical nonlinear resonator for high-efficiency
networks. Sci. Rep. 11, 13047 (2021). information processing. Microsyst. Nanoeng. 7, 83 (2021).
118. Vidamour, I. T. et al. Reconfigurable reservoir computing in a 140. Dion, G., Mejaouri, S. & Sylvestre, J. Reservoir computing with a
magnetic metamaterial. Commun. Phys. 6, 230 (2023). single delay-coupled non-linear mechanical oscillator. J. Appl.
119. Marković, D. et al. Reservoir computing with the frequency, phase Phys. 124, 152132 (2018).
and amplitude of spin–torque nano-oscillators. Appl. Phys. Lett. 141. Donati, E. et al. Processing EMG signals using reservoir computing
114, 12409 (2019). on an event-based neuromorphic system. In 2018 IEEE Biomedical
120. Sun, W. et al. 3D reservoir computing with high area efficiency Circuits and Systems Conference (BioCAS) 1–4 (IEEE, 2018).
(5.12 TOPS/mm2) implemented by 3D dynamic memristor array 142. Kan, S., Nakajima, K., Asai, T. & Akai-Kasaya, M. Physical
for temporal signal processing. In 2022 IEEE Symposium on VLSI implementation of reservoir computing through electrochemical
Technology and Circuits (VLSI Technology and Circuits) 222–223 reaction. Adv. Sci. 9, e2104076 (2022).
(IEEE, 2022). 143. Akashi, N. et al. A coupled spintronics neuromorphic approach
121. Kuriki, Y., Nakayama, J., Takano, K. & Uchida, A. Impact of input for high-performance reservoir computing. Adv. Intell. Syst. 4,
mask signals on delay-based photonic reservoir computing with 2200123 (2022).
semiconductor lasers. Opt. Express 26, 5777–5788 (2018). 144. Moon, J., Wu, Y. & Lu, W. D. Hierarchical architectures in reservoir
122. Appeltant, L., Van der Sande, G., Danckaert, J. & Fischer, I. computing systems. Neuromorphic Comput. Eng. 1, 014006 (2021).
Constructing optimized binary masks for reservoir computing 145. Gallicchio, C. & Micheli, A. in Reservoir Computing Natural Com
with delay systems. Sci. Rep. 4, 3629 (2014). puting Series (eds Nakajima, K. & Fischer, I.) Ch. 4 (Springer, 2021).
123. LeCun, Y. The MNIST Database of Handwritten Digits (1998); 146. Gallicchio, C., Micheli, A. & Pedrelli, L. Deep reservoir computing: a
http://yann.lecun.com/exdb/mnist/ critical experimental analysis. Neurocomputing 268, 87–99 (2017).
147. Wang, S. et al. Echo state graph neural networks with analogue Additional information
random resistive memory arrays. Nat. Mach. Intell. 5, 104–113 (2023). Supplementary information The online version
148. Gauthier, D. J., Bollt, E., Griffith, A. & Barbosa, W. A. S. Next contains supplementary material available at
generation reservoir computing. Nat. Commun. 12, 5564 (2021). https://doi.org/10.1038/s41928-024-01133-z.
149. Li, Y. et al. Monolithic 3D integration of logic, memory and
computing-in-memory for one-shot learning. In 2021 IEEE Correspondence and requests for materials should be addressed
International Electron Devices Meeting (IEDM) 21.25.21–21.25.24 to Jianshi Tang.
(IEEE, 2021).
150. An, R. et al. A hybrid computing-in-memory architecture by Peer review information Nature Electronics thanks Cheol Seong
monolithic 3D integration of BEOL CNT/IGZO-based CFET logic Hwang, Hans Kleemann, Wei Lu and the other, anonymous, reviewer(s)
and analog RRAM. In 2022 International Electron Devices Meeting for their contribution to the peer review of this work.
(IEDM) 18.11.11–18.11.14 (IEEE, 2022).
Reprints and permissions information is available at
Acknowledgements www.nature.com/reprints.
This work was in part supported by STI 2030-Major Projects
2022ZD0210200, the National Natural Science Foundation of Publisher’s note Springer Nature remains neutral with regard to
China (92264201, 62025111 and 62104126) and the XPLORER Prize. jurisdictional claims in published maps and institutional affiliations.
X.L. is supported by the Shuimu Tsinghua Scholar Program of
Tsinghua University. Springer Nature or its licensor (e.g. a society or other partner)
holds exclusive rights to this article under a publishing
Author contributions agreement with the author(s) or other rightsholder(s); author
X.L. and J.T. conceived the idea and wrote the paper. All authors self-archiving of the accepted manuscript version of this article is
discussed and commented on the paper. solely governed by the terms of such publishing agreement and
applicable law.
Competing interests
The authors declare no competing interests. © Springer Nature Limited 2024