0% found this document useful (0 votes)
57 views

Advanced Materials - 2018 - Hu - Memristor Based Analog Computation and Neural Network Classification With A Dot Product

This document summarizes a study on using memristor arrays to perform analog computations and neural network classification. Key points: - The study demonstrates high-precision analog tuning and control of a 128x64 memristor array, evaluating its vector matrix multiplication (VMM) computing precision. - Single-layer neural network inference for classifying handwritten digits from the MNIST dataset was performed using the memristor arrays, achieving 89.9% recognition accuracy. - With integrated and scaled-up memristors, the researchers forecast a computational efficiency of over 100 trillion operations per second per watt could be possible for memristor-based computing.

Uploaded by

Diptanu Debnath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Advanced Materials - 2018 - Hu - Memristor Based Analog Computation and Neural Network Classification With A Dot Product

This document summarizes a study on using memristor arrays to perform analog computations and neural network classification. Key points: - The study demonstrates high-precision analog tuning and control of a 128x64 memristor array, evaluating its vector matrix multiplication (VMM) computing precision. - Single-layer neural network inference for classifying handwritten digits from the MNIST dataset was performed using the memristor arrays, achieving 89.9% recognition accuracy. - With integrated and scaled-up memristors, the researchers forecast a computational efficiency of over 100 trillion operations per second per watt could be possible for memristor-based computing.

Uploaded by

Diptanu Debnath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Communication

Memristor Arrays www.advmat.de

Memristor-Based Analog Computation and Neural Network


Classification with a Dot Product Engine
Miao Hu, Catherine E. Graves, Can Li, Yunning Li, Ning Ge, Eric Montgomery,
Noraica Davila, Hao Jiang, R. Stanley Williams, J. Joshua Yang,* Qiangfei Xia,*
and John Paul Strachan*

(CMOS) transistor-based digital computa-


Using memristor crossbar arrays to accelerate computations is a promising tion. However, general-purpose digital sys-
approach to efficiently implement algorithms in deep neural networks. Early tems for computing are now hitting a wall
demonstrations, however, are limited to simulations or small-scale problems in energy efficiency with the approaching
primarily due to materials and device challenges that limit the size of the end of Moore’s law and the growing chal-
lenge of the von Neumann communication
memristor crossbar arrays that can be reliably programmed to stable and
bottleneck as dataset sizes have exploded.
analog values, which is the focus of the current work. High-precision analog There is also increasing demand for both
tuning and control of memristor cells across a 128 × 64 array is demon- higher performance in specific applica-
strated, and the resulting vector matrix multiplication (VMM) computing tions, such as machine learning and arti-
precision is evaluated. Single-layer neural network inference is performed ficial neural networks, and for drastically
in these arrays, and the performance compared to a digital approach is lower energy consumption in paradigms
such as Internet of Things and computing
assessed. Memristor computing system used here reaches a VMM accuracy
at the Edge.[5] To accelerate neural network
equivalent of 6 bits, and an 89.9% recognition accuracy is achieved for the computations, a number of special-purpose
10k MNIST handwritten digit test set. Forecasts show that with integrated and highly efficient architectures have been
(on chip) and scaled memristors, a computational efficiency greater than proposed and developed, both in the digital
100 trillion operations per second per Watt is possible. domain[6–9] and with mixed analog imple-
mentations utilizing emerging nonvola-
tile technology such as memristors.[10–12]
The latter are of particular interest in the
The potential to “compute by physics” in resistive networks broader field of neuromorphic computing[13–15] since memristors
through Ohm’s law and Kirchhoff’s current law has been rec- offer a highly scalable, high speed, and low power realization of
ognized since the early 1950s, with networks designed for functionality found within biological nervous systems.[16] This
wide-ranging applications such as solving partial differential equa- represents a development of memristors for applications beyond
tions,[1] image filtering,[2] motion computing,[3] and neural net- high-performance memory[17] to computational acceleration.
work algorithms.[4] This approach was easily overshadowed by the A core computational operation of interest with memristors
rapid development of complementary metal-oxide semiconductor is vector matrix multiplication (VMM), which can be naturally
implemented in dense crossbar geometries in a single-analog
computational step utilizing Ohm’s law and Kirchhoff’s current
Prof. M. Hu, Dr. C. E. Graves, E. Montgomery, Dr. N. Davila,
Dr. R. S. Williams, Dr. J. P. Strachan
law for summation. This dense memristor crossbar approach
Hewlett Packard Labs for VMM is one example of the “computing by physics” para-
Hewlett Packard Enterprise digm,[18] among other recently proposed and developed analog
Palo Alto, CA 94304, USA systems.[19–24] Accelerating VMM computations at lower power
E-mail: [email protected] through memristor crossbars has the potential to impact
C. Li, Y. Li, H. Jiang, Prof. J. J. Yang, Prof. Q. Xia important, matrix-heavy applications in speech recognition,
Department of Electrical and Computer Engineering
University of Massachusetts image and video classification, neuromorphic engineering, and
Amherst, MA 01003, USA signal processing. Even more impact comes from utilizing the
E-mail: [email protected]; [email protected] nonvolatility of memristors to significantly reduce data fetching
Dr. N. Ge and communication costs[11] which fundamentally limits nearly
HP Labs all system performance and is the well-known von Neumann
HP Inc.
Palo Alto, CA 94304, USA
bottleneck.[25] This bottleneck is particularly pronounced in
the training and inference in deep neural networks where the
The ORCID identification number(s) for the author(s) of this article
can be found under https://doi.org/10.1002/adma.201705914. fetching and updating of tens of millions of synaptic weights
leads to weeks of computation time and many kilowatts of
DOI: 10.1002/adma.201705914 power consumption limiting the future ability to scale such

Adv. Mater. 2018, 30, 1705914 1705914  (1 of 10) © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
15214095, 2018, 9, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/adma.201705914 by The Librarian, Wiley Online Library on [10/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.advmat.de

networks to sizes found in human brains (>1015 synapses). single-layer neural network to classify the MNIST database of
Crossbars built out of memristors offer a scalable nanotech- handwritten characters, showing 89.9% recognition accuracy.
nology[26] with broad conductance tunability,[27] and various The DPE system was designed to perform precise program-
bioneurological qualities such as stochasticity and transient ming of individual memristor cells, reprogramming of the
plasticity.[28,29] In order to be implemented practically, demon- memristor conductance matrix, and, critically, VMM com-
strations of sufficient memristor crossbar yield, repeatability, putation in a single step within the memristor crossbar. The
and controllability are required for these targeted applications. VMM computation is performed by applying an input vector
Many recent works have explored using memristor crossbar of voltages to the rows as brief simultaneously pulses (Ohm’s
arrays for computation, particularly in neural network applica- law) and the resulting summed currents (Kirchhoff’s law) are
tions.[30–43] These works considered various memristor technol- collected along the columns (Figure 1a). These currents are
ogies, but predominantly chalcogenide phase change memories converted to a voltage (via a transimpedance amplifier, or TIA)
(PCM) and transition metal-oxide memristors. However, the and measured after a short delay from the input voltage pulses.
majority of these works relied heavily on simulations to forecast The memristor array is programmed so that the individual con-
performance and accuracy for computations within crossbars, ductances of the memristor cells comprise the desired compu-
without demonstrating actual computations within memristor tational kernel, and as memristors are nonvolatile, the mem-
crossbars. In many cases, the coupled nonlinearities, sneak ristor array maintains the programmed computational kernel.
currents, and other circuit issues were ignored, which are As many applications such as neural network inference do not
critical to consider when assessing the promise of memristors require frequent reprogramming, the memristor crossbars can
for computation. The few previous experimental works in full be programmed once and then ignored. However, a new appli-
memristor arrays have been limited to small sizes (<1024 mem- cation can easily be implemented by simply reprogramming
ristors), binary device states, or exhibited limitations on recon- the memristor conductance matrix values and supplying the
figurability. Large neural networks (165 000 synapses) were appropriate inputs to the rows, as we will demonstrate below.
demonstrated with PCM arrays,[42] but this work was limited by Within the memristor array, we utilize one transistor-one
a sequential interface and could not carry out VMM computa- memristor (1T1M) cells in which the transistor limits and
tions in a single step with access to all word-lines and bit-lines controls current during programming to facilitate high accu-
simultaneously, as would be required for an actual computa- racy programming but is left fully open in computation mode.
tional accelerator. Face recognition with a 128 × 8 1T1R array The CMOS–memristor integration is carried out in a foundry-
was demonstrated with online learning,[41] and sparse encoding compatible back-end-of-the-line (BEOL) process (Figure 1b). A
was demonstrated within small memristor arrays.[40] However, transition metal-oxide memristor layer is deposited and pat-
neither work demonstrated or reported computations within terned atop CMOS access transistors and wiring with 2.0 µm
the array, nor crucially how the VMM computational accuracy technology. While such 1T1M integration can increase the area
relates to the final performance of the application. The cur- compared to purely passive crossbar arrays,[34] even 1T1M-based
rent lack of understanding for how materials, device, and cir- architectures can reduce silicon area compared to purely digital
cuit issues relate to a memristor array’s computing capability approaches.[11] Passive or 1S1R arrays typically utilized to pro-
is a major obstacle for designers attempting to simulate and vide a degree of isolation during programming through device
develop complex architectures utilizing memristor crossbar nonlinearity are not suitable for many VMM applications, as
arrays for real computational applications. the nonlinearity directly invalidates the utilization of Ohm’s
In this work, we present a platform for reprogrammable law and reduces the programming accuracy of analog levels,
analog computations with integrated transistor-memristor substantially compromising VMM computation accuracy. Our
arrays that we call the Dot Product Engine (DPE). Our plat- memristor device stack consists of Ta/HfO2/Pd (cross-section
form enables exploration of dense VMM computations with shown in Figure 1b) and has been developed to provide multi-
arbitrary target matrices converted directly from digital soft- level conductances implemented by continuous tuning of the
ware algorithms. Further, our material stack design for the chemical composition of a Ta-rich conduction channel within
1T1M memristor cells enables the precise analog tuning of the switching matrix.[44,45] To achieve constant resistance under
the memristor conductance over a wide range to implement different voltages (i.e., a linear IV relationship), we selected a
these arbitrary target matrices faithfully. Our approach differs resistance range of 1.1–10.0 kΩ (or conductance of 100–900 µS)
from online trained implementations[34,41] in which the derived for accurate analog VMM operation.[46] With other memristor
matrix values are trained specifically for that crossbar array’s material systems, we previously demonstrated experimentally
circuit properties. Instead, our approach enables general accel- that 64 reprogrammable conductance levels (6 bits) can be
eration of any matrix operations, taking digital input vectors achieved in individual 1T1M cells.[27] Here, we extend that work
and matrices, converting into the analog domain for low power, from individual cells to full arrays constituting thousands of
high speed computation, and then providing digital outputs. cells by programming a 128 × 64 array to display the Hewlett
We demonstrate VMM within memristor arrays up to 128 × 64 Packard Enterprise logo and mapping the greyscale logo image
in size at 10 MHz, yielding over 16 000 multiplications and to ≈180 distinct memristor conductance levels. All program-
additions in a single clock step, and leading to a forecasted per- ming and computational signals are generated from periph-
formance of about 115 trillion operations per second per Watt. eral printed circuit boards (PCBs) connected to the memristor
The resulting VMM outputs have 6 bits of precision, and we arrays through probe-card connections (Figure S1, Supporting
demonstrate multiple reprogramming iterations of our mem- Information), and the design allows for an expandable set of
ristor arrays for different applications. Finally, we implement a row and column boards that can simultaneously drive (or read)

Adv. Mater. 2018, 30, 1705914 1705914  (2 of 10) © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
15214095, 2018, 9, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/adma.201705914 by The Librarian, Wiley Online Library on [10/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.advmat.de

up to 128 row, 64 column, and 64 selector (transistor gate)


lines. The system is designed for fast parallel analog voltage
generation and current sensing, with each row independently
configurable (Figure 1a) for either voltage driving, floating, or
ground, while the columns have the additional capability of
current sensing in computation mode. The voltage driving cir-
cuit is specifically designed to provide adjustable voltage range
(−10 to +10 V) and short, but variable, pulse widths (≈100 ns).
The capability of the DPE system to drive and read from mul-
tiple columns and rows simultaneously distinguishes the pre-
sent system from any traditional memory application and ena-
bles the present exploration of analog computing applications.
This capability also allows for the implementation of array-level
conductance programming in large arrays. Table S1 (Sup-
porting Information) gives performance specifications for the
entire hardware control system.
Utilizing the DPE platform, we are able to individually access
and precisely tune memristor cell conductances (Figure 2a)
in an open-loop mode with single pulses (less precise and
accurate) or with feedback (more precise and accurate).[27] The
programming algorithm (Figure 2b) is applied across every
cell in the entire array with high efficiency. Memristor cells
were fabricated with chemically inert bottom row electrodes
and reactive top column electrodes, so a positive voltage from
row to column induces a RESET operation on the memristor
cell (Figure 1b), while a positive voltage from column to row
drives a SET operation. A selector line corresponding to the
gates of the transistors in a selected column is biased at dif-
ferent voltages to enable different current compliances for
the access transistors. All other unselected rows and columns
remain floating during individual programming. In Figure 2a,
the typical memristor switching curves are obtained by
applying a sequence of rising voltage amplitude pulses, with
lower voltage (0.2 V) read pulses applied in between to deter-
mine the resulting cell conductance (read gate voltage is fixed
to 5 V, putting transistors in the linear region). By varying
the gate voltage applied during the programming pulses, the
maximal current through the device is changed, controlling the
final conductance state during SET operations. More details on
programming and reading operations are provided in the Sup-
porting Information.

conductance. The desired vector is mapped to applied voltages (purple)


along the rows, and the current collected along each column yields the
VMM of the memristor conductance matrix Gij and voltage vector Vj.
The current from each column is collected by the TIA and converted to
a voltage signal, which is held by the sample and hold (S/H) circuit and
sensed by an ADC. Each column can be configured to measure current
through the TIA, S/H, and ADC path, or apply voltages along the digital-
to-analog Converter (DAC) path. b) Memristor device fabrication and
processing integration for the 1T1M memristor array. Front-end-of-the-
line (FEOL) CMOS transistors and wiring made with 2 µm technology
provide the base structure to precisely access individual memristor cells
during programming. A foundry-compatible BEOL process integrates the
Ta/HfOx/Pd memristor layer on top of the CMOS. In a 1T1M array, rows
share bottom electrode (BE) lines while columns share top electrode (TE)
lines and transistor gate lines, with isolation provided by the interlayer
Figure 1.  DPE VMM scheme and processing integration of memristor dielectric (ILD). c) Demonstration of accurate programming of the 1T1M
device layers. a) Schematic of DPE circuit operations in a memristor memristor array with ≈180 conductance levels. The grayscale indicates
matrix. Desired matrix values written by tuning each vertex’s memristor the memristor cell conductance as given by the scale on right.

Adv. Mater. 2018, 30, 1705914 1705914  (3 of 10) © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
15214095, 2018, 9, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/adma.201705914 by The Librarian, Wiley Online Library on [10/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.advmat.de

Figure 2.  Memristor cell 1T1M properties and tuning capability. a) Memristor switching sweeps with a range of achievable memristor conductance
values. Plot shows the conductance (at 0.2 V) following the programming pulse amplitude given by the x axis. By adjusting transistor gate voltage Vgate,
applied memristor voltage Vmemristor, and pulse width, the memristor cell can be tuned to different conductance values, shown here for the example
of increasing SET gate voltage. b) Example feedback programming algorithm for obtaining a desired memristor conductance (indicated by two black
dashed lines in the top panel). For each programming cycle, the algorithm decides whether to apply a SET or RESET operation. The programming
algorithm starts by increasing Vmemristor. If the memristor conductance does not reach the desired target, then Vgate is iteratively increased as well.
Finally, the pulse width can additionally be increased as necessary. This algorithm will correct for overshooting, or the occasional drift or disturb that
can occur while programming other cells in the array. c) Histogram of the programming error (target − final programmed value) for the single-layer
neural network weight matrix shown in Figure 3a. d) Histogram of the standard deviation for the set of devices targeted for a particular programming
level for the pattern shown in Figure 1c. This shows the spread of values for individual conductance levels in the desired conductance matrix. The
inset shows the histogram of programming error (Programmed − Targeted conductance) in µS, similar to that shown in (c). Note that the inset plot
is zoomed into the main error peak, while outlier points far off-scale are also present, similar to the inset of (c).

The individual conductance state tuning shown in Figure 2a the changing applied voltage pulse amplitude and sign, and
is systematically applied to all memristor cells sequentially the bottom panel shows the changing gate voltage magnitude.
within an array using a feedback algorithm to program the After applying this programming algorithm to every cell in an
large number of memristors to match the desired conduct- array, the result is a target pattern[48] such as those in Figure 1c,
ance matrix Gtarget for a given application. Parallel program- Figure 3a, and Figure 4a, demonstrating different applications
ming schemes in which a vector of row and column voltages is for images, signal processing, and neural network inference,
applied in a single step have been proposed by other groups,[47] respectively. There is a trade-off between number of attempted
particularly for neural network training, but are not explored programming cycles and final programming error. In Figure 2,
here. For a given target conductance and tolerance range, the 50 programming cycles were used and the associated errors
programming feedback algorithm can vary the applied voltage (Gactual −Gtarget) are plotted as histograms in Figure 2c and the
pulse amplitude and sign, gate voltage, and pulse width inset of d. As seen, the majority of cells are centered close to
(Figure 2b). The top panel of Figure 2b shows the conductance zero error, although the inset of (d) yields a standard devia-
trajectory over 200 programming cycles, the middle panel shows tion of 72.7 µS due to outliers beyond the plotted x axis. The

Adv. Mater. 2018, 30, 1705914 1705914  (4 of 10) © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
15214095, 2018, 9, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/adma.201705914 by The Librarian, Wiley Online Library on [10/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.advmat.de

Figure 3.  Operation and accuracy of DPE VMM operations for different use cases. a) Two VMM applications programmed and implemented on the
same DPE array. First, on the left, a signal processing application was tested using the discrete-cosine transform (DCT) which converts a time-based
signal into its frequency components. The same 1T1M memristor array was then reprogrammed for the second application on the right. This second
matrix implements a neural network application using a single-layer softmax neural network for recognition of handwritten digits. The DPE VMM output

Adv. Mater. 2018, 30, 1705914 1705914  (5 of 10) © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
15214095, 2018, 9, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/adma.201705914 by The Librarian, Wiley Online Library on [10/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.advmat.de

distribution is not Gaussian, and some cells can remain stuck circuit in each column, without any performance overhead.
at either high or low conductances (identified as defects) while When necessary, an additional calibration process can also
others follow a log-normal tail.[49] However, the spread of indi- include compensation to reduce the impact of very high or
vidual conductance levels is fairly reasonable, as shown in the low conductance defects, as discussed in Note S4 in the Sup-
main panel of Figure 2d. Here, the histogram of standard devia- porting Information. A simple linear scaling reduces errors
tions is plotted for the different targeted programming levels in significantly, as shown in Figure 3e. Here, it is shown that
the pattern shown in Figure 1c. This effectively gives the spread the final error for the DPE system is typically well below
in conductance for individual target conductance levels in the 50 µS and corresponds to ≈6 bits of VMM output accuracy.
desired conductance matrix and shows that most levels have a This is expected to improve further with state-of-the-art fabri-
spread <10 µS. More details on the array-level feedback tuning cation that has higher device yield (fewer defects), lower wire
algorithms are found in the Supporting Information. resistances, and by operating memristors at lower conduct-
A critical parameter for analog computing is the equivalent ance states.
digital precision for the computation results. We have quanti- Here, we experimentally demonstrate a single-layer neural
fied these results in our DPE system for different use cases in network inference application with the DPE for handwriting
Figure 3. VMM operations were performed for both a signal recognition. Neural networks are a key application of interest
processing application of the discrete cosine transform (DCT) for acceleration,[6,11,20,34,38] and this application is characterized
as well as neural network inference for the MNIST database by frequent reuse of matrix convolution kernels (and thus an
(Figure 3a). The DCT application converts a time-based signal advantage of using memristors by reducing data-fetching), tol-
into its frequency components and demonstrates full bipolar erance for defects, and reduced precision requirements.[50–52]
inputs and matrix values, using two cells per DCT matrix value We program a software-trained single-layer network in a 96 × 40
in order to represent signed numbers.[46] The neural network portion of a 128 × 64 memristor crossbar for handwritten digit
application uses only positive inputs and matrix values, imple- recognition on the full 10 000 MNIST dataset.[53] To our knowl-
menting a softmax layer for recognition of handwritten digits. edge, this is the largest demonstration utilizing memristor
Figure 3b shows a histogram of the measured VMM errors crossbars in hardware to date. Implementing this single-layer
compared to ideal expected results. A circuit simulation was neural network in the DPE platform requires reshaping and
also performed (in gray) that takes into account wire resistances partitioning of the neural network weight matrix and input
and sneak path currents, showing a fairly good match to the images. A software single-layer network for MNIST classi-
experimental results and hence an understanding for the error fies 784 pixel inputs (28 × 28) into 10 possible classes (0 to 9),
source which is primarily circuit parasitics. These can be cor- yielding a network size of 7840 weights. To fit this network into
rected with a linear scaling factor for each column (see below). a smaller array capable of supporting 4096 elements, we resize
Figure 3c shows the VMM errors for the neural network appli- the MNIST images to 19 × 20 and retrain the weight matrix
cation along with the circuit simulations, again showing a close in software. Ideally (without reshaping), this implementation
match. would use a 380 × 10 array, but we reshape this into an equiva-
To understand the circuit simulations and the key sources lent 96 × 40 array. The overall recognition accuracy of the neural
of VMM errors, a visualization for the analog signal deteriora- network in software did not degrade due to the image resizing,
tion due to parasitics is shown in Figure 3d for a 16 × 16 array. remaining at 92.4%. To perform the digit classification using
The color of the middle pillar at each vertex indicates the con- the DPE, input images are processed by unwrapping the image
ductance of the device, and five “stuck on” defects (red dots) are to a single 1 × 380 vector (Figure 4a), and then partitioning this
shown. This array is operated by applying voltages on all rows into four sets of 96 values (four zeroes are added to the end of
from the left and grounding all columns on top as in the experi- the input vector to get the 384 numbers). This yields four sets
mental setup. Thus, the signal degrades from left to right (red of pixel values that are converted to voltage signals and applied
to white color) and from top to bottom. to the rows of the memristor array, which is programmed to the
To improve the accuracy of our analog VMM computa- trained weight matrix (right side of Figure 4a). Only ten output
tions, each column output is linearly calibrated to account columns are needed for each of the four sets, performing the
for the above circuit parasitics. The calibration parameters partial synaptic weight matrix multiplication. A full-image clas-
are determined by first running one hundred test input pat- sification is concluded after four such computations, resulting
terns and finding the linear scaling that best matches the in four 1 × 10 vectors which are summed, and the resulting digit
expected results. This is a one-time linear tuning of the TIA with the maximal value yields the predicted classification of the

yields the digit classification. b) Histogram of experimentally measured DPE VMM error from the raw mathematical VMM for the DCT application, and
comparison results from a circuit simulation (in gray) that takes into account wire resistances and sneak path current. As shown, the circuit simula-
tion reproduces the experimental results quite well, indicating the main source of error of the DPE VMM to a raw mathematical VMM in simply due
to circuit parasitics which is easily corrected with a linear scaling factor per column output. c) Same as in (b) but for the neural network application.
Inset plots experimentally measured VMM data and circuit simulations versus raw mathematical VMM for both applications. d) Visualized circuit
simulation showing circuit nonidealities in a 16 × 16 array with 5 “stuck on” defects as red vertices. This array is operated by applying voltages on all
rows from the left and grounding all columns on top. Therefore, the input signal along the rows degrades left to right (red to white color), and on the
columns, you can see the grounding effects also degrading. The color of the middle pillar at each vertex indicates device conductance. e) Histogram
of VMM error following implementation of linear scaling factor for each column for DCT and MNIST applications. Inset shows excellent agreement of
raw VMM and experimental VMM following a simple linear column scaling.

Adv. Mater. 2018, 30, 1705914 1705914  (6 of 10) © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
15214095, 2018, 9, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/adma.201705914 by The Librarian, Wiley Online Library on [10/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.advmat.de

Figure 4.  Experimental demonstration of a single-layer neural network for MNIST handwritten digit classification. a) Illustration of the computing pro-
cedure. A single-layer softmax neural network is trained offline, converted to target conductance values, and programmed into a 96 × 40 1T1M crossbar
array. For a given input image, the 19 × 20 pixel image is unwrapped to a 380 input vector, converted to voltages, and applied to the memristor matrix in
four sets. The summed result of the four outputs yields the DPE classification. This is illustrated here for the digit “9.” b) Some example classification
results of the softmax single-layer neural network on the inset digits “4,” “9,” and “5” with ideal software results (blue) compared to the experimental
results (red). c) Total recognition accuracy for each digit for 10 000 images from the MNIST database. Results are shown for the single-layer trained
software result compared to the experimental system (same color legend as in (b)). The overall recognition accuracy is 89.9%.

Adv. Mater. 2018, 30, 1705914 1705914  (7 of 10) © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
15214095, 2018, 9, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/adma.201705914 by The Librarian, Wiley Online Library on [10/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.advmat.de

input image. The weight matrix was linearly mapped to the A key question is how the power efficiency of such an analog
conductance range of the memristor cells from 100 to 700 µS. DPE system compares to the state of the art, both in performing
The experimental output can show nonideal values caused by VMM computations and more broadly in convolutional neural
a combination of small cell-level programming errors as well network applications. This requires forecasting the DPE when
as stuck ON defects, as can be seen in the output current per implemented in an integrated chip and using scaled technology
column for a few example input patterns (digits 4,9,5) in the nodes. Although the present system operates at 10 MHz, this is
10k MNIST set (Figure 4b). These defects are expected to be due to the larger parasitics of 2 µm technology and probecard
significantly reduced in state-of-the-art fabrication facilities. interfaces. By assuming <100 nm technology and integrated
Our experimental demonstration of inference on the 10k control electronics, an operating frequency of 150 MHz is easily
test patterns from MNIST shows the promise of our hardware achievable, along with analog–digital converters (ADCs) oper-
acceleration. The full recognition accuracy for all 10k testing ating at <133 µW[57] per column channel. Given the 16 320 mul-
patterns as a function of the input digit shows that our hard- tiplications and additions performed in a 128 × 64 array, this
ware implementation can even outperform the software-trained leads to a computational efficiency of 115 TOPS W−1 (Tera Oper-
network (for digits 3,0, see Figure 4c). With linear correction, ations per Second per Watt). In comparison, digital technology
the accuracy of the hardware DPE neural network implemen- performing the same VMM operations at only 4-bit accuracy
tation reaches 89.9%, only a 2.5% reduction compared to the using 40 nm CMOS is estimated to operate at 7 TOPS W−1.[40]
ideal software accuracy. This remaining accuracy loss is due to In addition, the high power and area cost of ADCs could be
device programming inaccuracies and cumulative finite wire alleviated by only applying ADCs to final outputs of multilayer
resistances across the array. It is expected that more faithful neural networks. For a large family of neural network algo-
array programming, fewer memristor cell defects, and espe- rithms, ADCs could also be replaced with simple circuits such
cially the use of multilayer neural network implementations as threshold gates, comparators, or amplifiers, even further
will help close the accuracy gap to software levels (see multi- reducing DPE power and area.
layer simulations[54]). Additionally, using nonlinearity in the The above shows the power efficiency in replacing digital
activation function would be expected to increase performance VMM blocks with analog, memristor-based DPE circuits. An
further. Our results directly computed the forward inference important question is whether such computational efficiency
using existing trained neural networks, showing that these may is maintained at the application level, for example in full con-
be utilized directly without redesigning or retraining for the volutional neural network (CNN) inference where a streaming
particular hardware. This is well-matched to the paradigms of input of images is classified. In an earlier architectural study
IoT and Edge computing, in which the goal is to deploy and and performance estimation,[11] we evaluated CNN applica-
use trained networks in mobile applications where low power tions computed by a DPE-based system composed of many
and low cost (small chip area) are key, but speed and perfor- 128 × 128 arrays, each with only 2 bits per cell and a 10 MHz
mance cannot be compromised. Additionally, in such applica- operating frequency. All ADC elements, data routing, and buff-
tions, a reduced precision can often be tolerated. Consequently, ering are taken into account, along with the costs in breaking
a key figure of merit is the computational power efficiency of down the larger images across multiple arrays. The results show
the system. The present design, including power consumption a 14.8x, 5.5x, and 7.5x improvement in throughput (inferences
in the peripheral circuitry, is estimated to yield 115 TOPS W−1 per second), energy, and computational density (inferences
(Tera operations per second per Watt, see Table S2 in the Sup- per second per chip area) over the leading Digital ASIC imple-
porting Information). mentation[6] for the same task. Thus, the present work experi-
In this work, we have experimentally validated the potential mentally validates the assumptions for that architectural study,
for analog computing in nonvolatile memristor crossbar arrays. providing a base-line for future analog computing systems and
We demonstrated VMM computations in arrays up to 128 × 64, the potential to accelerate and significantly lower energy con-
supporting the assumptions in architectures showing significant sumption for important applications.
acceleration of neural network and signal processing computa-
tions compared to digital implementations.[11,54,55] We further
showed that multiple, stable conductance levels can be realized
in 1T1M cells composed of Ta/HfO2 memristors with 6 bits
Experimental Section
of precision. Given the low-precision requirements in many Transistor and Memristor Integration: An array of NMOS transistors
applications,[50–52] this exceeds the demands considerably. We were fabricated in a commercial fab with low wire resistance. The
transistors had a feature size of 2 µm. The memristor arrays were
showed that large memristor crossbars can carry out single-
fabricated in a university clean room aligned to the underlying
step VMM operations in-memory leading to high-throughput transistors following an argon plasma to remove protective metal-oxide
and low-energy consumption. The computing accuracy was layers. Photolithography patterning was used, along with thin-film
shown to be acceptable for machine learning applications deposition and liftoff. Sputter deposition of 5 nm silver (Ag) and 200 nm
including image recognition, and we also demonstrated repro- palladium (Pd) was used for the metal vias. After lifting-off in warm
grammability for multiple applications. Direct implementation acetone, the sample was annealed at 300 °C for 30 min in nitrogen
of the forward inference of a neural network for MNIST image with a flow of 20 sccm. A 60 nm Pd layer with a 5 nm tantalum (Ta)
adhesive layer was sputtered to serve as the bottom electrode. A 5 nm
recognition was shown, yielding 89.9% accuracy for a single HfO2 switching layer was deposited by atomic layer deposition using
layer. Reducing stuck-ON cells and using multilayer neural water and tetrakis(dimethylamido)hafnium as precursors at 250 °C,
networks are expected to increase the accuracy to state of the to ensure high film quality and step coverage. The patterning of the
art.[56] switching layer was done by photolithography and reactive ion etch

Adv. Mater. 2018, 30, 1705914 1705914  (8 of 10) © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
15214095, 2018, 9, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/adma.201705914 by The Librarian, Wiley Online Library on [10/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.advmat.de

using CHF3/O2 chemistry. Sputter deposited Ta of 50 nm thickness Keywords


followed by 10 nm Pd was used in a lift-off process to serve as the top
electrode. crossbar arrays, memristor, metal oxide, neuromorphic computing
Control Electronics and Chip Interface: A dedicated system of PCB
was designed and manufactured to accomplish all memristor state Received: October 11, 2017
programming (reading and writing) as well as computational operations. Published online: January 10, 2018
The novel operating mode for computations involved simultaneous
driving of all rows in the array along with simultaneous reading of the
currents in all columns. The designed PCB system was able to perform
these operations with driving pulses from 164 ns to greater than 4 µs. [1] G. Liebmann, Br. J. Appl. Phys. 1950, 1, 92.
Table S1 (Supporting Information) had detailed specifications. [2] H. Kobayashi, J. L. White, A. A. Abidi, IEEE J. Solid-State Circuits
Interfacing of the control PCB system to the memristor arrays was 1991, 26, 738.
accomplished on a probe station with a specially designed cantilever
[3] J. Hutchinson, C. Koch, J. Luo, C. Mead, Computer 1988, 21, 52.
probecard system for high-speed signal transmission and the flexibility to
[4] H. P. Graf, L. D. Jackel, IEEE Circuits Devices Mag. 1989, 5, 44.
test and operate new memristor chips without the need for wire-bonding.
[5] R. S. Williams, Comput. Sci. Eng. 2017, 19, 7.
Images of the setup were available in the Supporting Information. All
system configurations and final result readout were performed by a [6] Y. Chen, T. Luo, S. Liu, S. Zhang, L. He, J. Wang, L. Li, T. Chen,
control computer communicating with a microcontroller in the PCB Z. Xu, N. Sun, O. Temam, in Proc. 47th Annual IEEE/ACM Int. Symp.
system. The microcontroller supported a small DPE instruction set, on Microarchitecture, IEEE, Piscatawy, NJ, USA 2014, pp. 609–622.
receiving configuration instructions to control the column and row [7] Y.-H. Chen, J. Emer, V. Sze, in Proc. 43rd Int. Symp. on Computer
boards for different write or read operations at arbitrary positions in the Architecture, (ISCA), IEEE, Piscatawy, NJ, USA 2016, pp. 367–379.
crossbar array. Communication with the computer limited all operation [8] B. Reagen, P. Whatmough, R. Adolf, S. Rama, H. Lee, S. K. Lee,
times, and to achieve higher speed and efficiency, instructions for fully J. M. Hernández-Lobato, G.-Y. Wei, D. Brooks, in Proc. 43rd Int. Symp.
array-level read and write operations were also implemented in the on Computer Architecture, (ISCA), IEEE, Piscatawy, NJ, USA 2016,
firmware. In this way, for example, one array-level read command by the pp. 267–278.
firmware would read all devices in an array (more than 8000 devices) [9] S. Liu, Z. Du, J. Tao, D. Han, T. Luo, Y. Xie, Y. Chen, T. Chen, in Proc.
in less than 2 s. A graphical user interface, data visualization scripts, 43rd Int. Symp. on Computer Architecture, (ISCA), IEEE, Piscatawy,
programming scripts, and computing scripts were developed to support NJ, USA 2016, pp. 393–405.
all operations described in this work. [10] P. Chi, S. Li, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, Y. Xie, in Proc.
Single-Layer Neural Network for MNIST Classification: The single- 43rd Int. Symp. on Computer Architecture, IEEE Press, Piscataway,
layer neural network was trained in MATLAB using softmax regression. NJ, USA 2016, pp. 27–39.
Softmax is a simple and well-suited model for MNIST pattern [11] A. Shafiee, A. Nag, N. Muralimanohar, R. Balasubramonian,
recognition since there were only ten possible classifications. 60k images
J. P. Strachan, M. Hu, R. S. Williams, V. Srikumar, in Proc. 43rd Int.
from the MNIST database were used for training, employing gradient
Symp. on Computer Architecture, (ISCA), IEEE, Piscatawy, NJ, USA
descent to find the target synaptic weights related to the cost function
2016, pp. 14–26.
defined in softmax regression. When the trained network was used for
classification, outputs were exponentiated and normalized to generate a [12] T. Gokmen, Y. Vlasov, Front. Neurosci. 2016, 10.
probability distribution. This last step was not implemented in the DPE [13] C. Mead, Proc. IEEE 1990, 78, 1629.
since the present memristor crossbars would not directly compute the [14] G. Indiveri, B. Linares-Barranco, T. J. Hamilton, A. Van Schaik,
nonlinear function. R. Etienne-Cummings, T. Delbruck, S.-C. Liu, P. Dudek, P. Häfliger,
S. Renaud, J. Schemmel, G. Cauwenberghs, J. Arthur, K. Hynna,
F. Folowosele, S. Saighi, T. Serrano-Gotarredona, J. Wijekoon,
Y. Wang, K. Boahen Front. Neurosci. 2011, 5, 73.
Supporting Information [15] T. Hasegawa, K. Terabe, T. Tsuruoka, M. Aono, Adv. Mater. 2012, 24,
252.
Supporting Information is available from the Wiley Online Library or [16] M. P. Sah, H. Kim, L. O. Chua, IEEE Circuits Syst. Mag. 2014, 14, 12.
from the author. [17] J. Y. Seok, S. J. Song, J. H. Yoon, K. J. Yoon, T. H. Park, D. E. Kwon,
H. Lim, G. H. Kim, D. S. Jeong, C. S. Hwang, Adv. Funct. Mater.
2014, 24, 5316.
[18] K. Steinbuch, Biol. Cybern. 1961, 1, 36.
Acknowledgements [19] R. Chawla, A. Bandyopadhyay, V. Srinivasan, P. Hasler, in Proc. IEEE
The authors acknowledge fruitful discussions with Mark McLean, 2004 Custom Integrated Circuits Conf., IEEE, Piscatawy, NJ, USA 2004,
David Mountain, and Richart Slusher. This research was based upon pp. 651–654.
work supported by the Office of the Director of National Intelligence [20] J. Hasler, H. B. Marr, Front. Neurosci. 2013, 7, 118.
(ODNI), Intelligence Advanced Research Projects Activity (IARPA), [21] K. K. Likharev, Sci. Adv. Mater. 2011, 3, 322.
via contract number 2014-14080800008. The views and conclusions [22] D. B. Strukov, K. K. Likharev, Nanotechnology 2005, 16, 888.
contained herein are those of the authors and should not be interpreted [23] N. Ge, J. H. Yoon, M. Hu, E. J. Merced-Grafals, N. Davila,
as necessarily representing the official policies or endorsements, either J. P. Strachan, Z. Li, H. Holder, Q. Xia, R. S. Williams, X. Zhou,
expressed or implied, of the ODNI, IARPA, or the U.S. Government. J. J. Yang, Sci. Rep. 2017, 7, 40135.
The U.S. Government is authorized to reproduce and distribute reprints [24] S. Kumar, J. P. Strachan, R. S. Williams, Nature 2017, 548, 318.
for Governmental purposes notwithstanding any copyright annotation [25] J. Backus, Commun. ACM 1978, 21, 613.
thereon. [26] S. Pi, P. Lin, Q. Xia, J. Vac. Sci. Technol., B: Nanotechnol. Microelectron.:
Mater., Process., Meas., Phenom. 2013, 31, 06FA02.
[27] E. J. Merced-Grafals, N. Dávila, N. Ge, R. S. Williams, J. P. Strachan,
Nanotechnology 2016, 27, 365202.
Conflict of Interest [28] C. Du, W. Ma, T. Chang, P. Sheridan, W. D. Lu, Adv. Funct. Mater.
The authors declare no conflict of interest. 2015, 25, 4290.

Adv. Mater. 2018, 30, 1705914 1705914  (9 of 10) © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
15214095, 2018, 9, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/adma.201705914 by The Librarian, Wiley Online Library on [10/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.advmat.de

[29] Z. Wang, S. Joshi, S. E. Savel’ev, H. Jiang, R. Midya, P. Lin, M. Hu, [43] S. Yu, P. Y. Chen, Y. Cao, L. Xia, Y. Wang, H. Wu, in Proc. 2015 IEEE
N. Ge, J. P. Strachan, Z. Li, Q. Wu, M. Barnell, G.-L. Li, H. L. Xin, Int. Electron Devices Meeting (IEDM), IEEE, Piscatawy, NJ, USA
R. S. Williams, Q. Xia, J. J. Yang, Nat. Mater. 2017, 16, 101. 2015, pp. 17.3.1–17.3.4.
[30] M. Suri, O. Bichler, D. Querlioz, O. Cueto, L. Perniola, V. Sousa, [44] A. Wedig, M. Luebben, D.-Y. Cho, M. Moors, K. Skaja, V. Rana,
D. Vuillaume, C. Gamrat, B. DeSalvo, in Proc. 2011 IEEE Int. Electron T. Hasegawa, K. K. Adepalli, B. Yildiz, R. Waser, I. Valov, Nat. Nano-
Devices Meeting (IEDM), IEEE, Piscatawy, NJ, USA 2011, p. 4. technol. 2015, 11, 67.
[31] S. Yu, B. Gao, Z. Fang, H. Yu, J. Kang, H.-S. P. Wong, in Proc. 2012 [45] H. Jiang, L. Han, P. Lin, Z. Wang, M. H. Jang, Q. Wu, M. Barnell,
IEEE Int. Electron Devices Meeting (IEDM), IEEE, Piscatawy, NJ, USA J. J. Yang, H. L. Xin, Q. Xia, Sci. Rep. 2016, 6, 28525.
2012, pp. 10–14. [46] C. Li, M. Hu, Y. Li, H. Jiang, N. Ge, E. Montgomery, J. Zhang,
[32] G. W. Burr, P. Narayanan, R. M. Shelby, S. Sidler, I. Boybat, W. Song, N. Davila, C. E. Graves, Z. Li, J. P. Strachan, P. Lin,
C. di Nolfo, Y. Leblebici, in Proc. 2015 IEEE Int. Electron Devices Z. Wang, M. Barnell, Q. Wu, R. S. Williams, J. J. Yang, Q. Xia, Nat.
Meeting (IEDM), IEEE, Piscatawy, NJ, USA 2015, p. 4. Electron. 2017, https://doi.org/10.1038/s41928-017-0002-z.
[33] G. W. Burr, R. M. Shelby, A. Sebastian, S. Kim, S. Kim, S. Sidler, [47] S. Agarwal, T.-T. Quach, O. Parekh, A. H. Hsia, E. P. DeBenedictis,
K. Virwani, M. Ishii, P. Narayanan, A. Fumarola, L. L. Sanches, C. D. James, M. J. Marinella, J. B. Aimone, Front. Neurosci. 2016, 9,
I. Boybat, M. Le Gallo, K. Moon, J. Woo, H. Hwang, Y. Leblebici, 484.
Adv. Phys.: X 2017, 2, 89. [48] K.-H. Kim, S. Gaba, D. Wheeler, J. M. Cruz-Albrecht, T. Hussain,
[34] M. Prezioso, F. Merrikh-Bayat, B. D. Hoskins, G. C. Adam, N. Srinivasa, W. Lu, Nano Lett. 2012, 12, 389.
K. K. Likharev, D. B. Strukov, Nature 2015, 521, 61. [49] G. Medeiros-Ribeiro, F. Perner, R. Carter, H. Abdalla, M. D. Pickett,
[35] P. M. Sheridan, C. Du, W. D. Lu, IEEE Trans. Neural Netw. Learn. R. S. Williams, Nanotechnology 2011, 22, 95702.
Syst. 2016, 27, 2327. [50] M. Courbariaux, Y. Bengio, J.-P. David, presented at International
[36] S. Ambrogio, S. Balatti, V. Milo, R. Carboni, Z.-Q. Wang, Conference on Learning Representations (ICLR), San Diego, CA, USA
A. Calderoni, N. Ramaswamy, D. Ielmini, IEEE Trans. Electron May 2015.
Devices 2016, 63, 1508. [51] M. Rastegari, V. Ordonez, J. Redmon, A. Farhadi, in Proc. Euro-
[37] S. Yu, Z. Li, P.-Y. Chen, H. Wu, B. Gao, D. Wang, W. Wu, H. Qian, pean Conf. on Computer Vision, Springer, Cham, Switzerland 2016,
in Proc. 2016 IEEE Int. Electron Devices Meeting (IEDM), IEEE, pp. 525–542.
Piscatawy, NJ, USA 2016, pp. 12–16. [52] S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, Y. Zou, arXiv Prepr.
[38] S. Agarwal, S. J. Plimpton, D. R. Hughart, A. H. Hsia, I. Richter, arXiv1606.06160 2016.
J. A. Cox, C. D. James, M. J. Marinella, in Proc. 2016 Int. Joint Conf. [53] Y. LeCun, C. Cortes, C. J. C. Burges, The MNIST Database of Hand-
on Neural Networks, (IJCNN), IEEE, Piscatawy, NJ, USA 2016, written Digits, National Institute of Standards and Technology,
pp. 929–938. Gaithersburg, MD, USA 1998.
[39] G. Indiveri, E. Linn, S. Ambrogio, Resistive Switching: From Funda- [54] M. Hu, J. P. Strachan, Z. Li, E. M. Grafals, N. Davila, C. Graves,
mentals of Nanoionic Redox Processes to Memristive Device Applications, S. Lam, N. Ge, J. J. Yang, R. S. Williams, in Proc. 2016 53nd ACM/
(Eds: D. Ielmini, R. Waser), Wiley-VCH, Weinheim, Germany 2016, EDAC/IEEE Design Automation Conf., (DAC), Association for Com-
pp. 715–736. puting Machinery, New York 2016, pp. 1–6.
[40] P. M. Sheridan, F. Cai, C. Du, W. Ma, Z. Zhang, W. D. Lu, [55] M. Hu, J. P. Strachan, in Proc. IEEE Int. Conf. on Rebooting Computing
Nat. Nanotechnol. 2017, 12, 784. (ICRC), IEEE, Piscatawy, NJ, USA 2016, pp. 1–5.
[41] P. Yao, H. Wu, B. Gao, S. B. Eryilmaz, X. Huang, W. Zhang, [56] C. Liu, M. Hu, J. P. Strachan, H. Li, in 2017 54th ACM/EDAC/IEEE
Q. Zhang, N. Deng, L. Shi, H.-S. P. Wong, H. Qian, Nat. Commun. Design Automation Conference (DAC), Association for Computing
2017, 8, 15199. Machinery, New York 2017, pp. 1–6.
[42] G. W. Burr, R. M. Shelby, S. Sidler, C. di Nolfo, J. Jang, I. Boybat, [57] G. Van Der Plas, B. Verbruggen, in Proc. 2008 IEEE Int. Solid-State
R. S. Shenoy, P. Narayanan, K. Virwani, E. U. Giacometti, Circuits Conf. (ISSCC) Dig. Tech. Pap., IEEE, Piscatawy, NJ, USA
B. N. Kurdi, H. Hwang, IEEE Trans. Electron Devices 2015, 62, 3498. 2008, pp. 242–610.

Adv. Mater. 2018, 30, 1705914 1705914  (10 of 10) © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

You might also like