AI and ML Accelerator Survey and Trends
AI and ML Accelerator Survey and Trends
power, both in embedded applications and in data centers. fp16.32 corresponds to fp16 for multiplication and fp32 for
This paper is an update to IEEE-HPEC papers from the past accumulate/add). The form factor is depicted by color, which
three years [9]–[11]. As in past years, this paper continues shows the package for which peak power is reported. Blue
with last year’s focus on accelerators and processors that are corresponds to a single chip; orange corresponds to a card; and
geared toward deep neural networks (DNNs) and convolutional green corresponds to entire systems (single node desktop and
neural networks (CNNs) as they are quite computationally in- server systems). This survey is limited to single motherboard,
tense [12]. This survey focuses on accelerators and processors single memory-space systems. Finally, the hollow geometric
for inference for a variety of reasons including that defense objects are peak performance for inference-only accelerators,
and national security AI/ML edge applications rely heavily on while the solid geometric figures are performance for acceler-
inference. And we will consider all of the numerical precision ators that are designed to perform both training and inference.
types that an accelerator supports, but for most of them, their The survey begins with the same scatter plot that we have
best inference performance is in int8 or fp16/bf16 (IEEE 16- compiled for the past three years. As we did last year, to save
bit floating point or Google’s 16-bit brain float). space, we have summarized some of the important metadata
There are many surveys [13]–[24] and other papers that of the accelerators, cards, and systems in Table I, including
cover various aspects of AI accelerators. For instance, the first the label used in Figure 2 for each of the points on the
paper in this multi-year survey included the peak performance graph; many of the points were brought forward from last
of FPGAs for certain AI models; however, several of the year’s plot, and some details of those entries are in [9].
aforementioned surveys cover FPGAs in depth so they are There are several additions which we will cover below. In
no longer included in this survey. This multi-year survey Table I, most of the columns and entries are self explana-
effort and this paper focus on gathering a comprehensive list tory. However, there are two Technology entries that may
of AI accelerators with their computational capability, power not be: dataflow and PIM. Dataflow processors are custom-
efficiency, and ultimately the computational effectiveness of designed processors for neural network inference and training.
utilizing accelerators in embedded and data center applica- Since neural network training and inference computations can
tions. Along with this focus, this paper mainly compares be entirely deterministically laid out, they are amenable to
neural network accelerators that are useful for government dataflow processing in which computations, memory accesses,
and industrial sensor and data processing applications. A few and inter-ALU communications actions are explicitly/statically
accelerators and processors that were included in previous programmed or “placed-and-routed” onto the computational
years’ papers have been left out of this year’s survey. They hardware. Processor in memory (PIM) accelerators integrate
have been dropped because they have been surpassed by processing elements with memory technology. Among such
new accelerators from the same company, they are no longer PIM accelerators are those based on an analog computing
offered, or they are no longer relevant to the topic. technology that augments flash memory circuits with in-place
analog multiply-add capabilities. Please refer to the references
for the Mythic and Gyrfalcon accelerators for more details on
II. S URVEY OF P ROCESSORS
this innovative technology.
Many recent advances in AI can be at least partly cred- Finally, a reasonable categorization of accelerators follows
ited to advances in computing hardware [6], [7], [25], [26], their intended application, and the five categories are shown
enabling computationally heavy machine-learning algorithms as ellipses on the graph, which roughly correspond to perfor-
and in particular DNNs. This survey gathers performance and mance and power consumption: Very Low Power for speech
power information from publicly available materials including processing, very small sensors, etc.; Embedded for cameras,
research papers, technical trade press, company benchmarks, small UAVs and robots, etc.; Autonomous for driver assist
etc. While there are ways to access information from com- services, autonomous driving, and autonomous robots; Data
panies and startups (including those in their silent period), Center Chips and Cards; and Data Center Systems.
this information is intentionally left out of this survey; such For most of the accelerators, their descriptions and commen-
data will be included in this survey when it becomes publicly taries have not changed since last year so please refer to last
available. The key metrics of this public data are plotted in two years’ papers for descriptions and commentaries. There
Figure 2, which graphs recent processor capabilities (as of July are, however, several new releases that were not covered by
2022) mapping peak performance vs. power consumption. The past papers that are covered here.
dash-dotted box depicts the very dense region that is zoomed • Acelera, a Dutch embedded system startup, reported
in and plotted in Figure 3. the results of an embedded test chip that they have
The x-axis indicates peak power, and the y-axis indicate produced [37]. They claim both digital and analog design
peak giga-operations per second (GOps/s), both on a loga- capabilities, and this test chip was made to test the
rithmic scale. The computational precision of the processing extent of the digital design capabilities. They expect to
capability is depicted by the geometric shape used; the com- add analog (probably flash) design elements in upcoming
putational precision spans from analog and single-bit int1 to efforts.
four-byte int32 and two-byte fp16 to eight-byte fp64. The • Maxim Integrated has released a system-on-chip
precisions that show two types denotes the precision of the (SoC) for ultra low power applications called the
multiplication operations on the left and the precision of MAX78000 [84]–[86], which includes an ARM CPU
the accumulate/addition operations on the right (for example, core, a RISC-V CPU core and an AI accelerator. The
978-1-6654-9786-2/22/$31.00 ©2022 IEEE
Authorized licensed use limited to: ETH BIBLIOTHEK ZURICH. Downloaded on July 03,2023 at 03:04:40 UTC from IEEE Xplore. Restrictions apply.
MIT LINCOLN LABORATORY
Neural Network Processing Performance
S UPERCOMPUTING C ENTER
3
8
10 Legend
Computation Precision
Data Center Tachyum Data Center
7 Systems analog
10 Chips &
DGX-A100 int1
Cards GroqNode
GyrfalconServer int2
DGX-2CS-2
TsunAImi AGX-L5 int4.8
10 6 GraphCoreNode GroqNodeCS-1 int8
Peak Performance (GOps/sec)
H100
Cornami Ascend-910
Qcomm Groq
GraphCoreBow
DGX-1
Int8.32
Alibaba
TPU4 DGX-Station
Tenstorrent A30
Arria Baidu A100
A40 int16
T4 TPU4i
GraphCore2 GraphCore
Groq
AGX-L2
TPU1 int12.16
10 5 Embedded Aimotive OrinAGX
Tesla A10 AWS
V100
TPU3
/W Axelera
GoyaAchronix int32
s Mythic108
Op
AlphaIC OrinNX
Kalray Gaudi TPU2 fp16
e ra
ToshibaXavierAGX
T Mythic76 EyeQ5 XavierNX
XavierAGX P100 fp16.32
0
10
10 4 Ascend-310 NovuMind fp32
SiMa.ai
*
Perceive Ethos Journey2 TexInst fp64
Gyrfalcon Quadric XavierNX
TPUedge
AIStorm RK3399Pro Blaize
Hailo-8
Very Low /W Bitmain Form Factor
10 3 ps O
KL720
Power
e ra
Chip
Jetson2
1 0 TKendryte Jetson1 Card
Syntiant
GAP9 Autonomous
s /W s /W System
2
10
O p O p
Maxim
Te ra ig a
0G
GAP8
1 Computation Type
1 10
10 Inference
10 -2 10 -1 10 0 10 1 10 2 10 3 10 4
Training
Peak Power (W)
view - 2
Fig.courtesy
2: Peak performance vs. Lincoln
power Laboratory
scatter plot of publicly Center
announced MIT
AI accelerators LINCOLN
and LABORATORY
processors.
Slide of Albert Reuther, MIT Supercomputing S UPERCOMPUTING C ENTER
Arria Baidu
GraphCore2
AGX-L2 A30 Groq Hopper (H100) in March 2022 [98]. It features even more
A40
T4 TPU4i GraphCore Symmetric Multiprocessors (SIMD and Tensor cores),
OrinAGX
A10 AWS V100 TPU3
10 5 TPU1
50% higher memory bandwidth, and a 700W power
Aimotive Goya
Tesla Achronix budget for the SXM mezzanine card instance. (PCIe card
power budget is 450W.
OrinNX Gaudi • Over the past couple of years, NVIDIA has also an-
TPU2
XavierAGX
nounced and released several system platforms for au-
Kalray
Toshiba
tomotive, robotic, and other embedded applications that
XavierNX P100
deploy Ampere-generation GPU architecture. Specifically
NovuMind
XavierAGX for automotive applications, the DRIVE AGX platform
10 4 added two new systems: the DRIVE AGX L2 that en-
10 1 10 2 10 3
Peak Power (W) ables Level 2 autonomous driving within a 45W power
Fig. 3: Zoomed region of peak performance vs. power scatter envelope and the DRIVE AGX L5 that is intended to
plot. enable Level 5 autonomous driving within an 800W
power envelope [103]. Similarly, the Jetson AGX Orin
and Jetson NX Orin also use an Ampere-generation
GPU, and are intended for robotics, factory automation,
ARM core is for quick prototyping and code reuse, etc. [100], [101], and they consume a maximum of 60W
while the RISC-V core is included to enable optimizing and 25W peak power.
for the lowest power utilization. The AI accelerator has • Graphcore shared rough peak performance numbers for
64 parallel processors and support 1-bit, 2-bit, 4-bit, their second generation accelerator chip, the CG200 [59],
and 8-bit integer operations. The SoC operates at a [129], [130]. Since it is deployed on a PCIe card, we
maximum of 30mW and is intended for low-latency, can assume that the peak power is around 300W. In the
battery-powered applications. past year, Graphcore also announced it’s Bow accelerator,
• Tachyum came out of startup stealth mode in 2017, and which is the first wafer-on-wafer processor designed
they just recently announced the release of an evaluation in cooperation with TSMC. The accelerator itself is
board for their Prodigy all-in-one processor [128]. They the same CG200 as mentioned above, but it is mated
Authorized licensed use limited to: ETH BIBLIOTHEK ZURICH. Downloaded on July 03,2023 at 03:04:40 UTC from IEEE Xplore. Restrictions apply.
4
TABLE I: List of accelerator labels for plots. Next, we must mention accelerators that do not appear on
Company
Achronix
Product
VectorPath S7t-VG6
Label
Achronix
Technology Form Factor
dataflow Card
References
[27]
Figure 2 yet. Each has been released with some benchmark
Aimotive aiWare3 Aimotive dataflow Chip [28] results but either no peak performance numbers or no peak
AIStorm AIStorm AIStorm dataflow Chip [29]
Alibaba Alibaba Alibaba dataflow Card [30] power numbers.
AlphaIC RAP-E AlphaIC dataflow Chip [31]
Amazon Inferentia AWS dataflow Card [32], [33] • After last year releasing some impressive benchmark
ARM Ethos N77 Ethos dataflow Chip [36]
Axelera Axelera Test Core Axelera dataflow Chip [37] results for their reconfigurable AI accelerator technol-
Baidu Baidu Kunlun 818-300 Baidu dataflow Chip [38]–[40]
Bitmain BM1880 Bitmain dataflow Chip [41] ogy [131] and this year publishing two deeper technol-
Blaize El Cano Blaize dataflow Card [42]
Canaan Kendrite K210 Kendryte CPU Chip [45] ogy reveals [132], [133] and an applications paper with
Cerebras CS-1 CS-1 dataflow System [46]
Cerebras CS-2 CS-2 dataflow System [47] Argonne National Laboratory [134], SambaNova still has
Cornami Cornami Cornami dataflow Chip [48]
Enflame Cloudblazer T10 Enflame CPU Card [49]
not provided any details from which we can estimate peak
Google
Google
TPU Edge
TPU1
TPUedge
TPU1
dataflow
dataflow
System
Chip
[51]
[52], [53]
performance or power consumption of their solutions.
Google TPU2 TPU2 dataflow Chip [52], [53] • In May 2022, Intel’s Habana Labs announced the sec-
Google TPU3 TPU3 dataflow Chip [52]–[54]
Google TPU4i TPU4i dataflow Chip [54] ond generations of the Goya inference accelerator and
Google TPU4 TPU4 dataflow Chip [55]
GraphCore C2 GraphCore dataflow Card [56], [57] Gaudi training accelerator, named Greco and Gaudi2,
GraphCore C2 GraphCoreNode dataflow System [58]
GraphCore Colossus Mk2 GraphCore2 dataflow Card [59] respectively [135], [136]. Both promised multiple times
GraphCore Bow-2000 GraphCoreBow dataflow Card [60]
GreenWaves GAP8 GAP8 dataflow Chip [61], [62] better performance than their predecessor. Greco will be a
GreenWaves GAP9 GAP9 dataflow Chip [61], [62]
Groq Groq Node GroqNode dataflow System [63] single-width PCIe card drawing 75W, while the Gaudi2
Groq Tensor Streaming Processor Groq dataflow Card [56], [64]
Gyrfalcon Gyrfalcon Gyrfalcon PIM Chip [65]
will continue to be a double-width PCIe card drawing
Gyrfalcon
Habana
Gyrfalcon
Gaudi
GyrfalconServer
Gaudi
PIM
dataflow
System
Card
[66]
[67], [68]
650W (likely on a PCIe 5.0 slot). Habana released some
Habana
Hailo
Goya HL-1000
Hailo
Goya
Hailo-8
dataflow
dataflow
Card
Chip
[68], [69]
[70]
benchmarking comparisons to Nvidia A100 GPUs for
Horizon Robotics Journey2 Journey2 dataflow Chip [71] the Gaudi2, but peak performance numbers were not
Huawei HiSilicon Ascend 310 Ascend-310 dataflow Chip [72]
Huawei HiSilicon Ascend 910 Ascend-910 dataflow Chip [73] disclosed for either of these accelerators.
Intel Arria 10 1150 Arria FPGA Chip [74], [75]
Intel Mobileye EyeQ5 EyeQ5 dataflow Chip [42] • Esperanto has produced a few demo chips for evaluation
Kalray Coolidge Kalray manycore Chip [80], [81]
Kneron KL720 KL720 dataflow Chip [83] by Samsung and other partners [137]. The chip is reported
Maxim Max 78000 Maxim dataflow Chip [84]–[86]
Mythic M1076 Mythic76 PIM Chip [88]–[90] to be a 1,000-core RISC-V processor with each core
Mythic M1108 Mythic108 PIM Chip [88]–[90]
NovuMind NovuTensor NovuMind dataflow Chip [91], [92]
having an AI tensor accelerator. Esperanto has published
NVIDIA
NVIDIA
Ampere A10
Ampere A100
A10
A100
GPU
GPU
Card
Card
[93]
[94]
a few relative performance metrics [138], but they have
NVIDIA
NVIDIA
Ampere A30
Ampere A40
A30
A40
GPU
GPU
Card
Card
[93]
[93]
not disclosed any peak power or peak performance values.
NVIDIA DGX Station DGX-Station GPU System [95] • During the Tesla AI Day event, Telsa gave some details of
NVIDIA DGX-1 DGX-1 GPU System [95], [96]
NVIDIA DGX-2 DGX-2 GPU System [96] their custom-built Dojo accelerator and system. They did
NVIDIA DGX-A100 DGX-A100 GPU System [97]
NVIDIA H100 H100 GPU Card [98] provide peak performance of 22.6 TF FP32 performance
NVIDIA Jetson AGX Xavier XavierAGX GPU System [99]
NVIDIA Jetson NX Orin OrinNX GPU System [100], [101] per chip, but they did not report peak power draw per
NVIDIA Jetson AGX Orin OrinAGX GPU System [100], [101]
NVIDIA Jetson TX1 Jetson1 GPU System [102] chip. Perhaps these details will come later [139].
NVIDIA Jetson TX2 Jetson2 GPU System [102]
NVIDIA Jetson Xavier NX XavierNX GPU System [99] Finally there is one departure to the report this year. Last
NVIDIA DRIVE AGX L2 AGX-L2 GPU System [103]
NVIDIA DRIVE AGX L5 AGX-L5 GPU System [103] year, Centaur Technology announced a x86 CPU with an
NVIDIA Pascal P100 P100 GPU Card [104], [105]
NVIDIA T4 T4 GPU Card [106] integrated AI accelerator, which was realized as a 4,096 byte-
NVIDIA
Perceive
Volta V100
Ergo
V100
Perceive
GPU
dataflow
Card
Chip
[105], [107]
[108]
wide SIMD unit. The performance estimates were competitive,
Preferred Networks
Quadric
MN-3
q1-64
Preferred-MN-3 multicore
Quadric dataflow
Card
Chip
[110], [111]
[112]
but VIA Technologies, the parent company of Centaur, sold
Qualcomm
Rockchip
Cloud AI 100
RK3399Pro
Qcomm
RK3399Pro
dataflow
dataflow
Card
Chip
[113], [114]
[115]
off the USA-based engineering team of the processor to Intel,
SiMa.ai SiMa.ai SiMa.ai dataflow Chip [116] Corp. and seems to have ended the development of the CNS
Syntiant NDP101 Syntiant PIM Chip [117], [118]
Tachyum Prodigy Tachyum CPU Chip [119] processor [140].
Tenstorrent Tenstorrent Tenstorrent multicore Card [120]
Tesla Tesla Full Self-Driving Computer Tesla dataflow System [121], [122]
Texas Instruments TDA4VM TexInst dataflow Chip [123]–[125]
Toshiba 2015 Toshiba multicore System [126] III. O BSERVATIONS AND T RENDS
Untether TsunAImi TsunAImi PIM Card [127]
There are several observations comments for us to appreci-
ate on Figure 2.
with a second wafer that greatly improves power and • Int8 continues to be the default numerical precision for
clock distribution throughout the CG200 chip [60]. This embedded, autonomous and data center inference appli-
translates into 40% better performance and 16% better cations. This precision is adequate for most AI/ML ap-
performance-per-Watt. plications with a reasonable number of classes. However,
• Almost a year after Google announced details of their some accelerators also use fp16 and/or bf16 for inference.
fourth generation inference-only TPU4i accelerator in For training, has become integer representations
June 2021 [54], Google shared details about their fourth • Among the very low power chips, what is not captured is
generation training accelerator, TPUv4. Very few details the other features beyond the machine learning accelera-
were announced, but they did share peak power and per- tor on the chip. It is very common in this category and
formance numbers [55]. As with previous TPU variants, the Embedded category to release system-on-chip (SoC)
TPU4 is available through the Google Compute Cloud solutions, which often include low-power CPU cores,
and for internal operations. audio and video analog-to-digital converters (ADCs),
978-1-6654-9786-2/22/$31.00 ©2022 IEEE
Authorized licensed use limited to: ETH BIBLIOTHEK ZURICH. Downloaded on July 03,2023 at 03:04:40 UTC from IEEE Xplore. Restrictions apply.
MIT LINCOLNNeural Network Peak Performance 5
LABORATORY
S U P E R C O M PPast
U T I NDecade –RPrecision Comparison
G C ENTE
A. Broader Trends 10
0
2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
We also collected release dates, fabrication technology, and Release Date
MIT LINCO
peak performance for multiple precisions for a smaller subset LLSC Overview - 4 From: Albert Reuther, MIT LL Supercomputing Center
(b) Peak performance and fabrication technology vs. release date. S UPERCOM
of accelerators listed in Table I. We were curious about the
Fig. 4: Trends with respect to release date for subset of
trends of peak performance over the past ten years and how
publicly announced AI accelerators and processors.
numerical precision and fabrication technology influenced it.
These data are plotted in Figure 4. Figure 4a plots the release
date of a number of accelerators versus their peak performance
for one or more precision formats. There are marked gains in numerical formats for inference and training continue. For
peak performance for each of the precision formats, but within inference, some discussion continues whether int4 will be
each format the maximum gain is 1.5 orders of magnitude acceptable for embedded inference, and the Maxim MAX
over the 10-year period. In Figure 4b, we plot the release 78000 SoC solution supports 1-bit, 2-bit, 4-bit, and 8-bit
date versus the fabrication technology used for the accelerator. integer weights [85]. On the training side, it has been an-
The default precision for the peak performance values is int8; nounced that NIVIDA Hopper, Intel Gaudi2 and a future
however, there are a number of accelerators (e.g., NVIDIA GraphCore accelerator will support the lower precision FP8
K20, K80 and AMD Mi8) which did not have int8 support. numerical format [142]. GraphCore posted an analysis paper
For these accelerators, the peak performance is reported for on FP8 [143], including trade-off analyses of scaled integer
the lowest precision that the accelerator supported. This plot versus floating point representations, different 8-bit floating
shows that much performance has been gained over the past point representations, and mixed representation DNN model
ten years by supporting lower precision formats; it is partic- performance.
ularly interesting to observe how support for lower precision Another trend that has caught our attention is that math-
formats was included in these accelerators as research and ematical kernels other than DNN/CNN models have been
industry explore the effectiveness of lower floating point and implemented on several dataflow accelerators. These dataflow
integer formats in CNN/DNN inference and training. accelerators generally handle each data item independently
We have several more observations and trends that are not (i.e., there are no cache lines), and data movement and com-
yet captured in graphs. First, the exploration for the best putational operations are explicitly/statically programmed or
978-1-6654-9786-2/22/$31.00 ©2022 IEEE
Authorized licensed use limited to: ETH BIBLIOTHEK ZURICH. Downloaded on July 03,2023 at 03:04:40 UTC from IEEE Xplore. Restrictions apply.
6
“placed-and-routed” onto the computational hardware (as men- tups have announced photonic inference processors, includ-
tioned previously). Hence, they are amenable to implementing ing LightMatter [165], Lightelligence [166], LightOn [167],
other mathematical kernels for digital signal processing, phys- and Optalysys [168], [169], and several of these companies
ical simulation like computational fluid dynamics and weather have suggested that they will publish performance and power
simulation, and massive graph processing. Cerebras demon- measurements later this year [170], [171]. The LightMatter,
strated the mapping of fast stencil-code onto their wafer- Lightelligence, and LightOn accelerators implement multiply-
scale processor [144], while researchers from the University accumulate computations directly with Mach-Zehnder inter-
of Bristol demonstrated stencil codes and image processing ferometers, while the Optalysys uses an 2-dimensional FFT
using a GraphCore IPU [145]. A team from Citadel Enterprise technique also based on Mach-Zehnder interferometers.
America also reported on a series of HPC microbenchmarks
that they ran on GraphCore IPUs [146]. Google Research IV. S UMMARY
has been very busy demonstrating their TPUs on a variety This paper updated the survey of deep neural network
of parallel HPC applications including flood prediction [147], accelerators that span from extremely low power through
large scale distributed linear algebra [148], molecular dynam- embedded and autonomous applications to data center class
ics simulation [149], fast Fourier transforms [150], [151], MRI accelerators for inference and training. We focused on infer-
reconstruction [152], financial Monte Carlo simulations [153], ence accelerators, and discussed some new additions for the
and Monte Carlo simulation of the Ising model [154]. We year. The rate of announcements and releases has continued
see this as a foreshadowing of more interesting research and to be consistent and modest.
development in using this high performance accelerators.
V. DATA AVAILABILITY
B. Other Technologies The data spreadsheets and references that have been col-
The word neuromorphic has become a nebulous term. lected for this study and its papers will be posted at https:
In industry, it seems to have settled on any computational //github.com/areuther/ai-accelerators after they have cleared
circuit that in some way mimics some aspects of how the the release review process.
synapses in brains work. When this is applied most broadly,
it encompasses many if not all of the accelerators that this ACKNOWLEDGEMENT
series of papers surveys. In academia and the broader research We express our gratitude to Masahiro Arakawa, Bill Ar-
world, neuromorphic computing is the research, design, and cand, Bill Bergeron, David Bestor, Bob Bond, Chansup Byun,
development of computational hardware that models function- Nathan Frey, Vitaliy Gleyzer, Jeff Gottschalk, Michael Houle,
ality and processes in brains, including chemical processes Matthew Hubbell, Hayden Jananthan, Anna Klein, David
and electrical processes [155], [156]. These brain process Martinez, Joseph McDonald, Lauren Milechin, Sanjeev Mo-
simulation efforts have spanned the past four decades, but there hindra, Paul Monticciolo, Julie Mullen, Andrew Prout, Stephan
is only a modest overlap with the accelerators that are captured Rejto, Antonio Rosa, Matthew Weiss, Charles Yee, and Marc
in these surveys. Zissman for their support of this work.
One clear overlap is circuitry based on spiking neural
networks, which is what we will focus on here. Intel probably R EFERENCES
has the most extensive research program for evaluating the
[1] V. Gadepally, J. Goodwin, J. Kepner, A. Reuther, H. Reynolds,
commercial viability of spiking neural network accelerators S. Samsi, J. Su, and D. Martinez, “AI Enabling Technologies,” MIT
with their Loihi technology [157], [158] and Intel Neuromor- Lincoln Laboratory, Lexington, MA, Tech. Rep., may 2019. [Online].
phic Development Community [159]. Among the applications Available: https://arxiv.org/abs/1905.03592
[2] T. N. Theis and H. . P. Wong, “The End of Moore’s Law: A
that have been explored with Loihi are target classification in New Beginning for Information Technology,” Computing in Science
synthetic aperture radar and optical imagery [160], automotive Engineering, vol. 19, no. 2, pp. 41–50, mar 2017. [Online]. Available:
scene analysis [161], and spectrogram encoder [158]. Further, https://doi.org/10.1109/MCSE.2017.29
[3] M. Horowitz, “Computing’s Energy Problem (and What We Can Do
one company, Innatera, has announced a commercial spiking About It),” in 2014 IEEE International Solid-State Circuits Conference
neural network processor [162]. They have shared an example Digest of Technical Papers (ISSCC). IEEE, feb 2014, pp. 10–14.
inference benchmark demonstration [163], but they have not [Online]. Available: http://ieeexplore.ieee.org/document/6757323/
[4] C. E. Leiserson, N. C. Thompson, J. S. Emer, B. C. Kuszmaul, B. W.
release peak performance or power numbers. In a related vein, Lampson, D. Sanchez, and T. B. Schardl, “There’s Plenty of Room
some memristor technology is showing its effectiveness in at the Top: What Will Drive Computer Performance after Moore’s
simulating variable neuron-synapse functionality. However, the Law?” Science, vol. 368, no. 6495, jun 2020. [Online]. Available:
https://science.sciencemag.org/content/368/6495/eaam9744
use of memristors in AI/ML accelerators is still very much [5] N. C. Thompson and S. Spanuth, “The Decline of Computers as a
in the research phase. A company call Knowm is working General Purpose Technology,” Communications of the ACM, vol. 64,
towards commercialization of a memristor-based accelera- no. 3, pp. 64–72, mar 2021.
[6] J. L. Hennessy and D. A. Patterson, “A New Golden Age for
tor [164], but that is probably a few years away. They do Computer Architecture,” Communications of the ACM, vol. 62, no. 2,
sell a memristors and an evaluation kit on their website. pp. 48–60, jan 2019. [Online]. Available: http://dl.acm.org/citation.
Progress continues to be made in building and commer- cfm?doid=3310134.3282307
[7] W. J. Dally, Y. Turakhia, and S. Han, “Domain-Specific Hardware
cializing silicon photonic for AI/ML accelerators, including Accelerators,” Communications of the ACM, vol. 63, no. 7, pp. 48–57,
an extensive survey paper [24]. Several optical/photonic star- jun 2020. [Online]. Available: https://dl.acm.org/doi/10.1145/3361682
[8] Y. LeCun, “Deep Learning Hardware: Past, Present, and Future,” in [27] G. Roos, “FPGA Acceleration Card Delivers on Bandwidth, Speed, and
2019 IEEE International Solid- State Circuits Conference - (ISSCC), Flexibility,” nov 2019. [Online]. Available: https://www.eetimes.com/
feb 2019, pp. 12–19. fpga-acceleration-card-delivers-on-bandwidth-speed-and-flexibility/
[9] A. Reuther, P. Michaleas, M. Jones, V. Gadepally, S. Samsi, and [28] “aiWare3 Hardware IP Helps Drive Autonomous Vehicles To
J. Kepner, “AI Accelerator Survey and Trends,” in 2021 IEEE High Production,” oct 2018. [Online]. Available: https://aimotive.com/news/
Performance Extreme Computing Conference (HPEC), sep 2021, pp. content/1223
1–9. [29] R. Merritt, “Startup Accelerates AI at the Sensor,” feb 2019.
[10] ——, “Survey of Machine Learning Accelerators,” in 2020 IEEE High [Online]. Available: https://www.eetimes.com/startup-accelerates-ai-
Performance Extreme Computing Conference (HPEC), 2020, pp. 1–12. at-the-sensor/
[11] ——, “Survey and Benchmarking of Machine Learning Accelerators,” [30] T. Peng, “Alibaba’s New AI Chip Can Process
in 2019 IEEE High Performance Extreme Computing Conference, Nearly 80K Images Per Second,” 2019. [Online].
HPEC 2019. Institute of Electrical and Electronics Engineers Inc., sep Available: https://medium.com/syncedreview/alibabas-new-ai-chip-
2019. [Online]. Available: https://doi.org/10.1109/HPEC.2019.8916327 can-process-nearly-80k-images-per-second-63412dec22a3
[12] A. Canziani, A. Paszke, and E. Culurciello, “An Analysis of Deep [31] P. Clarke, “Indo-US Startup Preps Agent-based AI Processor,” aug
Neural Network Models for Practical Applications,” arXiv preprint 2018. [Online]. Available: https://www.eenewsanalog.com/news/indo-
arXiv:1605.07678, 2016. [Online]. Available: http://arxiv.org/abs/1605. us-startup-preps-agent-based-ai-processor/page/0/1
07678 [32] J. Hamilton, “AWS Inferentia Machine Learning Processor,” nov
[13] C. S. Lindsey and T. Lindblad, “Survey of Neural Network 2018. [Online]. Available: https://perspectives.mvdirona.com/2018/11/
Hardware,” in SPIE 2492, Applications and Science of Artificial aws-inferentia-machine-learning-processor/
Neural Networks, S. K. Rogers and D. W. Ruck, Eds., vol. [33] C. Evangelist, “Deep dive into Amazon Inferentia: A Custom-
2492. International Society for Optics and Photonics, apr 1995, pp. Built Chip to Enhance ML and AI,” jan 2020. [On-
1194–1205. [Online]. Available: http://proceedings.spiedigitallibrary. line]. Available: https://www.cloudmanagementinsider.com/amazon-
org/proceeding.aspx?articleid=1001095 inferentia-for-machine-learning-and-artificial-intelligence/
[14] Y. Liao, “Neural Networks in Hardware: A Survey,” Department [34] ExxactCorp, “Taking a Deeper Look at AMD Radeon Instinct GPUs for
of Computer Science, University of California, Tech. Rep., 2001. Deep Learning,” dec 2017. [Online]. Available: https://blog.exxactcorp.
[Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi= com/taking-deeper-look-amd-radeon-instinct-gpus-deep-learning/
10.1.1.460.3235 [35] R. Smith, “AMD Announces Radeon Instinct MI60 & MI50
[15] J. Misra and I. Saha, “Artificial Neural Networks in Hardware: Accelerators Powered By 7nm Vega,” nov 2018. [Online].
A Survey of Two Decades of Progress,” Neurocomputing, vol. 74, Available: https://www.anandtech.com/show/13562/amd-announces-
no. 1-3, pp. 239–255, dec 2010. [Online]. Available: https: radeon-instinct-mi60-mi50-accelerators-powered-by-7nm-vega
//doi.org/10.1016/j.neucom.2010.03.021 [36] D. Schor, “Arm Ethos is for Ubiquitous AI At the Edge,” feb 2020.
[16] V. Sze, Y. Chen, T. Yang, and J. S. Emer, “Efficient Processing of [Online]. Available: https://fuse.wikichip.org/news/3282/arm-ethos-is-
Deep Neural Networks: A Tutorial and Survey,” Proceedings of the for-ubiquitous-ai-at-the-edge/
IEEE, vol. 105, no. 12, pp. 2295–2329, dec 2017. [Online]. Available:
[37] S. Ward-Foxton, “Axelera Demos AI Test Chip After Taping Out in
https://doi.org/10.1109/JPROC.2017.2761740
Four Months,” may 2022. [Online]. Available: https://www.eetimes.
[17] V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer,
com/axelera-demos-ai-test-chip-after-taping-out-in-four-months/
Efficient Processing of Deep Neural Networks. Morgan
[38] J. Ouyang, X. Du, Y. Ma, and J. Liu, “Kunlun: A 14nm High-
and Claypool Publishers, 2020. [Online]. Available: https:
Performance AI Processor for Diversified Workloads,” in 2021 IEEE
//doi.org/10.2200/S01004ED1V01Y202004CAC050
International Solid- State Circuits Conference (ISSCC), vol. 64, feb
[18] H. F. Langroudi, T. Pandit, M. Indovina, and D. Kudithipudi, “Digital
2021, pp. 50–51.
Neuromorphic Chips for Deep Learning Inference: A Comprehensive
[39] R. Merritt, “Baidu Accelerator Rises in AI,” jul 2018. [Online].
Study,” in Applications of Machine Learning, M. E. Zelinski, T. M.
Available: https://www.eetimes.com/baidu-accelerator-rises-in-ai/
Taha, J. Howe, A. A. Awwal, and K. M. Iftekharuddin, Eds. SPIE,
sep 2019, p. 9. [Online]. Available: https://doi.org/10.1117/12.2529407 [40] C. Duckett, “Baidu Creates Kunlun Silicon for AI,” jul 2018. [Online].
[19] Y. Chen, Y. Xie, L. Song, F. Chen, and T. Tang, “A Survey of Available: https://www.zdnet.com/article/baidu-creates-kunlun-silicon-
Accelerator Architectures for Deep Neural Networks,” Engineering, for-ai/
vol. 6, no. 3, pp. 264–274, mar 2020. [Online]. Available: [41] B. Wheeler, “Bitmain SoC Brings AI to the Edge,” feb 2019. [Online].
https://doi.org/10.1016/j.eng.2020.01.007 Available: https://www.linleygroup.com/newsletters/newsletter detail.
[20] E. Wang, J. J. Davis, R. Zhao, H.-C. C. Ng, X. Niu, W. Luk, php%3Fnum=5975%26year=2019%26tag=3
P. Y. K. Cheung, and G. A. Constantinides, “Deep Neural [42] M. Demler, “Blaize Ignites Edge-AI Performance,” The Linley Group,
Network Approximation for Custom Hardware,” ACM Computing Tech. Rep., sep 2020. [Online]. Available: https://www.blaize.com/
Surveys, vol. 52, no. 2, pp. 1–39, may 2019. [Online]. Available: wp-content/uploads/2020/09/Blaize-Ignites-Edge-AI-Performance.pdf
https://dl.acm.org/doi/10.1145/3309551 [43] Y. Wu, “Chinese AI Chip Maker Cambricon Unveils
[21] S. Khan and A. Mann, “AI Chips: What They Are and Why They New Cloud-Based Smart Chip,” may 2018. [Online].
Matter,” Georgetown Center for Security and Emerging Technology, Available: https://www.chinamoneynetwork.com/2018/05/04/chinese-
Tech. Rep., apr 2020. [Online]. Available: https://cset.georgetown.edu/ ai-chip-maker-cambricon-unveils-new-cloud-based-smart-chip
research/ai-chips-what-they-are-and-why-they-matter/ [44] I. Cutress, “Cambricon, Maker of Hauwei’s Kirin NPU IP,
[22] U. Rueckert, “Digital Neural Network Accelerators,” in NANO-CHIPS Build a Big AI Chip and PCIe Card,” may 2018. [On-
2030: On-Chip AI for an Efficient Data-Driven World, B. Murmann line]. Available: https://www.anandtech.com/show/12815/cambricon-
and B. Hoefflinger, Eds. Springer, Cham, 2020, ch. 12, pp. 181–202. makers-of-huaweis-kirin-npu-ip-build-a-big-ai-chip-and-pcie-card
[Online]. Available: https://link.springer.com/chapter/10.1007%2F978- [45] L. Gwennap, “Kendryte Embeds AI for Surveillance,” mar
3-030-18338-7 12 2019. [Online]. Available: https://www.linleygroup.com/newsletters/
[23] T. Rogers and M. Khairy, “An Academic’s Attempt to Clear the Fog newsletter detail.php?num=5992
of the Machine Learning Accelerator War — SIGARCH,” aug 2021. [46] A. Hock, “Introducing the Cerebras CS-1, the Industry’s
[Online]. Available: https://www.sigarch.org/an-academics-attempt-to- Fastest Artificial Intelligence Computer,” nov 2019. [Online].
clear-the-fog-of-the-machine-learning-accelerator-war/ Available: https://www.cerebras.net/introducing-the-cerebras-cs-1-the-
[24] F. P. Sunny, E. Taheri, M. Nikdast, and S. Pasricha, “A Survey on industrys-fastest-artificial-intelligence-computer/
Silicon Photonics for Deep Learning,” ACM Journal on Emerging [47] T. Trader, “Cerebras Doubles AI Performance with Second-
Technologies in Computing Systems, vol. 17, no. 4, oct 2021. [Online]. Gen 7nm Wafer Scale Engine,” apr 2021. [Online].
Available: https://dl.acm.org/doi/10.1145/3459009 Available: https://www.hpcwire.com/2021/04/20/cerebras-doubles-ai-
[25] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classifica- performance-with-second-gen-7nm-wafer-scale-engine/
tion with Deep Convolutional Neural Networks,” Neural Information [48] “Cornami Achieves Unprecedented Performance at Lowest Power
Processing Systems, vol. 25, 2012. Dissipation for Deep Neural Networks,” oct 2019. [Online]. Available:
[26] N. P. Jouppi, C. Young, N. Patil, and D. Patterson, “A Domain-Specific https://cornami.com/1416-2/
Architecture for Deep Neural Networks,” Communications of the [49] P. Clarke, “GlobalFoundries Aids Launch of Chinese AI Startup,”
ACM, vol. 61, no. 9, pp. 50–59, aug 2018. [Online]. Available: dec 2019. [Online]. Available: https://www.eenewsanalog.com/news/
http://doi.acm.org/10.1145/3154484 globalfoundries-aids-launch-chinese-ai-startup
[50] V. Mehta, “Performance Estimation and Benchmarks for Real-World [71] “Horizon Robotics Journey2 Automotive AI Processor Series,” 2020.
Edge Inference Applications,” in Linley Spring Processor Conference. [Online]. Available: https://en.horizon.ai/product/journey
Linley Group, 2020. [72] Huawei, “Ascend 310 AI Processor,” 2020. [Online]. Available: https:
[51] “Edge TPU,” 2019. [Online]. Available: https://cloud.google.com/ //e.huawei.com/us/products/cloud-computing-dc/atlas/ascend-310
edge-tpu/ [73] ——, “Ascend 910 AI Processor,” 2020. [Online]. Available: https:
[52] N. P. Jouppi, D. H. Yoon, G. Kurian, S. Li, N. Patil, J. Laudon, //e.huawei.com/us/products/cloud-computing-dc/atlas/ascend-910
C. Young, and D. Patterson, “A Domain-Specific Supercomputer for [74] M. S. Abdelfattah, D. Han, A. Bitar, R. DiCecco, S. O’Connell,
Training Deep Neural Networks,” Commun. ACM, vol. 63, no. 7, pp. N. Shanker, J. Chu, I. Prins, J. Fender, A. C. Ling, and G. R. Chiu,
67–78, jun 2020. [Online]. Available: https://doi.org/10.1145/3360307 “DLA: Compiler and FPGA Overlay for Neural Network Inference
[53] P. Teich, “Tearing Apart Google’s TPU 3.0 AI Coprocessor,” may Acceleration,” in 2018 28th International Conference on Field
2018. [Online]. Available: https://www.nextplatform.com/2018/05/10/ Programmable Logic and Applications (FPL), aug 2018, pp. 411–
tearing-apart-googles-tpu-3-0-ai-coprocessor/ 4117. [Online]. Available: https://doi.org/10.1109/FPL.2018.00077
[54] N. P. Jouppi, D. H. Yoon, M. Ashcraft, M. Gottscho, T. B. Jablin, [75] N. Hemsoth, “Intel FPGA Architecture Focuses
G. Kurian, J. Laudon, S. Li, P. Ma, X. Ma, T. Norrie, N. Patil, on Deep Learning Inference,” jul 2018. [On-
S. Prasad, C. Young, Z. Zhou, D. Patterson, and G. Llc, “Ten Lessons line]. Available: https://www.nextplatform.com/2018/07/31/intel-fpga-
From Three Generations Shaped Google’s TPUv4i,” in Proc. of 2021 architecture-focuses-on-deep-learning-inference/
ACM/IEEE 48th Annual International Symposium on Computer Archi- [76] J. Hruska, “New Movidius Myriad X VPU Packs a
tecture (ISCA). IEEE Computer Society, jun 2021, pp. 1–14. Custom Neural Compute Engine,” aug 2017. [Online]. Avail-
[55] O. Peckham, “Google Cloud’s New TPU v4 ML Hub Packs 9 Exaflops able: https://www.extremetech.com/computing/254772-new-movidius-
of AI,” may 2022. [Online]. Available: https://www.hpcwire.com/2022/ myriad-x-vpu-packs-custom-neural-compute-engine
05/16/google-clouds-new-tpu-v4-ml-hub-packs-9-exaflops-of-ai/ [77] J. De Gelas, “Intel’s Xeon Cascade Lake vs. NVIDIA Turing: An
[56] L. Gwennap, “Groq Rocks Neural Networks,” Micropro- Analysis in AI,” jul 2019. [Online]. Available: https://www.anandtech.
cessor Report, Tech. Rep., jan 2020. [Online]. Avail- com/show/14466/intel-xeon-cascade-lake-vs-nvidia-turing
able: http://groq.com/wp-content/uploads/2020/04/Groq-Rocks-NNs- [78] “Intel Xeon Platinum 8180,” 2020. [Online]. Available: http:
Linley-Group-MPR-2020Jan06.pdf //www.cpu-world.com/CPUs/Xeon/Intel-Xeon8180.html
[57] D. Lacey, “Preliminary IPU Benchmarks,” oct 2017. [79] “Intel Xeon Platinum 8280,” 2020. [Online]. Available: http:
[Online]. Available: https://www.graphcore.ai/posts/preliminary- //www.cpu-world.com/CPUs/Xeon/Intel-Xeon8280.html
ipu-benchmarks-providing-previously-unseen-performance-for-a- [80] B. Dupont de Dinechin, “Kalray’s MPPA® Manycore Processor: At
range-of-machine-learning-applications the Heart of Intelligent Systems,” in 17th IEEE International New
[58] “Dell DSS8440 Graphcore IPU Server,” Graphcore, Tech. Rep., Circuits and Systems Conference (NEWCAS). Munich: IEEE, jun
feb 2020. [Online]. Available: https://www.graphcore.ai/hubfs/ 2019. [Online]. Available: https://www.european-processor-initiative.
Leadgenassets/DSS8440IPUServerWhitePaper 2020.pdf eu/dissemination-material/1259/
[81] P. Clarke, “NXP, Kalray Demo Coolidge Parallel Processor in
[59] S. Ward-Foxton, “Graphcore Takes on Nvidia with Second-Gen AI
’BlueBox’,” jan 2020. [Online]. Available: https://www.eenewsanalog.
Accelerator,” jul 2020. [Online]. Available: https://www.eetimes.com/
com/news/nxp-kalray-demo-coolidge-parallel-processor-bluebox
graphcore-takes-on-nvidia-with-second-gen-ai-accelerator/
[82] S. Ward-Foxton, “Kneron’s Next-Gen Edge AI Chip Gets $40m Boost,”
[60] M. Tyson, “Graphcore Bow IPU Introduces TSMC 3D Wafer-on-Wafer
jan 2020. [Online]. Available: https://www.eetasia.com/knerons-next-
Processor,” mar 2022. [Online]. Available: https://www.tomshardware.
gen-edge-ai-chip-gets-40m-boost/
com/news/graphcore-tsmc-bow-ipu-3d-wafer-on-wafer-processor
[83] ——, “Kneron Attracts Strategic Investors,” jan 2021. [Online]. Avail-
[61] “GAP Application Processors,” 2020. [Online]. Available: https:
able: https://www.eetimes.com/kneron-attracts-strategic-investors/
//greenwaves-technologies.com/gap8 gap9/
[84] ——, “Maxim Debuts Homegrown AI Accelerator in Latest ULP
[62] J. Turley, “GAP9 for ML at the Edge EEJournal,” jun 2020. [Online]. SoC,” nov 2020. [Online]. Available: https://www.eetimes.com/maxim-
Available: https://www.eejournal.com/article/gap9-for-ml-at-the-edge/ debuts-homegrown-ai-accelerator-in-latest-ulp-soc/
[63] N. Hemsoth, “Groq Shares Recipe for TSP Nodes, Systems,” sep [85] A. Jani, “Maxim Showcases Efficient Custom AI,” feb 2021. [Online].
2020. [Online]. Available: https://www.nextplatform.com/2020/09/29/ Available: https://www.linleygroup.com/newsletters/newsletter detail.
groq-shares-recipe-for-tsp-nodes-systems/ php?num=6274&year=2021&tag=3
[64] D. Abts, J. Ross, J. Sparling, M. Wong-VanHaren, M. Baker, [86] M. Clay, C. Grecos, M. Shirvaikar, and B. Richey, “Benchmarking
T. Hawkins, A. Bell, J. Thompson, T. Kahsai, G. Kimmell, J. Hwang, the MAX78000 Artificial Intelligence Microcontroller for Deep
R. Leslie-Hurd, M. Bye, E. R. Creswick, M. Boyd, M. Venigalla, Learning Applications,” in Real-Time Image Processing and Deep
E. Laforge, J. Purdy, P. Kamath, D. Maheshwari, M. Beidler, Learning 2022, N. Kehtarnavaz and M. F. Carlsohn, Eds., vol. 12102,
G. Rosseel, O. Ahmad, G. Gagarin, R. Czekalski, A. Rane, S. Parmar, International Society for Optics and Photonics. SPIE, 2022, pp.
J. Werner, J. Sproch, A. Macias, and B. Kurtz, “Think Fast: A 47–52. [Online]. Available: https://doi.org/10.1117/12.2622390
Tensor Streaming Processor (TSP) for Accelerating Deep Learning [87] T. P. Morgan, “Drilling Into Microsoft’s BrainWave Soft Deep Learning
Workloads,” in 2020 ACM/IEEE 47th Annual International Symposium Chip,” aug 2017. [Online]. Available: https://www.nextplatform.com/
on Computer Architecture (ISCA), may 2020, pp. 145–158. [Online]. 2017/08/24/drilling-microsofts-brainwave-soft-deep-leaning-chip/
Available: https://doi.org/10.1109/ISCA45697.2020.00023 [88] S. Ward-Foxton, “Mythic Resizes its AI Chip,” jun 2021. [Online].
[65] S. Ward-Foxton, “Gyrfalcon Unveils Fourth AI Accelerator Chip — Available: https://www.eetimes.com/mythic-resizes-its-analog-ai-chip/
EE Times,” nov 2019. [Online]. Available: https://www.eetimes.com/ [89] N. Hemsoth, “A Mythic Approach to Deep Learning Inference,” aug
gyrfalcon-unveils-fourth-ai-accelerator-chip/ 2018. [Online]. Available: https://www.nextplatform.com/2018/08/23/
[66] “SolidRun, Gyrfalcon Develop Arm-based Edge Op- a-mythic-approach-to-deep-learning-inference/
timized AI Inference Server,” feb 2020. [Online]. [90] D. Fick, “Mythic @ Hot Chips 2018,” aug 2018. [Online]. Available:
Available: https://www.hpcwire.com/off-the-wire/solidrun-gyrfalcon- https://medium.com/mythic-ai/mythic-hot-chips-2018-637dfb9e38b7
develop-edge-optimized-ai-inference-server/ [91] K. Freund, “NovuMind: An Early Entrant in AI Silicon,” Moor
[67] L. Gwennap, “Habana Offers Gaudi for AI Training,” Microprocessor Insights & Strategy, Tech. Rep., may 2019. [Online]. Available: https:
Report, Tech. Rep., jun 2019. [Online]. Available: https://habana.ai/wp- //moorinsightsstrategy.com/wp-content/uploads/2019/05/NovuMind-
content/uploads/2019/06/Habana-Offers-Gaudi-for-AI-Training.pdf An-Early-Entrant-in-AI-Silicon-By-Moor-Insights-And-Strategy.pdf
[68] E. Medina and E. Dagan, “Habana Labs Purpose-Built AI [92] J. Yoshida, “NovuMind’s AI Chip Sparks Controversy,” oct
Inference and Training Processor Architectures: Scaling AI Training 2018. [Online]. Available: https://www.eetimes.com/novuminds-ai-
Systems Using Standard Ethernet With Gaudi Processor,” IEEE chip-sparks-controversy/
Micro, vol. 40, no. 2, pp. 17–24, mar 2020. [Online]. Available: [93] T. P. Morgan, “Nvidia Rounds Out ”Ampere” Lineup
https://doi.org/10.1109/MM.2020.2975185 With Two New Accelerators,” apr 2021. [Online].
[69] L. Gwennap, “Habana Wins Cigar for AI Inference,” feb 2019. Available: https://www.nextplatform.com/2021/04/15/nvidia-rounds-
[Online]. Available: https://www.linleygroup.com/mpr/article.php?id= out-ampere-lineup-with-two-new-accelerators/
12103 [94] R. Krashinsky, O. Giroux, S. Jones, N. Stam, and S. Ramaswamy,
[70] S. Ward-Foxton, “Details of Hailo AI Edge Accelerator Emerge,” aug “NVIDIA Ampere Architecture In-Depth,” may 2020. [Online].
2019. [Online]. Available: https://www.eetimes.com/details-of-hailo- Available: https://devblogs.nvidia.com/nvidia-ampere-architecture-in-
ai-edge-accelerator-emerge/ depth/
[95] P. Alcorn, “Nvidia Infuses DGX-1 with Volta, Eight V100s in a Single [119] A. Shilov, “Tachyum Teases 128-Core CPU: 5.7 GHz,
Chassis,” may 2017. [Online]. Available: https://www.tomshardware. 950W, 16 DDR5 Channels,” jun 2022. [Online]. Avail-
com/news/nvidia-volta-v100-dgx-1-hgx-1,34380.html able: https://www.tomshardware.com/news/tachyum-teases-128-core-
[96] I. Cutress, “NVIDIA’s DGX-2: Sixteen Tesla V100s, 30TB cpu-57-ghz-950w-16-ddr5-channels
of NVMe, Only $400K,” mar 2018. [Online]. Avail- [120] L. Gwennap, “Tenstorrent Scales AI Performance: Architecture Leads
able: https://www.anandtech.com/show/12587/nvidias-dgx2-sixteen- in Data-Center Power Efficiency,” Microprocessor Report, Tech.
v100-gpus-30-tb-of-nvme-only-400k Rep., apr 2020. [Online]. Available: https://www.tenstorrent.com/wp-
[97] C. Campa, C. Kawalek, H. Vo, and J. Bessoudo, “Defining AI content/uploads/2020/04/Tenstorrent-Scales-AI-Performance.pdf
Innovation with NVIDIA DGX A100,” may 2020. [Online]. Available: [121] E. Talpes, D. D. Sarma, G. Venkataramanan, P. Bannon, B. McGee,
https://devblogs.nvidia.com/defining-ai-innovation-with-dgx-a100/ B. Floering, A. Jalote, C. Hsiong, S. Arora, A. Gorti, and G. S.
[98] R. Smith, “NVIDIA Hopper GPU Architecture and H100 Sachdev, “Compute Solution for Tesla’s Full Self-Driving Computer,”
Accelerator Announced: Working Smarter and Harder,” mar 2022. IEEE Micro, vol. 40, no. 2, pp. 25–35, mar 2020. [Online]. Available:
[Online]. Available: https://www.anandtech.com/show/17327/nvidia- https://doi.org/10.1109/MM.2020.2975764
hopper-gpu-architecture-and-h100-accelerator-announced [122] “FSD Chip - Tesla,” 2020. [Online]. Available: https://en.wikichip.
[99] ——, “NVIDIA Gives Jetson AGX Xavier a Trim, org/wiki/tesla (car company)/fsd chip
Announces Nano-Sized Jetson Xavier NX,” nov 2019. [123] S. Ward-Foxton, “TI’s First Automotive SoC with an AI Accelerator
[Online]. Available: https://www.anandtech.com/show/15070/nvidia- Launches,” feb 2021. [Online]. Available: https://www.eetimes.com/
gives-jetson-xavier-a-trim-announces-nanosized-jetson-xavier-nx tis-first-automotive-soc-with-an-ai-accelerator-launches/
[100] B. Funk, “NVIDIA Jetson AGX Orin: The Next-Gen Platform That [124] “TDA4VM Jacinto Processors for ADAS and Autonomous Vehicles,”
Will Power Our AI Robot Overlords Unveiled,” mar 2022. [Online]. Texas Instruments, Tech. Rep., mar 2021. [Online]. Available:
Available: https://hothardware.com/news/nvidia-jetson-agx-orin https://www.ti.com/lit/gpn/tda4vm
[101] “Jetson AGX Orin for Next-Gen Robotics,” 2022. [Online]. Avail- [125] M. Demler, “TI Jacinto Accelerates Level 3 ADAS,” mar
able: https://www.nvidia.com/en-us/autonomous-machines/embedded- 2020. [Online]. Available: https://www.linleygroup.com/newsletters/
systems/jetson-orin/ newsletter detail.php?num=6130&year=2020&tag=3
[102] D. Franklin, “NVIDIA Jetson TX2 Delivers Twice the Intelligence [126] R. Merritt, “Samsung, Toshiba Detail AI Chips,” feb 2019. [Online].
to the Edge,” mar 2017. [Online]. Available: https://developer.nvidia. Available: https://www.eetimes.com/samsung-toshiba-detail-ai-chips/
com/blog/jetson-tx2-delivers-twice-intelligence-edge/ [127] L. Gwennap, “Untether Delivers At-Memory AI,” Linley Group, Tech.
[103] B. Hill, “NVIDIA Unveils Ampere-Infused DRIVE AGX For Rep., nov 2020. [Online]. Available: https://www.linleygroup.com/
Autonomous Cars, Isaac Robotics Platform With BMW Partnership,” newsletters/newsletter detail.php?num=6230
may 2022. [Online]. Available: https://hothardware.com/news/nvidia- [128] G. Hilson, “Startup Tachyum Offers Universal Processor for
drive-agx-pegasus-orin-ampere-next-gen-autonomous-cars Evaluation,” jun 2022. [Online]. Available: https://www.eetimes.com/
[104] “NVIDIA Tesla P100.” [Online]. Available: https://www.nvidia.com/ startup-tachyum-offers-universal-processor-for-evaluation/
en-us/data-center/tesla-p100/
[129] N. Toon, “Introducing 2nd Generation IPU Systems for AI at
[105] R. Smith, “16GB NVIDIA Tesla V100 Gets Reprieve;
Scale,” jul 2020. [Online]. Available: https://www.graphcore.ai/posts/
Remains in Production,” may 2018. [Online]. Avail-
introducing-second-generation-ipu-systems-for-ai-at-scale
able: https://www.anandtech.com/show/12809/16gb-nvidia-tesla-v100-
[130] I. Lunden, “Graphcore Unveils New GC200 Chip and the
gets-reprieve-remains-in-production
Expandable M2000 IPU Machine That Runs on Them,”
[106] E. Kilgariff, H. Moreton, N. Stam, and B. Bell, “NVIDIA
jul 2020. [Online]. Available: https://techcrunch.com/2020/07/15/
Turing Architecture In-Depth,” sep 2018. [Online]. Available:
graphcore-second-generation-chip/
https://developer.nvidia.com/blog/nvidia-turing-architecture-in-depth/
[131] S. Ward-Foxton, “SambaNova Emerges From Stealth With
[107] “NVIDIA Tesla V100 Tensor Core GPU,” 2019. [Online]. Available:
Record-Breaking AI System,” dec 2020. [Online]. Avail-
https://www.nvidia.com/en-us/data-center/tesla-v100/
able: https://www.eetimes.com/sambanova-emerges-from-stealth-with-
[108] J. McGregor, “Perceive Exits Stealth With Super Efficient
record-breaking-ai-system/
Machine Learning Chip For Smarter Devices,” apr 2020. [Online].
Available: https://www.forbes.com/sites/tiriasresearch/2020/04/06/ [132] R. Prabhakar, S. Jairath, and J. L. Shin, “SambaNova SN10 RDU: A
perceive-exits-stealth-with-super-efficient-machine-learning-chip-for- 7nm Dataflow Architecture to Accelerate Software 2.0,” in 2022 IEEE
smarter-devices/#1b25ab646d9c International Solid- State Circuits Conference (ISSCC), vol. 65, 2022,
[109] D. Schor, “The 2,048-core PEZY-SC2 Sets a Green500 Record,” pp. 350–352.
nov 2017. [Online]. Available: https://fuse.wikichip.org/news/191/the- [133] R. Prabhakar and S. Jairath, “SambaNova SN10 RDU:Accelerating
2048-core-pezy-sc2-sets-a-green500-record/ Software 2.0 with Dataflow,” in 2021 IEEE Hot Chips 33 Symposium
[110] “MN-Core,” 2020. [Online]. Available: https://projects.preferred.jp/ (HCS), aug 2021, pp. 1–37.
mn-core/en/ [134] M. Emani, V. Vishwanath, C. Adams, M. E. Papka, R. Stevens,
[111] I. Cutress, “Preferred Networks: A 500 W Custom PCIe L. Florescu, S. Jairath, W. Liu, T. Nama, and . Sujeeth, “Accelerat-
Card using 3000 mm2 Silicon,” dec 2019. [Online]. Avail- ing Scientific Applications With SambaNova Reconfigurable Dataflow
able: https://www.anandtech.com/show/15177/preferred-networks-a- Architecture,” Computing in Science & Engineering, vol. 23, no. 2, pp.
500-w-custom-pcie-card-using-3000-mm2-silicon 114–119, 2021.
[112] D. Firu, “Quadric Edge Supercomputer,” Quadric, Tech. Rep., apr [135] O. Peckham, “Intel’s Habana Labs Unveils Gaudi2, Greco AI
2019. [Online]. Available: https://quadric.io/supercomputing.pdf Processors,” may 2022. [Online]. Available: https://www.hpcwire.com/
[113] S. Ward-Foxton, “Qualcomm Cloud AI 100 Promises Impressive 2022/05/10/intels-habana-labs-unveils-gaudi2-greco-ai-processors/
Performance per Watt for Near-Edge AI,” sep 2020. [136] T. P. Morgan, “Intel Pits New Gaudi2 AI Training
[Online]. Available: https://www.eetimes.com/qualcomm-cloud-ai- Engine Against Nvidia GPUs,” may 2022. [Online]. Avail-
100-promises-impressive-performance-per-watt-for-near-edge-ai/ able: https://www.nextplatform.com/2022/05/10/intel-pits-new-gaudi2-
[114] D. McGrath, “Qualcomm Targets AI Inferencing in the Cloud,” ai-training-engine-against-nvidia-gpus/
apr 2019. [Online]. Available: https://www.eetimes.com/qualcomm- [137] D. Martin, “Samsung, Others Test Esperanto’s 1,000-Core RISC-V
targets-ai-inferencing-in-the-cloud/# AI Chip,” apr 2022. [Online]. Available: https://www.theregister.com/
[115] “Rockchip Released Its First AI Processor RK3399Pro NPU 2022/04/22/samsung esperanto riscv/
Performance Up to 2.4TOPs,” jan 2018. [Online]. Available: https: [138] K. Freund, “Esperanto Launches AI Accelerator with over 1000 RISC-
//www.rock-chips.com/a/en/News/Press Releases/2018/0108/869.html V Cores,” aug 2021.
[116] L. Gwennap, “Machine Learning Moves to the Edge,” [139] O. Peckham, “Enter Dojo: Tesla Reveals Design for Modular
Microprocessor Report, Tech. Rep., apr 2020. [On- Supercomputer & D1 Chip,” aug 2021. [Online]. Avail-
line]. Available: https://www.linleygroup.com/uploads/sima-machine- able: https://www.hpcwire.com/2021/08/20/enter-dojo-tesla-reveals-
learning-moves-to-the-edge-wp.pdf design-for-modular-supercomputer-d1-chip/
[117] D. McGrath, “Tech Heavyweights Back AI Chip Startup,” [140] A. Shilov, “Via Shutters Centaur Technology Site, Sells Off Equip-
oct 2018. [Online]. Available: https://www.eetimes.com/tech- ment,” dec 2021. [Online]. Available: https://www.tomshardware.com/
heavyweights-back-ai-chip-startup/ news/via-sells-off-equipment-from-centaur-preps-to-shut-down-site
[118] R. Merritt, “Startup Rolls AI Chips for Audio,” feb 2018. [Online]. [141] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones,
Available: https://www.eetimes.com/startup-rolls-ai-chips-for-audio/ A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention Is All
You Need,” CoRR, vol. abs/1706.0, 2017. [Online]. Available: [160] M. Barnell, C. Raymond, M. Wilson, D. Isereau, and C. Cicotta,
http://arxiv.org/abs/1706.03762 “Target Classification in Synthetic Aperture Radar and Optical Imagery
[142] J. Burt, “Chip Makers Press For Standardized FP8 Format For AI,” Using Loihi Neuromorphic Hardware,” in 2020 IEEE High Perfor-
jul 2022. [Online]. Available: https://www.nextplatform.com/2022/07/ mance Extreme Computing Conference (HPEC), 2020, pp. 1–6.
07/chip-makers-press-for-standardized-fp8-format-for-ai/ [161] A. Viale, A. Marchisio, M. Martina, G. Masera, and M. Shafique,
[143] B. Noune, P. Jones, D. Justus, D. Masters, and C. Luschi, “8-bit “CarSNN: An Efficient Spiking Neural Network for Event-Based
Numerical Formats for Deep Neural Networks,” ArXiv, jun 2022. Autonomous Cars on the Loihi Neuromorphic Research Processor,”
[Online]. Available: http://arxiv.org/abs/2206.02915 in 2021 International Joint Conference on Neural Networks (IJCNN),
[144] K. Rocki, D. van Essendelft, I. Sharapov, R. Schreiber, M. Morrison, jul 2021, pp. 1–10.
V. Kibardin, A. Portnoy, J. F. Dietiker, M. Syamlal, and M. James, [162] S. Ward-Foxton, “Innatera Unveils Neuromorphic AI Chip
“Fast Stencil-Code Computation on a Wafer-Scale Processor,” in arXiv, to Accelerate Spiking Networks,” jul 2021. [Online].
2020. Available: https://www.eetimes.com/innatera-unveils-neuromorphic-ai-
[145] T. Louw and S. Mcintosh-Smith, “Using the Graphcore IPU for Tra- chip-to-accelerate-spiking-networks/
ditional HPC Applications,” in 3rd Workshop on Accelerated Machine [163] M. Levy, “Innatera’s Spiking Neural Processor,” apr 2021. [Online].
Learning (AccML), jan 2021. Available: https://www.linleygroup.com/newsletters/newsletter detail.
[146] Z. Jia, B. Tillman, M. Maggioni, and D. P. Scarpazza, “Dissecting php?num=6302&year=2021&tag=3
the Graphcore IPU Architecture via Microbenchmarking,” Citadel, [164] V. Ostrovskii, P. Fedoseev, Y. Bobrova, and D. Butusov,
Chicago, Tech. Rep., dec 2019. [Online]. Available: https://arxiv.org/ “Structural and Parametric Identification of Knowm Memristors,”
abs/1912.03413v1 Nanomaterials, vol. 12, no. 1, jan 2022. [Online]. Avail-
[147] R. L. Hu, D. Pierce, Y. Shafi, A. Boral, V. Anisimov, S. Nevo, and Y.-f. able: /pmc/articles/PMC8746671//pmc/articles/PMC8746671/?report=
Chen, “Accelerating physics simulations with tensor processing units: abstracthttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8746671/
An inundation modeling example,” The International Journal of High [165] S. Ward-Foxton, “Optical Compute Promises Game-
Performance Computing Applications, vol. 36, no. 4, pp. 510–523, Changing AI Performance,” aug 2020. [Online]. Available:
2022. [Online]. Available: https://doi.org/10.1177/10943420221102873 https://www.eetimes.com/optical-compute-promises-game-changing-
[148] A. G. M. Lewis, J. Beall, M. Ganahl, M. Hauru, S. B. Mallick, ai-performance/?utm source=eetimes&utm medium=networksearch
and G. Vidal, “Large Scale Distributed Linear Algebra With Tensor [166] ——, “Optical Chip Solves Hardest Math Problems Faster than GPUs,”
Processing Units,” arXiv preprint, dec 2021. [Online]. Available: dec 2021. [Online]. Available: https://www.eetimes.com/optical-
https://arxiv.org/abs/2112.09017v1 computing-chip-runs-hardest-math-problems-100x-faster-than-gpus/
[167] J. Launay, I. Poli, K. Müller, I. Carron, L. Daudet, F. Krzakala,
[149] P. Sharma and V. Jadhao, “Molecular Dynamics Simulations on Cloud
and S. Gigan, “Light-in-the-Loop: Using a Photonics Co-Processor
Computing and Machine Learning Platforms,” arXiv preprint, 2021.
for Scalable Training of Neural Networks,” arXiv preprint, jun 2020.
[150] T. Lu, T. Marin, Y. Zhuo, Y.-F. Chen, and C. Ma, “Nonuniform
[Online]. Available: https://arxiv.org/abs/2006.01475v2
Fast Fourier Transform on Tpus,” in 2021 IEEE 18th International
[168] E. Cottle, F. Michel, J. Wilson, N. New, and I. Kundu, “Optical
Symposium on Biomedical Imaging (ISBI), 2021, pp. 783–787.
Convolutional Neural Networks – Combining Silicon Photonics and
[151] T. Lu, Y. F. Chen, B. Hechtman, T. Wang, and J. Anderson,
Fourier Optics for Computer Vision,” arXiv preprint, dec 2020.
“Large-Scale Discrete Fourier Transform on TPUs,” IEEE Access,
[Online]. Available: https://arxiv.org/abs/2103.09044v1
vol. 9, pp. 93 422–93 432, feb 2020. [Online]. Available: https:
[169] J. Wilson, “The Multiply and Fourier Transform Unit: A Micro-Scale
//arxiv.org/abs/2002.03260v3
Optical Processor,” Optalysys, Tech. Rep., dec 2020. [Online].
[152] T. Lu, T. Marin, Y. Zhuo, Y. F. Chen, and C. Ma, “Accelerating MRI Available: https://optalysys.com/s/Multiply and Fourier Transform
Reconstruction on TPUs,” 2020 IEEE High Performance Extreme white paper 12 12 20.pdf
Computing Conference (HPEC 2020), sep 2020. [Online]. Available: [170] D. Schneider, “A Neural-Net Based on Light Could Best Digital
https://arxiv.org/abs/2006.14080v1 Computers,” jun 2019. [Online]. Available: https://spectrum.ieee.org/
[153] F. Belletti, D. King, K. Yang, R. Nelet, Y. Shafi, Y.-F. Chen, a-neural-net-based-on-light-could-best-digital-computers
and J. Anderson, “Tensor Processing Units for Financial Monte [171] C. Q. Choi, “Photonic Chip Performs Image Recognition at the Speed
Carlo,” in Proceedings of the 2020 SIAM Conference on Parallel of Light,” jun 2022. [Online]. Available: https://spectrum.ieee.org/
Processing for Scientific Computing. Society for Industrial and photonic-neural-network
Applied Mathematics, jun 2019, pp. 12–23. [Online]. Available:
https://arxiv.org/abs/1906.02818v5
[154] K. Yang, Y. F. Chen, G. Roumpos, C. Colby, and J. Anderson, “High
performance Monte Carlo simulation of ising model on TPU clusters,”
in International Conference for High Performance Computing,
Networking, Storage and Analysis, SC. IEEE Computer Society, nov
2019. [Online]. Available: https://arxiv.org/abs/1903.11714v4
[155] C. D. Schuman, T. E. Potok, R. M. Patton, J. D. Birdwell, M. E. Dean,
G. S. Rose, and J. S. Plank, “A Survey of Neuromorphic Computing
and Neural Networks in Hardware,” arXiv preprint arXiv:1705.06963,
may 2017. [Online]. Available: http://arxiv.org/abs/1705.06963
[156] C. D. James, J. B. Aimone, N. E. Miner, C. M. Vineyard,
F. H. Rothganger, K. D. Carlson, S. A. Mulder, T. J. Draelos,
A. Faust, M. J. Marinella, J. H. Naegle, and S. J. Plimpton,
“A Historical Survey of Algorithms and Hardware Architectures
for Neural-inspired and Neuromorphic Computing Applications,”
Biologically Inspired Cognitive Architectures, vol. 19, pp. 49–64,
jan 2017. [Online]. Available: https://www.sciencedirect.com/science/
article/abs/pii/S2212683X16300561
[157] R. F. Service, “Microchips That Mimic the Human Brain Could
Make AI Far More Energy Efficient,” may 2022. [Online].
Available: https://www.science.org/content/article/microchips-mimic-
human-brain-could-make-ai-far-more-energy-efficient
[158] G. Orchard, E. P. Frady, D. B. D. Rubin, S. Sanborn, S. B. Shrestha,
F. T. Sommer, and M. Davies, “Efficient Neuromorphic Signal Pro-
cessing with Loihi 2,” in 2021 IEEE Workshop on Signal Processing
Systems (SiPS), oct 2021, pp. 254–259.
[159] M. Davies, A. Wild, G. Orchard, Y. Sandamirskaya, G. A. F. Guerra,
P. Joshi, P. Plank, and S. R. Risbud, “Advancing Neuromorphic
Computing With Loihi: A Survey of Results and Outlook,” Proceedings
of the IEEE, vol. 109, no. 5, pp. 911–934, may 2021.