2012.09265
2012.09265
M. Cerezo,1, 2, 3 Andrew Arrasmith,1 Ryan Babbush,4 Simon C. Benjamin,5 Suguru Endo,6 Keisuke Fujii,7, 8, 9
Jarrod R. McClean,4 Kosuke Mitarai,7, 10, 11 Xiao Yuan,12, 13 Lukasz Cincio,1, 3 and Patrick J. Coles1, 3
1
Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
2
Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA
3
Quantum Science Center, Oak Ridge, TN 37931, USA
4
Google Quantum AI Team, Venice, CA 90291, United States of America
5
Department of Materials, University of Oxford, Parks Road, Oxford OX1 3PH, United Kingdom
6
NTT Secure Platform Laboratories, NTT Corporation, Musashino, Tokyo 180-8585, Japan
7
Graduate School of Engineering Science, Osaka University, Osaka 560-8531, Japan
8
Center for Quantum Information and Quantum Biology,
Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Osaka 560-8531, Japan
9
Center for Emergent Matter Science, RIKEN, Saitama 351-0198, Japan
arXiv:2012.09265v1 [quant-ph] 16 Dec 2020
10
Center for Quantum Information and Quantum Biology,
Institute for Open and Transdisciplinary Research Initiatives, Osaka 560-8531, Japan
11
JST, PRESTO, Saitama 332-0012, Japan
12
Center on Frontiers of Computing Studies, Department of Computer Science, Peking University, Beijing 100871, China
13
Stanford Institute for Theoretical Physics, Stanford University, Stanford California 94305, USA
Applications such as simulating large quantum systems or solving large-scale linear algebra prob-
lems are immensely challenging for classical computers due their extremely high computational cost.
Quantum computers promise to unlock these applications, although fault-tolerant quantum com-
puters will likely not be available for several years. Currently available quantum devices have serious
constraints, including limited qubit numbers and noise processes that limit circuit depth. Varia-
tional Quantum Algorithms (VQAs), which employ a classical optimizer to train a parametrized
quantum circuit, have emerged as a leading strategy to address these constraints. VQAs have now
been proposed for essentially all applications that researchers have envisioned for quantum comput-
ers, and they appear to the best hope for obtaining quantum advantage. Nevertheless, challenges
remain including the trainability, accuracy, and efficiency of VQAs. In this review article we present
an overview of the field of VQAs. Furthermore, we discuss strategies to overcome their challenges
as well as the exciting prospects for using them as a means to obtain quantum advantage.
FIG. 1. Schematic diagram of a Variational Quantum Algorithm (VQA). The inputs to a VQA are: a cost function
C(θ) which encodes the solution to the problem, an ansatz whose parameters are trained to minimize the cost, and (possibly) a
set of training data used during the optimization. At each iteration of the loop one employs a quantum computer to efficiently
estimate the cost (or its gradients). This information is fed into a classical computer that leverages the power of optimizers
to navigate the cost landscape and solve the optimization problem in Eq. (1). Once a termination condition is met, the VQA
outputs an estimate of the solution to the problem. The form of the output depends on the precise task at hand. In the figure
are indicated some of the most common types of output.
VQAs are arguably the natural quantum analog of the and machine learning. Finally, Section VI contains our
highly successful machine-learning methods employed in final discussions and outlooks.
classical computing, such as neural networks. Moreover,
VQAs leverage the toolbox of classical optimization,
as VQAs employ parametrized quantum circuits to be II. BASIC CONCEPTS AND TOOLS
run on the quantum computer, and then outsource the
parameter optimization to a classical optimizer. This
One of the main advantages of Variational Quantum
has the added advantage of keeping the quantum circuit
Algorithms (VQAs) is that they provide a general frame-
depth shallow and hence mitigating noise, in contrast to
work that can be used to solve a wide array of problems.
quantum algorithms developed for the fault-tolerant era.
While this versatility translates into different algorith-
VQAs have now been proposed for a plethora of ap-
mic structures with varying levels of complexity, there
plications, covering essentially all of the applications
are basic elements that most (if not all) VQAs share in
that researchers have envisioned for quantum computers.
common. In this section we review the building blocks of
While they may be the key to obtaining near-term quan-
VQAs in the hope that these can be used as blueprints
tum advantage, VQAs still face important challenges, in-
for the development of novel algorithms.
cluding their trainability, accuracy, and efficiency. In this
Let us start by considering a task one wishes to solve.
review, we discuss the exciting prospects for VQAs, and
This implies having access to a description of the prob-
we highlight the challenges that must be overcome to ob-
lem, and also possibly to a set of training data. As
tain the ultimate goal of quantum advantage.
schematically shown in Fig. 1, the first step to devel-
The structure of this review is as follows. In Section II
oping a VQA is to define a cost (or loss) function C
we introduce the basic framework behind VQAs. Therein
which encodes the solution to the problem. One then
we present a general description of cost functions, and we
proposes an ansatz, i.e., a quantum operation depending
discuss some of the most widely used ansatze and opti-
on a set of continuous or discrete parameters θ that can
mizers. Then, in Section III we present different applica-
be optimized (see below for a more in-depth discussion
tions for VQAs ranging from finding ground and excited
of ansatze). This ansatz is then trained (with data from
states, quantum simulation, optimization, and machine
the training set) in a hybrid quantum-classical loop to
learning, among others. Section IV contains a discussion
solve the optimization task
of the main challenges (and their potential solutions) for
VQAs. These encompass trainability problems such as
θ ∗ = arg min C(θ) . (1)
barren plateaus, the effect of hardware noise, and error θ
mitigation techniques. In Section V we discuss oppor-
tunities where VQAs could be employed for obtaining The trademark of VQAs is that they employ a quantum
quantum advantage in the near-term. Such applications computer to estimate the cost function C(θ) (or its gra-
include using VQAs to solve problems in quantum chem- dient) while leveraging the power of classical optimizers
istry, nuclear and particle physics, and for optimization to train the parameters θ. In what follows, we provide
3
A. Cost function
metric matrix) to a parametrized quantum state |ψ(θ)i optimizer. One of the main advantages of many VQAs
with the goal of obtaining a more accurate result by op- is that, as discussed below, one can analytically evaluate
timizing together J and θ [56]. the cost function gradient.
Ansatz for mixed states. Since mixed states play an im- Parameter-shift rule. Let us consider for simplicity a
portant role in many applications, e.g. systems at finite cost function of the form (2) with fk (x) = x, and let θl be
temperature, several ansatzeP have been developed to con- the l-th element in θ which parametrize a unitary eiθl σl
struct a mixed state ρ = i pi |ψi ihψi | of n qubits. A first in the ansatz. Here, σl is a Pauli operator. Surprisingly,
approach (which comes at the cost of requiring up to 2n there is a hardware-friendly protocol to evaluate the par-
qubits) is based on preparing a pure state that has ρ as a tial derivative of C(θ) with respect to θl often referred
reduced state in some subsystem of qubits. Refs. [52, 57] to as the parameter-shift rule [63–66]. Explicitly, the
have proposed a P method to variationally obtain a pu- parameter-shift rules states that the equality
√
rification |ψi = i pi |ψi i|φi i of ρ, while [58] intro-
∂C X 1
Tr Ok U † (θ + )ρk U (θ + )
duced a method to construct a state |ρi = 1c i pi |ψi i|ψi i
P
=
P 2 ∂θl 2 sin α
with normalization c = i pi . Alternatively, one can
k
also train a probability distribution {pi (φ)} and a set
− Tr Ok U † (θ − )ρk U (θ − ) ,
(6)
of states {|ψi (θi )i} toPconstruct ρ as the statistical en-
semble ρ(φ, {θi }) = i pi (φ)|ψi (θi )ihψi (θi )|. Ref. [57] with θ ± = θ ± αel , holds for any real number α. Here el
proposed to use a simple product distribution based on is a vector having 1 as its l-th element and 0 otherwise.
physical insights, while a more general proposals for en- Equation (6) shows that one can evaluate the gradient by
ergy based models was introduced in [59]. More recently, shifting the l-th parameter by some amount α. Note that
there has been a proposal to generate mixed states which the accuracy of the evaluation depends on the coefficient
uses the autoregressive model [60]. 1/(2 sin α) since each of the ±α-term is evaluated by sam-
Ansatz expressibility. Given the wide range of ansatze pling Ok . This accuracy is maximized at α = π/4, since
one can employ, a relevant question is whether a given 1/ sin α is minimized at this point. While the parameter-
architecture can prepare a target state by optimizing its shift rule might resemble a naive finite difference, it eval-
parameters. In this sense, there are different ways to uates the analytic gradient of the parameter by virtue of
judge the quality of an ansatz [61] by considering two the coefficient 1/ sin α. A detailed comparison between
different notions: the expressibility and the entangling the parameter-shift rule and the finite difference can be
capability of an ansatz. An ansatz is expressible if the found in Ref. [67]. Finally, we remark that the gradient
circuit can be used to uniformly explore the entire space for more general fk (x) can be obtained from (6) by using
of quantum states. Thus one way to quantify the express- the chain rule.
ibility of an ansatz U (θ) is to compare the distribution of Other derivatives. Higher-order derivatives of the cost
states obtained from U (θ) to the maximally expressive function can be evaluated by straight-forward extensions
uniform (Haar) distribution of states UHaar . Motivated of the parameter-shift rule. For example, the second
by this line of thought, the expressibility of a circuit is derivative for the previous example can be written as
measured by [61] ||A(t) ||2 , where ∂2C X 1 h i
2
= 2 Tr Ok U † (θ + 2αel ) ρk U (θ + 2αel )
Z ∂θl 4 sin α
k
⊗t †
(t)
A (U ) := dUHaar UHaar |0ih0|(UHaar )⊗t h i
+ Tr Ok U † (θ − 2αel ) ρk U (θ − 2αel )
Z
− dU U ⊗t |0ih0|(U † )⊗t .
h i
(5) − 2Tr Ok U † (θ)ρk U (θ) ,
well [61], and the expressibility of different ansatze is in- order ones such as ∂θ l θl0
or ∂∂θC3 can be obtained similarly.
l
vestigated further in [62]. We remark that Ref. [61] also Explicit formulas can be found in [67, 68]. These ob-
introduced a measure of entangling capability for ansatze, servations relate to the fact that the cost function can
which quantifies the average entanglement of states pro- be expanded into a trigonometric series that admits a
duced from randomly sampling the circuit parameters θ. classically efficient, analytical approximation around any
reference point. One can thus infer a classical model of
the cost function, and minimise it, to offload more work
C. Gradients from the quantum processor to the classical supervising
system [69, 70].
Once the cost function and ansatz have been defined, Other types of derivatives of the parametrized quan-
the next step is to train the parameters θ and solve the tum state not directly related to the cost function, such
optimization problem of Eq. (1). It is known that for as a metric tensor of a state ∂hψ(θ)| ∂θl
∂|ψ(θ)i
∂θl0 (with |ψ(θ)i =
many optimization tasks using information in the cost U (θ)|ψ0 i for some initial state |ψ0 i), are sometimes used
function gradient (or in higher-order derivatives) can help in sophisticated optimization algorithms [71–73] (see Sec-
in speeding up and guaranteeing the convergence of the tion II D) and variational quantum simulation [74–76]
6
weights are in decreasing order, each state U (θ ∗ )|ϕi i cor- ulation algorithms, such as the Trotter-Suzuki product
responds to an eigenstate of the (non-degenerate) Hamil- formula, generally discretize time into small time steps
tonian with increasing energies. On the other hand, and simulate each time evolution with a quantum cir-
the non-weighted subspace
Pm VQE makes use of the cost cuit. Therefore, the circuit depth generally increases
†
function C1 (θ) = i=0 hϕi |U (θ)HU (θ)|ϕi i. Minimiz- polynomially with the system size and simulated time.
ing C1 again gives the subspace of lowest eigenstates. Given the noise inherent in NISQ devices, the accumu-
While each state U (θ ∗ )|ϕi i is in a superposition of the lated hardware errors for such deep quantum circuits can
eigenstates, one needs to further optimize a second cost prove prohibitive. On the other hand, VQAs for dynami-
C2 (θ ∗ , φ) = hϕi |V † (φ)U † (θ ∗ )HU (θ ∗ )V (φ)|ϕi i over pa- cal quantum simulation only use a shallow depth circuit,
rameters φ to rotate each state U (θ ∗ )V (φ)|ϕi i to an significantly reducing the impact of hardware noise.
eigenstate. Iterative approach. Instead of directly implement-
Multistate contracted VQE. The multistate contracted ing the unitary evolution described by the Schrödinger
VQE [45] can be regarded as a midway point between equation d|ψ(t)i = −iH|ψ(t)i, iterative variational al-
dt
subspace expansion and subspace VQE. It first obtains gorithms [74, 75] consider trial states |ψ(θ)i and map
the lowest energy subspace {U (θ ∗ )|ϕi i}m i=0 by optimiz- the evolution of the state to the evolution of the pa-
ing C1 (θ) as in the non-weighted subspace VQE. While rameters θ. By iteratively updating the parameters, the
instead of optimizing an additional unitary, the multi- quantum state is effectively updated and hence evolved.
state contracted VQE approximates each eigenstate as Specifically, by using variational principles, e.g., McLach-
|Ei = i αi U (θ ∗ )|ϕi i with coefficients αi which are ob-
P
d
lan’s principle to solve the minimization minθ̇ δk( dt +
tained by solving a generalised eigenvalue problem simi- iH)|ψ(θ)ik, one obtains a linear equation for the parame-
lar to subspace expansion with S = I. p dθ
ters as M · θ̇ = V . Here
k|ψik = hψ|ψi, θ̇ = dt , M i,j =
Adiabatically assisted VQE. Quantum adiabatic opti-
mization seeks to find a solution to an optimization prob- Re ∂i hψ(θ)|∂j |ψ(θ)i , Vi = Im hψ(θ)|H∂i |ψ(θ)i , and
lem by slowly transforming the ground state of a simple ∂i |ψ(θ)i = ∂|ψ(θ)i
∂θ i . Each element of M and V can be
problem to that of a complex problem. These methods efficiently measured with a modified Hadamard test cir-
have a close connection with classical homotopy schemes cuit. By solving the linear equation, one can iteratively
that are used to find the solutions of classical prob- update the parameters from θ to θ + θ̇∆t with a small
lems in optimization [95]. In light of this connection, time step ∆t. Similar variational algorithms could be ap-
the adiabatically assisted VQE [96] uses a cost function plied for simulating the Wick-rotated Schrödinger equa-
C(θ) = hψ(θ)|H(s)|ψ(θ)i where H(s) = (1−s)H0 +sHP , tion of imaginary time evolution [71] and general first-
and where we recall that |ψ(θ)i = U (θ)|ψ0 i. Here HP is order derivative equations with non-Hermitian Hamilto-
the problem Hamiltonian of interest and H0 is a simple nians [76]. A systematic comparison between different
Hamiltonian whose known ground state is taken as the variational principles for different problems can be found
initial state |ψ0 i. During the parameter optimization, in Ref. [75]. Recent works also extend the algorithms to
one slowly changes s from 0 to 1. The idea of Hamilto- use adaptive ansatz to reduce the circuit depth [99, 100]
nian transformation has been used as a type of ansatz to Subspace approach. The weighted subspace VQE [94]
obtain solutions near the more challenging endpoint [97]. provides an alternative way for simulating dynamics in
Accelerated VQE. As previously mentioned, while the subspace of the low energy eigenstates [101]. Here
Quantum Phase Estimation (QPE) provides a means to one uses the weighted subspace VQE unitary operator
estimate eigenenergies in the fault-tolerant era, it is not U (θ ∗ ) that maps computational basis states {|ϕj i} to the
implementable in the near-term. However, one of the low energy eigenstates {|Ej i} as U (θ ∗ )|ϕj i ≈ eiδj |Ej i,
positive features of this algorithm is that a precision with δj an unknown phase. Considering the low energy
can be obtained with a number of measurements which subspace, the time evolution operator can be approxi-
scale as O(log( 1 )). This is in contrast with VQE, which mated as exp(−iHt) ≈ U (θ ∗ )T (t)U † (θ ∗ ) with T (t) =
requires O( 12 ) measurement for the same precision. This
P
j exp(−iEj t)|ψj ihψj |. The procedure could intuitively
motivated the Accelerated VQE algorithm, which in- be understood as (1) rotating the state to the computa-
terpolates between the VQE and QPE algorithms [98]. tional basis with U † (θ ∗ ), (2) evolving the state with T (t),
∗
This involves taking the VQE algorithm and replacing and (3) rotating the basisP back with U (θ ). Therefore,
the measurement process with a tunable version of QPE for any state |ψ(0)i = j αj |Ej i that is a superposition
called α-QPE. This allows the measurement cost to in- of the low energy eigenstates, its time evolution can be
terpolate between that of VQE and QPE. simulated as |ψ(t)i = U (θ ∗ )T (t)U † (θ ∗ )|ψ(0)i. Since the
time evolution is directly implemented via T (t), it does
not involve iterative parameter update and the circuit
B. Dynamical quantum simulation depth is independent of the simulation time.
Variational fast forwarding. Similar to the
Apart from static eigenstate problems, VQAs can also subspace approach, variational fast forward-
be applied to simulate the dynamical evolution of a quan- ing [102] simulates the time evolution oper-
tum system. Conventional quantum Hamiltonian sim- ation exp(−iHt) as U (θ ∗ )T (E, t)U † (θ ∗ ) with
9
P
T (E, t) = j exp(−iEj t)|ψj ihψj | a trainable diag-
onal matrix and U (θ ∗ ) a trainable unitary that maps
between the eigenstates of H and the computational
basis. While the subspace approach obtains T (E, t)
and U (θ ∗ ) via weighted subspace VQE, variational fast
forwarding optimises a cost given by the fidelity between
e−iHδt and U (θ ∗ )T (E, δt)U † (θ ∗ ) for a small time step
δt via the so-called local Hilbert-Schmidt test [103].
Then, according to the Trotter-Suzuki product formula,
one has e−iHT = (e−iH∆t )M ≈ U (θ ∗ )(T (E, t))M U † (θ ∗ ).
Again, since the time evolution is implemented in
T (E, t), one can simulate the evolution for arbitrary
time t with the same circuit structure. As shown
in [104], the ensuing Trotter error of this approach can FIG. 4. Quantum Approximate Optimization Algo-
be removed by diagonalizing instead the Hamiltonian H rithm (QAOA). a) Schematic representation of the Trot-
that generates the evolution. terized adiabatic transformation in the ansatz. The algo-
Simulating open systems. The VQA framework can rithm only loosely follows the evolution of the ground state
also be extended to simulate dynamical evolution of open of H(t) = (1 − t)HM + tHP for every t ∈ [0, 1], as one is in-
quantum systems. Suppose that the dynamics of the terested in making the final state close to the ground state of
HP . b) Problem Hamiltonian HP and graph hjki for a Max-
system is described by dρ dt = L(ρ), where L denotes a Cut task. Each node in the graph (circle) represents a spin.
super-operator for a dissipative process. Similarly to Vertices connecting two nodes indicate an interaction σjz σkz in
the iterative approach for pure states [74], one maps the HP , with σkz the Pauli z operator on spin k. The solution
evolution of the mixed state to one of the variational is encoded in the ground state of HP where some spins are
parameters via McLachlan’s principle, which solves the pointing up (green) while others point down (blue).
d
minimization minθ̇ k( dt − L)ρ(θ)k. The solution deter-
mines the evolution of the parameters M · θ̇ = V with
Mi,j = Tr ∂i ρ(θ)† ∂j ρ(θ) , Vi = Tr ∂i ρ(θ)† L(ρ) and
mizing a given classical objective function L(s). QAOA
∂i ρ(θ) = ∂ρ(θ) encodes L(s) in a quantum Hamiltonian HP by promot-
∂θ i . Each term of M and V can be computed ing each classical variable sj to a Pauli spin-1/2 operator
by applying the SWAP test circuit on two copies of the
σjz , so that the goal is to prepare the ground state of HP .
purified states [75]. Here, to simulate an open system of
Motivated by the quantum adiabatic algorithm, QAOA
n qubits, one needs to apply operations on 4n + 1 qubits. replaces adiabatic evolution with p rounds of alternat-
An alternative approach [76] which reduces this overhead ing time propagation between the problem Hamiltonian
is to simulate the stochastic Schrödinger equation, which HP and appropriately chosen mixer Hamiltonian HM ,
unravels the evolution of the density matrix into trajec- see Fig. 4. As discussed in the quantum alternating op-
tories of pure states. Each pure state trajectory experi- erator ansatz of Section II B, the evolution time intervals
ences continuous damping effect and jump processes due are treated as variational parameters and are optimized
to the noise operators, both of which can be efficiently classically. Hence, defining θ = {γ, β}, the cost function
simulated. Since this method one only controls a single is C(γ, β) = hψp (γ, β)|HP |ψp (γ, β)i with
copy of the pure state, it only requires n + 1 qubits.
|ψp (γ, β)i = e−iβp HM e−iγp HP · · · e−iβ1 HM e−iγ1 HP |ψ0 i , (7)
there exist quantum algorithms for the fault-tolerant era ful thought since the underlying mathematics of quan-
aimed for these tasks, the goal of VQAs is to have heuris- tum mechanics is linear. To address this, a VQA for
tical scalings comparable to the provable scaling of these such non-linear problems was proposed in [122]. The
non-near-term algorithms while keeping the algorithm re- approach was illustrated for the time-independent non-
quirements compatible with the NISQ era. linear Schrödinger equation, where the cost function is
Linear systems. Solving systems of linear equations the total energy (sum of potential, kinetic, and interac-
has wide-ranging applications in science and engineering. tion energies), and where the space was discretized into
Quantum computers offer the possibility of exponential a finite grid. By employing multiple copies of variational
speedup for this task. Specifically, for an N × N linear quantum states in the cost-evaluation circuit, this VQA
system Ax = b, one considers the Quantum Linear Sys- can compute non-linear functions.
tems Problem (QLSP) where the task is to prepare a nor- Factoring. While Shor’s algorithm for factoring is very
malized state |xi such that A|xi ∝ |bi, where |bi = b/kbk well known, the time horizon for its large-scale implemen-
is also a normalized state. The classical algorithmic com- tation is long. Hence, a VQA for factoring as a poten-
plexity for this task scales polynomially in the dimension tial near-term alternative was introduced in [123]. This
N , whereas the now-famous HHL quantum algorithm [4] proposal relies on the fact that factoring can be formu-
has a complexity that scales logarithmically in N , with lated as an optimization problem, and in particular, as
some scaling improvements having been proposed [113– a ground state problem for a classical Ising model. The
116]. These pioneering quantum algorithms, however, authors employed the QAOA to variationally search for
will be difficult to implement in the near term due to the the ground state. Their numerical heuristics suggest that
enormous circuit depth requirements [117]. a linear number of layers in the ansatz (p ∈ O(n)) leads
This has motivated researchers to propose VQAs for to a large overlap with the ground state.
the QLSP [118–120]. A common feature Principal Component Analysis. An important prim-
P in these algo-
rithms is the assumption that A = k ck Ak is given
itive in data science is reducing the dimensionality of
as a linear combination of unitaries Ak that can be effi- one’s data with Principal Component Analysis (PCA).
ciently implemented. One can then construct a Hamil- This involves diagonalizing the covariance matrix for a
tonian whose ground state is the solution to the QLSP data set and selecting the eigenvectors with the largest
and apply a variational approach to minimize the cost eigenvalues as the key features of the data. Because the
C(θ) = hψ(θ)|HG |ψ(ψ)i. Refs. [118–120] considered the covariance matrix is positive semi-definite, one can store
Hamiltonian HG = A(1 − |bihb|)A† (which was also con- it in a density matrix, i.e., in a quantum state, and then
sidered outside of the variational setting [121]). It is any diagonalization method for quantum states can be
worth noting that the aforementioned cost can have gra- used for PCA. This idea was exploited in [124] to propose
dients that vanish exponentially in n (i.e., a so-called bar- a quantum algorithm for PCA. However, quantum phase
ren plateau in the cost landscape, see Section IV). This estimation and density matrix exponentiation were sub-
problem can be mitigated by considering a local Hamil- routines in this algorithm, making it non-implementable
tonian with the same ground state [118] or by in the NISQ era. To potentially make this application
P using a hy-
brid ansatz strategy [120] where |ψ(θ)i = i αi |ψi (θ 1 )i more near term, Ref. [125] proposed a variational quan-
with αi being variational parameters. Numerical heuris- tum state diagonalization algorithm, where the cost func-
tics for random (sparse) linear systems showed efficient tion C(θ) quantifies the Hilbert-Schmidt distance be-
scaling in N , κ, and [118]. In particular, the heuristic tween the state ρ̃(θ) = U (θ)ρU (θ)† and Z(ρ̃(θ)), and
logarithmic scaling in N suggests that VQAs could po- where Z is the dephasing channel. While this VQA
tentially give an exponential speedup, analogous to HHL, outputs estimates of all the eigenvalues and eigenvec-
for the QLSP. tors of ρ, it comes at the cost of requiring 2n qubits
Matrix-vector multiplication. Another related problem for an n qubit state. This qubit requirement can be re-
is to realize matrix-vector multiplication, i.e., to prepare duced with the VQA of [97], which requires only n qubits.
a normalized state |xi such that |xi ∝ A|bi with nor- Here one exploits the connection between diagonaliza-
malized vector |bi. When A = 1 − iHδt, then the prob- tion and majorization to define a cost function of the
lem becomes the task of Hamiltonian simulation. Simi- form C(θ) = Tr[ρ̃(θ)H] where H is a non-degenerate
lar to solving the QLSP, one constructs the Hamiltonian Hamiltonian. Due to Schur concavity, this cost function
HM = 1 −A |bi hb| A† /kA |bi k2 , whose ground
p state is |xi is minimized when ρ̃(θ) is diagonalized.
†
with zero energy [119]. Here kA|bik = hb|A A|bi is the
Euclidean norm. Given an approximate solution |ψ(θ ∗ )i,
one can lower bound the fidelity to the exact solution as E. Compilation and Unsampling
|hψ(θ ∗ )|xi|2 ≥ 1 − hψ(θ ∗ )|HM |ψ(θ ∗ )i, thus verify the so-
lution’s correctness whenever the cost function is small. A very natural task that NISQ devices can poten-
Non-linear equations. Non-linear equations are im- tially accelerate is the compiling of quantum programs.
portant to various fields, especially in the form of non- In quantum compiling, the goal is to transform a given
linear partial differential equations. However, mapping unitary V into native gate sequence U (θ) with an op-
such equations onto quantum computers requires care- timally short circuit depth. Quantum compiling plays
11
a major role in error mitigation, as errors increase with operations with r ancillary qubits. By sequentially ap-
circuit depth. Quantum compiling is a challenging prob- plying encoding, recovery, and decoding on the input
lem for classical computers to perform optimally, due state, one obtains an output ρout = W (θ 1 )V (θ 1 )(ψ ⊗
to the exponential complexity of classically simulating |0ih0|⊗n−k+r )V (θ 1 )† W (θ 1 )† . Projecting the n − k an-
quantum dynamics. Hence, several VQAs have been in- cillary qubits back to |0ih0| and discarding the last
troduced that can potentially be used to accelerate this r ancillary qubits, one finds a quantum channel ρ =
task [103, 126–129]. These algorithms can be catego- E(θ 1 , θ 2 )(ψ) on the input state ψ.R The target of QVEC-
rized as either Full Unitary Matrix Compiling (FUMC) TOR is to maximize the fidelity ψ dψF (ψ, E(θ 1 , θ 2 )(ψ))
or Fixed Input State Compiling (FISC), which respec- between the output ρ and the input ψ averaged overall all
tively aim to compile the target unitary V over all input ψ or any US that forms a unitary 2-design. The solution
states or for a particular input state. In [103] a VQA will give the quantum circuit that maximally protects the
for FUMC was presented, which employs cost functions input state. Numerical simulations showed that QVEC-
closely related to entanglement fidelities to quantify the TOR can find quantum codes that outperform existing
distance between V and U (θ). The proposal in [126] ones [130].
also treats the FUMC case, but with an alternative ap- Instead of discovering new device-tailored QEC codes,
proach to quantifying the cost using the average gate fi- Ref. [131] considered how to compile conventional QEC
delity, averaged over many input and output states. The codes into a given quantum hardware with specific noise.
FISC case was treated in [127], where the problem was Suppose one aims to implement the logical state |ψiL =
reformulated as a ground state energy task, hence mak- α|0iL + β|1iL with logical state basis {|0iL , |1iL }. Note
ing the connection with VQE. The connection with VQE that |ψiL is the ground state of the stabilizers Gk as
was also generalized to FUMC [128], showing that varia- well as the logical operator P = |ψiL hψ|L − |ψ ⊥ iL hψ ⊥ |L
tional quantum compiling, in general, is a special kind of with orthogonal state |ψ ⊥ iL . Then one can Pconstruct
VQE problem. Ref. [129] introduced and experimentally a frustration-free Hamiltonian H = −a0 P − k≥1 ak Gk
implemented a compiling scheme which can be thought
of as FISC, although the architecture here is focused on P with |ψiL the ground
with positive coefficients a0 , ak , and
state with energy EG = −(a0 + k≥1 ak ). One then em-
the application of unpreparing a quantum state. Finally, ploys a VQA to discover the circuit that implements |ψiL
it is worth noting that both FUMC and FISC exhibit with a given hardware structure. Since the eigenstate en-
resilience to hardware noise, in that the global minimum ergies are know, the fidelity of the discovered state can be
of the cost landscape is unaffected by various types of bounded by F ≥ 1 − (E − EG )/a with the discovered en-
noise [128]. This noise resilience feature is crucial for the ergy E and a = min{a0 , ak }. Numerical studies showed
utility of variational quantum compiling for error mitiga- the encoding circuits for the five- and seven-qubit codes
tion, and we discuss this in more detail in Sec. IV C. with different noisy hardware [131].
Quantum Error Correction (QEC) protects qubits Quantum machine learning (QML) generally refers to
from hardware noise. Due to the large qubit requirements the tasks of learning patterns with the goal of mak-
of QEC schemes, their implementation is beyond NISQ ing accurate predictions on unknown, and unseen data.
device capabilities. Nevertheless, QEC could still bene- While providing an in-depth discussion on the field of
fit NISQ hardware by suppressing the error to a certain QML is beyond the scope of this review, we here present
extent and by combining it with other error mitigation several QML applications for which the VQA frame-
methods. Specifically, conventional universal approaches work can be readily implemented. Specifically, here one
for implementing QEC codes generally involve an un- learns a parametrized quantum circuit to solve a given
necessarily long circuit that does not take into account task [64, 132]. This connection between VQAs and (typ-
the hardware structure or the type of noise. Hence, two ical) QML applications shows that the lessons learned in
VQAs have been introduced to solve these problems to one field can be of great use in the other, hence providing
automatically discover or compile a small quantum error- a close connection between these two fields.
correcting code for any quantum hardware and any noise. Classifiers. The classification of data is a ubiquitous
The Variational Quantum Error Corrector (QVEC- task in machine learning. Given training data of the form
TOR) was first proposed to discover a device-tailored {x(i) , y (i) }, where x(i) are inputs, and y (i) labels, the goal
quantum error-correcting code for a quantum mem- is to train a classifier to accurately predict the label of
ory [130]. For any k-qubit input state |ψi = US |0i, each input. Since a key aspect for the success of classi-
prepared by a unitary US acting on a reference state cal neural networks is their non-linearity, one can expect
|0i, QVECTOR considered two parametrized circuits this property to also arise in a quantum classifier. As
V (θ 1 ) (on n ≥ k qubits) and W (θ 2 ) (on n + r qubits), shown in [133], parametrized quantum circuits can sup-
which respectively encode the input logical state into n port linear transformations and non-linearity can be ex-
qubits with n − k ancillary qubits and realize recovery ploited from the tensor product structure of a quantum
12
system. More precisely, defining an input data depen- input state and measuring in the computational basis,
dent unitary V (x), then the tensor product V (x) ⊗ V 0 (x) i.e., this corresponds then to a quantum circuit Born
or the multiplication V (x)V 0 (x) results in a non-linear machine [145]. In principle one wishes to minimize the
function of the input data x. In this sense, the unitary difference between the two distributions. However, since
V (x) can be used as a quantum non-linear feature map, q(x) is not available, the cost function
P is defined(i)by the
1
where the Hilbert space can be exploited for a feature negative log-likelihood C(θ) = − D i log(pθ (x )). In
space [134, 135]. Interestingly, the tensor network struc- Ref. [146], cat and coherent thermal states were gener-
ture of quantum mechanics has even inspired classical ated experimentally in this way from specific classical
machine learning methods [136]. data sets. In [145], an implicit generative model has been
Here, after embedding the input data x into the constructed by comparing the distance in the Gaussian
quantum state, a linear transformation is performed kernel feature space. The representation power of the
using a parametrized quantum circuit, U (θ)V (x)|ψ0 i. generative model has been investigated in [143]. Finally,
The cost function is then defined as the error between it has been shown that quantum circuit Born machines
the true label and the expectation value ofPan eas- can simulate the restricted Boltzmann machine and per-
ily measurable observable A, i.e., C(θ) = i [y
(i)
− form a sampling task that is hard for a classical com-
† (i) † (i) 2
hψ0 |V (x )U (θ)AU (θ)V (x )|ψ0 i] . This approach puter [147].
has been used in generalization and in classification Variational Quantum Generators. Generative Adver-
tasks [64, 133], as well as in an experimental demon- sarial Networks (GANs) play an important role in classi-
stration of variational classification [135]. Moreover, as cal machine learning for applications such as image syn-
shown in Refs. [134, 135], instead of using a parametrized thesis and molecular discovery. Ref. [148] proposed a
unitary U (θ) one can employ estimate products of quan- VQA for learning continuous distributions which is meant
tum feature vectors hψ0 |V † (x0 )V (x)|ψ0 i to perform a ker- to be a quantum version of GANs. Here one still consid-
nel method. Finally, the quantum kernel trick, which ers classical data, but encoded into a quantum circuit.
means that the dimensions of the quantum-enhanced fea- This encoding is followed by a variational quantum cir-
ture space are larger than the number of data sets, has cuit that generates quantum states, which are then mea-
been demonstrated experimentally by using an ensemble sured to produce a fake sample. This fake sample then
nuclear spin system [137]. enters either a classical discriminator or a quantum dis-
Autoencoders. The autoencoder for data compression criminator, and the cost function is optimized to mini-
is an important primitive in machine learning. Here the mize the discrimination probability with respect to real
idea is to force information through a bottleneck while samples. The target application is to accelerate classical
still maintaining the recoverability of the data. As a GANs using quantum computers.
quantum analog, Ref. [138] introduced a VQA for quan- Quantum Neural Network architectures. Several Quan-
tum autoencoding, with the goal of compressing quantum tum Neural Network (QNNs) architectures have been
data. (See [139, 140] for alternative approaches to quan- proposed, and which can be used for some of the afore-
tum autoencoders.) The input to the algorithm is an en- mentioned tasks. For instance, Refs. [132, 149, 150] pro-
semble of pure quantum states {pµ , |ψµ i} on a bipartite posed perceptron-based QNNs. In these architectures
system AB. The goal is then to train an ansatz U (θ) to each node in the neural network represents a qubit, and
compress this ensemble into the A subsystem, such that their connections are given by parametrized unitaries of
one can recover each state |ψµ i with high fidelity from the form (3) acting on an the input states. On the
subsystem A. The B subsystem is discarded and hence other hand, Ref. [151] introduced a Quantum Convo-
can be thought of as the “trash”. Given the close connec- lutional Neural Networks (QCNN). QCNNs have been
tion between data compression and decoupling [138], the employed for error correction, image recognition [152],
cost function is based on the overlap between the out- and to discriminate quantum state belonging to different
put state on B and a fixed pure state. Recently, a local topological phases [151]. Moreover, it has been shown
version of this cost function was also proposed and was that QCNN and QNNs with tree tensor network archi-
shown to train well for large-scale problems [141]. Quan- tectures do not exhibit barren plateaus [153, 154] (see
tum autoencoders have seen experimental implementa- Section IV A 1 for more details on barren plateaus), po-
tion on quantum hardware [142], and will likely be an tentially making them a generically trainable architecture
important primitive in quantum machine learning. for large-scale implementations.
Generative models. The idea of training a
parametrized quantum circuit for a QML implementa-
tion can also be applied for a generative model [143, 144], H. New frontiers
which is an unsupervised statistical learning task with
the goal of learning a probability distribution that gen- In this section we discuss an exciting application of
erates a given data set. Let {x(i) }D i=1 be a data set the VQA framework where one draws on the fact that
sampled from a probability distribution q(x). Here one VQAs deal with systems that are quantum mechanical.
learns q(x) as the parametrized probability distribution That is, many VQAs have been propose to understand
pθ (x) = |hx|U (θ)|ψ0 i|2 obtained by applying U (θ) to an and exploit the mathematical and physical structure of
13
quantum states, and quantum theory in general. the variational algorithm for quantum singular value de-
Quantum foundations. NISQ computers will likely composition introduced in [159]. One can envision then
play an important role in understanding the foundations using these algorithms to characterize the entanglement
of quantum mechanics. In a sense, these devices offer ex- (and for example, topological order) in a ground state
perimental platforms to test foundational ideas ranging that was prepared by VQE, and hence different VQAs
from quantum gravity to quantum Darwinism. For exam- can be used together in a complementary manner.
ple, the emergence of classicality in quantum systems will Quantum metrology. Quantum metrology is a research
be soon be a computationally tractable field of study due field where one seeks the optimal setup for probing a pa-
to the increasing size of NISQ computers. Along these rameter of interest with minimal shot noise. The probed
lines, Ref. [155] proposed the Variational Consistent His- parameter is typically a magnetic field. In the absence of
tories (VCH) algorithm. Consistent Histories is a formal noise during the probing process, the analytical solution
approach to quantum mechanics that has proven to be about the optimal probe state has been given; however,
useful in studying the quantum-to-classical transition as when general physical noises are present, an analytical
well as quantum cosmology; however, it involves comput- argument is hard. Variational-state quantum metrology
ing the decoherence functional between all pairs of histo- variationally searches for the optimal probe state [160–
ries. Since the number of histories grows exponentially in 163]. For state preparation, variational quantum circuits
both the system size and the number of dynamical times are used in Refs. [160, 162, 163] while optical tweezer
considered, classical numerics are intractable for this for- arrays are considered in [161]. More concretely, one pre-
malism. The VCH algorithm aims to remedy this by pares a probe state with variational parameters, probe
storing the decoherence functional in a quantum density the magnetic field with physical noises, measure quan-
matrix and then using a quantum device to efficiently tum Fisher information (QFI) as a cost function, and
compute a cost function that quantifies how close this update the parameters to maximize it. Note that since
matrix is to being diagonal. Fully diagonal corresponds QFI cannot be efficiently computed, an approximation of
to the consistency condition, and hence VCH actively QFI can be heuristically found by optimizing measure-
searches for consistent families of histories. VCH has ment basis, or by computing upper and lower bounds on
the potential to elucidate the quantum-to-classical tran- the QFI [163].
sition in quantum dynamics that otherwise could not be
efficiently studied.
Quantum information theory. Another field that will IV. CHALLENGES AND POTENTIAL
SOLUTIONS
likely see renewed interest due to NISQ computers is
quantum information theory or quantum Shannon the-
ory [156]. For example, in [138] it was remarked that Despite the tremendous developments in the field of
the quantum autoencoder algorithm could potentially be VQAs, there are still many challenges that need to be
used to learn encodings and achievable rates for quantum addressed to maintain the hope of achieving quantum
channel transmission. Another area of research is using speedups when scaling up these near-term architectures.
NISQ computers to compute key quantities in quantum Understanding the limitations of VQAs is crucial to de-
information theory, such as the von Neumann entropy or veloping strategies that can be used to construct bet-
distinguishability measures like the trace distance. While ter algorithms, prove certain guarantees on their perfor-
it is know that these problems are hard for general quan- mance, and even to build better quantum hardware. In
tum states [157], Ref. [158] introduced a VQA to estimate this section we review some of the known challenges of
the quantum fidelity between an arbitrary state σ and a VQAs and how they can be potentially addressed. The
low-rank state ρ. Moreover, in [59] a VQA was intro- results are presented in three main areas pertaining the
duced to learn modular Hamiltonians, which provides an trainability of the cost function, the efficiency of estimat-
upper bound on the von Neumann entropy of a quantum ing the cost, and the effect of noise in VQAs.
state. Here one attempts to variationally decorrelate a
quantum state by minimizing the relative entropy to a
product distribution, and hence this method is suited for A. Trainability
states that can be easily decorrelated.
Entanglement Spectroscopy. Characterizing entangle- 1. Barren plateaus
ment is crucial for understanding condensed matter sys-
tems, and the entanglement spectrum has proven to be The so-called Barren Plateau (BP) phenomenon in the
useful in studying topological order. Several VQAs have cost function landscape has recently received consider-
been introduced to extract the entanglement spectrum able attention as one of the main bottlenecks for VQAs.
of a quantum state [97, 125, 159]. Since the entangle- When a given cost function C(θ) exhibits a BP the mag-
ment spectrum can be viewed as the principlal compo- nitude of its partial derivatives will be, on average, expo-
nents of a reduced density matrix, algorithms for PCA nentially vanishing with the system size [164]. As shown
can be used for this purpose. This includes the VQAs in Fig. 5 this has the effect of the landscape being essen-
discussed in Sec. III G. In addition, one can also employ tially flat. Hence, in a BP one needs an exponentially
14
tity. The main idea behind this method is to reduce essential for quantum advantage [26]. More reasonable
the randomness and depth of the circuit to break the as- resource estimates can be reached for restricted prob-
sumption that the circuit approximates a 2-design, a con- lems such as the Hubbard model [182, 183], how This
dition necessary for BPs to arise in deep ansatze. Similar need to efficiently estimate the expected values of opera-
to the previous method, other schemes have been intro- tors from digital quantum circuits has inspired a variety
duced to prevent BP by restricting the randomization of of approaches [184]. While in principle one could always
the ansatz. For instance, the proposal of [177] showed take projective measurements onto the eigenbasis of the
that correlating the parameters in the ansatz effectively operator in question, in general both the computational
reduces the dimension of the hyperparameter space and complexity of finding the required unitary, as well as the
can lead to large cost function gradients. In addition, depth required to implement that transformation, may
Ref. [178] introduced a method where one employs layer- be intractable. However, given that arbitrary Pauli op-
by-layer training. Here, one initially trains shallow cir- erators are diagonalizable with one layer of single qubit
cuits and progressively adds components to the circuit. rotations, it is common for the operators of interest (such
While the latter guarantees that the number of param- as quantum chemistry Hamiltonians) to be expressed by
eters and randomness remains small for the first steps their P
decomposition into such Pauli operators. That is,
of the training, it has been shown [179] this method can H = i ci σi , where {ci } are real coefficients and σi Pauli
lead to an abrupt transition in the ability of quantum operators. The drawback of this approach is that for
circuits to be trained. Finally, a method was introduced many interesting Hamiltonians this decomposition con-
in [180] where one pre-trains the parameters in the quan- tains many terms. For example, for chemical Hamilto-
tum circuits by employing classical neural networks. nians the number of distinct Pauli strings scales as n4
Ansatz strategies. Another strategy for preventing BPs where n is the number of orbitals (and thus qubits) for
is using structured ansatze which are problem-inspired. large molecules. In what follows we discuss several meth-
The goal here is to restrict the space explored by the ods whose goal is to obtain measurement frugality in es-
ansatz during the optimization. As discussed in Sec- timating the cost function.
tion II B, the UCC ansatz for VQE of the quantum al- Commuting sets of operators. In the interests of reduc-
ternating operator ansatz [22, 23] for optimization are ing the number of measurements required to estimate an
problem-inspired ansatze which are usually trainable operator expectation value, a number of methods have
even when randomly initialized. Other ansatz strategies been proposed for partitioning sets of Pauli strings into
include the proposals in [180] to learning a mixed state, commuting (i.e. simultaneously measurable) subsets.
where one leverages knowledge of the target Hamiltonian The choice of the subsets is also of course non-unique
to create a Hamiltonian variational ansatz. In addition, and has been mapped onto the combinatorial problems
Refs. [49, 50] presented an approach
P where the ansatz for of graph coloring [185, 186], finding the minimum clique
the solution is |ψ ({cµ })i = µ cµ |ψµ i, for a fixed set of cover [187–190], or finding the maximal flow in network
states {|ψµ i} determined by the problem at hand. Here flow graphs [191], which makes it possible to import the
the optimization over the coefficients {cµ } can be solved heuristics and formal results from those problems.
using a quadratically constrained quadratic program. Perhaps the simplest approach to such a partition-
Finally, we remark that along with ansatz strate- ing is to look for subsets that are qubit-wise commuting
gies there are other ways of potentially addressing BPs. (QWC), which is to say that the Pauli operators on each
These include optimizers tailored to mitigate the effect of qubit commute. Indeed, this was the first method in-
BPs [181], using local cost functions [97], or employing an troduced [192]. However, while the QWC methods help
architecture such as the Quantum Convolutional Neural reduce the number of operators, they do not change the
Network, which has been shown to avoid BPs [153]. asymptotic scaling for quantum chemistry applications,
motivating more general commutative groupings to be
considered. To this end, it has been shown that by con-
B. Efficiency sidering general commutations (and increasing the num-
ber of gates of the circuit quadratically with n) the scal-
Another requirement that must be met for VQAs to ing of the number of measurements can be reduced to
provide a quantum advantage is having an efficient way n3 [185, 186, 188–191].
to estimate expectation values (and thus more general For using VQE on fermionic systems, this scaling can
cost functions). The existence of BPs can exponentially actually be brought down to either quadratic or, for sim-
increase the precision requirements needed for the opti- pler cases even linear in n [193]. This significant im-
mization portion of VQAs, as discussed above in IV A 1, provement is found by considering factorizations of the
but even in the absence of such BPs these expectation two-electron integral tensors, rather than working at the
value estimations are not guaranteed to be efficient. In- operator level. The success of this approach suggests that
deed, early estimations of resource requirements sug- using background information on the problem may signif-
gested that the number of measurements that would be icantly improve the measurement efficiency of estimating
required for interesting quantum chemistry VQE prob- an expectation value.
lems would be astronomical, making addressing this issue Optimized sampling. In addition to reducing the num-
16
ber of individual operators that need to be measured, aspects of the impact: hardware noise could potentially
measurement efficiency can also be improved by carefully slow down the training process, it could bias the land-
allocating the number of shots among the Pauli opera- scape so that the noisy global optimum no longer cor-
tors. Since operators with smaller coefficients will tend to responds to the noise-free global optimum, and it could
contribute less to the overall variance, assigning the same affect the final value of the optimal cost.
number of shots to each operator is usually inefficient. In-
stead, the optimal approach is to give eachp Pauli operator
a number of shots proportional to |ci | Var(σi ), where 1. Impact of hardware noise
Var(σi ) is the variance of hσi i [194]. During an optimiza-
tion where low precision steps may be allowed early on, Effect of noise on training. The question of whether
this allocation can instead be performed
p randomly with noise can help with the training process was posed
probabilities proportional to |ci | Var(σi ). Making the in [200]. In practice, it is typical to observe that noise
allocation randomly in this way allows for unbiased esti- slows down the training. For example, it was heuris-
mates with as little as one shot, potentially significantly tically observed that the noise-free cost achieves lower
increasing the efficiency of the optimization [195]. Opti- values with noise-free training than with noisy train-
mizing the sampling of the metric tensor (Section II C) ing [79, 195, 201]. As discussed in Section IV A 1, the
has also been explored, with the conclusion that these intuition behind this slowing down is that the cost land-
costs need not be dominant in metric-aware VQAs [196]. scape is flattened, and hence gradient magnitudes are re-
Classical shadows. Another promising approach to ef- duced, by the presence of incoherent noise [173, 202, 203].
ficient measurements is the construction of classical shad- Moreover, gradients decay exponentially with the algo-
ows [197], also know as shadow tomography. In this ap- rithm’s depth, meaning that the deeper the circuit, the
proach, an approximate classical representation of the more it will be affected. This can be further understood
state (the classical shadow) is constructed by summing from the fact that cost functions are typically extrem-
over the collection of states that a sequence of different ized by pure states, and since incoherent noise reduces
measurements projects onto. These measurements are state purity, one expects this noise to erode the extremal
taken in the basis of randomly chosen strings so that points of the landscape [174]. The presence of the noise-
a partial tomography of the state is completed. Com- induced barren plateaus and their effect on the trainabil-
bining the measurements in this way, each shot con- ity is one of the leading challenges for VQAs, with poten-
tributes to the estimation of each Pauli operator expec- tial solutions being developing better quantum hardware
tation value, resulting in a number of measurements that or shorter-depth algorithms. It is worth remarking that
scales logarithmically with n. As with direct measure- the results discussed here do not account for the use of
ment approaches discussed above, this approach can also error mitigation techniques (see below), and the scope to
be further optimized by tuning the probability distribu- which these could help is still an open question.
tion for the Pauli operators that define the measurements Effect of noise on cost evaluation. In Refs. [173, 174] it
to match the properties of the operator and state [198]. was also shown that in the presence of local Pauli noise,
Neural network tomography. A different approach us- the cost landscape concentrated exponentially with the
ing partial tomography is to train an approximate re- depth of the ansatz around the value of the cost associ-
stricted Boltzmann machine (RBM) representation of the ated with the maximally mixed state. While the proof
desired quantum state [199]. This RBM is fitted using of this exponential concentration of the cost was for gen-
measurements of the Pauli operators that are needed to eral VQAs, some previous works had also observed this
directly estimate a given operator’s expectation value, effect for the special case of the QAOA [202, 203]. The
and so does not inherently reduce the number of opera- exponential concentration of the cost is of course impor-
tors to measure. However, by computing the expectation tant beyond the issue of trainability. Even if one is able
value on an approximate RBM instead of directly from to train, the final cost value will be corrupted by noise.
measurements the sampling variance for a given number There are certain VQAs where this is not an important
of shots is substantially reduced at the cost of introducing issue (e.g., in QAOA where one can classically compute
a small, positive bias [199]. the cost after sampling). However, for VQE problems,
this is important, since one is ultimately interested in an
accurate estimation of the energy. This emphasizes the
C. Accuracy importance of understanding to what degree error mit-
igation methods (discussed below) can correct for this
issue.
One of the main goals for VQAs is to provide a practi-
cal use for near-term noisy devices. In this sense, VQAs
provide a strategy to deal with hardware noise as they 2. Noise resilience
can potentially minimize quantum circuit depth. How-
ever, one can still ask what the impact of hardware noise Inherent resilience to coherent noise. One reason for
will be on the accuracy of a VQA. There are multiple interest in variational algorithms is their ability to nat-
17
ware [211–213]. Taking this approach can allow one to larger and better quantum devices, one can nevertheless
implement a probabilistic error mitigation protocol with- pose the question as to what specific applications will
out needing to construct a full error model for an ex- provide the first quantum advantage for a practical sce-
periment [211]. Alternatively, one can perform a simple nario. In this section we discuss some of the most exciting
regression with this Clifford data to estimate how the ob- possibilities where quantum advantage could arise.
servables have been affected and invert this regression to
estimate desired noise-free expectation values [212]. Fi-
nally, zero-noise extrapolation can be merged with this A. Chemistry and material sciences
regression to have an extrapolation to zero-noise whose
form is tuned via the Clifford data, reducing the risk of
The ability to simulate and understand the static and
blind extrapolations [213].
dynamical properties of molecules and strongly corre-
Several additional QEM methods have been proposed.
lated electronic systems is a fundamental task in many
Symmetry verification is especially useful for ansatze
areas of science. For instance, this task is relevant in bi-
that preserve symmetries such as particle and spin num-
ology to understand protein folding dynamics, while in
ber [12, 214, 215]. Since physical errors break the symme-
pharmaceutical sciences one could analyze drug–receptor
try, by measuring and ignoring the undesired case (simi-
interactions to improve drug discovery capabilities [221–
larly to error detection), one can mitigate physical noise.
223]. Similarly, analyzing the electronic structure of com-
Unlike other QEM methods, symmetry verification can
plex correlated materials is highly important for studying
recover the quantum state itself. One can also take a
high-temperature superconductivity or to analyze tran-
post-processing approach using the information of the
sition metal materials near a Mott transition.
symmetry with a larger sample number [215]. The use
Molecular structure. In the past few decades there
of symmetry verification to augment error extrapolation
have been great developments in the classical treatment
and probabilistic error cancellation was taken still further
of the structure of molecular systems. These include
in [207].
approximate methods such as Hartree-Fock or density
In an alternative and complementary approach, the
functional theory, or methods closely connected to quan-
subspace expansion method was also shown to be useful
tum information, like the density matrix renormalization
for QEM in [216]. Here, using subspace expansion one
group approach that utilizes matrix product states as
can mitigate physical noise for eigenstates of the Hamilto-
an ansatz [224, 225]. However, even for these sophisti-
nian as well as evaluating excited states because the state
cated approaches, systems of interest like FeMoCo are
is expanded in a larger subspace. Note that this method
beyond the reach of an accurate description due to the
works better for coherent noise than for stochastic noise.
entanglement structure of the electrons and orbitals. The
A distinct approach was introduced in [217, 218] which
relevant electronic space that one needs to treat correla-
comes at the cost of increasing the number of qubits.
tions accurately in for these systems is relatively modest,
Here, by entangling and measuring M copies of a noisy
and for that reason, these may be good targets for near-
state ρ, one can compute expectation values with respect
ρM
term quantum computers to play a role. As discussed
to the state Tr[ρM ] . Under the assumption that the prin- in Section III A the VQE algorithm [16] (and associated
cipal eigenvector of ρ is the desired state, this method architectures) have shown promising advances towards
can exponentially suppress errors with M . Finally we re- the goal of performing molecular quantum chemistry on
mark that Ref. [219] introduced a method to mitigate ex- quantum computers [226], with large scale implementa-
pectation values against correlated measurement errors, tion already being executed [227].
while [220] implemented an error mitigation technique to Molecular dynamics. As for the dynamics of chemical
suppress the effects of photon loss for a Gaussian Boson and other quantum systems, there have been a number
sampling device. of strides in evaluating or compressing these evolutions
using variational approaches [74, 75]. Much like varia-
tional principles connected to the ground state, there are
V. OPPORTUNITIES FOR NEAR TERM a number of time-dependent variational principles that
QUANTUM ADVANTAGE can be used to approximate time-dynamics. Here there
are two timescales of interest. The first is the electronic
VQAs are largely regarded as the best candidate for timescales over which electrons rearrange upon excita-
providing quantum advantage for practical applications. tion. The second, much slower than the first, is the
That is, it is expected that one will be able to employ a rearrangement of nuclei that is induced by forces de-
VQA to solve a problem more efficiently than any classi- rived from the electrons in their respective configura-
cal state-of-the-art method. As discussed in the previous tions, excited or not. Generally speaking, treating the
sections, tremendous effort has been recently put forward detailed dynamics of the electrons accurately has been
towards this goal with the development of efficient ansatz extremely challenging for classical approaches despite
strategies, quantum-aware optimization methods, novel its relevance in phenomena related to photovoltaics and
VQAs, and error mitigation techniques. While many light-emitting diodes [228, 229]. The scale between the
challenges still remain to be addressed, like the need for two timescales has motivated the development of meth-
19
ods that treat them separately, often using a classical classically simulable. While large scale, fault-tolerant
or semi-classical representation for the nuclei and quan- quantum computers will eventually be able to handle this
tum representation for the electrons [230]. Variational difficulty, there is also the potential for achieving a sig-
methods can be applied incrementally in these cases, by nificant quantum advantage in this area with VQAs in
stepping the electronic wavefunction forward with time- the NISQ era [242]. Recent advances in this direction
dependent variational principles [74, 75] and sampling the include work on VQAs for LGT simulation [14] as well
forces [231] to move the nuclei classically, resulting in a as variational determinations of mass gaps, Green’s func-
Born-Oppenheimer type molecular dynamics. Early test tions, and running coupling constants [243–245]. In ad-
systems for quantum molecular dynamics often include dition, an approach using a VQA to determine interpola-
photo-dissociation reactions and conical interactions of tion operators to accelerate classical LGT computations
small molecular systems [232]. Ultimately, these methods has been proposed [246]. Finally, the impacts of decoher-
may help unlock proton-coupled electron transfer mech- ence by hardware noise on LGT calculations have been
anisms [233] in proteins and help with the design of novel studied, finding that gauge violations caused by decoher-
organic photovoltaics [228] and related systems. ence only grow linearly at short times, suggesting that
Materials science. Classical methods for materials sim- short depth approaches may be possible [247]. Taken to-
ulations usually employ density-functional theory cou- gether, these results show that studying LGTs is a viable
pled with approximation methods, such as the local den- candidate for NISQ quantum advantage.
sity approximation [234] to tackle weakly correlated ma-
terials. However, many effects arising from strongly cor-
related systems are beyond the reach of such classical C. Optimization and machine learning
methods. Since long-term algorithms for material sim-
ulation require phase estimation [235–237], these lie be- While it is natural to consider that VQAs can bring an
yond the scope of near-term devices. In contrast, near- advantage on task which are inherently quantum in na-
term VQAs for analyzing strong correlation problem are ture, the prospect of employing quantum algorithms to
aimed at reducing the circuit depth by using smart ini- solve classical problems is also an exciting one. Gener-
tializations [238], or by optimizing the circuit structure ally, one here aims to employ the large dimension of the
itself [29, 30]. Hilbert space to encode large problems or large amounts
of data, with the premise that the quantum nature of the
algorithm (such as coherence or entanglement between
B. Nuclear and particle physics qubits [248]) helps in speeding up a given task.
Optimization. Many optimization problems can be en-
Nuclear physics. Similar to the chemistry applications coded in relatively simple mathematical models such as
discussed above, VQAs have the potential to convey a the Max-Cut or the Max-Sat problems. These include
quantum advantage in studying nuclear structure and dy- tasks like electronic circuitry layout design, state prob-
namics. The most studied potential contribution is the lems in statistical physics [249], and even automotive con-
utility of the VQE method to find nuclear ground states. figuration [250]. Applying QAOA to classical optimiza-
This was first demonstrated for computing the deuteron tion problems is widely considered to be one of the lead-
(2 H) binding energy [239], and has been extended to ing candidates for achieving quantum advantage on NISQ
other light nuclei such as the triton (3 H), 3 He, and an devices [110]. There are several reasons for this optimism.
alpha particle (4 He) [240]. Additionally, using VQE to QAOA has provable performance guarantees [22, 251] for
prepare the ground state of a triton has been an initial p = 1. In general, even p = 1 QAOA ansatz cannot
step as a demonstration of simulating neutrino-nucleon be efficiently simulated on any classical device [252]. At
scattering [241]. Considering these low-energy applica- the same time, QAOA performance can only improve by
tions along with the general progress towards studying increasing p. It was also shown that “bang-bang” evo-
higher energy nuclear interactions (i.e., quantum chro- lution that motivates QAOA ansatz is the optimal ap-
modynamics) via VQA lattice gauge theory approaches proach given fixed quantum computation time [32]. On
(discussed below) shows that VQAs have the potential the other hand, there are problems for which a shallow
to provide a significant advantage over classical methods QAOA ansatz does not perform well [253, 254] suggesting
for nuclear physics. that p may have to grow with the problem size. Larger p
Particle physics. In particle physics many analytical requires improvements in the parametrization and opti-
tools have been developed to describe and study theo- mization [175]. Similarly to quantum chemistry, large
ries, but there are many areas that remain intractable. scale experiments of QAOA have already been imple-
In particular, the study of important gauge theories like mented [255].
quantum chromodynamics is often handled by mapping Machine Learning. In the past few decades, the use
the problem onto a lattice to allow for numerical stud- of machine learning has become common in most, if not
ies. One of the major drawbacks of such Lattice Gauge all, areas of science. While the problem of loading clas-
Theories (LGTs) for classical computation is that they sical data on quantum computers is still an active topic
exhibit the sign problem and as a result are usually not of research, there has been significant efforts put forward
20
to use quantum algorithms for machine learning applica- even when the fault-tolerant era arrives. Transitioning
tions [134, 148, 256, 257]. For instance, it has been shown from estimating expectation values from Hamiltonian av-
that quantum neural networks can achieve a significantly eraging to phase estimation may be an important com-
better effective dimension than comparable classical neu- ponent here [98]. QAOA may be a good candidate VQA
ral networks [258]. Moreover, it has also been pointed to find usage in the fault-tolerant era, albeit with caveats
out that quantum algorithms can outperform classical about the overhead [261]. Strategies that address chal-
ones in deep learning problems [259], and more recently lenges in the NISQ era, such as keeping circuit depth shal-
a VQA has been proposed for deep reinforcement learn- low and avoiding barren plateaus, could still play a role
ing [260]. An exciting prospect for using quantum neu- in the fault-tolerant era. Therefore, current research on
ral networks is that certain architectures are immune to VQAs will likely remain useful even when fault-tolerant
barren plateaus, and hence are trainable even for large quantum devices arrive.
problems [153, 154].
VII. REFERENCES
VI. OUTLOOK
[1] Richard P Feynman, “Simulating physics with comput-
ers,” Int. J. Theor. Phys 21 (1982).
In the quest for quantum advantage, analytical and [2] Peter W Shor, “Algorithms for quantum computation:
heuristic scaling analysis of VQAs will be increasingly discrete logarithms and factoring,” in Proceedings 35th
important. Better methods to analyze VQA scalability annual symposium on foundations of computer science
are anticipated in the future. This will likely include both (Ieee, 1994) pp. 124–134.
gradient scaling as well as other scaling aspects, such as [3] Seth Lloyd, “Universal quantum simulators,” Science
273, 1073–1078 (1996).
the density of local minima and the shape of the cost
[4] Aram W Harrow, Avinatan Hassidim, and Seth Lloyd,
landscape. These fundamental results will help to guide “Quantum algorithm for linear systems of equations,”
the search for quantum advantage. Physical Review Letters 103, 150502 (2009).
At the same time, the future will also see an improved [5] “IBM Makes Quantum Computing Available on
toolbox for VQAs. Quantum-aware optimizers will ex- IBM Cloud to Accelerate Innovation,” (2016),
ploit knowledge gained about the cost landscape. These press release at https://www-03.ibm.com/press/us/
improved optimizers will mitigate the impacts of small en/pressrelease/49661.wss.
gradients and avoid local minima to facilitate rapid train- [6] Adetokunbo Adedoyin, John Ambrosiano, Petr Anisi-
ing of the parameters in VQAs. mov, Andreas Bärtschi, William Casper, Gopinath
Application-specific ansatzes will continue to be devel- Chennupati, Carleton Coffrin, Hristo Djidjev, David
Gunter, Satish Karra, et al., “Quantum algo-
oped. Better ansatzes will enhance gradient magnitudes
rithm implementations for beginners,” arXiv preprint
to improve trainability and they may also reduce the im- arXiv:1804.03719 (2018).
pact of noise on VQAs. This will likely include adaptive [7] J. Preskill, “Quantum computing in the NISQ era and
ansatz strategies, which appear promising. beyond,” Quantum 2, 79 (2018).
New error mitigation strategies are anticipated in the [8] Frank Arute et al., “Quantum supremacy using a pro-
future. These will be crucial for obtaining accurate re- grammable superconducting processor,” Nature 574,
sults from VQAs and will improve accuracy by orders 505–510 (2019).
of magnitude. Error mitigation will be hard-coded into [9] Han-Sen Zhong, Hui Wang, Yu-Hao Deng, Ming-Cheng
cloud-based quantum computing platforms, to allow uses Chen, Li-Chao Peng, Yi-Han Luo, Jian Qin, Dian Wu,
to obtain accurate results with ease. Xing Ding, Yi Hu, et al., “Quantum computational ad-
vantage using photons,” Science (2020).
The future will also see better quantum hardware be-
[10] A. Kandala, A. Mezzacapo, K. Temme, M. Takita,
come available, both in terms of qubit count and noise M. Brink, J. M. Chow, and J. M. Gambetta,
levels. VQAs will certainly benefit from such improved “Hardware-efficient variational quantum eigensolver for
hardware. Moreover, VQAs will play a central role in small molecules and quantum magnets,” Nature 549,
benchmarking the capabilities of these new platforms. 242 (2017).
In the near future, VQAs will likely see a shift from the [11] Bryan T Gard, Linghua Zhu, George S Barron,
proposal and development phase to the implementation Nicholas J Mayhall, Sophia E Economou, and Edwin
phase. Researchers will aim to implement larger, more Barnes, “Efficient symmetry-preserving state prepara-
realistic problems with VQAs instead of toy problems. tion circuits for the variational quantum eigensolver al-
These implementations will incorporate multiple state-of- gorithm,” npj Quantum Information 6, 1–9 (2020).
[12] Matthew Otten, Cristian L Cortes, and Stephen K
the-art strategies for enhancing VQA performance. Com-
Gray, “Noise-resilient quantum dynamics using
bining strategies for improving the accuracy, trainability, symmetry-preserving ansatzes,” arXiv preprint
and efficiency of VQAs will test their ultimate capabili- arXiv:1910.06284 (2019).
ties and will push the boundaries of NISQ devices, with [13] Nikolay V Tkachenko, James Sud, Yu Zhang, Sergei
the grand vision of obtaining quantum advantage. Tretiak, Petr M Anisimov, Andrew T Arrasmith,
In the more distant future, VQAs will even find use Patrick J Coles, Lukasz Cincio, and Pavel A
21
Guzik, “Strategies for quantum computing molecular trost, “Quantum principal component analysis,” Nature
energies using the unitary coupled cluster ansatz,” Physics 10, 631–633 (2014).
Quantum Science and Technology 4, 014008 (2018). [125] Ryan LaRose, Arkin Tikku, Étude O’Neel-Judy, Lukasz
[110] Gavin E Crooks, “Performance of the quantum approxi- Cincio, and Patrick J Coles, “Variational quantum
mate optimization algorithm on the maximum cut prob- state diagonalization,” npj Quantum Information 5, 1–
lem,” arXiv preprint arXiv:1811.08419 (2018). 10 (2019).
[111] Dave Wecker, Matthew B Hastings, and Matthias [126] Kentaro Heya, Yasunari Suzuki, Yasunobu Nakamura,
Troyer, “Training a quantum optimizer,” Physical Re- and Keisuke Fujii, “Variational quantum gate optimiza-
view A 94, 022309 (2016). tion,” arXiv preprint arXiv:1810.12745 (2018).
[112] Sami Khairy, Ruslan Shaydulin, Lukasz Cincio, Yuri [127] Tyson Jones and Simon C Benjamin, “Quantum compi-
Alexeev, and Prasanna Balaprakash, “Learning to op- lation and circuit optimisation via energy dissipation,”
timize variational quantum circuits to solve combinato- arXiv preprint arXiv:1811.03147 (2018).
rial problems,” Proceedings of the AAAI Conference on [128] Kunal Sharma, Sumeet Khatri, M. Cerezo, and
Artificial Intelligence 34, 2367–2375 (2020). Patrick J Coles, “Noise resilience of variational quantum
[113] A Ambainis, “Variable time amplitude amplification compiling,” New Journal of Physics 22, 043006 (2020).
and a faster quantum algorithm for solving systems of [129] Jacques Carolan, Masoud Mohseni, Jonathan P Olson,
linear equations 29th int,” in Symp. Theoretical Aspects Mihika Prabhu, Changchen Chen, Darius Bunandar,
of Computer Science (STACS 2012), Vol. 14 (2012) pp. Murphy Yuezhen Niu, Nicholas C Harris, Franco NC
636–47. Wong, Michael Hochberg, et al., “Variational quantum
[114] Y. Subaşı, R. D. Somma, and D. Orsucci, “Quantum unsampling on a quantum photonic processor,” Nature
algorithms for systems of linear equations inspired by Physics , 1–6 (2020).
adiabatic quantum computing,” Phys. Rev. Lett. 122, [130] Peter D Johnson, Jonathan Romero, Jonathan Olson,
060504 (2019). Yudong Cao, and Alán Aspuru-Guzik, “Qvector: an
[115] A. Childs, R. Kothari, and R. Somma, “Quantum algo- algorithm for device-tailored quantum error correction,”
rithm for systems of linear equations with exponentially arXiv preprint arXiv:1711.02249 (2017).
improved dependence on precision,” SIAM J. Comput- [131] Xiaosi Xu, Simon C Benjamin, and Xiao Yuan, “Vari-
ing 46, 1920–1950 (2017). ational circuit compiler for quantum error correction,”
[116] Shantanav Chakraborty, András Gilyén, and Stacey arXiv preprint arXiv:1911.05759 (2019).
Jeffery, “The Power of Block-Encoded Matrix Pow- [132] Edward Farhi and Hartmut Neven, “Classification with
ers: Improved Regression Techniques via Faster Hamil- quantum neural networks on near term processors,”
tonian Simulation,” in 46th International Colloquium arXiv preprint arXiv:1802.06002 (2018).
on Automata, Languages, and Programming (ICALP [133] Maria Schuld, Alex Bocharov, Krysta M Svore, and
2019), Leibniz International Proceedings in Informat- Nathan Wiebe, “Circuit-centric quantum classifiers,”
ics (LIPIcs), Vol. 132 (2019) pp. 33:1–33:14. Physical Review A 101, 032308 (2020).
[117] A. Scherer, B. Valiron, S.-C. Mau, S. Alexander, [134] Maria Schuld and Nathan Killoran, “Quantum machine
E. Van den Berg, and T. E. Chapuran, “Concrete re- learning in feature hilbert spaces,” Physical Review Let-
source analysis of the quantum linear-system algorithm ters 122, 040504 (2019).
used to compute the electromagnetic scattering cross [135] Vojtěch Havlíček, Antonio D Córcoles, Kristan Temme,
section of a 2D target,” Quantum Information Process- Aram W Harrow, Abhinav Kandala, Jerry M Chow,
ing 16, 60 (2017). and Jay M Gambetta, “Supervised learning with
[118] Carlos Bravo-Prieto, Ryan LaRose, M. Cerezo, Yigit quantum-enhanced feature spaces,” Nature 567, 209–
Subasi, Lukasz Cincio, and Patrick J Coles, “Varia- 212 (2019).
tional quantum linear solver: A hybrid algorithm for lin- [136] Edwin Stoudenmire and David J Schwab, “Supervised
ear systems,” arXiv preprint arXiv:1909.05820 (2019). learning with tensor networks,” in Advances in Neural
[119] Xiaosi Xu, Jinzhao Sun, Suguru Endo, Ying Li, Simon C Information Processing Systems (2016) pp. 4799–4807.
Benjamin, and Xiao Yuan, “Variational algorithms [137] Takeru Kusumoto, Kosuke Mitarai, Keisuke Fujii,
for linear algebra,” arXiv preprint arXiv:1909.03898 Masahiro Kitagawa, and Makoto Negoro, “Experimen-
(2019). tal quantum kernel machine learning with nuclear spins
[120] Hsin-Yuan Huang, Kishor Bharti, and Patrick Reben- in a solid,” arXiv preprint arXiv:1911.12021 (2019).
trost, “Near-term quantum algorithms for linear systems [138] J. Romero, J. P. Olson, and A. Aspuru-Guzik, “Quan-
of equations,” arXiv preprint arXiv:1909.07344 (2019). tum autoencoders for efficient compression of quan-
[121] Yiğit Subaşı, Rolando D Somma, and Davide Orsucci, tum data,” Quantum Science and Technology 2, 045001
“Quantum algorithms for systems of linear equations in- (2017).
spired by adiabatic quantum computing,” Physical Re- [139] Kwok Ho Wan, Oscar Dahlsten, Hlér Kristjánsson,
view Letters 122, 060504 (2019). Robert Gardner, and MS Kim, “Quantum generali-
[122] Michael Lubasch, Jaewoo Joo, Pierre Moinier, Martin sation of feedforward neural networks,” npj Quantum
Kiffner, and Dieter Jaksch, “Variational quantum algo- Information 3, 36 (2017).
rithms for nonlinear problems,” Physical Review A 101, [140] Guillaume Verdon, Jason Pye, and Michael Broughton,
010301 (2020). “A universal training algorithm for quantum deep learn-
[123] Eric Anschuetz, Jonathan Olson, Alán Aspuru-Guzik, ing,” arXiv preprint arXiv:1806.09729 (2018).
and Yudong Cao, “Variational quantum factoring,” in [141] M. Cerezo, Akira Sone, Tyler Volkoff, Lukasz Cincio,
International Workshop on Quantum Technology and and Patrick J Coles, “Cost-function-dependent barren
Optimization Problems (Springer, 2019) pp. 74–85. plateaus in shallow quantum neural networks,” arXiv
[124] Seth Lloyd, Masoud Mohseni, and Patrick Reben- preprint arXiv:2001.00550 (2020).
25
[142] Alex Pepper, Nora Tischler, and Geoff J Pryde, “Ex- Matsuzaki, and Simon C Benjamin, “Variational-state
perimental realization of a quantum autoencoder: The quantum metrology,” New Journal of Physics 22, 083038
compression of qutrits via machine learning,” Physical (2020).
Review Letters 122, 060501 (2019). [161] Raphael Kaubruegger, Pietro Silvi, Christian Kokail,
[143] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, and Rick van Bijnen, Ana Maria Rey, Jun Ye, Adam M
Dacheng Tao, “Expressive power of parametrized quan- Kaufman, and Peter Zoller, “Variational spin-squeezing
tum circuits,” Physical Review Research 2, 033125 algorithms on programmable quantum sensors,” Physi-
(2020). cal Review Letters 123, 260505 (2019).
[144] Guillaume Verdon, Michael Broughton, and Ja- [162] Ziqi Ma, Pranav Gokhale, Tian-Xing Zheng, Sisi Zhou,
cob Biamonte, “A quantum algorithm to train neu- Xiaofei Yu, Liang Jiang, Peter Maurer, and Frederic T
ral networks using low-depth circuits,” arXiv preprint Chong, “Adaptive circuit learning for quantum metrol-
arXiv:1712.05304 (2017). ogy,” arXiv preprint arXiv:2010.08702 (2020).
[145] Jin-Guo Liu and Lei Wang, “Differentiable learning of [163] Jacob L Beckey, M. Cerezo, Akira Sone, and Patrick J
quantum circuit born machines,” Physical Review A 98, Coles, “Variational quantum algorithm for estimat-
062324 (2018). ing the quantum fisher information,” arXiv preprint
[146] Marcello Benedetti, Delfina Garcia-Pintos, Oscar Per- arXiv:2010.10488 (2020).
domo, Vicente Leyton-Ortega, Yunseong Nam, and [164] Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy,
Alejandro Perdomo-Ortiz, “A generative modeling ap- Ryan Babbush, and Hartmut Neven, “Barren plateaus
proach for benchmarking and training shallow quantum in quantum neural network training landscapes,” Nature
circuits,” npj Quantum Information 5, 1–9 (2019). communications 9, 4812 (2018).
[147] Brian Coyle, Daniel Mills, Vincent Danos, and Elham [165] Andrew Arrasmith, M. Cerezo, Piotr Czarnik, Lukasz
Kashefi, “The born supremacy: Quantum advantage Cincio, and Patrick J Coles, “Effect of barren
and training of an ising born machine,” npj Quantum plateaus on gradient-free optimization,” arXiv preprint
Information 6, 1–11 (2020). arXiv:2011.12245 (2020).
[148] Jonathan Romero and Alan Aspuru-Guzik, “Variational [166] Aram W Harrow and Richard A Low, “Random quan-
quantum generators: Generative adversarial quantum tum circuits are approximate 2-designs,” Communica-
machine learning for continuous distributions,” arXiv tions in Mathematical Physics 291, 257–302 (2009).
preprint arXiv:1901.00848 (2019). [167] Fernando GSL Brandao, Aram W Harrow, and Michał
[149] MV Altaisky, “Quantum neural network,” arXiv Horodecki, “Local random quantum circuits are approx-
preprint quant-ph/0107012 (2001). imate polynomial-designs,” Communications in Mathe-
[150] Kerstin Beer, Dmytro Bondarenko, Terry Farrelly, To- matical Physics 346, 397–434 (2016).
bias J Osborne, Robert Salzmann, Daniel Scheiermann, [168] Alexey Uvarov, Jacob D. Biamonte, and Dmitry Yudin,
and Ramona Wolf, “Training deep quantum neural net- “Variational quantum eigensolver for frustrated quan-
works,” Nature communications 11, 1–6 (2020). tum systems,” Phys. Rev. B 102, 075104 (2020).
[151] Iris Cong, Soonwon Choi, and Mikhail D Lukin, “Quan- [169] Alexey Uvarov and Jacob Biamonte, “On barren
tum convolutional neural networks,” Nature Physics 15, plateaus and cost function locality in variational
1273–1278 (2019). quantum algorithms,” arXiv preprint arXiv:2011.10530
[152] Lukas Franken and Bogdan Georgiev, “Explorations in (2020).
quantum neural networks with intermediate measure- [170] Zoë Holmes, Andrew Arrasmith, Bin Yan, Patrick J
ments,” in Proceedings of ESANN (2020). Coles, Andreas Albrecht, and Andrew T Sornborger,
[153] Arthur Pesah, M. Cerezo, Samson Wang, Tyler Volkoff, “Barren plateaus preclude learning scramblers,” arXiv
Andrew T Sornborger, and Patrick J Coles, “Absence preprint arXiv:2009.14808 (2020).
of barren plateaus in quantum convolutional neural net- [171] Kunal Sharma, M. Cerezo, Lukasz Cincio, and
works,” arXiv preprint arXiv:2011.02966 (2020). Patrick J Coles, “Trainability of dissipative perceptron-
[154] Kaining Zhang, Min-Hsiu Hsieh, Liu Liu, and Dacheng based quantum neural networks,” arXiv preprint
Tao, “Toward trainability of quantum neural networks,” arXiv:2005.12458 (2020).
arXiv preprint arXiv:2011.06258 (2020). [172] Carlos Ortiz Marrero, Mária Kieferová, and Nathan
[155] A. Arrasmith, L. Cincio, A. T. Sornborger, W. H. Zurek, Wiebe, “Entanglement induced barren plateaus,” arXiv
and P. J. Coles, “Variational consistent histories as a hy- preprint arXiv:2010.15968 (2020).
brid algorithm for quantum foundations,” Nature com- [173] Samson Wang, Enrico Fontana, M. Cerezo, Kunal
munications 10, 3438 (2019). Sharma, Akira Sone, Lukasz Cincio, and Patrick J
[156] Mark M Wilde, Quantum information theory (Cam- Coles, “Noise-induced barren plateaus in variational
bridge University Press, 2013). quantum algorithms,” arXiv preprint arXiv:2007.14384
[157] B. Rosgen and J. Watrous, “On the hardness of distin- (2020).
guishing mixed-state quantum computations,” in 20th [174] Daniel Stilck Franca and Raul Garcia-Patron, “Limita-
Annual IEEE Conference on Computational Complex- tions of optimization algorithms on noisy quantum de-
ity (CCC’05) (2005) pp. 344–354. vices,” arXiv preprint arXiv:2009.05532 (2020).
[158] M. Cerezo, Alexander Poremba, Lukasz Cincio, and [175] Leo Zhou, Sheng-Tao Wang, Soonwon Choi, Hannes
Patrick J Coles, “Variational quantum fidelity estima- Pichler, and Mikhail D Lukin, “Quantum approximate
tion,” Quantum 4, 248 (2020). optimization algorithm: Performance, mechanism, and
[159] Carlos Bravo-Prieto, Diego García-Martín, and José I implementation on near-term devices,” Physical Review
Latorre, “Quantum singular value decomposer,” Physi- X 10, 021067 (2020).
cal Review A 101, 062310 (2020). [176] Edward Grant, Leonard Wossnig, Mateusz Ostaszewski,
[160] Bálint Koczor, Suguru Endo, Tyson Jones, Yuichiro and Marcello Benedetti, “An initialization strategy for
26
[210] Jinzhao Sun, Xiao Yuan, Takahiro Tsunoda, Vlatko Ve- mon C Benjamin, and Xiao Yuan, “Quantum com-
dral, Simon C Bejamin, and Suguru Endo, “Mitigat- putational chemistry,” Reviews of Modern Physics 92,
ing realistic noise in practical noisy intermediate-scale 015003 (2020).
quantum devices,” arXiv preprint arXiv:2001.04891 [227] Frank Arute, Kunal Arya, Ryan Babbush, Dave Bacon,
(2020). Joseph C Bardin, Rami Barends, Sergio Boixo, Michael
[211] Armands Strikis, Dayue Qin, Yanzhu Chen, Simon C Broughton, Bob B Buckley, David A Buell, et al.,
Benjamin, and Ying Li, “Learning-based quantum error “Hartree-fock on a superconducting qubit quantum
mitigation,” arXiv preprint arXiv:2005.07601 (2020). computer,” arXiv preprint arXiv:2004.04174 (2020).
[212] Piotr Czarnik, Andrew Arrasmith, Patrick J Coles, and [228] Artem A Bakulin, Stoichko D Dimitrov, Akshay Rao,
Lukasz Cincio, “Error mitigation with clifford quantum- Philip CY Chow, Christian B Nielsen, Bob C Schroeder,
circuit data,” arXiv preprint arXiv:2005.10189 (2020). Iain McCulloch, Huib J Bakker, James R Durrant, and
[213] Angus Lowe, Max Hunter Gordon, Piotr Czarnik, An- Richard H Friend, “Charge-transfer state dynamics fol-
drew Arrasmith, Patrick J Coles, and Lukasz Cincio, lowing hole and electron transfer in organic photovoltaic
“Unified approach to data-driven quantum error miti- devices,” The Journal of Physical Chemistry Letters 4,
gation,” arXiv arXiv:2011.01157 (2020). 209–215 (2013).
[214] Sam McArdle, Xiao Yuan, and Simon Benjamin, [229] Markus Gross, David C Müller, Heinz-Georg Nothofer,
“Error-mitigated digital quantum simulation,” Physical Ulrich Scherf, Dieter Neher, Christoph Bräuchle, and
Review Letters 122, 180501 (2019). Klaus Meerholz, “Improving the performance of doped
[215] Xavi Bonet-Monroig, Ramiro Sagastizabal, M Singh, π-conjugated polymers for use in organic light-emitting
and TE O’Brien, “Low-cost error mitigation by symme- diodes,” Nature 405, 661–665 (2000).
try verification,” Physical Review A 98, 062339 (2018). [230] JR Schmidt, Priya V Parandekar, and John C Tully,
[216] Jarrod R McClean, Zhang Jiang, Nicholas C Rubin, “Mixed quantum-classical equilibrium: Surface hop-
Ryan Babbush, and Hartmut Neven, “Decoding quan- ping,” The Journal of Chemical Physics 129, 044104
tum errors with subspace expansions,” Nature Commu- (2008).
nications 11, 1–9 (2020). [231] Thomas E O’Brien, Bruno Senjean, Ramiro Sagas-
[217] Bálint Koczor, “Exponential error suppression tizabal, Xavier Bonet-Monroig, Alicja Dutkiewicz,
for near-term quantum devices,” arXiv preprint Francesco Buda, Leonardo DiCarlo, and Lucas Viss-
arXiv:2011.05942 (2020). cher, “Calculating energy derivatives for quantum chem-
[218] William J Huggins, Sam McArdle, Thomas E O’Brien, istry on a quantum computer,” npj Quantum Informa-
Joonho Lee, Nicholas C Rubin, Sergio Boixo, K Birgitta tion 5, 1–12 (2019).
Whaley, Ryan Babbush, and Jarrod R McClean, “Vir- [232] John C Tully and Richard K Preston, “Trajectory sur-
tual distillation for quantum error mitigation,” arXiv face hopping approach to nonadiabatic molecular col-
preprint arXiv:2011.07064 (2020). lisions: the reaction of h+ with d2,” The Journal of
[219] Sergey Bravyi, Sarah Sheldon, Abhinav Kandala, Chemical Physics 55, 562–572 (1971).
David C Mckay, and Jay M Gambetta, “Mitigating [233] David R Weinberg, Christopher J Gagliardi, Jonathan F
measurement errors in multi-qubit experiments,” arXiv Hull, Christine Fecenko Murphy, Caleb A Kent,
preprint arXiv:2006.14044 (2020). Brittany C Westlake, Amit Paul, Daniel H Ess,
[220] Daiqin Su, Robert Israel, Kunal Sharma, Haoyu Qi, Dewey Granville McCafferty, and Thomas J Meyer,
Ish Dhand, and Kamil Brádler, “Error mitigation on “Proton-coupled electron transfer,” Chemical Reviews
a near-term quantum photonic device,” arXiv preprint 112, 4016–4093 (2012).
arXiv:2008.06670 (2020). [234] Walter Kohn and Lu Jeu Sham, “Self-consistent equa-
[221] Yudong Cao, Jhonathan Romero, and Alán Aspuru- tions including exchange and correlation effects,” Phys-
Guzik, “Potential of quantum computing for drug dis- ical review 140, A1133 (1965).
covery,” IBM Journal of Research and Development 62, [235] Bela Bauer, Dave Wecker, Andrew J Millis, Matthew B
6–1 (2018). Hastings, and Matthias Troyer, “Hybrid quantum-
[222] Yudong Cao, Jonathan Romero, Jonathan P Olson, classical approach to correlated materials,” Physical Re-
Matthias Degroote, Peter D Johnson, Mária Kieferová, view X 6, 031045 (2016).
Ian D Kivlichan, Tim Menke, Borja Peropadre, Nico- [236] Ryan Babbush, Craig Gidney, Dominic W Berry,
las PD Sawaya, et al., “Quantum chemistry in the age Nathan Wiebe, Jarrod McClean, Alexandru Paler,
of quantum computing,” Chemical reviews 119, 10856– Austin Fowler, and Hartmut Neven, “Encoding elec-
10915 (2019). tronic spectra in quantum circuits with linear t com-
[223] Carlos Outeiral, Martin Strahm, Jiye Shi, Garrett M plexity,” Physical Review X 8, 041015 (2018).
Morris, Simon C Benjamin, and Charlotte M Deane, [237] Dominic W Berry, Mária Kieferová, Artur Scherer, Yu-
“The prospects of quantum computing in computa- val R Sanders, Guang Hao Low, Nathan Wiebe, Craig
tional molecular biology,” Wiley Interdisciplinary Re- Gidney, and Ryan Babbush, “Improved techniques for
views: Computational Molecular Science , e1481 (2020). preparing eigenstates of fermionic hamiltonians,” npj
[224] Steven R White, “Density matrix formulation for quan- Quantum Information 4, 1–7 (2018).
tum renormalization groups,” Physical Review Letters [238] Pierre-Luc Dallaire-Demers, Jonathan Romero, Libor
69, 2863 (1992). Veis, Sukin Sim, and Alán Aspuru-Guzik, “Low-
[225] Garnet Kin-Lic Chan and Sandeep Sharma, “The den- depth circuit ansatz for preparing correlated fermionic
sity matrix renormalization group in quantum chem- states on a quantum computer,” arXiv preprint
istry,” Annual Review of Physical Chemistry 62, 465– arXiv:1801.01053 (2018).
481 (2011). [239] Eugene F Dumitrescu, Alex J McCaskey, Gaute Ha-
[226] Sam McArdle, Suguru Endo, Alan Aspuru-Guzik, Si- gen, Gustav R Jansen, Titus D Morris, T Papenbrock,
28
Raphael C Pooser, David Jarvis Dean, and Pavel preprint arXiv:1910.08980 (2019).
Lougovski, “Cloud quantum computing of an atomic nu- [255] Frank Arute, Kunal Arya, Ryan Babbush, Dave Ba-
cleus,” Physical Review Letters 120, 210501 (2018). con, Joseph C Bardin, Rami Barends, Sergio Boixo,
[240] Hsuan-Hao Lu, Natalie Klco, Joseph M Lukens, Ti- Michael Broughton, Bob B Buckley, David A Buell,
tus D Morris, Aaina Bansal, Andreas Ekström, Gaute et al., “Quantum approximate optimization of non-
Hagen, Thomas Papenbrock, Andrew M Weiner, Mar- planar graph problems on a planar superconducting pro-
tin J Savage, et al., “Simulations of subatomic many- cessor,” arXiv preprint arXiv:2004.04197 (2020).
body physics on a quantum frequency processor,” Phys- [256] Maria Schuld, Ilya Sinayskiy, and Francesco Petruc-
ical Review A 100, 012320 (2019). cione, “An introduction to quantum machine learning,”
[241] Alessandro Roggero, Andy CY Li, Joseph Carlson, Ra- Contemporary Physics 56, 172–185 (2015).
jan Gupta, and Gabriel N Perdue, “Quantum comput- [257] Jacob Biamonte, Peter Wittek, Nicola Pancotti, Patrick
ing for neutrino-nucleus scattering,” Physical Review D Rebentrost, Nathan Wiebe, and Seth Lloyd, “Quantum
101, 074038 (2020). machine learning,” Nature 549, 195–202 (2017).
[242] John Preskill, “Simulating quantum field theory with [258] Amira Abbas, David Sutter, Christa Zoufal, Aurélien
a quantum computer,” arXiv preprint arXiv:1811.10085 Lucchi, Alessio Figalli, and Stefan Woerner, “The
(2018). power of quantum neural networks,” arXiv preprint
[243] Suguru Endo, Iori Kurata, and Yuya O Nakagawa, arXiv:2011.00027 (2020).
“Calculation of the green’s function on near-term quan- [259] Nathan Wiebe, Ashish Kapoor, and Krysta M
tum computers,” Physical Review Research 2, 033281 Svore, “Quantum deep learning,” arXiv preprint
(2020). arXiv:1412.3489 (2014).
[244] Chinmay Mishra, Shane Thompson, Raphael Pooser, [260] Samuel Yen-Chi Chen, Chao-Han Huck Yang, Jun Qi,
and George Siopsis, “Quantum computation of an in- Pin-Yu Chen, Xiaoli Ma, and Hsi-Sheng Goan, “Vari-
teracting fermionic model,” Quantum Science and Tech- ational quantum circuits for deep reinforcement learn-
nology 5, 035010 (2020). ing,” IEEE Access 8, 141007–141024 (2020).
[245] Danny Paulson, Luca Dellantonio, Jan F Haase, Alessio [261] Yuval R Sanders, Dominic W Berry, Pedro CS Costa,
Celi, Angus Kan, Andrew Jena, Christian Kokail, Rick Louis W Tessler, Nathan Wiebe, Craig Gidney, Hart-
van Bijnen, Karl Jansen, Peter Zoller, et al., “Towards mut Neven, and Ryan Babbush, “Compilation of fault-
simulating 2d effects in lattice gauge theories on a tolerant quantum heuristics for combinatorial optimiza-
quantum computer,” arXiv preprint arXiv:2008.09252 tion,” Physical Review X Quantum 1, 020312 (2020).
(2020).
[246] A Avkhadiev, PE Shanahan, and RD Young, “Accel-
erating lattice quantum field theory calculations via in-
terpolator optimization using noisy intermediate-scale ACKNOWLEDGEMENTS
quantum computing,” Physical Review Letters 124,
080501 (2020).
[247] Jad C Halimeh, Valentin Kasper, and Philipp Hauke, MC is thankful to Kunal Sharma for helpful discus-
“Fate of lattice gauge theories under decoherence,” sions. MC was initially supported by the Laboratory Di-
arXiv preprint arXiv:2009.07848 (2020). rected Research and Development (LDRD) program of
[248] Kunal Sharma, M. Cerezo, Zoë Holmes, Lukasz Cincio, Los Alamos National Laboratory (LANL) under project
Andrew Sornborger, and Patrick J Coles, “Reformu- number 20180628ECR, and later supported by the Cen-
lation of the no-free-lunch theorem for entangled data ter for Nonlinear Studies at LANL. AA was initially sup-
sets,” arXiv preprint arXiv:2007.04900 (2020). ported by the LDRD program of LANL under project
[249] Francisco Barahona, Martin Grötschel, Michael Jünger, number 20200056DR, and later supported by the by
and Gerhard Reinelt, “An application of combinatorial the U.S. Department of Energy (DOE), Office of Sci-
optimization to statistical physics and circuit layout de- ence, Office of High Energy Physics QuantISED pro-
sign,” Operations Research 36, 493–513 (1988).
[250] Wolfgang Küchlin and Carsten Sinz, “Proving consis-
gram under under Contract Nos. DE-AC52-06NA25396
tency assertions for automotive product data manage- and KA2401032. SCB acknowledges financial support
ment,” Journal of Automated Reasoning 24, 145–163 from EPSRC Hub grants under the agreement num-
(2000). bers EP/M013243/1 and EP/T001062/1, and from EU
[251] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann, H2020-FETFLAG-03-2018 under the grant agreement
“A Quantum Approximate Optimization Algorithm Ap- No 820495 (AQTION). SE was supported by MEXT
plied to a Bounded Occurrence Constraint Problem,” Quantum Leap Flagship Program (MEXT QLEAP)
arXiv preprint arXiv:1412.6062 (2014). Grant Number JPMXS0120319794, JPMXS0118068682
[252] Edward Farhi and Aram W Harrow, “Quantum and JST ERATO Grant Number JPMJER1601. KF was
supremacy through the quantum approximate opti- supported by JSPS KAKENHI Grant No. 16H02211,
mization algorithm,” arXiv preprint arXiv:1602.07674
JST ERATO JPMJER1601, and JST CREST JP-
(2016).
[253] Matthew B Hastings, “Classical and quantum bounded MJCR1673. KM was supported by JST PRESTO
depth approximation algorithms,” arXiv preprint Grant No. JPMJPR2019 and JSPS KAKENHI Grant
arXiv:1905.07047 (2019). No. 20K22330. KM and KF were also supported
[254] Sergey Bravyi, Alexander Kliesch, Robert Koenig, and by MEXT Quantum Leap Flagship Program (MEXT
Eugene Tang, “Obstacles to state preparation and vari- QLEAP) Grant Number JPMXS0118067394 and JP-
ational optimization from symmetry protection,” arXiv MXS0120319794. XY acknowledges support from the
29
Simons Foundation. LC was initially supported by the U.S. Department of Energy (DOE).
the LDRD program of LANL under project number
20190065DR, and later supported by the U.S. DOE, Of-
fice of Science, Office of Advanced Scientific Computing AUTHOR CONTRIBUTIONS
Research, under the Quantum Computing Application
Teams (QCAT) program. PJC was initially supported
All authors have read, discussed and contributed to
by the LANL ASC Beyond Moore’s Law project, and
the writing of the manuscript.
later supported by the U.S. DOE, Office of Science, Office
of Advanced Scientific Computing Research, under the
Accelerated Research in Quantum Computing (ARQC)
program. Most recently, MC, LC, and PJC were sup- COMPETING INTERESTS
ported by the Quantum Science Center (QSC), a Na-
tional Quantum Information Science Research Center of The authors declare no competing interests.