0% found this document useful (0 votes)
35 views

Quantum Convolutional Neural Networks

Uploaded by

Sergio Salazar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Quantum Convolutional Neural Networks

Uploaded by

Sergio Salazar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Articles

https://doi.org/10.1038/s41567-019-0648-8

Quantum convolutional neural networks


Iris Cong1, Soonwon Choi * and Mikhail D. Lukin1
1,2

Neural network-based machine learning has recently proven successful for many complex applications ranging from image
recognition to precision medicine. However, its direct application to problems in quantum physics is challenging due to the
exponential complexity of many-body systems. Motivated by recent advances in realizing quantum information processors, we
introduce and analyse a quantum circuit-based algorithm inspired by convolutional neural networks, a highly effective model
in machine learning. Our quantum convolutional neural network (QCNN) uses only O(log(N)) variational parameters for input
sizes of N qubits, allowing for its efficient training and implementation on realistic, near-term quantum devices. To explicitly
illustrate its capabilities, we show that QCNNs can accurately recognize quantum states associated with a one-dimensional
symmetry-protected topological phase, with performance surpassing existing approaches. We further demonstrate that
QCNNs can be used to devise a quantum error correction scheme optimized for a given, unknown error model that substantially
outperforms known quantum codes of comparable complexity. The potential experimental realizations and generalizations of
QCNNs are also discussed.

T
he complex nature of quantum many-body systems moti- (interleaved) layers of image processing; in each layer, an inter-
vates the use of machine learning techniques to analyse mediate two-dimensional array of pixels, called a feature map, is
them. Indeed, large-scale neural networks have successfully produced from the previous one (Fig. 1a). (More generally, CNN
solved classically difficult problems such as image recognition or layers connect ‘volumes’ of multiple feature maps to subsequent
optimization of classical error correction1, and their architectures volumes; for simplicity, we consider only a single feature map per
have been related to various physical concepts2,3. As such, a number volume and leave the generalization to future works.) The convolu-
ð‘Þ
of recent works have used neural networks to study properties of tion layers compute new pixel values xi;j from a linear combina-
w
quantum many-body systems4–10. However, the direct application of tion of nearby ones in the preceding mapI xð‘Þ ¼ P wa;b xð‘�1Þ ,
i;j
these classical algorithms is challenging for intrinsically quantum a;b¼1
iþa;jþb

problems, which take quantum states or processes as inputs. This is where the weights wa,b form a w × w matrix. IPooling layers reduce
because the extremely large many-body Hilbert space hinders the feature map size, for example by taking the maximum value from a
efficient translation of such problems into a classical framework few contiguous pixels, and are often followed by the application of a
without performing exponentially difficult quantum state or pro- nonlinear (activation) function. Once the feature map size becomes
cess tomography11,12. sufficiently small, the final output is computed from a function
Recent experimental progress towards realizing quantum infor- that depends on all remaining pixels (the fully connected layer).
mation processors13–16 has led to proposals for the use of quantum The weights and fully connected function are optimized by train-
computers to enhance conventional machine learning tasks17–20. ing on large datasets. In contrast, variables such as the number of
Motivated by such developments, we introduce and analyse a convolution and pooling layers and the size w of the weight matri-
machine learning-inspired quantum circuit model—the quantum ces (known as hyperparameters) are fixed for a specific CNN1. The
convolutional neural network (QCNN)—and demonstrate its abil- key properties of a CNN are thus translationally invariant convolu-
ity to solve important classes of intrinsically quantum many-body tion and pooling layers, each characterized by a constant number
problems. The first class of problems we consider is quantum phase of parameters (independent of system size) and sequential data size
recognition (QPR), which asks whether a given input quantum state reduction (that is, a hierarchical structure).
ρin belongs to a particular quantum phase of matter. In contrast to Motivated by this architecture, we introduce a QCNN circuit
many existing schemes based on tensor network descriptions21–23, model extending these key properties to the quantum domain
we assume that ρin is prepared in a physical system without direct (Fig. 1b). The circuit’s input is an unknown quantum state ρin. A
access to its classical description. The second class, quantum error convolution layer applies a single quasilocal unitary (Ui) in a trans-
correction (QEC) optimization, asks for an optimal QEC code for lationally invariant manner for finite depth. For pooling, a frac-
a given, a priori unknown error model such as dephasing or poten- tion of qubits are measured, and their outcomes determine unitary
tially correlated depolarization in realistic experimental settings. We rotations (Vj) applied to nearby qubits. Hence, nonlinearities in
provide both theoretical insight and numerical demonstrations for QCNNs arise from reducing the number of degrees of freedom.
the successful application of a QCNN to these important problems, Convolution and pooling layers are performed until the system
and show its feasibility for near-term experimental implementation. size is sufficiently small; then, a fully connected layer is applied
as a unitary F on the remaining qubits. Finally, the outcome of the
QCNN circuit model circuit is obtained by measuring a fixed number of output qubits.
Convolutional neural networks provide a successful machine As in the classical case, QCNN hyperparameters such as the num-
learning architecture for classification tasks such as image recog- ber of convolution and pooling layers are fixed, and the unitaries
nition1,24,25. A CNN generally consists of a sequence of different themselves are learned.

1
Department of Physics, Harvard University, Cambridge, MA, USA. 2Department of Physics, University of California, Berkeley, Berkeley, CA, USA.
*e-mail: [email protected]

Nature Physics | VOL 15 | December 2019 | 1273–1278 | www.nature.com/naturephysics 1273


Articles NaTure PHysiCs
a to an input state (for example |00〉). While both types of layers apply
C P C P FC
quasilocal unitary gates, each isometry layer first introduces a set
of new qubits in a predetermined state, such as |0〉 (Fig. 1c). This
j
exponentially growing, hierarchical structure allows for the long-
= range correlations associated with critical systems. The QCNN cir-
i cuit has similar structure but runs in the reverse direction. Hence,
ge Cat Dog
ut im
a ( – 1) ( ) ( + 1) for any given state |ψ〉 with a MERA representation, there is always
Inp a QCNN that recognizes |ψ〉 with deterministic measurement out-
b U1
V1 comes; one such QCNN is simply the inverse of the MERA circuit.
U1 For input states other than |ψ〉, however, such a QCNN does not
U1
V1 V2 generally produce deterministic measurement outcomes. These
U1 U2 additional degrees of freedom distinguish a QCNN from a MERA.
ρin
U1
V1
U2
V2 F
Specifically, we can identify the measurements as syndrome mea-
U1 U2 surements in QEC29, which determine error correction unitaries
U1
V1
U2 Vj to apply to the remaining qubit(s). Thus, a QCNN circuit with
U1 multiple pooling layers can be viewed as a combination of a MERA
U1 (an important variational ansatz for many-body wavefunctions)
V1
U1 and nested QEC (a mechanism to detect and correct local quantum
U1 errors without collapsing the wavefunction). This makes QCNNs a
c powerful architecture for classifying input quantum states or devis-
ing new QEC codes. In particular, for QPR, the QCNN can provide
∣0〉 a MERA realization of a representative state |ψ0〉 in the target phase.
= MERA Other input states within the same phase can be viewed as |ψ0〉 with
local errors, which are repeatedly corrected by the QCNN in mul-
QCNN

MERA

tiple layers. In this sense, the QCNN circuit can mimic renormal-
QEC
= QCNN ization-group flow, a methodology that successfully classifies many
families of quantum phases30. For QEC optimization, the QCNN
∣ψ〉
structure allows for simultaneous optimization of efficient encoding
and decoding schemes with potentially rich entanglement structure.
Fig. 1 | The concept of QCNNs. a, Simplified illustration of classical CNNs.
A sequence of image-processing layers transforms an input image into
Detecting a 1D symmetry-protected topological phase
a series of feature maps (blue rectangles) and finally into an output
We first demonstrate the potential of a QCNN by applying it to
probability distribution (purple bars). C, convolution; P, pooling; FC, fully
QPR in a class of 1D many-body systems. Specifically, we consider a
connected. b, QCNNs inherit a similar layered structure. Boxes represent
Z2 ´ Z2 symmetry-protected topological (SPT) phase P, a phase con-
unitary gates or measurement with feed-forwarding. c, The QCNN and the
taining
I the S = 1 Haldane chain31, and ground states {|ψIG〉} of a family
MERA share the same circuit structure, but run in reverse directions. Image
of Hamiltonians on a spin-1/2 chain with open boundary conditions:
of cat from https://www.pexels.com/photo/grey-and-white-short-fur-
cat-104827/. N�2
X N
X N�1
X
H ¼ �J Zi Xiþ1 Ziþ2 � h1 Xi � h 2 Xi Xiþ1 ð2Þ
i¼1 i¼1 i¼1

A QCNN to classify N-qubit input states is thus characterized where Xi, Zi are Pauli operators for the spin at site i, and h1, h2 and J
by O(log(N)) parameters. This corresponds to a double exponential are parameters of the Hamiltonian.
Q The Z2 ´ Z2 symmetry is gener-
reduction compared with a generic quantum circuit-based classifier19 ated by XevenðoddÞ ¼ Xi. Figure 2aI shows the phase diagram
and allows for efficient learning and implementation. For example, i2evenðoddÞ
given a set of M classified training vectors {(|ψα〉, yα): α = 1, …, M}, I
as a function of (h1/J, h2/J). When h2 = 0, the Hamiltonian is exactly
where |ψα〉 are input states and yα = 0 or 1 are corresponding binary solvable via the Jordan–Wigner transformation30, confirming that P
classification outputs, one could compute the mean squared error is characterized by non-local order parameters. When h1 = h2 = 0, allI
terms are mutually commuting, and a ground state is the 1D cluster
M  2 state. Our goal is to identify whether a given, unknown ground state
1 X drawn from the phase diagram belongs to P.
MSE ¼ yi � ffUi ;Vj ;Fg ðjψ α iÞ ð1Þ
2M α¼1 As an example, we first present an exact, I analytical QCNN cir-
cuit that recognizes P (Fig. 2b). The convolution layers involve
Here, ffUi ;Vj ;Fg ðjψ α iÞ denotes the expected QCNN output value for controlled-phase gatesI as well as Toffoli gates with controls in the
input |ψI α〉. Learning then consists of initializing all unitaries and X-basis, and pooling layers perform phase-flips on remaining qubits
successively optimizing them until convergence, for example via when one adjacent measurement yields X = −1. This convolution–
gradient descent. pooling unit is repeated d times, where d is the QCNN depth. The
To gain physical insight into the mechanism underlying QCNNs fully connected layer measures Zi−1XiZi+1 on the remaining qubits.
and motivate their application to the problems under consideration, Figure 2c shows the QCNN output for a system of N = 135 spins
we now relate our circuit model to two well-known concepts in and d = 1, …, 4 along h1 = 0.5J, obtained using matrix product
quantum information theory—the multiscale entanglement renor- state simulations. As d increases, the measurement outcomes show
malization ansatz26 (MERA) and QEC. The MERA framework pro- sharper changes around the critical point, and the output of a d = 2
vides an efficient tensor network representation of many classes of circuit already reproduces the phase diagram with high accuracy
interesting many-body wavefunctions, including those associated (Fig. 2a). This QCNN can also be used for other Hamiltonian mod-
with critical systems26–28. A MERA can be understood as a quantum els belonging to the same phase, such as the S = 1 Haldane chain31
state generated by a sequence of unitary and isometry layers applied (see Methods).

1274 Nature Physics | VOL 15 | December 2019 | 1273–1278 | www.nature.com/naturephysics


NaTure PHysiCs Articles
a b
1.6 1.0
X X X
FC
1.2
0.8
0.8 Paramagnetic
×d
0.4 0.6 Z Z Z
! ...
X X !
...
Z Z X Z P
h2 / J

0 ! X !
0.4
–0.4 SPT
0.2
–0.8
! !! XX X X X X X X !! !
...
–1.2 0 ... C
Antiferromagnetic
–1.6
0 0.4 0.8 1.2 1.6
h1 / J

c 1.0 d 100
5.5
5.0
4.5

Reduction
4.0
0.8 80
3.5

Sample complexity
3.0

2.5
0.6 60
1 2 3 4
〈X 〉

0.4 40

0.2 20

0 0
–1.5 –1.0 –0.5 0 0.5 1.0 1.5 0.36 0.37 0.38 0.39 0.40 0.41 0.42
h2 / J h2 / J

Fig. 2 | Application to quantum phase recognition. a, The phase diagram of the Hamiltonian in the main text. The phase boundary points (blue and red
diamonds) are extracted from infinite-size DMRG numerical simulations, while the background shading (colour scale) represents the output from the
exact QCNN circuit for input size N = 45 spins (see Methods). b, Exact QCNN circuit to recognize a Z2 ´ Z2 SPT phase. Blue line segments represent
controlled-phase gates, blue three-qubit gates are Toffoli gates with the control qubits in the X basis, Iand orange two-qubit gates flip the target qubit’s
phase when the X measurement yields −1. The fully connected layer applies controlled-phase gates followed by an Xi projection, effectively measuring
Zi−1XiZi+1. c, Exact QCNN output along h1 = 0.5J for N = 135 spins, depths d = 1, …, 4 (from light to dark blue). d, Sample complexity of QCNN at depths d = 1,
…, 4 (from light to dark blue) versus SOPs of length N/2, N/3, N/5 and N/6 (from light to dark red) to detect the SPT/paramagnet phase transition along
h1 = 0.5J for N = 135 spins. The critical point is identified as h2/J = 0.423 using infinite-size DMRG. In the shaded area, the correlation length exceeds the
system size, and finite-size effects can considerably affect our results. Inset: the ratio of SOP sample complexity to QCNN sample complexity is plotted as
a function of d on a logarithmic scale for h2/J = 0.3918. In the numerically accessible regime, this reduction of sample complexity scales exponentially as
1.73e0.28d (trendline).

Sample complexity. The performance of a QPR solver can be quan- M to test whether p > p0 with 95% confidence using an arcsine vari-
tified by sample complexity11: what is the expected number of copies ance-stabilizing transformation34:
of the input state required to identify its quantum phase? We dem-
onstrate that the sample complexity of our exact QCNN circuit is 1:962
Mmin ¼ pffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ð4Þ
substantially better than that of conventional methods. In principle, ðarcsin p � arcsinp0 Þ
P can be detected by measuring a non-zero expectation value of
I
string order parameters (SOPs)32,33 S such as Similarly, the sample complexity for a QCNN can be determined by
I replacing 〈ψin|S|ψin〉 by the QCNN output expectation value in the
Sab ¼ Za Xaþ1 Xaþ3 ::: Xb�3 Xb�1 Zb ð3Þ expression for p.
Figure 2d shows the sample complexity for the QCNN at various
In practice, however, the expectation values of SOP vanish near the depths and SOPs of different lengths. The QCNN clearly requires
phase boundary due to diverging correlation length33; since quan- substantially fewer input copies throughout the parameter regime,
tum projection noise is maximal in this vicinity, many experimental especially near criticality. More importantly, although the SOP sam-
repetitions are required to affirm a non-zero expectation value. In ple complexity scales independently of string length, the QCNN
contrast, the QCNN output is much sharper near the phase transi- sample complexity consistently improves with increasing depth and
tion, so fewer repetitions are required. is limited only by finite size effects in our simulations. In particu-
Quantitatively, given some |ψin〉 and SOP S , a projective measure- lar, compared with SOPs, the QCNN reduces sample complexity by
ment of S can be modelled as a (generalized) IBernoulli random vari- a factor that scales exponentially with the depth of the QCNN in
able, where the outcome is 1 with probability p = (〈ψin|S|ψin〉 + 1) / 2 numerically accessible regimes (inset). Such scaling arises from the
and −1 with probability 1 − p (since S 2 equals the identity operator); iterative QEC performed at each depth and is not expected from
I
after M binary measurements, we estimate p. p > p0 = 0.5 signifies any measurements of simple (potentially nonlocal) observables. We
jψ in i 2 P. We define the sample complexity Mmin as the minimum show in the Methods that our QCNN circuit measures a multiscale
I
Nature Physics | VOL 15 | December 2019 | 1273–1278 | www.nature.com/naturephysics 1275
Articles NaTure PHysiCs
(L /3) –1.6
∣ψ0 〉⊗∣010...0〉x
0.8
–1.2
X Z X Z X Z QEC X Z
–0.8
Z Z Z Z 0.6
X X
–0.4
Z Z X
0.4

h2 / J
0

X X X X X X X X X X 0.4
0.2
0.8

X 1.2 0

(L) (L) 1.6


∣ψ0 〉 ∣ψ0 〉 0 0.4 0.8 1.2 1.6
h1 / J
Fig. 3 | MERA and QEC in the QCNN circuit. When the input state is the
cluster state with a single-qubit X error (left), the convolution–pooling unit Fig. 4 | Output of a trained QCNN. We numerically optimize our QCNN
of our circuit identifies and corrects the error, while reducing the system for a system of N = 15 spins and d = 1 starting from random initial values.
size by a factor of 3 (right). This process resembles the combination of The training data points are 40 equally spaced points h1 ∈ [0, 2] along
MERA and QEC. the line h2 = 0 where the Hamiltonian is solvable by the Jordan–Wigner
transformation (grey dots). The blue and red diamonds are phase boundary
points extracted from infinite-size DMRG numerical simulations, while
string order parameter—a sum of products of exponentially many
the background shading (colour scale) represents the expectation value of
different SOPs that remains sharp up to the phase boundary.
QCNN output.

MERA and QEC. Futher insights into the performance of the


QCNN are revealed by interpreting it in terms of MERA and
QEC. In particular, our QCNN is specifically designed to contain lem consists of solvable points, more generally, such a dataset can be
the MERA representation of the 1D cluster state (|ψ0〉) such that it obtained by using traditional methods (such as measuring SOPs) to
becomes a stable fixed point. When |ψ0〉 is used as input, each con- classify representative states that can be efficiently generated either
volution–pooling unit produces the same state |ψ0〉 with reduced numerically or experimentally36,37.
system size in the unmeasured qubits, while yielding deterministic
outcomes (X = 1) in the measured qubits. The fully connected layer Optimizing QEC
measures the SOP for |ψ0〉. When an input wavefunction is per- As seen in the previous example, the QCNN’s architecture enables
turbed away from |ψ0〉, our QCNN corrects such ‘errors’. For exam- one to perform effective QEC. We next leverage this feature to
ple, if a single X error occurs, the first pooling layer identifies its design a new QEC code that is itself optimized for a given error
location, and controlled unitary operations correct the error propa- model. More specifically, any QCNN circuit (and its inverse) can
gated through the circuit (Fig. 3). Similarly, if an initial state has be viewed as a decoding (encoding) quantum channel between the
multiple, sufficiently separated errors (possibly in coherent super- physical input qubits and the logical output qubit. The encoding
positions), the error density after several iterations of convolution scheme introduces sets of new qubits in a predetermined state, for
and pooling layers will be substantially smaller35. If the input state example |0〉, while the decoding scheme performs measurements
converges to the fixed point, our QCNN classifies it into the SPT (Fig. 5a). Given an error channel N , our aim is therefore to maxi-
phase with high fidelity. Clearly, this mechanism resembles the clas- mize the recovery fidelity I
sification of quantum phases based on renormalization-group flow. X
f ¼ hψ l jM�1 ðN ðMðjψ l ihψ l jÞÞÞjψ l i ð5Þ
Obtaining QCNN from training procedure. Having analytically jψ l i2f ± x;y;zg
illustrated the computational power of the QCNN circuit model, we
now demonstrate how a QCNN for P can also be obtained using the where MðM�1 Þ is the encoding (decoding) scheme generated by
I
learning procedure. Details of the hyperparameters of the QCNN a QCNN I circuit, and |±x, y, z〉 are the ±1 eigenstates of the Pauli
can be found in the Methods and Supplementary Fig. 2. Initially, matrices. Thus, our method simultaneously optimizes both encod-
all unitaries are set to random values. As classically simulating our ing and decoding schemes, while ensuring their efficient imple-
training procedure requires expensive computational resources, mentation in realistic systems. The variational optimization can be
we focus on a relatively small system with N = 15 spins and QCNN carried out with an unknown N , since f can be evaluated experi-
depth d = 1; there are a total of 1,309 parameters to be learned (see mentally. I
Methods). Our training data consists of 40 evenly spaced points To illustrate the potential of this procedure, we consider a two-
along the line h2 = 0, where the Hamiltonian is exactly solvable by layer QCNN with N = 9 physical qubits and 126 variational param-
the Jordan–Wigner transformation. Using gradient descent with eters (Fig. 5a and Methods). This particular architecture includes
the MSE function (1), we iteratively update the unitaries until con- the nested (classical) repetition codes and the 9-qubit Shor code38;
vergence (see Methods). The classification output of the resulting in the following, we compare our performance to the better of the
QCNN for generic h2 is shown in Fig. 4. This QCNN accurately two. We consider three different input error models: (1) indepen-
reproduces the two-dimensional phase diagram over the entire dent single-qubit errors on all qubits with equal probabilities pμ for
parameter regime, despite being trained only on samples from a set μ = X, Y and Z errors or (2) anisotropic probabilities px ≠ py = pz,
of solvable points that do not even cross the lower phase boundary. and (3) independent single-qubit anisotropic errors with additional
This example illustrates how the QCNN structure avoids over- two-qubit correlated errors XiXi+1 with probability pxx.
fitting to training data with its exponentially reduced number of On initializing all QCNN parameters to random values and
parameters. While the training dataset for this particular QPR prob- numerically optimizing them to maximize f, we find that our model

1276 Nature Physics | VOL 15 | December 2019 | 1273–1278 | www.nature.com/naturephysics


NaTure PHysiCs Articles
a As an example, we present a feasible protocol for near-term
∣0〉
implementation of our exact cluster model QCNN circuit via neu-
U1
–1
U1 tral Rydberg atoms39,43, where long-range dipolar interactions allow
∣0〉 high-fidelity entangling gates44 among distant qubits in a variable
∣0〉 geometric arrangement. The qubits can be encoded in the hyperfine
∣0〉 ground states, where one of the states can be coupled to the Rydberg
∣ψl〉 U2
–1
U1
–1
U1 U2 ρ level to perform efficient entangling operations via the Rydberg-
∣0〉 blockade mechanism44; an explicit implementation scheme for
∣0〉 every gate in Fig. 2b is provided in the Methods. Our QCNN at
∣0〉 depth d with N input qubits requires at most 7N2 ð1 � 3
1�d
Þ þ N31�d
U1
–1
U1 multi-qubit operations and 4d single-qubit rotations.
I For a realistic
∣0〉 effective coupling strength Ω ≈ 2π × 10–100 MHz and single-qubit
coherence time τ ≈ 200 μs limited by the Rydberg state lifetime,
Encoding Decoding
approximately Ωτ ≈ 2π × 103–104 multi-qubit operations can be
performed, and a d = 4 QCNN on N ≈ 100 qubits is feasible. These
b estimates are reasonably conservative as we have not considered
advanced control techniques such as pulse-shaping45 or potentially
parallelizing independent multi-qubit operations.
10–2

Outlook
Logical error rate

These considerations indicate that QCNNs provide a promising


quantum machine learning paradigm. Several interesting gener-
alizations and future directions can be considered. For example,
10–4 while we have only presented the QCNN circuit structure for rec-
ognizing 1D phases, it is straightforward to generalize the model
to higher dimensions, where phases with intrinsic topological
order such as the toric code are supported46,47,48. Such studies could
potentially identify nonlocal order parameters with low sample
10–6 complexity for lesser-understood phases such as quantum spin
10–4 10–3 10–2 10–1 liquids49 or anyonic chains50. To recognize more exotic phases, we
Input error rate could also relax the translation–invariance constraints, resulting in
O(Nlog(N)) parameters for system size N, or use ancilla qubits to
Fig. 5 | QCNN for optimizing quantum error correction. a, Schematic for implement parallel feature maps following traditional CNN archi-
using QCNNs to optimize QEC. The inverse QCNN encodes a single logical tecture. Further extensions could incorporate optimizations for
qubit |ψl〉 into nine physical qubits, which undergo noise N . The QCNN fault-tolerant operations on QEC code spaces. Finally, while we
then decodes these to obtain the logical state ρ. Our aim is I to maximize
have used a finite-difference scheme to compute gradients in our
〈ψl|ρ|ψl〉. b, Logical error rate of Shor code (blue) versus a learned QEC learning demonstrations, the structural similarity of a QCNN to
code (orange) in a correlated error model. The input error rate is defined as its classical counterpart suggests the possibility of adopting more
the sum of all probabilities pμ and pxx. The performance of the Shor code is efficient schemes such as backpropagation1.
worse than performing no error correction at all (identity, grey line), while
the optimized code can still substantially reduce the error rate. Online content
Any methods, additional references, Nature Research reporting
summaries, source data, statements of code and data availability and
produces the same logical error rate as known codes in case (1) but associated accession codes are available at https://doi.org/10.1038/
can reduce the error rate by a constant factor of up to 50% in case s41567-019-0648-8.
(2), depending on the specific input error probability ratios (see
Methods and Supplementary Fig. 4). In case (3), the optimized QEC Received: 9 October 2018; Accepted: 23 July 2019;
code performs substantially better than known codes (Fig. 5b). As Published online: 26 August 2019
the Shor code is only guaranteed to correct arbitrary single-qubit
errors, it performs more poorly than using no error correction, References
while the optimized QEC code performs much better. This example 1. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521,
demonstrates the power of using QCNNs to obtain and optimize 436–444 (2015).
new QEC codes for realistic, a priori unknown error models. 2. Lin, H. W., Tegmark, M. & Rolnick, D. Why does deep and cheap learning
work so well? J. Stat. Phys. 168, 1223–1247 (2017).
3. Mehta, P. & Schwab, D. J. An exact mapping between the variational
Experimental realizations renormalization group and deep learning. Preprint at https://arxiv.org/
Our QCNN architecture can be efficiently implemented on several abs/1410.3831 (2014).
state-of-the-art experimental platforms. The key ingredients for real- 4. Carleo, G. & Troyer, M. Solving the quantum many-body problem with
izing QCNNs include the efficient preparation of quantum many- artificial neural networks. Science 355, 602–606 (2017).
5. van Nieuwenburg, E. P. L., Liu, Y. H. & Huber, S. D. Learning phase
body input states, the application of two-qubit gates at various length transitions by confusion. Nat. Phys. 13, 435–439 (2017).
scales and projective measurements. As in stabilizer-based QEC, the 6. Carrasquilla, J. & Melko, R. G. Machine learning phases of matter. Nat. Phys.
measurements of intermediate qubits and feed-forwarding can be 13, 431–434 (2017).
replaced by controlled two-qubit unitary operations so that measure- 7. Wang, L. Discovering phase transitions with unsupervised learning. Phys. Rev.
ments are performed only at the end of an experimental sequence. B 94, 195105 (2016).
8. Levine, Y., Cohen, N. & Shashua, A. Quantum entanglement in deep learning
These capabilities have already been demonstrated in multiple pro- architectures. Phys. Rev. Lett. 122, 065301 (2019).
grammable quantum simulators consisting of N ≥ 50 qubits based on 9. Zhang, Y. & Kim, E.-A. Quantum loop topography for machine learning.
trapped neutral atoms and ions, or superconducting qubits39–42. Phys. Rev. Lett. 118, 216401 (2017).

Nature Physics | VOL 15 | December 2019 | 1273–1278 | www.nature.com/naturephysics 1277


Articles NaTure PHysiCs
10. Maskara, N., Kubica, A. & Jochym-O’Connor, T. Advantages of versatile 37. Ge, Y., Molnar, A. & Cirac, J. I. Rapid adiabatic preparation of injective
neural-network decoding for topological codes. Phys. Rev. A 99, 052351 (2019). projected entangled pair states and Gibbs states. Phys. Rev. Lett. 116,
11. Haah, J., Harrow, A. W., Ji, Z., Wu, X. & Yu, N. Sample-optimal tomography 080503 (2016).
of quantum states. IEEE Trans. Inform. Theory 63, 5628–5641 (2017). 38. Shor, P. Scheme for reducing decoherence in quantum computer memory.
12. Lee, J. Y. & Landon-Cardinal, O. Practical variational tomography for critical Phys. Rev. A 52, R2493(R) (1995).
one-dimensional systems. Phys. Rev. A 91, 062128 (2019). 39. Bernien, H. et al. Probing many-body dynamics on a 51-atom quantum
13. Ladd, T. D. et al. Quantum computers. Nature 464, 45–53 (2010). simulator. Nature 551, 579–584 (2017).
14. Monroe, C. & Kim, J. Scaling the ion trap quantum processor. Science 339, 40. Zhang, J. et al. Observation of a many-body dynamical phase transition with
1164–1169 (2013). a 53-qubit quantum simulator. Nature 551, 601–604 (2017).
15. Devoret, M. H. & Schoelkopf, R. J. Superconducting circuits for quantum 41. Brydges, T. et al. Probing Rényi entanglement entropy via randomized
information: an outlook. Science 339, 1169–1174 (2013). measurements. Science 364, 260–263 (2019).
16. Awschalom, D. D., Bassett, L. C., Dzurak, A. S., Hu, E. L. & Petta, J. R. 42. Harris, R. et al. Phase transitions in a programmable quantum spin glass
Quantum spintronics: engineering and manipulating atom-like spins in simulator. Science 361, 162–165 (2018).
semiconductors. Science 339, 1174–1179 (2013). 43. Labuhn, H. et al. Tunable two-dimensional arrays of single Rydberg atoms for
17. Biamonte, J. et al. Quantum machine learning. Nature 549, 195–202 (2017). realizing quantum ising models. Nature 534, 667–670 (2016).
18. Dunjko, V., Taylor, J. M. & Briegel, H. J. Quantum-enhanced machine 44. Levine, H. et al. High-fidelity control and entanglement of Rydberg atom
learning. Phys. Rev. Lett. 117, 130501 (2016). qubits. Phys. Rev. Lett. 121, 123603 (2018).
19. Farhi, E. & Neven, H. Classification with quantum neural networks on near 45. Freeman, R. Shaped radiofrequency pulses in high resolution NMR. Progr.
term processors. Preprint at https://arxiv.org/abs/1802.06002 (2018). Nucl. Magn. Reson. Spectrosc. 32, 59–106 (1998).
20. Huggins, W., Patil, P., Mitchell, B., Whaley, K. B. & Stoudenmire, E. M. 46. Kitaev, A. Y. Fault-tolerant quantum computation by anyons. Ann. Phys. 303,
Towards quantum machine learning with tensor networks. Quantum Sci. 2–30 (2003).
Technol. 4, 024001 (2018). 47. Levin, M. A. & Wen, X.-G. String-net condensation: a physical mechanism
21. Huang, C.-Y., Chen, X. & Lin, F.-L. Symmetry-protected quantum state for topological phases. Phys. Rev. B. 71, 045110 (2005).
renormalization. Phys. Rev. B 88, 205124 (2013). 48. Schuch, N., Pérez-García, D. & Cirac, J. I. PEPS as ground states: degeneracy
22. Singh, S. & Vidal, G. Symmetry-protected entanglement renormalization. and topology. Ann. Phys. 325, 2153 (2010).
Phys. Rev. B 88, 121108(R) (2013). 49. Savary, L. & Balents, L. Quantum spin liquids: a review. Rep. Prog. Phys. 80,
23. Kim, I. & Swingle, B. Robust entanglement renormalization on a 016502 (2017).
noisy quantum computer. Preprint at https://arxiv.org/abs/1711.07500 50. Feiguin, A. et al. Interacting anyons in topological quantum liquids: the
(2017). golden chain. Phys. Rev. Lett. 98, 160409 (2007).
24. LeCun, Y. & Bengio, Y. in The Handbook of Brain Theory and Neural
Networks 255–258 (MIT Press, 1995). Acknowledgements
25. Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep We thank I. Cirac, E. Farhi, W. W. Ho, C. Nayak, H. Pichler, J. Preskill, X. Qi, A.
convolutional neural networks. In Proc. 25th Int. Conf. Neural Information Vishwanath, Z. Wang and X.-G. Wen for insightful discussions. I.C. acknowledges
Processing Systems Vol. 1, 1097–1105 (2012). support from the Paul and Daisy Soros Fellowship, the Alfred Spector and Rhonda
26. Vidal, G. Class of many-body states that can be efficiently simulated. Phys. Kost Fellowship of the Hertz Foundation, and the Department of Defense through
Rev. Lett. 101, 110501 (2008). the National Defense Science and Engineering Graduate Fellowship Program. S.C.
27. Aguado, M. & Vidal, G. Entanglement renormalization and topological order. acknowledges support from the Miller Institute for Basic Research in Science. This
Phys. Rev. Lett. 100, 070404 (2008). work was supported through the National Science Foundation, the Center for Ultracold
28. Pfeifer, R. N. C., Evenbly, G. & Vidal, G. Entanglement renormalization, scale Atoms, the Vannevar Bush Faculty Fellowship and Google Research Award.
invariance, and quantum criticality. Phys. Rev. A 79, 040301 (2009).
29. Preskill, J. Lecture Notes for Physics 229: Quantum Information and
Computation. (California Institute of Technology, 1998).
Author contributions
All authors contributed extensively to the work presented in this paper.
30. Sachdev, S. Quantum Phase Transitions (Cambridge University
Press, 2011).
31. Haldane, F. D. M. Nonlinear field theory of large-spin Heisenberg Competing interests
antiferromagnets: semiclassically quantized solitons of the one-dimensional The authors declare no competing interests.
easy-axis Néel state. Phys. Rev. Lett. 50, 1153 (1983).
32. Haegeman, J., Pérez-García, D., Cirac, I. & Schuch, N. Order parameter for Additional information
symmetry-protected topological phases in one dimension. Phys. Rev. Lett. Supplementary information is available for this paper at https://doi.org/10.1038/
109, 050402 (2012). s41567-019-0648-8.
33. Pollmann, F. & Turner, A. M. Detection of symmetry-protected topological
phases in one dimension. Phys. Rev. B 86, 125441 (2012). Reprints and permissions information is available at www.nature.com/reprints.
34. Brown, L. D., Cai, T. T. & DasGupta, A. Interval estimation for a binomial Correspondence and requests for materials should be addressed to S.C.
proportion. Stat. Sci. 16, 101 (2001). Peer review information: Nature Physics thanks Zohar Ringel and the other, anonymous,
35. Zeng, B. & Zhou, D. L. Topological and error-correcting properties reviewer(s) for their contribution to the peer review of this work.
for symmetry-protected topological order. Eur. Phys. Lett. 113,
56001 (2016). Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in
36. Schwarz, M., Temme, K. & Verstraete, F. Preparing projected entangled pair published maps and institutional affiliations.
states on a quantum computer. Phys. Rev. Lett. 108, 110502 (2012). © The Author(s), under exclusive licence to Springer Nature Limited 2019

1278 Nature Physics | VOL 15 | December 2019 | 1273–1278 | www.nature.com/naturephysics


NaTure PHysiCs Articles
Methods of measuring a single SOP, our QCNN circuit measures a sum of products of
Phase diagram and QCNN circuit simulations. The phase diagram in Fig. 2a exponentially many different SOPs:
was numerically obtained using the infinite-size density matrix renormalization X ð1Þ X ð2Þ
group (DMRG) algorithm. We generally follow the method outlined in ref. 51 with O¼ Cab Sab þ Ca1 b1 a2 b2 Sa1 b1 Sa2 b2 þ    ð10Þ
ab a1 b1 a2 b2
150 as the maximum bond dimension. To extract each data point in Fig. 2a, we
numerically obtain the ground state energy density as a function of h2 for fixed h1
O can be viewed as a multiscale SOP with coefficients computed recursively
and compute its second-order derivative. The phase boundary points are identified I
in d using equations (8) and (9). This allows the QCNN to produce a sharp
from sharp peaks.
classification output even when the correlation length is as long as 3d.
The simulation of our QCNN in Fig. 2b also uses matrix product state
representations. We first obtain the input ground state wavefunction using finite-
Demonstration of learning procedure for QPR. To perform our learning
size DMRG51 with bond dimension D = 130 for a system of N = 135 qubits. We then
procedure in a QPR problem, we choose the hyperparameters for the QCNN as
perform the circuit operations by sequentially applying swap gates and two-qubit
shown in Supplementary Fig. 2. This hyperparameter structure can be used for
gates on the nearest neighboring qubits52. Each three-qubit gate is decomposed
generic (1D) phases, and is characterized by a single integer n that determines
into two-qubit unitaries53. We find that increasing the bond dimension to D = 150
the reduction of system size in each convolution–pooling layer, L → L/n.
does not lead to any visible changes in our main figures, confirming a reasonable
(Supplementary Fig. 2 shows the special case where n = 3.) The first convolution
convergence of our method. The colour plot in Fig. 2a is similarly generated for a
layer involves (n + 1)-qubit unitaries starting on every nth qubit. This is followed
system of N = 45 qubits.
by n layers of n-qubit unitaries arranged as shown in Supplementary Fig. 2. The
pooling layer measures n − 1 out of every contiguous block of n qubits; each of
QCNN for the S = 1 Haldane chain. As discussed in the main text, the (spin-1/2)
these is associated with a unitary Vj applied to the remaining qubit, depending on
1D cluster state belongs to an SPT phase protected by Z2 ´ Z2 symmetry, a phase
the measurement outcome. This set of convolution and pooling layers is repeated
that also contains the celebrated S = 1 Haldane chain31. IIt is thus natural to ask
d times. Finally, the fully connected layer consists of an arbitrary unitary on the
whether this circuit can be used to detect the phase transition between the Haldane
remaining N/nd qubits, and the classification output is given by the measurement
phase and an S = 1 paramagnetic phase, which we numerically demonstrate here.
output of the middle qubit (or any fixed choice of one of them). For our example, we
The one-parameter family of Hamiltonians we consider for the Haldane phase
choose n = 3 because the Hamiltonian in equation (2) involves three-qubit terms.
is defined on a 1D chain of N spin-1 particles with open boundary conditions31:
In our simulations, we consider only N = 15 spins and d = 1, because simulating
N
X N
X quantum circuits on classical computers requires a large amount of resources.
HHaldane ¼ J Sj  Sjþ1 þ ω ðSzj Þ2 ð6Þ We parameterize unitaries as exponentials of generalized a × a Gell-Mann
j¼1 j¼1 matrices {Λi}, where a!= 2v, and v is the number of qubits involved in the unitary55:
P
In this equation, Sj denotes the vector of S = 1 spin operators at site j, and J, ω are U ¼ exp �i cj Λj
parameters of the Hamiltonian. The system is protected by a Z2 ´ Z2 symmetry j .
Q x I This parameterization is used directly for the unitaries in the convolution layers
generated by global π-rotations of every spin around the X andIY axes: Rx ¼ eiπSj ,
Q iπSy j C2−C4, the pooling layer and the fully connected layer. For the first convolution
Ry ¼ e j . When ω is zero or small compared to J, the ground state belongs I layer C1, we restrict the choice of U1 to a product of six two-qubit unitaries between
j each possible pair of qubits: U1 = U(23)U(24)U(13)U(14)U(12)U(34), where U(αβ) is a two-
toI the SPT phase, but when ω/J is sufficiently large, the ground state becomes qubit unitary acting on qubits indexed by α and β. Such a decomposition is useful
paramagnetic31. when considering experimental implementation.
To apply our QCNN circuit to this Haldane phase, we must first identify a In the QCNN learning procedure, all parameters cμ are set to random values
quasilocal isometric map U between the two models, because their representations between 0 and 2π for the unitaries {Ui, Vj, F}. In every iteration of gradient descent,
of the symmetry group are distinct. More specifically,
Q since the cluster model has a we compute the derivative of the mean-squared error function (equation (1)) to
Z2 ´ Z2 symmetry generated by XevenðoddÞ ¼ Xi, we require URx U y ¼ Xodd first order with respect to each cμ by using the finite-difference method:
I i2evenðoddÞ I
and URy U y ¼ Xeven . Such a map canI be constructed following ref. 54. Intuitively, ∂MSE 1 
¼ MSEðcμ þ ϵÞ � MSEðcμ � ϵÞ þ Oðϵ2 Þ ð11Þ
it extends
I the local Hilbert space of a spin-1 particle by introducing a spin singlet ∂cμ 2ϵ
state |s〉 and mapping it to a pair of spin-1/2 particles: jxi7!jþ�i, j yi7! � j�þi,
jzi7! � ij��i, jsi7!jþþi. Here, |±〉 denote the |±1〉 eigenstates
I ofIthe (spin-1/2) ∂cμ , where η is the learning rate
Each coefficient is thus updated as cμ 7!cμ � η ∂MSE
I I for that iteration. We compute the learning
I rate using the bold driver technique
Pauli matrix X. |μ〉 denotes a spin-1 state defined by Rν jμi ¼ ð�1Þδμ;ν þ1 jμi (μ, from machine learning, where η is increased by 5% if the error has decreased
I from the previous iteration and decreased by 50% otherwise56. We repeat the
ν ∈ {x, y, z}). The QCNN circuit for the Haldane chain thus consists of applying U
gradient descent procedure until the error function changes on the order of 10−5
followed by the circuit presented in the main text.
between successive iterations. In our simulations, we use ϵ ¼ 10�4 for the gradient
Supplementary Fig. 1 shows the QCNN output for an input system of N = 54
computation and begin with an initial learning rate of η0 =I10.
spin-1 particles at d = 1, …, 4, obtained using matrix product state simulations
with D = 160. For this system size, we numerically identified the critical point
Construction of the QCNN circuit. To construct the exact QCNN circuit in Fig.
as ω/J = 1.035 ± 0.005 by using DMRG to obtain the second derivative of energy
2b, we follow the guidelines discussed in the main text. Specifically, we design the
density as a function of ω and J. The QCNN provides accurate identification of the
convolution and pooling layers to satisfy the following two important properties:
phase transition.
1. Fixed-point criterion. If the input is a cluster state |ψ0〉 of L spins, the output
Multiscale string order parameters. We examine the final operator measured by of the convolution–pooling layers is a cluster state |ψ0〉 of L/3 spins, with all
our circuit that recognizes the SPT phase in the Heisenberg picture. Although a measurements deterministically yielding |0〉.
QCNN performs non-unitary measurements in the pooling layers, similar to QEC 2. QEC criterion. If the input is not |ψ0〉 but instead differs from |ψ0〉 at one site
circuits29, one can postpone all measurements to the end and replace pooling layers by an error that commutes with the global symmetry, the output should still
by unitary-controlled gates acting on both measured and unmeasured qubits. In be a cluster state of L/3 spins, but at least one of the measurements will result
this way, the QCNN is equivalent to measuring a non-local observable in the state |1〉.
ðdÞ ð1Þ ðdÞ ð1Þ These two properties are desirable for any quantum circuit implementation of RG
O ¼ ðUCP :::UCP Þy Zi�1 Xi Ziþ1 ðUCP :::UCP Þ ð7Þ
flow for performing QPR.
ðlÞ
where i is the index of the measured qubit in the final layer and UCP is the unitary In the specific case of our Hamiltonian, the ground state (1D cluster state) is a
corresponding to the convolution–pooling unit at depth l. A more I explicit graph state, which can be efficiently obtained by applying a sequence of controlled-
expression of O can be obtained by commuting UCP with the Pauli operators, phase gates to a product state. This simplifies the construction of the MERA
I
which yields recursive relations: representation for the fixed-point criterion. To satisfy the QEC criterion, we treat the
P
y ground state manifold of the unperturbed Hamiltonian H ¼ �J Zi Xiþ1 Ziþ2 as
UCP Xi UCP ¼ X~i�2 X~i X~iþ2 ð8Þ i
the code space of a stabilizer code with stabilizers {ZiXi+1ZI i+2}. The remaining degrees
of freedom in the QCNN convolution and pooling layers are then specified such
y 1 that the circuit detects and corrects the error (that is, it measures at least one |1〉 and
UCP Zi UCP ¼ ðZ~i þ Z~i�2 X~i�1 þ X~iþ1 Z~iþ2 � Z~i�2 X~i�1 Z~i X~iþ1 Z~iþ2 Þ ð9Þ
2 prevents propagation to the next layer) when a single-qubit X error is present.
~i enumerates every qubit at depth l − 1, including those measured in the pooling QCNN for general QPR problems. Our interpretation of QCNNs in terms
I
layer. It follows that an SOP of the form ZXX … XZ at depth l transforms into a of MERA and QEC motivates their application for recognizing more generic
weighted linear combination of 16 products of SOPs at depth l − 1. Thus, instead quantum phases. For any quantum phase P whose RG fixed-point wavefunction
I
Nature Physics | www.nature.com/naturephysics
Articles NaTure PHysiCs

jψ 0 ðPÞi has a tensor network representation in isometric or G-isometric form57 for pairs of nearby qubits, that is i ∈ {1, 2, 4, 5, 7, 8}. Such a geometrically local
(Supplementary
I Fig. 3a), one can systematically construct a corresponding QCNN correlation is motivated from experimental considerations. In this case, we train
circuit. This family of quantum phases includes all 1D SPT and 2D string-net our QCNN circuit on a specific error model with parameter choices px = 5.8 × 10−3,
phases47,57,58. In these cases, one can explicitly construct a commuting parent py = pz = 2 × 10−3, pxx = 2 × 10−4 and evaluate the logical error probabilities for
Hamiltonian for jψ 0 ðPÞi and a MERA structure in which jψ 0 ðPÞi is a fixed-point various physical error models with the same relative ratios but different total
wavefunction (Supplementary
I Fig. 3a for 1D systems). The Idiagrammatic proof of error per qubit px + py + pz + pxx. In general, for an anisotropic logical
P error model
this fixed-point property is given in Supplementary Fig. 3b. Furthermore, any ‘local with probabilities pμ for σμ logical errors, the overlap f is ð1 � 2 pμ =3Þ, since
δμ;ν þ1
error’ perturbing an input state away from jψ 0 ðPÞi can be identified by measuring h ± ν jσ μ j ± ν i ¼ ð�1Þ . Becuase of this, we compute the totalμ logical error
a fraction of terms in the parent Hamiltonian, I similar to syndrome measurements probability
I I
from f as 1.5(1 − f). Hence, our goal is to maximize the logical state
in stabilizer-based QEC29. Then, a QCNN for P simply consists of the MERA for overlap f defined in equation (5). If we naively apply the gradient descent method
jψ 0 ðPÞi and a nested QEC scheme in which anI input state with error density below based on f directly to both U1 and U2, we find that the optimization is easily
the
I QEC threshold59 ‘flows’ to the RG fixed point. Such a QCNN can be optimized trapped in a local optimum. Instead, we optimize two unitaries U1 and U2
via our learning procedure. sequentially, similar to the layer-by-layer optimization in backpropagation for
While our generic learning protocol begins with completely random unitaries, conventional CNN1.
as in the classical case1, this initialization may not be the most efficient for gradient A few remarks are in order. First, since U1 is optimized prior to U2, one
descent. Instead, motivated by deep learning techniques such as pre-training1, a needs to devise an efficient cost function C1 that is independent of U2. In particular,
better initial parameterization would consist of a MERA representation of jψ 0 ðPÞi simply maximizing f with an assumption on U2, for example that it equals the
and one choice of nested QEC. With such an initialization, the learning procedure I identity, may not be ideal, since such choice does not capture a potential interplay
serves to optimize the QEC scheme, expanding its threshold to the target phase between U1 and U2. Second, because U1 captures arbitrary single-qubit rotations,
boundary (Supplementary Fig. 3c). the definition of C1 should be basis-independent. Finally, we note that the tree
structure of our circuit allows one to view the first layer as an independent
Experimental resource analysis. To compute the gate depth of the cluster model quantum channel:
QCNN circuit in a Rydberg atom implementation, we analyse each gate shown in h   i
Fig. 2b. By postponing pooling layer measurements to the end of the circuit, the MU1 : ρ7!tra U1 N U1y ðj0ih0j  ρ  j0ih0jÞU1 U1y ð17Þ
multi-qubit gates required are
where tra[⋅] denotes tracing over the ancilla qubits that are measured in the
Cz Zij ¼ eiπð�1þZi Þð�1þZj Þ=4 ð12Þ intermediate step. From this perspective, MU1 describes an effective error model to
be corrected by the second layer. I
Cx Zij ¼ eiπð�1þXi Þð�1þZj Þ=4 ð13Þ With these considerations, we optimize U1 such that the effective error model
MU1 becomes as classical as possible, that is MU1 is dominated by a ‘flip’ error
Cx Cx Xijk ¼ eiπð�1þXi Þð�1þXj Þð�1þXk Þ=8 ð14Þ along
I a certain axis with a strongly suppressedI ‘phase’ error. Only then will the
remnant, simpler errors be corrected by the second layer. More specifically, one
By using Rydberg blockade-mediated controlled gates , it is straightforward to
60
may represent MU1 using a map MU1 : r7!Mr þ c, where r 2 R3 is the Bloch
implement CzZij and Cz Cz Zijk ¼ eiπð�1þZi Þð�1þZj Þð�1þZk Þ=8 . The desired CxZij and I state ρ  12 1 þ rI σ, where 1 is the identityI operator and σ = (X,
vector for a qubit
CxCxXijk gates can thenI be obtained by conjugating CzZij and CzCzZijk by single- Y, Z) is the vector of Pauli
I matrices53. The singular values of the real matrix M
qubit rotations. For an input size of N spins, the kth convolution–pooling unit encode the probabilities p1 ≥ p2 ≥ p3 for three different types of errors. We choose
thus applies 4N/3k−1 CzZij gates, N/3k−1 CxCxXijk gates and 2N/3k−1 layers of CxZij our cost function for the first layer as C1 ¼ p21 þ p2 þ p3, which is relatively more
gates. The depth of single-qubit rotations required is 4d, as these rotations can be sensitive to p2 and p3 than p1 and ensureI that the resultant, optimized channel MU1
implemented in parallel on all N qubits. Finally, the fully connected layer consists is dominated by one type of error (with probability p1). We note that M can be I
of N31−d CzZij gates. Thus, the total number of multi-qubit operations required for a efficiently evaluated from a quantum device without knowing N , by performing
QCNN of depth d operating on N spins is 7N 2 ð1 � 3
1�d
Þ þ N31�d . Note that we do quantum process tomography for a single logical qubit. Once UI1 is optimized, we
not need to use SWAP gates since the Rydberg I interaction is long-range. use gradient descent to find an optimal U2 to maximize f. As with QPR, gradients
are computed via the finite-difference method, and the learning rate is determined
Demonstration of learning procedure for QEC. To obtain the QEC code by the bold driver technique1.
considered in the main text, we consider a QCNN with N = 9 input physical qubits
and simulate the circuit evolution of its 2N × 2N density matrix exactly. Strictly Data availability
speaking, our QCNN has three layers: a three-qubit convolution layer U1, a 3-to-1 The data that support the plots within this paper and other findings of this study
pooling layer and a 3-to-1 fully connected layer U2. Without loss of generality, we are available from the corresponding author on reasonable request.
may ignore the optimization over the pooling layer by absorbing its effect into the
first convolution layer, leading to the effective two-layer structure shown in Fig. 5a.
The generic three-qubit unitary operations U1 and U2 are parameterized using 63
Gell-Mann coefficients each. References
As discussed in the main text, we consider three different error models: (1) 51. McCulloch, I. P. Infinite size density matrix renormalization group, revisited.
independent single-qubit errors on all qubits with equal probabilities pμ for μ = X, Preprint at https://arxiv.org/abs/0804.2509 (2008).
Y and Z errors, (2) independent single-qubit errors on all qubits, with anisotropic 52. Vidal, G. Efficient classical simulation of slightly entangled quantum
probabilities px ≠ py = pz and (3) independent single-qubit anisotropic errors with computations. Phys. Rev. Lett. 91, 147902 (2003).
additional two-qubit correlated errors XiXi+1 with probability pxx. More specifically, 53. Nielsen, M. A. & Chuang, I. Quantum Computation and Quantum
the first two error models are realized by applying a (generally anisotropic) Information (Cambridge Univ. Press, 2000).
depolarization quantum channel to each of the nine physical qubits: 54. Verresen, R., Moessner, R. & Pollman, F. One-dimensional symmetry-
! protected topological phases and their transitions. Phys. Rev. B 96,
X X 165124 (2017).
N 1;i : ρ7! 1 � pμ ρ þ pμ σ μi ρσ μi ð15Þ 55. Bertlmann, R. A. & Krammer, P. Bloch vectors for qudits. J. Phys. A 41,,
μ μ 235303 (2008).
with Pauli matrices σ μi for i ∈ {1, 2, …, 9} (the qubit indices are defined from 56. Hinton, G. Lecture notes for CSC2515: Introduction to machine learning
bottom to top in Fig.I5a). For the anisotropic case, we trained the QCNN on various (Univ. Toronto, 2007).
different error models with the same total error probability px + py + pz = 0.001 but 57. Schuch, N., Pérez-García, D. & Cirac, J. I. Classifying quantum phases using
different relative ratios; the resulting ratio between the logical error probability of matrix product states and projected entangled pair states. Phys. Rev. B 84,
the Shor code and that of the QCNN code is plotted as a function of anisotropy in 165139 (2011).
Supplementary Fig. 4. For strongly anisotropic models, the QCNN outperforms the 58. Chen, X., Gu, Z.-C. & Wen, X.-G. Classification of gapped symmetric phases
Shor code, while for nearly isotropic models, the Shor code is optimal and QCNN in one-dimensional spin systems. Phys. Rev. B 83, 035107 (2011).
can achieve the same logical error rate. 59. Aharonov, D. & Ben-Or, M. in Proc. 29th Annu. ACM Symp. on the Theory of
For the correlated error model, we additionally apply a quantum channel: Computing 176–188 (ACM, 1997).
60. Saffman, M., Walker, T. & Molmer, K. Quantum information with Rydberg
N 2;i : ρ7!ð1 � pxx Þρ þ pxx Xi Xiþ1 ρXi Xiþ1 ð16Þ atoms. Rev. Mod. Phys. 82, 2313 (2010).

Nature Physics | www.nature.com/naturephysics

You might also like