Efficient Representation of Quantum Many-body States with Deep Neural Networks
Efficient Representation of Quantum Many-body States with Deep Neural Networks
Part of the challenge for quantum many-body problems comes from the difficulty of
representing large-scale quantum states, which in general requires an exponentially
large number of parameters. Neural networks provide a powerful tool to represent quantum
many-body states. An important open question is what characterizes the representational
power of deep and shallow neural networks, which is of fundamental interest due to the
popularity of deep learning methods. Here, we give a proof that, assuming a widely believed
computational complexity conjecture, a deep neural network can efficiently represent most
physical states, including the ground states of many-body Hamiltonians and states generated
by quantum dynamics, while a shallow network representation with a restricted Boltzmann
machine cannot efficiently represent some of those states.
1 Center for Quantum Information, IIIS, Tsinghua University, Beijing 100084, China. 2 Department of Physics, University of Michigan, Ann Arbor, MI 48109,
USA. Correspondence and requests for materials should be addressed to X.G. (email: [email protected]) or to L.-M.D. (email: [email protected])
T
he Hilbert space dimension associated with quantum represent any states generated by polynomial-size quantum
many-body problems is exponentially large, which poses a circuits or any ground states of physical Hamiltonians with
big challenge for solving those problems even with the polynomial-size gaps. Here, polynomial-size gap means that
most powerful computers. The variational approach is usually the the energy gap of the Hamiltonian approaches to zero at most by
tool of choice for tackling such difficult problems, which include 1/poly(n), where poly(n) denotes a polynomial function of the
many successful examples from simple mean-field approximation particle number n. Most physical quantum states are generated
to more complicated methods such as those based on matrix either by many-body dynamics, which can be efficiently
product states1, tensor network states2, string bond states3, 4, and, simulated through a polynomial-size quantum circuit13–15, or
more recently, neural network states5, 6. The first important step as ground states of some physical Hamiltonians, so they can all be
of the variational approach is to find an efficient representation of efficiently represented by DBMs. This result, combined with the
the relevant quantum many-body states. Here, by efficient we reinforcement learning method (see discussion in Supplementary
mean the number of parameters used to characterize those Note 6), indicates the potential power of the DBM representation
quantum states increases at most by a polynomial function with as a tool for solving quantum many-body problems.
the number of particles (or degrees of freedom) in the system. We note that existence of an efficient representation by a
With an efficient representation, one can combine it with pow- DBM does not mean we can always use this representation for
erful learning methods to optimize those variational parameters efficient calculation of physical observables as the latter involves
by optimization techniques, such as the gradient descent method. further complicated index contraction. Efficient representation is
Neural networks are a powerful tool to represent complex a necessary but not sufficient condition required to tackle
correlations in multiple-variable functions or probability quantum many-body problems. One need to combine it with
distributions and recently find wide applications in artificial efficient numerical training algorithm to extract physical
intelligence through the popularity of deep learning methods7. observables. Finding ground-state energies of general many-
An interesting connection has been made recently between the body Hamiltonians is known to be computationally hard,
variational approach in quantum many-body problems and requiring in general exponential calculation time2. So even if an
learning methods based on neural network representations5. efficient representation of ground states exists, we may not be able
Numerical evidence suggests that the restricted Boltzmann to use it to find ground-state energy. On the other hand, although
machine (RBM), a shallow generative neural network, optimized we prove that the RBMs cannot represent the most general
by the reinforcement learning method, provides a good solution quantum states, it does not restrict the use of RBMs for solving
to several many-body models5. Given this success, an important many practical problems. Indeed, RBMs could be very useful to
open question is what characterizes the representational power represent and learn a wide class of ground states or physical states
and limitations of the RBM for quantum many-body states. arising from time evolution. Apart from numerical simulation of
In this paper, we characterize the representational power and quantum many-body problems, efficient representation by DBMs
limitations of the RBM and its extension to deep neural networks, or RBMs may also find applications for the classification of
the deep Boltzmann machine (DBM). We prove that DBMs can topological quantum phases16, 17 or the quantum approach to
efficiently represent most physical states, including the ground space–time with holographic properties18, 19, similar to applica-
states of many-body Hamiltonians and states generated by tions of the tensor network representation in those scenarios.
quantum dynamics, while RBMs cannot efficiently represent
some of those states. The result shows there exists an exponential
Neural network quantum states. AP many-body quantum state of
separation in efficiency between using DBMs or RBMs to
n qubits can be written as jΨi ¼ v Ψðv Þjv i in the computa-
represent quantum many-body states.
tional basis with v ≡ (v1, …, vn), where the wave function Ψ(v) is
a general complex function of n binary variables vi ∈ {0, 1}. In the
neural network representation by a BoltzmannP machine, the wave
Results
function Ψ(v) is expressed as ΨðvÞ ¼ h eWðv;hÞ , where the
Summary of major results. Our first major result concerns
weight W(v, h) is a complex quadratic function of binary
RBMs. We prove that while RBMs can efficiently represent many
variables v and h ≡ (h1, …, hm) called visible and hidden neurons,
highly entangled states, there is a fundamental limit for them to
respectively. The number of hidden neurons m is at most
efficiently represent general quantum states. For the power of
poly(n) for an efficient representation. In the graphic repre-
RBMs, we show through explicit construction that RBMs can
sentation shown in Fig. 1, the neurons vi and hj connected by an
efficiently and exactly represent arbitrary graph states8, certain
edge are P correlated with a nonzero Wij in the weight
states obeying entanglement volume law or describing the critical
Wðv; hÞ ¼ i;j Wij vi hj . For the RBM (Fig. 1a), the layer of visible
system9, and topological toric code states10. For the limitation of
neurons is connected to one layer of hidden neurons (neurons in
RBMs, we introduce an explicit class of states which can be
the same layer are not mutually connected). The DBM is similar
generated either by a polynomial-size quantum circuit or as
to the RBM but with two or more layers of hidden neu-
ground states of gapped Hamiltonians, and prove for those states
rons (Fig. 1b). Two hidden layers are actually general enough as
there is no efficient RBM representation unless the polynomial
one can see in Fig. 1b that odd and even layers can each be
hierarchy, a generalization of the famous P versus NP problem in
combined into a single layer. A fully connected Boltzmann
computer science, collapses, which is widely believed to be unli-
machine is shown in Fig. 1c. In the methods section, we prove
kely. Note that our result well complements the known theory
that any fully connected Boltzmann machine can be efficiently
about the representational power of RBMs11, 12. It has been
represented by a DBM as illustrated in Fig. 1d.
proven in ref. 12 that an RBM can approximate any probability
distribution with arbitrary accuracy if one does not limit the
representation efficiency (the number of parameters in the Power and limitations of restricted Boltzmann machines. First,
representation). Here, we strengthen this result by showing that we show that RBMs can represent many highly entangled states,
with consideration of efficiency, there are quantum probability including wave functions of any graph states8, topological toric
distributions that cannot be efficiently represented by RBMs. codes10, and states violating the entanglement area law or
Our second major result concerns about the power of DBMs. describing the critical system9. As an example to illustrate the
We prove through explicit construction that DBMs can efficiently method, we give a simple construction for RBM representation of
a b a b Weight function WH
1 1
2 3 2
4
4 5
5
c d
Fig. 2 Representation of Graph states by RBMs. a Graph representation of
an example graph state. b Representation of the graph state with a
Visible neuron restricted Boltzmann machine. One hidden neuron with the Hadamard
Hidden neuron weight function WH (explicit form given in Eq. (1) of the text) simulates the
correlation in the wave function between each pair of connected qubits in
any graph states
Fig. 1 Illustration of Boltzmann machine neural networks. a Restricted
Boltzmann machine (RBM) which has only one hidden layer and no
intra-layer connections. b Deep Boltzmann machine (DBM) which has at v (#P-hard is a known computational complexity class that
least two hidden layers and no intra-layer connections. General DBMs are in general requires exponential calculation time). If this
equivalent to DBMs with two hidden layers after rearrangement of odd and state ΨGWD has a RBM representation, it means #P ⊂ P/poly,
even layers. c Fully connected Boltzmann machine which has intra-layer an unlikely result in computational complexity theory as
connections. d Reduction of fully connected Boltzmann machine to DBMs this means the polynomial hierarchy collapses22. The
with two hidden layers state ΨGWD (with its explicit form given in Supplementary
Note 3) is just a two-dimensional cluster state after a layer
any graph state and leave the representation of other categories of translation-invariant single-qubit unitary operations. This
of states to Supplementary Note 1. RBM representations for state ΨGWD is a special instance of states that can be
one-dimensional cluster states (a special case of graph states) generated by a constant-depth quantum circuit (which is a
and toric codes have been given recently in ref. 6. We give a special polynomial-size circuit). It also belongs to the projected
different construction method which is simpler and more entangled pair states (PEPS) and the ground states of gapped
systematic. The Q wave functionpofffiffiffi a graph state takes the form Hamiltonians. Combining the results above, we arrive at the
Ψðv1 ; ; vn Þ ¼ hi;ji ð1Þvi vj = 2, where 〈i, j〉 denotes an edge following theorem:
linking the i-th and j-th qubits represented by visible neurons vi, Theorem 1: There exist states, which can be generated by a
vj. As shown in Fig. 2, one hidden neuron h and two edges pffiffiffi constant-depth quantum circuit or expressed as PEPS or ground
with weight WH realize the correlation function ð1Þvi vj = 2 states of gapped Hamiltonians, but cannot be efficiently
between vi and vj. This requires solving the equation
vi vj pffiffiffi
P represented by any RBM unless the polynomial hierarchy
WH ðvi ;hÞþWH ðvj ;hÞ
h e ¼ ð1Þ = 2 , which has a simple solution collapses in the computational complexity theory.
π ln 2 π π The above argument holds for the exact representation of Ψ(v)
WH ðx; hÞ ¼ i ix ih þ iπxh ð1Þ with an RBM. As proved in Supplementary Note 3, under
8 2 2 4 reasonable conjectures about computational complexities, the
with x = vi or vj. same result also holds for approximate representations of Ψ(v)
The RBM state has an important property that its wave with RBMs.
function Ψ(v) can be calculated efficiently under given input Note that 2D cluster states can be efficiently represented by
values to the variables vi. Here we prove that this property leads RBMs. While after a layer of single-qubit operations which do not
to limitations of the RBM in representing more general quantum change the quantum phase according to the classification scheme
states. With a given input value of v, Ψ(v) can be factorized as in refs. 16, 23, the output state ΨGWD cannot be efficiently
0 1 represented by RBMs any more. So the RBM representation is not
Y Y Y closed under unitaries that preserve a quantum phase.
@ eWij ðvi ;0Þ þ eWij ðvi ;1Þ A; ð2Þ
j i:hi;ji i:hi;ji
Representational power of deep Boltzmann machines. Now we
where i (j) runs from 1 to at most n (m), so the total show with DBMs, i.e., with one more layer of hidden neurons,
computational time for Ψ(v) scales as mn for each given input most physical states, including all the states in Theorem 1, can be
v. This means Ψ(v) can be computed by a circuit Cn with efficiently represented. For this purpose, first we introduce a
polynomial size poly(n) for a given input v ∈ {0, 1}n. If a quantum couple of gadgets that will simplify our construction.
state has a RBM representation (even if its explicit form is A gadget is a complex function of binary variables after
unknown), computing Ψ(v) is characterized by the computational encapsulation of hidden neurons in a DBM network as shown in
complexity class P/poly20, which represent problems that can be Fig. 3a, where the input is represented by port neurons (for
solved by a polynomial-size circuit even if the circuit cannot be connection of different gadgets) and the output is the value of the
constructed efficiently in general. The circuit here corresponds to function. We use gadgets as basic elements in a large DBM. As
a RBM representation, with the input given by a specific v and the examples, we define the Hadamard gadget and phase gadget as
output given by the value of Ψ(v). shown in Fig. 3b, which will play the role of elementary gates for
We have introduced in ref. 21 a specific quantum many-body construction of DBM representations of quantum circuits. The
state, denoted as ΨGWD, for which we proved it is #P-hard P WH is given by Eq. (1) and Wθ is the solution of
weight function
to calculate its wave function ΨGWD(v) in the computational basis the equation h eWθ ðx1 ;hÞþWθ ðx2 ;hÞ ¼ eiθx1 δx1 x2 , which may take the
a b c Rule I: g (·, ·)
Weight function W
H g1(·, x) g2(x,: ·)
H =
g(x, y, z) g2(x, ·)
Encapsulation g1(·, x) +
z Weight function W Rule II: g (·, x, ·)
x y Z () = x g1(·, x) g2(x, ·)
x
d Hadamard
t H
gadget Identity gadget
e Z ()
I H
Phase
|+ H Z ()
t gadget t Z () DBM representation I I H I
|+
|+ Z ()
I Z () I
Control-Z
H
0
t
T
Port neuron
Fig. 3 Representation of universal quantum computational states by DBMs. a Gadget is a complex function of binary variables represented by port neurons,
a short-hand notation after encapsulation of hidden neurons. b Two elementary gadgets for representation of quantum circuits: the Hadamard gadget with
weight WH given by Eq. (1) and the phase gadget with weight Wθ given by Eq. (3). c Two types of fusion rules for gadgets: rule I and rule II and their neural
network representation. d Fusion with Hadamard or phase gadgets with rule I or rule II simulates application of three elementary quantum gates: the
Hadamard gate, the phase gate, and the controlled phase flip gate, which together make universal quantum computation. The figure illustrates evolution of
the wave function from step t to step t + 1. e Representation of an example quantum circuits with elementary gadgets. To represent circuits of depths T, we
need to apply T steps of fusions with elementary gadgets, and gadget fusions in the same step can be applied in parallel. The identity gadget is a special
phase gadget with θ = 0. After the last step of computation, port neurons become visible neurons to represent the index of physical qubits, and we get a
DBM representation of the output state
Received: 18 May 2017 Accepted: 18 July 2017 22. Babai, L., Fortnow, L. & Lund, C. in Proc. 31st Annual Symposium on
Foundations of Computer Science 16–25 (IEEE, 1990).
23. Hastings, M. B. & Wen, X.-G. Quasiadiabatic continuation of quantum states:
The stability of topological ground-state degeneracy and emergent gauge
invariance. Phys. Rev. B 72, 045141 (2005).
References 24. Barenco, A. et al. Elementary gates for quantum computation. Phys. Rev. A 52,
1. Schollwöck, U. The density-matrix renormalization group in the age of matrix 3457–3467 (1995).
product states. Ann. Phys. 326, 96–192 (2011). 25. Nielsen, M. A. & Chuang, I. L. Quantum Computation and Quantum
2. Verstraete, F., Murg, V. & Cirac, J. I. Matrix product states, projected entangled Information (Cambridge University Press, 2010).
pair states, and variational renormalization group methods for quantum spin 26. Verstraete, F. & Cirac, J. I. Renormalization algorithms for quantum-many
systems. Adv. Phys. 57, 143–224 (2008). body systems in two and higher dimensions. Preprint at http://arxiv.org/abs/
3. Schuch, N., Wolf, M. M., Verstraete, F. & Cirac, J. I. Simulation of quantum cond-mat/0407066 (2004).
many-body systems with strings of operators and monte carlo tensor 27. Vidal, G. Entanglement renormalization. Phys. Rev. Lett. 99, 220405 (2007).
contractions. Phys. Rev. Lett. 100, 040501 (2008). 28. Schuch, N., Wolf, M. M., Verstraete, F. & Cirac, J. I. Computational complexity
4. Sfondrini, A., Cerrillo, J., Schuch, N. & Cirac, J. I. Simulating two-and three- of projected entangled pair states. Phys. Rev. Lett. 98, 140506 (2007).
dimensional frustrated quantum systems with string-bond states. Phys. Rev. B 29. Vidal, G. Efficient simulation of one-dimensional quantum many-body
systems. Phys. Rev. Lett. 93, 040502 (2004).
81, 214426 (2010).
5. Carleo, G. & Troyer, M. Solving the quantum many-body problem with
artificial neural networks. Science 355, 602–606 (2017). Acknowledgements
6. Deng, D.-L., Li, X. & Sarma, S. D. Exact machine learning topological states. We thank Ignacio Cirac, Shengtao Wang, Giuseppe Carleo, and Zhengyu Zhang for
Preprint at http://arxiv.org/abs/1609.09060 (2016). helpful discussions. This work was supported by the Ministry of Education and the
7. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015). National key Research and Development Program of China. L.-M.D. acknowledges in
8. Raussendorf, R. & Briegel, H. J. A one-way quantum computer. Phys. Rev. Lett. addition support from the AFOSR MURI program.
86, 5188–5191 (2001).
9. Verstraete, F., Wolf, M. M., Perez-Garcia, D. & Cirac, J. I. Criticality, the area
law, and the computational power of projected entangled pair states. Phys. Rev. Author contributions
Lett. 96, 220601 (2006). X. G. and L.-M. D. contributed substantially to this work.
10. Kitaev, A. Y. Fault-tolerant quantum computation by anyons. Ann. Phys. 303,
2–30 (2003). Additional information
11. Freund, Y. & Haussler, D. Unsupervised Learning of Distributions of Binary Supplementary Information accompanies this paper at doi:10.1038/s41467-017-00705-2.
Vectors Using Two Layer Networks. Report No. UCSC-CRL-91-20
(University of California, 1994). Competing interests: The authors declare no competing financial interests.
12. Le Roux, N. & Bengio, Y. Representational power of restricted boltzmann
machines and deep belief networks. Neural Comput. 20, 1631–1649 (2008). Reprints and permission information is available online at http://npg.nature.com/
13. Lloyd, S. Universal quantum simulators. Science 273, 1073–1078 (1996). reprintsandpermissions/
14. Poulin, D., Qarry, A., Somma, R. & Verstraete, F. Quantum simulation of time-
dependent hamiltonians and the convenient illusion of hilbert space. Phys. Rev. Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in
Lett. 106, 170501 (2011). published maps and institutional affiliations.
15. Berry, D. W., Childs, A. M., Cleve, R., Kothari, R. & Somma, R. D. Simulating
hamiltonian dynamics with a truncated taylor series. Phys. Rev. Lett. 114,
090502 (2015).
16. Chen, X., Gu, Z.-C., Liu, Z.-X. & Wen, X.-G. Symmetry-protected topological Open Access This article is licensed under a Creative Commons
orders in interacting bosonic systems. Science 338, 1604–1606 (2012). Attribution 4.0 International License, which permits use, sharing,
17. Schuch, N., Pérez-Garca, D. & Cirac, I. Classifying quantum phases using adaptation, distribution and reproduction in any medium or format, as long as you give
matrix product states and projected entangled pair states. Phys. Rev. B 84, appropriate credit to the original author(s) and the source, provide a link to the Creative
165139 (2011). Commons license, and indicate if changes were made. The images or other third party
18. Swingle, B. Entanglement renormalization and holography. Phys. Rev. D 86, material in this article are included in the article’s Creative Commons license, unless
065007 (2012). indicated otherwise in a credit line to the material. If material is not included in the
19. Pastawski, F., Yoshida, B., Harlow, D. & Preskill, J. Holographic quantum article’s Creative Commons license and your intended use is not permitted by statutory
error-correcting codes: toy models for the bulk/boundary correspondence. regulation or exceeds the permitted use, you will need to obtain permission directly from
J. High Energy Phys. 2015, 149 (2015). the copyright holder. To view a copy of this license, visit http://creativecommons.org/
20. Arora, S. & Barak, B. Computational Complexity: A Modern Approach licenses/by/4.0/.
(Cambridge University Press, 2009).
21. Gao, X., Wang, S.-T. & Duan, L.-M. Quantum supremacy for simulating a
translation-invariant ising spin model. Phys. Rev. Lett. 118, 040502 (2017). © The Author(s) 2017