VOLUME 83, NUMBER 15 PHYSICAL REVIEW LETTERS 11 OCTOBER 1999
Quantum Games and Quantum Strategies
Jens Eisert,1 Martin Wilkens,1 and Maciej Lewenstein 2
1
Institut f ür Physik, Universität Potsdam, 14469 Potsdam, Germany
2
Institut f ür Theoretische Physik, Universität Hannover, 30167 Hannover, Germany
(Received 26 June 1998)
We investigate the quantization of nonzero sum games. For the particular case of the Prisoners’
Dilemma we show that this game ceases to pose a dilemma if quantum strategies are allowed for. We
also construct a particular quantum strategy which always gives reward if played against any classical
strategy.
PACS numbers: 03.67. – a, 02.50.Le, 03.65.Bz
One might wonder what games and physics could pos- In the Prisoners’ Dilemma, each of the two players,
sibly have in common. After all, games such as chess or Alice and Bob, must independently decide whether she
poker seem to heavily rely on bluffing, guessing, and other or he chooses to defect (strategy D) or cooperate (strategy
activities of unphysical character. Yet, as was shown by C). Depending on their decision, each player receives a
von Neumann and Morgenstern [1], conscious choice is certain payoff—see Table I. The objective of each player
not essential for a theory of games. At the most abstract is to maximize his or her individual payoff. The catch of
level, game theory is about numbers that entities are effi- the dilemma is that D is the dominant strategy, that is, ra-
ciently acting to maximize or minimize [2]. For a quantum tional reasoning forces each player to defect, and thereby
physicist it is then legitimate to ask what happens if linear doing substantially worse than if they would both decide
superpositions of these actions are allowed for, that is, if to cooperate [10]. In terms of game theory, mutual defec-
games are generalized into the quantum domain. tion is also a Nash equilibrium [2]: in contemplating the
There are several reasons why quantizing games may move DD in retrospect, each of the players comes to the
be interesting. First, classical game theory is a well- conclusion that he or she could not have done better by
established discipline of applied mathematics [2] which unilaterally changing his or her own strategy [11].
has found numerous applications in economy, psychology, In this paper we give a physical model of the Prison-
ecology, and biology [2,3]. Since it is based on probabil- ers’ Dilemma, and we show that—in the context of this
ity to a large extent, there is a fundamental interest in model—the players escape the dilemma if they both re-
generalizing this theory to the domain of quantum proba- sort to quantum strategies. Moreover, we shall demon-
bilities. Second, if the “Selfish Genes” [3] are reality, we strate that (i) there exists a particular pair of quantum
may speculate that games of survival are being played al- strategies which always gives a reward and is a Nash equi-
ready on the molecular level, where quantum mechanics librium and (ii) there exists a particular quantum strategy
dictates the rules. Third, there is an intimate connection which always gives, at least, a reward if played against
between the theory of games and the theory of quantum any classical strategy.
communication. Indeed, whenever a player passes his de- The physical model consists of (i) a source of two bits,
cision to the other player or the game’s arbiter, he in fact one bit for each player, (ii) a set of physical instruments
communicates information, which—as we live in a quan- that enables the player to manipulate his or her own bit in a
tum world—is legitimate to think of as quantum infor- strategic manner, and (iii) a physical measurement device
mation. On the other hand, it has recently transpired that which determines the players’ payoff from the state of the
eavesdropping in quantum-channel communication [4–6] two bits. All three ingredients, the source, the players’
and optimal cloning [7] can readily be conceived in a physical instruments, and the payoff physical measurement
strategic game between two or more players, the objective device are assumed to be perfectly known to both players.
being to obtain as much information as possible in a given
setup. Finally, quantum mechanics may well be useful to
win some specially designed zero-sum unfair games, such TABLE I. Payoff matrix for the Prisoners’ Dilemma. The
first entry in the parenthesis denotes the payoff of Alice
as PQ penny flip, as was recently demonstrated by Meyer and the second number the payoff of Bob. The numerical
[8], and it may ensure fairness in remote gambling [9]. values are chosen as in [3]. Referring to Eq. (2), this choice
In this Letter we consider nonzero sum games where— corresponds to r 苷 3 (“reward ”), p 苷 1 (“punishment ”), t 苷
in contrast to zero-sum games—the two players no longer 5 (“temptation”), and s 苷 0 (“sucker’s payoff ”).
appear in strict opposition to each other, but may rather Bob: C Bob: D
benefit from mutual cooperation. A particular instance of
Alice: C 共3, 3兲 共0, 5兲
this class of games, which has found widespread applica-
Alice: D 共5, 0兲 共1, 1兲
tions in many areas of science, is the Prisoners’ Dilemma.
0031-9007兾99兾83(15)兾3077(4)$15.00 © 1999 The American Physical Society 3077
VOLUME 83, NUMBER 15 PHYSICAL REVIEW LETTERS 11 OCTOBER 1999
The quantum formulation proceeds by assigning the changing t $ s in the last two entries (for numerical val-
possible outcomes of the classical strategies D and C to ues of r, p, t, and s, see Table I). Note that Alice’s
two basis vectors jD典 and jC典 in the Hilbert space of a expected payoff $A not only depends on her choice of
two-state system, i.e., a qubit. At each instance, the state strategy ÛA , but also on Bob’s choice ÛB .
of the game is described by a vector in the tensor product It proves to be sufficient to restrict the strategic space
space which is spanned by the classical game basis jCC典, to the 2-parameter set of unitary 2 3 2 matrices,
jCD典, jDC典, and jDD典, where the first and second entries √ !
eif cosu兾2 sinu兾2
refer to Alice’s and Bob’s qubits, respectively. Û共u, f兲 苷 , (3)
The board of our quantum game is depicted in Fig. 1; 2 sinu兾2 e2if cosu兾2
it can in fact be considered a simple quantum network with 0 # u # p and 0 # f # p兾2. To be specific, we
[12] with sources, reversible one-bit and two-bit gates, associate the strategy “cooperate” with the operator,
and sinks. Note that the complexity is minimal in this √ !
implementation as the players’ decisions are encoded in 1 0
Ĉ ⬅ Û共0, 0兲, Ĉ 苷 , (4)
dichotomic variables. 0 1
We denote the game’s initial state by jc0 典 苷 Jˆ jCC典, while the strategy “defect” is associated with a spin flip,
where Jˆ is a unitary operator which is known to both √ !
players. For fair games, Jˆ must be symmetric with respect D̂ ⬅ Û共p, 0兲, D̂ 苷
0 1
. (5)
to the interchange of the two players. The strategies are 21 0
executed on the distributed pair of qubits in the state In order to guarantee that the ordinary Prisoners’
jc0 典. Strategic moves of Alice and Bob are associated Dilemma is faithfully represented, we impose the sub-
with unitary operators ÛA and ÛB , respectively, which are sidiary conditions
chosen from a strategic space S. The independence of the
players dictates that ÛA and ÛB operate exclusively on the 关Jˆ , D̂ ≠ D̂兴 苷 0, 关Jˆ , D̂ ≠ Ĉ兴 苷 0 ,
(6)
qubits in Alice’s and Bob’s possession, respectively. The 关Jˆ , Ĉ ≠ D̂兴 苷 0 .
strategic space S may therefore be identified with some
subset of the group of unitary 2 3 2 matrices. These conditions, together with the identificationJ˜ 苷
Having executed their moves, which leaves the game Jˆ y , imply that, for any pair of strategies taken from
in a state 共ÛA ≠ ÛB 兲Jˆ jCC典, Alice and Bob forward their the subset S0 ⬅ 兵Û共u, 0兲 j u [ 关0, p兴其, the joint proba-
共s兲 共s 0 兲
qubits for the final measurement which determines their bilities Pss 0 factorize, Pss 0 苷 pA pB , where p 共C兲 苷
payoff. The measurement device consists of a reversible cos2 共u兾2兲 and p 共D兲 苷 1 2 p 共C兲 . Identifying p 共C兲 with
two-bit gate J˜ which is followed by a pair of Stern- the individual preference to cooperate, we observe that
Gerlach-type detectors. The two channels of each detector condition (6) in fact ensures that the quantum Prisoners’
are labeled by s 苷 C, D. With the proviso of subsequent Dilemma entails a faithful representation of the most gen-
justification, we set J˜ 苷 Jˆ y , such that the final state eral classical Prisoners’ Dilemma, where each player uses
jcf 典 苷 jcf 共ÛA , ÛB 兲典 of the game prior to detection is a biased coin in order to decide whether he or she chooses
given by to cooperate or to defect [13]. Of course, the entire set
of quantum strategies is much bigger than S0 , and it is the
jcf 典 苷 Jˆ y 共ÛA ≠ ÛB 兲Jˆ jCC典 . (1)
quantum sector SnS0 which offers additional degrees of
The subsequent detection yields a particular result, ss 0 苷 freedom which can be exploited for strategic purposes.
CD say, and the payoff is returned according to the cor- Note that our quantization scheme applies to any two-
responding entry of the payoff matrix. Yet, since quan- player binary choice symmetric game and—due to the
tum mechanics is a fundamentally probabilistic theory, the classical correspondence principle [Eq. (6)]—is to a great
only strategic notion of a payoff is the expected payoff. extent canonical.
Alice’s expected payoff is given by Factoring out Abelian subgroups which yield nothing
$A 苷 rPCC 1 pPDD 1 tPDC 1 sPCD , (2) but a reparametrization of the quantum sector of the
strategic space S, a solution of Eq. (6) is given by
where Pss 0 苷 j具ss 0 j cf 典j2 is the joint probability that Jˆ 苷 exp兵ig D̂ ≠ D̂兾2其, where g [ 关0, p兾2兴 is a real
the channels s and s 0 of the Stern-Gerlach-type devices parameter. In fact, g is a measure for the game’s
will click. Bob’s expected payoff is obtained by inter- entanglement. For a separable game g 苷 0, and the
joint probabilities Pss 0 factorize for all possible pairs
of strategies ÛA , ÛB . Figure 2 shows Alice’s expected
payoff for g 苷 0. As can be seen in this figure, for any
of Bob’s choices ÛB Alice’s payoff is maximized if she
chooses to play D̂. The game being symmetric, the same
holds for Bob, and D̂ ≠ D̂ is the equilibrium in dominant
strategies. Indeed, separable games do not display any
FIG. 1. The setup of a two-player quantum game. features which go beyond the classical game.
3078
VOLUME 83, NUMBER 15 PHYSICAL REVIEW LETTERS 11 OCTOBER 1999
Surprisingly, D̂ ≠ D̂ even ceases to be a Nash equi-
librium as both players can improve by unilaterally
deviating from the strategy D̂. However, concomitant
5
4 with the disappearance of the equilibrium D̂ ≠ D̂ a
3 new Nash equilibrium Q̂ ≠ Q̂ has emerged with payoff
2
1 $A 共Q̂, Q̂兲 苷 $B 共Q̂, Q̂兲 苷 3. Indeed, $A 关Û共u, f兲, Q̂兴 苷
Q Q cos2 共u兾2兲 共3 sin2 f 1 cos2 f兲 # 3 for all u [ 关0, p兴 and
f [ 关0, p兾2兴, and analogously $B 共Q̂, ÛB 兲 # $B 共Q̂, Q̂兲
C C for all ÛB [ S such that no player can gain from unilater-
ally deviating from Q̂ ≠ Q̂. It can be shown that Q̂ ≠ Q̂
is a unique equilibrium, that is, rational reasoning dictates
that both players play Q̂ as their optimal strategy.
D D It is interesting to see that Q̂ ≠ Q̂ has the property to be
FIG. 2. Alice’s payoff in a separable game. In this and the Pareto optimal [2], that is, by deviating from this pair of
following plot we have chosen a certain parametrization such strategies it is not possible to increase the payoff of one
that the strategies ÛA and ÛB each depend on a single parameter player without lessening the payoff of the other player.
t [ 关21, 1兴 only: We set ÛA 苷 Û共tp, 0兲 for t [ 关0, 1兴 and In the classical game, only mutual cooperation is Pareto
ÛA 苷 Û共0, 2tp兾2兲 for t [ 关21, 0兲 (same for Bob). Defection
D̂ corresponds to the value t 苷 1, cooperation Ĉ to t 苷 0, and optimal, but it is not an equilibrium solution. One could
Q̂ is represented by t 苷 21. say that by allowing for quantum strategies the players
escape the dilemma [14].
The alert reader may object that—very much like
The situation is entirely different for a maximally any quantum mechanical system can be simulated on a
entangled game g 苷 p兾2. Here, pairs of strategies exist classical computer—the quantum game proposed here
which have no counterpart in the classical domain, yet by can be played by purely classical means. For instance,
virtue of Eq. (6) the game behaves completely classical Alice and Bob may each communicate their choice of
if both players decide to play f 苷 0. For example, angles to the judge using ordinary telephone lines. The
PCC 苷 j cos共fA 1 fB 兲 cos共uA 兾2兲 cos共uB 兾2兲j2 factorizes judge computes the values Pss 0 , tosses a four-sided coin
on S0 ≠ S0 (i.e., fA 苷 fB 苷 0 fixed), but exhibits non- which is biased on these values, and returns the payoff
local correlations otherwise. In Fig. 3 we depict Alice’s according to the outcome of the experiment. While
payoff in the Prisoners’ Dilemma as a function of the such an implementation yields the proper payoff in this
strategies ÛA , ÛB . Assuming Bob chooses D̂, Alice’s scenario, four real numbers have to be transmitted. This
best reply would be contrasts most dramatically with our quantum mechanical
√ ! model which is more economical as far as communication
i 0
Q̂ ⬅ Û共0, p兾2兲, Q̂ 苷 , (7) resources are concerned. Moreover, any local hidden
0 2i
variable model of the physical scheme presented here
while assuming Bob plays Ĉ, Alice’s best strategy would predicts inequalities for Pss 0 , as functions of the four
be defection D̂. Thus, there is no dominant strategy left angles uA , uB , fA , and fB , which are violated by the
for Alice. The game being symmetric, the same holds for above expressions for the expected payoff. We conclude
Bob, i.e., D̂ ≠ D̂ is no longer an equilibrium in dominant that in an environment with limited resources, it is only
strategies. quantum mechanics which allows for an implementation
of the game presented here.
So far we have considered fair games, where both
players had access to a common strategic space. What
happens when we introduce an unfair situation (Alice may
use a quantum strategy, i.e., her strategic space is still S,
5
4
while Bob is restricted to apply only “classical strategies”
3 characterized by fB 苷 0)? In this case, Alice is well
2
advised to play
Q Q
1
√ !
1 i 1
M̂ 苷 Û共p兾2, p兾2兲, M̂ 苷 p , (8)
2 21 2i
C C
(the “miracle move”), giving her at least reward r 苷 3
as payoff, since $A 关M̂, Û共u, 0兲兴 $ 3 for any u [ 关0, p兴,
leaving Bob with $B 关M̂, Û共u, 0兲兴 # 1兾2 [see Fig. 4(a)].
D D Hence, if in an unfair game Alice can be sure that
FIG. 3. Alice’s payoff for a maximally entangled game. The Bob plays Û共u, 0兲, she may choose “always-M̂” as her
parametrization is chosen as in Fig. 2. preferred strategy in an iterated game. This certainly
3079
VOLUME 83, NUMBER 15 PHYSICAL REVIEW LETTERS 11 OCTOBER 1999
duced a correspondence principle which guarantees that
the performance of a classical game and its quantum ex-
5 (a)
tension can be compared in an unbiased manner. Very
4 much as in quantum cryptography and computation, we
have found superior performance of the quantum strate-
3
gies if entanglement is present.
2 This research was triggered by an inspiring talk by
1 Artur Ekert on quantum computation. We also ac-
knowledge fruitful discussions with S. M. Barnett, C. H.
Bennett, R. Dum, T. Felbinger, P. L. Knight, H.-K. Lo,
A. Sanpera, and P. Zanardi. This work was supported by
the DFG.
5 (b)
4
[1] J. von Neumann and O. Morgenstern, The Theory of
3 Games and Economic Behaviour (Princeton University
2 Press, Princeton, NJ, 1947).
[2] R. B. Myerson, Game Theory: An Analysis of Conflict
1 (MIT Press, Cambridge, MA, 1991).
[3] R. Axelrod, The Evolution of Cooperation (Basic Books,
New York, 1984); R. Dawkins, The Selfish Gene (Oxford
University Press, Oxford, 1976).
FIG. 4. Quantum versus classical strategies: (a) Alice’s payoff [4] C. H. Bennett, F. Bessette, G. Brassard, L. Salvail, and
as a function of u when Bob plays Û共u, 0兲 [Û共0, 0兲 苷 Ĉ and J. Smolin, J. Cryptol. 5, 3 (1992).
Û共p, 0兲 苷 D̂] and Alice chooses Ĉ (solid line), D̂ (dotted [5] A. K. Ekert, Phys. Rev. Lett. 67, 661 (1991).
line), or M̂ (dashed line). (b) The expected payoff that Alice [6] N. Gisin and B. Huttner, Phys. Lett. A 228, 13 (1997).
can always attain in an unfair game as a function of the
entanglement parameter g. [7] R. F. Werner, Phys. Rev. A 58, 1827 (1998).
[8] D. A. Meyer, Phys. Rev. Lett. 82, 1052 (1999).
[9] L. Goldenberg, L. Vaidman, and S. Wiesner, Phys. Rev.
outperforms tit-for-tat, but one must keep in mind that Lett. 82, 3356 (1999).
the assumed asymmetry is essential for this argument. [10] Alice’s reasoning goes as follows: “If Bob cooperates, my
It is moreover interesting to investigate how Alice’s payoff will be maximal if, and only if, I defect. If, on the
advantage in an unfair game depends on the degree other hand, Bob defects, my payoff will again be maximal
if, and only if, I defect. Hence I shall defect.”
of entanglement of the initial state jc0 典. The minimal
[11] The Prisoners’ Dilemma must be distinguished from
expected payoff m that Alice can always attain by its iterated versions where two players play the simple
choosing an appropriate strategy UA is given by Prisoners’ Dilemma several times while keeping track of
m 苷 max min $A 共ÛA , ÛB 兲 ; (9) the game’s history. In a computer tournament conducted
ÛA [S ÛB 苷Û共u,0兲 by Axelrod it was shown that a particular strategy tit-for-
tat outperforms all other strategies [3].
Alice will not settle for anything less than this quan- [12] D. Deutsch, Proc. R. Soc. London A 425, 73 (1989).
tity. By considering m a function of the entanglement [13] Probabilistic strategies of this type are called mixed
parameter g [ 关0, p兾2兴, it is clear that m共0兲 苷 1 (since strategies in game theory.
in this case the dominant strategy D̂ is the optimal choice) [14] In a more general treatment, one should include the pos-
while for maximal entanglement we find m共p兾2兲 苷 3 sibility that each player can resort to any local operation
which is achieved by playing M̂. Figure 4(b) shows m that quantum mechanics allows for. That is, each player
as a function of the entanglement parameter g. We ob- may apply any completely positive mapping represented
serve that m is in fact a monotone increasing function by operators Ai (Alice) and Bi (Bob), i 苷 1, 2, .P . . , respec-
y
of g, and the maximal advantage is only accessible for tively,
P y fulfilling the trace-preserving properties i Ai Ai 苷
maximal entanglement. Furthermore, Alice should de- 1, i Bi Bi 苷 1. It can then be shown that with these
viate from the strategy D̂ if, and only if, the degree strategic options the unique Nash equilibrium of the main
of entanglement exceeds a certain threshold value gth 苷 text is replaced by a continuous set and one isolated Nash
p equilibrium. This attracts the players’ attention and will
arcsin共1兾 5 兲 艐 0.464. The observed threshold behavior
make the players expect and therefore fulfill it (the focal
is in fact reminiscent of a first-order phase transition in point effect [2]). The focal equilibrium is the one where
Alice’s optimal strategy: At the threshold she should dis- Alice maps the initial state J jc0 典 具c0 jJ to 1兾4, and Bob
continuously change her strategy from D̂ to Q̂. chooses the same operation. This pair of strategies yields
In summary, we have demonstrated that novel features an expected payoff of 2.25 to each player and is there-
emerge if classical games such as the Prisoners’ Dilemma fore again more efficient than the equilibrium in dominant
are extended into the quantum domain. We have intro- strategies in the classical game.
3080