Decision Making Using Game Theory
Decision Making Using Game Theory
Game theory is a key element in most decision-making processes involving two or more
people or organisations. This book explains how game theory can predict the outcome of
complex decision-making processes, and how it can help you to improve your own
negotiation and decision-making skills. It is grounded in well-established theory, yet the
wide-ranging international examples used to illustrate its application oVer a fresh approach
to what is becoming an essential weapon in the armoury of the informed manager. The
book is accessibly written, explaining in simple terms the underlying mathematics behind
games of skill, before moving on to more sophisticated topics such as zero-sum games,
mixed-motive games, and multi-person games, coalitions and power. Clear examples and
helpful diagrams are used throughout, and the mathematics is kept to a minimum. It is
written for managers, students and decision makers in any Weld.
Dr Anthony Kelly is a lecturer at the University of Southampton Research & Graduate School
of Education where he teaches game theory and decision making to managers and students.
MMMM
Anthony Kelly
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, So Paulo
Cambridge University Press
The Edinburgh Building, Cambridge , United Kingdom
Published in the United States of America by Cambridge University Press, New York
www.cambridge.org
Information on this title: www.cambridge.org/9780521814621
Cambridge University Press 2003
This book is in copyright. Subject to statutory exception and to the provision of
relevant collective licensing agreements, no reproduction of any part may take place
without the written permission of Cambridge University Press.
First published in print format 2003
-
isbn-13 978-0-511-06494-4 eBook (NetLibrary)
-
isbn-10 0-511-06494-2 eBook (NetLibrary)
-
isbn-13 978-0-521-81462-1 hardback
-
isbn-10 0-521-81462-6 hardback
Contents
Preface
ix
Introduction
Terminology
Classifying games
A brief history of game theory
Layout
3
6
8
14
Games of skill
17
18
27
Games of chance
32
33
37
45
vi
Contents
48
49
52
66
72
77
78
80
86
90
93
98
99
102
104
105
107
113
115
129
vii
Contents
Repeated games
135
135
139
149
150
151
153
155
174
Rationality
Indeterminacy
Inconsistency
Conclusion
174
177
178
180
182
190
Bibiliography
Index
192
199
MMMM
Preface
Preface
Introduction
Man is a gaming animal. He must always be trying to get the better in something or other.
Charles Lamb 17751834 Essays of Elia
Introduction
vying for business from a common Wnite catchment area. Each has to
decide whether or not to reduce prices, without knowing what the
others have decided. Assuming that turnover increases when prices are
dropped, various strategic combinations result in gains or losses for
some of the retailers, but if one retailer gains customers, another must
lose them. So this is a zero-sum non-cooperative game and unlike
cooperative games, players need to conceal their intentions from each
other.
A third category of game represents situations where the interests of
players are partly opposed and partly coincident. Say, for example, the
teachers union at a school is threatening not to participate in parents
evenings unless management rescinds the redundancy notice of a
long-serving colleague. Management refuses. The union now complicates the game by additionally threatening not to cooperate with
preparations for government inspection, if their demands are not met.
Management has a choice between conceding and refusing, and whichever option it selects, the union has four choices: to resume both
normal work practices; to participate in parents evenings only; to
participate in preparations for the inspection only; or not to resume
participation in either. Only one of the possible strategic combinations
leads to a satisfactory outcome from the managements point of view
management refusing to meet the unions demands notwithstanding
the resumption of normal work although clearly some outcomes are
worse than others. Both players (management and union) prefer some
outcomes to others. For example, both would rather see a resumption
of participation in parents evenings since staV live in the community
and enrolment depends on it than not to resume participation in
either. So the players interests are simultaneously opposed and coincident. This is an example of a mixed-motive game.
Game theory aims to Wnd optimal solutions to situations of conXict
and cooperation such as those outlined above, under the assumption
that players are instrumentally rational and act in their own best
interests. In some cases, solutions can be found. In others, although
formal attempts at a solution may fail, the analytical synthesis itself can
illuminate diVerent facets of the problem. Either way, game theory
oVers an interesting perspective on the nature of strategic selection in
both familiar and unusual circumstances.
The assumption of rationality can be justiWed on a number of levels.
Terminology
At its most basic level, it can be argued that players behave rationally by
instinct, although experience suggests that this is not always the case,
since decision makers frequently adopt simplistic algorithms which
lead to sub-optimal solutions.
Secondly, it can be argued that there is a kind of natural selection at
work which inclines a group of decisions towards the rational and
optimal. In business, for example, organisations that select sub-optimal
strategies eventually shut down in the face of competition from optimising organisations. Thus, successive generations of decisions are
increasingly rational, though the extent to which this competitive
evolution transfers to not-for-proWt sectors like education and the
public services, is unclear.
Finally, it has been suggested that the assumption of rationality that
underpins game theory is not an attempt to describe how players
actually make decisions, but merely that they behave as if they were not
irrational (Friedman, 1953). All theories and models are, by deWnition,
simpliWcations and should not be dismissed simply because they fail to
represent all realistic possibilities. A model should only be discarded if
its predictions are false or useless, and game theoretic models are
neither. Indeed, as with scientiWc theories, minor departures from full
realism can often lead to a greater understanding of the issues (Romp,
1997).
Terminology
Game theory represents an abstract model of decision making, not the
social reality of decision making itself. Therefore, while game theory
ensures that a result follows logically from a model, it cannot ensure
that the result itself represents reality, except in so far as the model is an
accurate one. To describe this model accurately requires practitioners
to share a common language which, to the uninitiated, might seem
excessively technical. This is unavoidable. Since game theory represents
the interface of mathematics and management, it must of necessity
adopt a terminology that is familiar to both.
The basic constituents of any game are its participating, autonomous
decision makers, called players. Players may be individual persons,
organisations or, in some cases, nature itself. When nature is desig-
Introduction
Terminology
And if
management
chooses to . . .
Concede
Concede
Concede
Concede
Refuse
Refuse
Refuse
Refuse
Concede
Concede
Concede
Concede
Refuse
Refuse
Refuse
Refuse
Concede
Concede
Concede
Concede
Refuse
Refuse
Refuse
Refuse
Concede
Concede
Concede
Concede
Refuse
Refuse
Refuse
Refuse
If management
chooses to . . .
Introduction
Classifying games
There are three categories of games: games of skill; games of chance; and
games of strategy. Games of skill are one-player games whose deWning
property is the existence of a single player who has complete control
over all the outcomes. Sitting an examination is one example. Games of
skill should not really be classiWed as games at all, since the ingredient
of interdependence is missing. Nevertheless, they are discussed in the
next chapter because they have many applications in management
situations.
Games of chance are one-player games against nature. Unlike games
of skill, the player does not control the outcomes completely and
strategic selections do not lead inexorably to certain outcomes. The
outcomes of a game of chance depend partly on the players choices
and partly on nature, who is a second player. Games of chance are
further categorised as either involving risk or involving uncertainty. In
the former, the player knows the probability of each of natures responses and therefore knows the probability of success for each of his
or her strategies. In games of chance involving uncertainty, probabilities cannot meaningfully be assigned to any of natures responses
(Colman, 1982), so the players outcomes are uncertain and the probability of success unknown.
Games of strategy are games involving two or more players, not
including nature, each of whom has partial control over the outcomes.
In a way, since the players cannot assign probabilities to each others
choices, games of strategy are games involving uncertainty. They can be
sub-divided into two-player games and multi-player games. Within
each of these two sub-divisions, there are three further sub-categories
depending on the way in which the pay-oV functions are related to one
another whether the players interests are completely coincident;
completely conXicting; or partly coincident and party conXicting:
Games of strategy, whether two-player or multi-player, in which the
players interests coincide, are called cooperative games of strategy.
Games in which the players interests are conXicting (i.e. strictly
competitive games) are known as zero-sum games of strategy, so
called because the pay-oVs always add up to zero for each outcome of
a fair game, or to another constant if the game is biased.
Classifying games
GAME THEORY
Games of skill
Games of chance
Games of strategy
Games involving
risk
Games involving
uncertainty
Two-person
Cooperative
Purely
cooperative
Mixed- motive
Minimal
social
situation
Zero-sum
Infinite
Multi-person
Finite
Non-cooperative
Cooperative
Perfect info
Imperfect info
Symmetric
games
No saddle
Saddle
Have no optimal
equilibrium points
Shapley
Snow
Non-essential
coalitions
Power
indices
Mixed strategy
Have optimal
equilibrium points
Essential
coalitions
Dominance &
admissability
Shapley
ShapleyShubik
Leadership
Johnston
Duopoly
models
Heroic
DeeganPackel
Exploitation
Repeated games
Banzhaf
Martyrdom
Figure 1.1
A taxonomy of games.
Introduction
In 1921, the eminent French academician Emile Borel began publishing on gaming strategies, building on the work of Zermelo and
others. Over the course of the next six years, he published Wve papers
on the subject, including the Wrst modern formulation of a mixedstrategy game. He appears to have been unaware of Waldegraves
earlier work. Borel (1924) attempted, but failed, to prove the minimax
theorem. He went so far as to suggest that it could never be proved, but
as is so often the case with rash predictions, he was promptly proved
wrong! The minimax theorem was proved for the general case in
December 1926, by the Hungarian mathematician, John von
Neumann. The complicated proof, published in 1928, was subsequently modiWed by von Neumann himself (1937), Jean Ville (1938), Hermann Weyl (1950) and others. Its predictions were later veriWed by
experiment to be accurate to within one per cent and it remains a
keystone in game theoretic constructions (ONeill, 1987).
Borel claimed priority over von Neumann for the discovery of game
theory. His claim was rejected, but not without some disagreement.
Even as late as 1953, Maurice Frechet and von Neumann were engaged
in a dispute on the relative importance of Borels early contributions to
the new science. Frechet maintained that due credit had not been paid
to his colleague, while von Neumann maintained, somewhat testily,
that until his minimax proof, what little had been done was of little
signiWcance anyway.
The verdict of history is probably that they did not give each other
much credit. Von Neumann, tongue Wrmly in cheek, wrote that he
considered it an honour to have labored on ground over which Borel
had passed (Frechet, 1953), but the natural competition that can
sometimes exist between intellectuals of this stature, allied to some
local FrancoGerman rivalry, seems to have got the better of common
sense.
In addition to his prodigious academic achievements, Borel had a
long and prominent career outside mathematics, winning the Croix de
Guerre in the First World War, the Resistance Medal in the Second
World War and serving his country as a member of parliament,
Minister for the Navy and president of the prestigious Institut de
France. He died in 1956.
Von Neumann found greatness too, but by a diVerent route. He was
thirty years younger than Borel, born in 1903 to a wealthy Jewish
banking family in Hungary. Like Borel, he was a child prodigy. He
10
Introduction
11
12
Introduction
removed the two-person zero-sum restriction from Zermelos theorem, by replacing the concept of best individual strategy with that of
the Nash equilibrium. He proved that every n-person game of perfect
information has an equilibrium in pure strategies and, as part of that
proof, introduced the notion of sub-games. This too became an important stepping-stone to later developments, such as Seltens concept
of sub-game perfection.
The triad formed by these three works von NeumannMorgenstern, LuceRaiVa and Nash was hugely inXuential. It encouraged a
community of game theorists to communicate with each other and
many important concepts followed as a result: the notion of
cooperative games, which Harsanyi (1966) was later to deWne as ones in
which promises and threats were enforceable; the study of repeated
games, in which players are allowed to learn from previous interactions
(Milnor & Shapley, 1957; Rosenthal, 1979; Rosenthal & Rubinstein,
1984; Shubik, 1959); and bargaining games where, instead of players
simply bidding, they are allowed to make oVers, counteroVers and side
payments (Aumann, 1975; Aumann & Peleg, 1960; Champsaur, 1975;
Hart, 1977; Mas-Colell 1977; Peleg, 1963; Shapley & Shubik, 1969).
The Second World War had highlighted the need for a strategic
approach to warfare and eVective intelligence-gathering capability. In
the United States, the CIA and other organisations had been set up to
address those very issues, and von Neumann had been in the thick of it,
working on projects such as the one at Los Alamos to develop the
atomic bomb. When the war ended, the military establishment was
naturally reluctant to abandon such a fruitful association so, in 1946,
the US Air Force committed $10 million of research funds to set up the
Rand Corporation. It was initially located at the Douglas Aircraft
Company headquarters, but moved to purpose-built facilities in Santa
Monica, California. Its remit was to consider strategies for intercontinental warfare and to advise the military on related matters. The
atmosphere was surprisingly un-military: participants were well paid,
free of administrative tasks and left to explore their own particular
areas of interest. As beWtted the political climate of the time, research
was pursued in an atmosphere of excitement and secrecy, but there was
ample opportunity for dissemination too. Lengthy colloquia were held
in the summer months, some of them speciWc to game theory, though
security clearance was usually required for attendance (Mirowski,
1991).
13
It was a period of great activity at Rand from which a new rising star,
Lloyd Shapley, emerged. Shapley, who was a student with Nash at
Princeton and was considered for the same Nobel Prize in 1994, made
numerous important contributions to game theory: with Shubik, he
developed an index of power (Shapley & Shubik, 1954 & 1969); with
Donald Gillies, he invented the concept of the core of a game (Gale &
Shapley, 1962; Gillies, 1959; Scarf, 1967); and in 1964, he deWned his
value for multi-person games. Sadly, by this time, the Rand Corporation had acquired something of a Dr Strangelove image, reXecting a
growing popular cynicism during the Vietnam war. The mad wheelchair-bound strategist in the movie of the same name was even thought
by some to be modelled on von Neumann.
The decline of Rand as a military think-tank not only signalled a shift
in the axis of power away from Princeton, but also a transformation of
game theory from the military to the socio-political arena (Rapoport &
Orwant, 1962). Some branches of game theory transferred better than
others to the new paradigm. Two-person zero-sum games, for
example, though of prime importance to military strategy, now had
little application. Conversely, two-person mixed-motive games, hardly
the most useful model for military strategy, found numerous applications in political science (Axelrod, 1984; Schelling, 1960). Prime among
these was the ubiquitous prisoners dilemma game, unveiled in a
lecture by A.W. Tucker in 1950, which represents a socio-political
scenario in which everyone suVers by acting selWshly, though rationally. As the years went by, this particular game was found in a
variety of guises, from drama (The Caretaker by Pinter) to music (Tosca
by Puccini). It provoked such widespread and heated debate that it was
nearly the death of game theory in a political sense (Plon, 1974), until it
was experimentally put to bed by Robert Axelrod in 1981.
Another important application of game theory was brought to the
socio-political arena with the publication of the ShapleyShubik
(1954) and Banzhaf (1965) indices of power. They provided political
scientists with an insight into the non-trivial relationship between
inXuence and weighted voting, and were widely used in courts of law
(Mann & Shapley, 1964; Riker & Ordeshook, 1973) until they were
found not to agree with each other in certain circumstances (StraYn,
1977).
In 1969, Robin Farquharson used the game theoretic concept of
strategic choice to propose that, in reality, voters exercised their
14
Introduction
franchise not sincerely, according to their true preferences, but tactically, to bring about a preferred outcome. Thus the concept of strategic
voting was born. Following publication of a simpliWed version nine
years later (McKelvey & Niemi, 1978), it became an essential part of
political theory.
After that, game theory expanded dramatically. Important centres of
research were established in many countries and at many universities.
It was successfully applied to many new Welds, most notably evolutionary biology (Maynard Smith, 1982; Selten, 1980) and computer
science, where system failures are modelled as competing players in a
destructive game designed to model worst-case scenarios.
Most recently, game theory has also undergone a renaissance as a
result of its expansion into management theory, and the increased
importance and accessibility of economics in what Alain Touraine
(1969) termed the post-industrial era. However, such progress is not
without its dangers. Ever more complex applications inspire ever more
complex mathematics as a shortcut for those with the skill and knowledge to use it. The consequent threat to game theory is that the
fundamentals are lost to all but the most competent and conWdent
theoreticians. This would be a needless sacriWce because game theory,
while undeniably mathematical, is essentially capable of being understood and applied by those with no more than secondary school
mathematics. In a very modest way, this book attempts to do just that,
while oVering a glimpse of the mathematical wonderland beyond for
those with the inclination to explore it.
Layout
The book basically follows the same pattern as the taxonomy of games
laid out in Figure 1.1. Chapter 2 describes games of skill and the
solution of linear programming and optimisation problems using
diVerential calculus and the Lagrange method of partial derivatives. In
doing so, it describes the concepts of utility functions, constraint sets,
local optima and the use of second derivatives.
Chapter 3 describes games of chance in terms of basic probability
theory. Concepts such as those of sample space, random variable and
distribution function are developed from Wrst principles and
15
Layout
16
Introduction
Games of skill
It is not from the benevolence of the butcher, the brewer, or the baker, that we expect our
dinner, but from their regard to their own interest.
Adam Smith 1789 The Wealth of Nations
Games of skill are one-player games. Since they do not involve any
other player, and when they involve nature it is under the condition of
certainty, they are not really regarded as genuine games. Nature does
not constitute a genuine second player, as in the case of games of
chance, because nothing nature does aVects the outcomes of the
players choices. The solitary player in games of skill knows for certain
what the outcome of any choice will be. The player completely controls
the outcomes. Solving a crossword puzzle is a game of skill, but playing
golf is not, since the choices that the player makes do not lead to
outcomes that are perfectly predictable. Golf is a game of chance
involving uncertainty, although some would call it a form of moral
eVort! Nature inXuences the outcomes to an extent which depends on
the players skill, but the probability of which is not known.
The operation of single-player decision making is discussed in the
following sections. The problem of linear programming and optimisation, where a player wishes to optimise some utility function within a
set of constraints, is considered with the help of some realistic
examples. The application of some basic concepts from calculus, including the Lagrange method of partial derivatives, is also discussed.
17
18
Games of skill
19
'(p) = 0
'(a) > 0
b
'(b) < 0
y = x)
x
Figure 2.1
y = x)
'(a) < 0
'(b) > 0
b
p
'(p) = 0
x
Figure 2.2
20
Games of skill
21
40
(40, 20)
x
60
Figure 2.3
The constraint set for the conversion of a nurses residence for in-patient and out-patient use.
Pay-off
(Profit)
tx + ny
Strategy
(Patient mix)
Figure 2.4
(0, 40)
40ny
(40, 20)
40tx + 20ny
(60, 0)
60tx
Pay-off matrix for the conversion of a nurses residence for in-patient and out-patient use.
22
Games of skill
Profits per
month
Patient
mix
Figure 2.5
If t = 1600
and n = 1500
If t = 1400
and n = 1700
If t = 1000
and n = 2100
(0, 40)
60 000
68 000
84 000
(40, 20)
94 000
90 000
82 000
(60, 0)
96 000
84000
60 000
Sample numerical prots for the conversion of a nurses residence for in-patient and out-patient use.
23
Let x represent the number of full-time staV employed at the call centre.
Let y represent the number of part-time staV.
Clearly,
x P 30 and y P 0
24
Games of skill
R (x)
.
.
3200
3000
x = +4
2800
.
x
+2
Figure 2.6
25
80
x > 30
(30, 25)
20
(30, 10)
x
34
40
x + 2/5y < 40
Figure 2.7
Cost
40x + 14y
Strategy
Figure 2.8
( 34, 0 )
1360
( 30, 10 )
1340
( 30, 25 )
1550
( 40, 0 )
1600
26
Games of skill
n[(h 9 4)1/2 9 h]
a
The company wishes to determine the optimal number of teaching hours for its
trainees so as to maximise overall results, subject to a minimum requirement of
4 hours per week imposed by the Institute of Chartered Accountants code of
practice.
27
h
4
3.5
(4.25, 3.75)
4.0
4.5
5.0
r (h )
5.5
Figure 2.9
A graphic representation of the solution to the problem of examination success and the time given to
direct tutoring.
below the benchmark when the number of hours per week given to
direct tutoring is 4.25.
A graphic representation of the above function and its solution can
be found on Figure 2.9, where the constant quotient n/a is normalised
to unity for convenience.
28
Games of skill
29
Let m : the number of materials funding units per week per project.
Let t : the number of time units per week per project.
Let s : the expected number of units of output : 380.
Let c : the cost of the projects per week.
Clearly,
c : 25m ; 60t
and the department wishes to minimise this equation subject to the
constraint
20tm1/2 : 380
The following three Lagrange equations must be solved:
s
c
:
m
m
c
s
:
t
t
20tm1/2 : 380
The partial derivatives are:
c
: 25
m
c
: 60
t
s
10t
:
m m 1/2
s
: 20m 1/2
t
30
Games of skill
ST
S
T
Figure 2.10
Relationship between modelling output and time units in the Creative Design department.
2t
m 1/2
3 : m 1/2
20tm 1/2 : 380
From the second Lagrange equation,
:
3
m 1/2
5m
6
and substituting this for t in the third Lagrange equation gives the Wrst
solution:
m : 8.04
Therefore, from the equation above,
t : 6.7
31
SM
1/2
M
Figure 2.11
Relationship between modelling output and materials units in the Creative Design department.
Games of chance
Chaos umpire sits, And by decision more embroils the fray By which he reigns; next him high
arbiter Chance governs all.
John Milton 16081674 Paradise Lost
32
33
Figure 3.1
si
P{si } = pi
34
Games of chance
Example 3.1
The six faces of a die have colours rather than the usual numbers. The sample
space is:
S : red, orange, yellow, green, blue, white
The die is fair, so the probability of each outcome, Psi : pi : 1/6.
The random variable, X, assigns to each face of the die a real number. In
other words, say:
X(red) : 1, X(orange) : 2, X(yellow) : 3
X(green) : 4, X(blue) : 5, X(white) : 6
The distribution function, F, is now the function that turns each
random variable into a probability. In this example,
F(x) : P(X) :
Z(si)
6
where Z(si) is the number of integers less than or equal to the integer
representing X(si). So the obvious values for this distribution function
are:
F[x(r)] : 1/6, F[x(o)] : 1/3, F[x(y)] : 1/2
F[x(g)] : 2/3, F[x(b)] : 5/6, F[x(w)] : 1
and its graph can be seen on Figure 3.2
35
y = F (x)
1
0.5
x
1
Figure 3.2
The graph of the distribution function of the random variable in Example 3.1.
Three important measurements, commonly used in statistics, are associated with random variables and their distribution functions expected value, variance and standard deviation. In particular, expected
value is the basis for solving many games of chance.
The expected value, E(X), of the random variable X is the real
number:
E(X) : pi xi, from i : 1 to n, where xi : X(si)
In Example 3.1 above,
E(X) : 1/6(1) ; 1/6(2) ;1/6(3) ;1/6(4) ;1/6(5) ;1/6(6)
:3.5
36
Games of chance
F(x) :
(x) dx, x + R
9-
P(X) : F(x) :
(x) dx
a
: F(b) 9 F(a), if a X O b
Using this new (continuous function) notation, the expected value and
variance are given by the formulae:
;-
E(X) :
x (x) dx
9-
and
37
;-
V(X) :
[x 9 E(X)]2 (x) dx
9-
The Wgures in Table 3.1 oVer the solution. The rail company should
apply for EU funding, since its expected value is 720m, compared with
690m from the Public Transport Reconstruction Fund.
However, simple and attractive though this technique may be, there
are a number of serious objections. In the Wrst place, not every decision
can be made on the basis of monetary value alone. If that were the case,
no one would ever take part in lotteries or go to casinos since, in the
long term, it is almost certain that one would lose. Even the notion of
38
Games of chance
Table 3.1 Pay-off matrix for a company applying for funding for a new high-speed rail link
Source of
grant
(i)
Expected
value ()a
(E = Q A)
EU
4000
2000
1000
3
7
10
20
0.40
0.40
0.40
0.40 3/20
0.40 7/20
0.40 10/20
0.40
240
280
200
720
PTRF
2500
1500
750
10
20
30
60
0.60
0.60
0.60
0.60 5/50
0.60 15/50
0.60 30/50
0.50
150
270
270
690
39
the player, rather than their objective value, although of course, simple
pay-oV value sometimes equates to utility value. Moreover, both principles share some common ground. For example, a smaller amount can
never have a greater utility or pay-oV value than a larger amount.
The expected utility value of a choice c, U(c), for a continuous
distribution function, is deWned as:
U(c) : pi ui, from i : 1 to n
where ui is called the von NeumannMorgenstern utility function and
represents the players preferences among his or her expected values,
E(xi). In probabilistic terms, every decision involving risk is a lottery
whose pay-oV is the sum of the expected utility values, as the following
example demonstrates.
Example 3.3 The viability of computer training courses
The City & Guilds of London Institute, the UK training organisation, oVers a
range of education classes in autumn and spring for adults wishing to return to
work. On average, only one course in six actually runs; the others fail because of
insuYcient enrolment. The organisation is considering oVering a new Membership diploma course (Level 6) in Information Technology, which nets the
organisation 300 per capita in government capitation subsidies if it runs. At
present, the organisation oVers both Licentiateship (Level 4) and Graduateship
(Level 5) certiWcate courses. The former nets it 108 per capita and the latter nets
it 180 per capita.
Which course should the organisation oVer the new single diploma course
(D) or the established pair of certiWcate courses (C) if the organisation is: (i)
risk-neutral; (ii) averse to risk; (iii) risk-taking?
40
Games of chance
(i) The organisation is risk-neutral. The expected utility value for the
diploma option, U(D), is:
U(D) : 1/6(300) : 50
whereas the expected utility value for the double certiWcate option,
U(C), is:
U(C) : 1/6(180) ; 1/6(108) : 48
so the diploma option is marginally preferred.
(ii) The organisation is averse to risk. The expected utility value for the
diploma option, U(D), is:
U(D) : 1/6(300 : 2.887
whereas the expected utility value for the double certiWcate option,
U(C), is
U(C) : 1/6(180 ; 1/6(108 : 3.968
so the certiWcate option is preferred.
(iii) The organisation is willing to take risks. The expected utility value
for the diploma option, U(D), is:
U(D) : 1/6(300)2 : 15 000
whereas the expected utility value for the double certiWcate option,
U(C), is
U(C) : 1/6(180)2 ; 1/6(108)2 : 7344
so the diploma option is clearly preferred.
41
u(x) E(x)
[ u(x ) = 0 ]
X
Figure 3.3
The von NeumannMorgenstern utility function and the expected value have a linear relationship.
value of 1 and 50 has a utility value of 2, then 500 has a utility value of
3.)
The results of Example 3.3 can be represented graphically and the
three categories of risk generalised to deWnitions, as follows:
If the von NeumannMorgenstern utility function, which represents
a players preferences among expected values, and the expected value
itself have a linear relationship, the player is said to be risk-neutral
(see Figure 3.3). In such a linear von NeumannMorgenstern utility
function, the player is essentially ranking the values of the game in
the same order as the expected values. This, by deWnition, is what it
means to be risk-neutral. (The values of the game in Example 3.3 are:
300 for the diploma option; and 144 for the certiWcate option,
being the average of 180 and 108.)
Notice that, for a linear function,
u"(x) : 0
If the von NeumannMorgenstern utility function is proportional to
any root of the expected value, the player is said to be risk-averse (see
Figure 3.4). Generally, risk-averse functions are of the form:
u(xi) . n'E(xi)
Notice that the derivative of a concave function such as this is clearly
decreasing, so:
u"(x) 0
42
Games of chance
ux
Ex
u x
X
Figure 3.4
The von NeumannMorgenstern utility function is proportional to a root of the expected value.
u(x) E(x)n
[ u (x) < 0 ]
X
Figure 3.5
The von NeumannMorgenstern utility function is proportional to a power of the expected value.
43
100 000m
44
Games of chance
45
a utility value based on the same fraction of the most and least
preferred values.
Utility values reXect a players relative preferences and the von
NeumannMorgenstern theory suggests that players will always try to
maximise utility value, rather than expected value, although the two
may occasionally produce the same result.
46
Games of chance
Nature
Strategy
Bank
Insure
Do not insure
Figure 3.6
No maternity
leave
1 maternity
leave
5000
5000
13 500
other. These four outcomes constitute the banks pay-oV matrix, which
can be seen on Figure 3.6.
Of course, the pay-oV matrix only shows monetary values and
ignores extraneous factors such as the diYculty of getting long-term
staV cover in relatively isolated rural areas or at certain times of the
year.
47
Nature
Strategy
Bank
Insure
Do not insure
Figure 3.7
No maternity
leave
1 maternity
leave
5000
8500
A wrong decision isnt forever; it can always be reversed. But the losses from a delayed
decision can never be retrieved.
J.K. Galbraith 1981 A Life in our Times
49
Figure 4.1
A directed graph.
50
b
f
d
a
e
c
Figure 4.2
nodes and edges are themselves Wnite. Such nodes are called terminal
nodes, nt, and are recognisable as places where edges go in, but not out.
On Figure 4.1, for example, node e is a terminal node.
Conversely, every set of decisions and therefore every arrow diagram
must have a starting node, called a root, r, which is recognisable as the
node with no arrows going into it. Edges come out, but do not go in.
Node f is the only root on Figure 4.1. Roots are important features of a
special type of directed graphs, called a tree, discussed in greater detail
later.
In preparation for tracing decision-making strategies back through
time, from the pay-oV to the root, it is worth mentioning that, for every
directed graph, GD : (N, E), there exists another directed graph,
GDB : (N, EB), called the backward directed graph of GD, whose nodes
are the same as those of GD, but whose edges are reversed. So GDB for
the directed graph represented on Figure 4.1, for example, is:
N : a, b, c, d, e, f
E : (b, a), (d, b), (c, d), (d, c), (a, c), (e, d), (e, c), (e, f)
and this can be seen on Figure 4.2.
As was mentioned above, a special type of directed graph exists,
known as a tree, in which there is a root and (for all other nodes) one
and only one path (see Figure 4.3).
It can be seen that for trees, the root is an ancestor of every node,
every node is a descendent of the root and no node can have more than
one parent. Also, there are no reverse paths if there exists a path from
n1 to n2, one does not exist from n2 to n1.
51
l
m
n
e
A branch
o
f
p
q
g
r
u
c
v
w
j
d
x
y
z
Figure 4.3
52
53
Without sports
fields
Green-field site
Disperse staff
to other offices
nearby
Existing site
New building
Move
during
work
Move en bloc to
another site
Stay in prefabs
on existing site
Extension
With indoor
sports
facilities
With theatre or
presentation facilities
Individual
studios
Computer
centre
Networked
Stand alone
Figure 4.4
Alpha stage: All the parents of terminal nodes are selected and to each
of these alpha nodes is assigned the best possible pay-oV from their
terminals. If a node has only one terminal node, then that pay-oV
must be assigned. All but the best pay-oV nodes are deleted to get a
thinned-out tree.
54
f
k
r
c
i
Figure 4.5
A decision-making graph.
Dealt
with
in-shop
Shift
supervisor
Bad
behaviour
Referred
to
manager
Reprimand
Re-do work
without overtime
Sacked
Good
behaviour
Praise
Merit stars
Praise in weekly
meetings
Increased
hourly rate
Promotion to manager
Figure 4.6
55
The best pay-oVs are not always the biggest numbers. Sometimes,
in cases of minimising cost, for example, the backward selection of
pay-oVs is based on selecting the lowest numbers.
Beta stage: The process is traced back a further step by looking at the
parents of the alpha nodes the beta nodes and assigning to each of
them the best pay-oVs of their alpha nodes. Then all but the best
pay-oV nodes are deleted.
Gamma stage et seq: The above steps are repeated to get an everthinning tree, until the root is the only parent left. What remains is
the optimal path.
Clearly, after all stages have been carried out, the option which involves
the lowest time commitment from staV is the middle one sponsoring
the UK soccer event (50 days). The concert option involves the staV in a
minimum of 64 days of preparation and participation, and the literary
prize option in a minimum of 66 days of involvement.
The method of backward induction is slightly more cumbersome for
decision making graphs than for trees. The pay-oVs must be carried
backwards in a cumulative sense, because the optimal path can only be
determined when all the backward moves are made. This is a consequence of the fact that nodes may have more than one edge coming in.
The following example illustrates the method.
56
Classical
26
30
40
Pop
Music
32
Drama
Mixed
34
30
48
Concert series
40
10
Soccer
Rugby
48
56
One meeting,
all races
UK
Sporting
event
Europe
N. America
4
Athletics
48
48
Baseball
40
Soccer
56
14
Serie A (Italy)
A European cup
Literary
prize
Confined
to new
writers
48
36
Boxing
Poets
36
Open to all
40
30
Novelists
27
Poets
30
d stage
g stage
Novelists
b stage
a stage
(a)
Figure 4.7
(a ) A backward induction process for organising three sponsored events. The backward induction
process after (b ) the stage; (c ) the and stages; (d ) the , and stages; and (e ) all stages.
57
Classical
26
30
40
Pop
Music
32
Drama
Mixed
34
30
48
Concert series
40
10
Soccer
Rugby
48
UK
Sporting
event
Europe
N. America
4
Athletics
Baseball
Literary
prize
Confined
to new
writers
48
36
Boxing
Poets
36
Open to all
40
30
Novelists
27
Poets
30
d stage
(b)
Figure 4.7 (cont. )
g stage
48
Soccer
56
14
Novelists
b stage
40
A European cup
58
Classical
26
40
(26)
Music
Drama
34
30
(34)
Concert series
40
10
Soccer
(40)
UK
Sporting
event
(46)
Europe
Soccer
N. America
4
(48)
40
Literary
prize
Confined
to new
writers
48
Boxing
(30)
36
30
Open to all
27
40
Poets
(27)
d stage
(c)
Figure 4.7 (cont. )
g stage
Novelists
A European cup
59
(64)
Drama
34
A run of one
three-act play
30
Concert series
40
Soccer
10
UK
Sporting
event
(50)
Literary
prize
Confined
to new
writers
36
(66)
30
Novelists
d stage
(d )
40
Soccer
10
UK
(50)
(e)
Figure 4.7 (cont. )
Sporting
event
60
Some nodes have more than one running total. In this event, the lowest
total is selected, since the agency wishes to minimise the time to
appointment. For example, there are two paths to Convene internal
board. One takes 21 days (15 ; 6) and the other takes 18 days
(10 ; 8). It can never make sense to select the former, so it is discarded.
For all calculations thereafter, this node is assigned the value 18, and
the backwards process continues. The lower Wgure is shown in bold
type on Figures 4.8, and the discarded one in light type.
The optimal strategy, as far as time is concerned, is to appoint one
systems manager and two programmers by advertising internally, convening an internal appointments board and not requiring a medical
examination before appointment.
Sequential decision making in single-player games involving uncertainty
61
Prepare
specifications
Advertise
externally
(43)
(39)
Convene full
appoint. board
20
(40)
(20)
3
15
1 manager & 2
programmers
22
(42)
(36)
4 full-time
progrs
Prep.
spec
Advertise
internally
(44) / (40)
6 part-time operators
(21)
(18)
Convene
internal board
18
Appoint
8
28
10
(46)
(20)
(46)
(40)
20
Prepare
specifications
Do not
advertise
stage
(a)
(12)
Prepare
specifications
Appointment
by managing
director
stage
Advertise
externally
(40)
(39)
Take up
references
stage
Convene full
appoint. board
20
stage
(20)
3
15
1 manager & 2
programmers
(36)
4 full-time
progrs
Prep.
spec
(18)
Advertise
internally
Convene
internal board
18
Appoint
(40)
6 part-time operators
8
10
Prepare
specifications
(b)
Figure 4.8
(12)
(20)
(40)
20
stage
Do not
advertise
stage
Appointment
by managing
director
stage
Take up
references
stage
A backward induction process (a) for three possible appointments options, where time is the critical
factor; and (b ) after discarding from four stages and showing the optimal strategy.
62
63
9000
Continue involvement
Company
Produces
benefits
Discontinue involvement
(9000)
p
Chance
2250
Fails to produce benefits
1 p
2250
Company
8000
Continue involvement
Company
Produces
benefits
(8000)
Discontinue involvement
q
Chance
150
1 q
150
b stage
Figure 4.9
a stage
64
r
9000
1 r
Company
Produces
benefits
Continue involvement
Discontinue involvement
(9000)
Chance
2250
Fails to produce benefits
1 p
2250
r
Company
8000
1 r
Give low commitment
Continue involvement
Company
Produces
benefits
Discontinue involvement
(8000)
q
150
Chance
1 q
150
stage
Figure 4.10
stage
stage
Decision tree for a company involved in a research partnership, with a priori probability.
65
will happen (denoted by r on Figure 4.10) and how this gets revised to
an a posteriori probability when the pharmaceutical company gathers
more information from experience of involvement in research or from
other similar companies. The probability of one event occurring once
another event is known to have occurred is usually calculated using
Bayess formula.
If the a posteriori probability of event A happening given that B has
already happened is denoted by p(A/B); and Ac denotes the complementary event of A such that:
p(Ac) : 1 9 p(A)
then Bayess formula is:
p(A/B) :
p(B/A) p(A)
p(B/A)p(A) ; p(B/Ac)p(Ac)
66
where Oc denotes the complementary event of O, i.e. the school fails its
Ofsted inspection.
p(C/O)p(O)
p(C/O)p(O) ; p(C/Oc)p(Oc)
0.75 ; 0.62
:
0.75 ; 0.62 ; 0.17 ; 0.38
: 0.878
p(O/C) :
2 for P2;
1 for P3;
7 for P4.
67
d (3, 1, 2, 0)
a
e (1, 0, 4, 3)
P1
f (1, 0, 5, 2)
b
k (4, 2, 1, 7)
P1
P3
P4
g
l (3, 1, 2, 5)
h (1, 2, 8, 4)
c
i (2, 3, 4, 1)
P2
j (2, 1, 1, 3)
Figure 4.11
68
Choose X
n1
P2
Choose Y
Choose Y
P2
n2
Choose X
n3
f
P3
g
Figure 4.12
69
In other words, a choice function is the set of all edges coming from a
players nodes. This is the players strategy the choices that the player
must make at each information set. It is a complete plan of what to do
at each information set in case the game, brought by the other players,
arrives at one of the information set nodes. And the set of all such
strategies uniquely determines which terminal node and pay-oV is
reached.
As was mentioned already, sequential games can be games of either
perfect or imperfect information. In the former case, each player moves
one at a time and in full knowledge of what moves have been made by
the other players. In the latter, players have to act in ignorance of one
anothers moves, anticipating what the others will do, but knowing that
they exist and are inXuencing the outcomes. The following examples
illustrate the diVerence.
It is noticeable that every information set has only one element and,
in fact, this is true for all games of perfect information a game will be
70
(2, 1)
Accept proposals
P2
Reject proposals
Propose change
( 2, 0)
P1
Propose no change
(0, 0)
Figure 4.13
71
P2
Accept proposals
(4, 2)
Propose change
P1
Reject proposals
Propose no change
( 4, 0)
Accept resignation
( 1, 0)
Nature
P2
(P3)
Accept proposals
(2, 1)
Reject resignation
Propose change
Reject proposals
P1
Propose no change
(0, 0)
Figure 4.14
( 2, 0)
the board has accepted the retirement, unbeknown to the new operations manager, and she proposes no change, then the manager loses
some credibility with staV (91) and there is no alteration in the status
of the board (0). If, under the same circumstances, the operations
manager proposes change and the proposals are accepted by the board,
the manager gains massively (;4) and the board gains to a lesser
extent (;2) for their foresight and Xexibility. However, if the board
rejects the proposals for change, notwithstanding the regional managers resignation, the operations manager will appear out-of-touch
and unsympathetic (94), while the board will at least appear sympathetic in the eyes of staV (;1).
In this game, the operations manager cannot distinguish between
the two nodes in her information set. There is more than one node in
the information set, so the game is one of imperfect information. The
operations manager, at the time of making her choice, does not know
72
of natures outcome, although the board, making its decision after her,
does.
73
The minimal social situation is a class of games of imperfect information, where the players are ignorant of their own pay-oV functions and
the inXuence of other players. Each player only knows what choices
may be made. In some minimal situation games, one player may not
even know of the existence of the others.
Kelley et al. (1962) proposed a principle of rational choice for
minimal social situation games known as the winstay, losechange
principle. This principle states that, if a player makes a choice which
produces a positive pay-oV, the player will repeat that choice. On the
other hand, if a player makes a choice which produces a negative
pay-oV, the player will change strategy. Thus, strategies which produce
positive pay-oVs are reinforced and ones which produce negative
pay-oVs are not.
Again, the distinction must be made between players in a minimal
social situation game making decisions simultaneously and those in
which decisions are made sequentially. The following two examples
illustrate the signiWcant diVerences in outcome that can result from
these subtle diVerences in process.
74
Clifford
Chance
LLP
Strategy
Supplier A
Supplier B
Supplier
A
2 content; 1 content
2 content; 1 discontent
Supplier
B
2 discontent; 1 content
2 discontent; 1 discontent
A 9 A, etc.
If CliVord Chance chooses B Wrst and Kelley, Drye & Warren follows
by choosing supplier B too, both become discontent. So the former
changes, thus rewarding the latter who naturally sticks with the
original strategy. Still receiving no satisfaction, CliVord Chance
changes again, which makes Kelley, Drye & Warren discontent, and
75
B 9 B, etc.
A 9 B, etc.
B 9 A, etc.
76
He either fears his fate too much, Or his deserts are small, That puts it not unto the touch to
win or lose it all.
James Graham, Marquess of Montrose 16121650 My Dear and only Love
78
various methods for solving games with and without saddle points and
show how the issue of security leads inexorably to the notion of mixed
strategy. The chapter concludes with a discussion of interval and
ordinal scales and shows that game theory analyses can recommend
strategies even in cases where speciWc solutions cannot be found.
79
Player 1
Player 2
Player 2
AA
Figure 5.1
AB
BA
BB
Player 1
Player 2
Player 2
Outcomes
Figure 5.2
AA
AB
BA
BB
80
Player 2
Player 1
Figure 5.3
Strategy
Strategy A
Strategy B
Strategy A
AA
AB
Strategy B
BA
BB
81
Figure 5.4
A saddle point.
Player 2
Player 1
Figure 5.5
Strategy
Sometimes saddle points are unique and there is only one for the
game. Other times there is no saddle point or there are multiple ones.
Games without saddle points require fairly complicated methods to
Wnd a solution and are discussed later, but games with multiple saddle
points present no new problems. Figure 5.5 shows a pay-oV matrix
with two saddle points. The solution for player 1 is to choose either
strategy A or strategy B. The solution for player 2 is simply to choose
strategy B.
The following examples illustrate the methods by which two-person
zero-sum games may be represented and how they may be solved.
82
Surgeon
Incompetent
Competent
Hospital board
Hospital board
Dismiss
20 000
Figure 5.6
Retrain
10 000
Dismiss
50 000
Retrain
83
Hospital board
Strategy
Surgeon
Figure 5.7
Accept
incompetency
assessment
Reject
incompetency
assessment
Dismiss with
retirement
package
Retain
&
retrain
20 000
10 000
50000
84
No. bus
drivers
No. journeys
per driver
No.
oV-the-road
periods
Driving time
(h/m)
20 63-min journeys
4
5
6
5
4
3
0
0
2
21/0
21/0
18/54
21 60-min journeys
4
5
6
5
4
3
1
1
3
20/0
20/0
18/0
28 45-min journeys
4
5
6
7
5
4
0
3
4
21/0
18/45
18/0
Weekly timetable
Operations
manager
Personnel
manager
Figure 5.8
Strategy
20 journeys
21 journeys
28 journeys
4 drivers
21
20
21
5 drivers
21
20
18 / 45
6 drivers
18 / 54
18
18
85
Player 2
Player 1
Figure 5.9
Figure 5.10
a
b
c
d
e
f
g
6
14
10
8
7
12
7
4
5
12
8
2
11
11
13
13
8
10
4
8
12
12
10
14
10
13
9
10
13
12
10
7
9
7
8
8
3
14
13
14
5
9
7
14
12
14
10
8
10
Strategy
Column 1
Column 2
Row 1
Row 2
86
Player 2
Player 1
Figure 5.11
12
12
11
10
13
d
e
f
g
4
7
9
7
2
8
8
3
13
14
13
14
8
5
9
7
10
10
8
10
87
Player 2
c
12
10
13
f
g
8
3
9
7
8
10
Player 1
Figure 5.12
Player 2
e
b
10
Player 1
Figure 5.13
88
Pilots
(Cockpit e.V.)
Strategy
Change
now
Change
later
Change
now
100%
Change
later
25%
Figure 5.14 shows the pay-oV matrix for the game. Clearly, the game is
an imperfect one and has no saddle point, since there is no element in
the matrix which is a row minimum and a column maximum. Such
matrices are not unusual and the larger the matrix, the more likely it is
that it will not have a saddle point. However, it is still possible to Wnd a
rational solution.
If the groups try to out-guess each other, they just go round in
circles. If the pilots use the minimax principle, it makes no diVerence
whether they change now or later the minimum gain is zero in either
case. The cabin crews will attempt to keep the pilots gain to a minimum and will therefore opt to change now. The most that the pilots
can gain thereafter is 25% of the performance bonus. However, the
pilots can reasonably anticipate the cabin crew strategy and will therefore opt to change later to guarantee themselves 25%. The cabin crews
can reasonably anticipate this anticipation and will consequently opt to
alter their choice to later, thereby keeping the pilots gain to zero. This
cycle of counter-anticipation can be repeated ad nauseam.
There is simply no combination of pure strategies that is in equilibrium and so there is no saddle point. All that can be said is that the
value of the game lies somewhere between zero and 25%. One of the
groups will always have cause to regret its choice once the other groups
strategy is revealed. A security problem therefore exists for zero-sum
games with no saddle point. Each group must conceal their intentions
from the other and, curiously, this forms the basis for a solution.
The best way for players to conceal their intentions is to leave their
selections completely to chance, assigning to each choice a predetermined probability of being chosen. This is called adopting a mixed
89
Pilots
(Cockpit e.V.)
Strategy
Change
now
Change
later
Probability
Change
now
100%
p = 0.2
Change
later
25%
1 p = 0.8
Probability
q = 0.8
1 q = 0.2
90
91
bigger than two-by-two, the player who deviates from the optimal
strategy will be actually disadvantaged, since the value of the game is
assured only if both players adopt optimal mixed strategies.
Not all games with pay-oV matrices larger than two-by-two are
problematic. For example, it has been shown (Shapely and Snow, 1950)
that any game which can be represented by a matrix with either two
rows or two columns (i.e. a game in which one of the players has
speciWcally two strategies) can be solved in the same way as two-by-two
matrices, because one of the two-by-two matrices embedded in the
larger matrix is a solution of the larger matrix. Therefore, to solve the
larger matrix, it is necessary only to solve each two-by-two sub-matrix
and to check each solution.
Example 5.4 Student attendance
A school has a problem with student attendance in the run up to public
examinations, even among students who are not candidates. Teachers must
decide whether to teach on, passively supervise students studying in the (reduced) class groups or actively revise coursework already done during the year.
Students must decide whether to attend school or study independently at home.
Research has shown that, if teachers teach on, year-on-year results improve by
12% if students attend, but fall by 8% if they do not. If teachers passively
supervise group study, examination results improve by 2% if students study at
home and are unchanged if they attend. If teachers actively revise coursework,
results improve 5% if students attend the revision workshops and 1% if they do
not.
Figure 5.16 shows the pay-oV matrix for the game. It has no saddle
point, so the players must adopt a mixed strategy. One of the columns
the Wrst one say can be arbitrarily deleted and the residual two-bytwo matrix solved, as follows.
Let the students Wrst-row mixed strategy be assigned a probability p
and the second-row mixed strategy a probability 1 9 p. The expected
pay-oVs yielded by these two strategies are equal no matter what the
teachers do. Therefore:
0(p) ; 2(1 9 p) : 5(p) ; 1(1 9 p)
which yields the solutions:
92
Teachers
Students
Figure 5.16
Strategy
Teach on
Passively
supervise
group study
Actively
give revision
workshops
Attend
lessons
12
Do not attend
lessons
93
94
Operations
manager
Personnel
manager
Figure 5.17
Strategy
20 journeys
21 journeys
28 journeys
4 drivers
2100
2000
2100
5 drivers
2100
2000
1875
6 drivers
1890
1800
1800
Transformation of the pay-off matrix in Figure 5.8 (each pay-off multiplied by 100).
Teachers
Students
Figure 5.18
Strategy
Teach on
Passively
supervise
group study
Actively
give revision
workshops
Attend
lessons
20
13
Do not attend
lessons
10
Transformation of the pay-off matrix in Figure 5.16 (each pay-off has 8 added).
95
Hospital board
Strategy
Surgeon
Figure 5.19
Dismiss with
retirement
package
Retain
&
retrain
vg
vb
Accept
incompetency
assessment
Reject
incompetency
assessment
Games with ordinal pay-oVs can be solved easily if they have saddle
points. Figure 5.19 is Figure 5.7 reconWgured as an ordinal pay-oV
matrix, and vb, b, g and vg represent very bad, bad, good and very
good, respectively.
It is still clear that b is a saddle point, since b is the minimum in its
row and the maximum in its column. In other words,
vb b g, for all values of vb, b, g + R
So, it is not necessary to have exact quantitative measurements of the
relative desirability of the pay-oVs in a zero-sum game in order to Wnd
a solution.
Games with ordinal pay-offs, but without saddle points
96
Cabin crews
Change
now
Change
later
Probability
Change
now
Change
later
1 p
1 q
Assigned
probabilities
Strategy
Pilots
Figure 5.20
As before:
w(p) ; b(1 9 p) : e(p) ; w(1 9 p)
(i)
and
w(q) ; e(1 9 q) : b(q) ; w(1 9 q)
(ii)
97
q:19q:w9e:w9b
and the cabin crews should assign a higher probability to changing
later, which is the inverse strategy and probability of that of the pilots.
Although this ordinal game has not been completely solved, because
the probabilities and the value of the game have not been determined,
nevertheless the analysis has at least given both players some indication
of what their optimal strategies are. This is a recurring feature of game
theory as applied to more intractable problems: it does not always
produce a solution, but it does provide a greater insight into the nature
of the problem and the whereabouts of the solution.
Consider what you think is required and decide accordingly. But never give your reasons; for
your judgement will probably be right, but your reasons will certainly be wrong.
Earl of Mansfield 17051793 Advice to a new governor
99
100
Player 2
Player
1
Figure 6.1
Strategy
c1
c2
r1
u1(r1,c1) , u2(r1,c1)
u1(r1,c2), u2(r1,c2)
r2
u1(r2,c1), u2(r2,c1)
u1(r2,c2), u2(r2,c2)
Player 2
Player
1
Figure 6.2
Strategy
c1
c2
r1
2, 2
4, 3
r2
3, 4
1, 1
101
Player 2
Player
1
Figure 6.3
Strategy
c1
c2
c3
r1
1, 0
0, 3
3, 1
r2
0, 2
1, 1
4, 0
r3
0, 2
3, 4
6, 2
A Nash equilibrium.
102
Player 2
Player
1
Figure 6.4
Strategy
c1
c2
r1
4, 4
2, 3
r2
3, 2
1, 1
Pay-off matrix for a two-person mixed-motive game with a single Nash equilibrium point.
103
Nominee 2
Nominee
Strategy
Decline
nomination
Accept
nomination
Decline
nomination
2, 2
3, 4
Accept
nomination
4, 3
1, 1
Figure 6.5
It can be seen from the matrix that there are no dominant or inadmissible strategies. Neither candidate can select a strategy that will yield the
best pay-oV no matter what the other candidate does. The minimax
principle fails too because, according to it, both candidates should
choose their Wrst strategy (decline the nomination) so as to avoid the
worst pay-oV (1, 1). Yet, if they do this, both candidates regret it once
the others choice becomes known. Hence, the minimax strategies are
not in equilibrium and the solution (2, 2) is not an equilibrium point.
It is unstable and both players are tempted to deviate from it, although
it should be pointed out that the worst case scenario (1, 1) arises when
both players deviate from their minimax strategies.
Despite the failure of both the elimination and the minimax approaches, there are two equilibrium points on the Figure 6.5 matrix. If
nominee 1 chooses to accept the nomination, nominee 2 can do no
better than decline; and if nominee 1 chooses to decline the nomination, nominee 2 can do no better than accept. So there are two
equilibrium points those with pay-oVs (4, 3) and (3, 4).
Unlike zero-sum games, the value of the game is not a constant
because the players do not agree about preferability and the two
equilibrium points are therefore asymmetrical. There is no formal
solution beyond this. Informal factors such as explicit negotiation and
cultural prominence must be explored if a more deWnite outcome
is required. For example, a younger nominee may defer in favour of
an older one in companies where seniority is the prominent basis
for promotion; or the two candidates may negotiate a political
arrangement. Either way, it is in the interests of both players in a
104
College 2
College
1
Figure 6.6
Strategy
Submit
preferred
calendar
Submit
unpreferred
calendar
Submit
preferred
calendar
2, 2
4, 3
Submit
unpreferred
calendar
3, 4
1, 1
105
other college reXects its hope of reversing the arrangement next year. The
ordinal pay-oVs are displayed on Figure 6.6.
As was the case with the leadership game described in the previous
example, there are no dominant or inadmissible strategies and the
minimax principle, in which both colleges choose their Wrst strategy
(submit preferred calendar) so as to avoid the worst pay-oV (1, 1) fails.
Again, as was the case with leadership games, the minimax strategies
are unstable and both players are tempted to deviate from it.
There are, nevertheless, two equilibrium points on the Figure 6.6
matrix. If college 2 chooses to submit its preferred calendar, college 1
can do no better than submit its less preferred calendar; and vice versa.
So there are two equilibrium points with pay-oVs (4, 3) and (3, 4).
Again, like leadership games, the value of the game is not a constant
because the players do not agree about preferability.
Games with this type of pay-oV matrix are called heroic games
(Rapoport, 1967a) because the player who deviates from the minimax
strategy beneWts both self and the other player, but beneWts the other
player more and as such is exhibiting heroic unselWsh behaviour.
Like leadership games, there is no formal solution beyond this,
although it is clearly in the interests of both players to communicate
their intentions to one another. Informal considerations suggest that it
is a good strategy to convince the other player of ones own determination! For example, if college 2 convinces college 1 that it has a school
tour abroad planned which is impossible to cancel, college 1 serves its
own interest best by acting heroically and choosing its less preferred
option (Luce & RaiVa, 1989). It can also be seen from this example that
the commonly held notion of keeping all options open is erroneous,
as many game theoreticians have pointed out. Better to adopt a
scorched earth policy, like Napoleons advance on Moscow, or at least
convince the other player of ones intention to do so!
106
Ericsson
Strategy
Nokia
Figure 6.7
Issue
shares
later
Issue
shares
now
Issue
shares
later
3, 3
2, 4
Issue
shares
now
4, 2
1, 1
107
result in the worst possible pay-oV (1, 1). The converse is true in the
case where Ericsson chooses to opt out.
Games with this type of pay-oV matrix are called exploitation games
(Rapoport, 1967a) because the player who deviates unilaterally from
the safe minimax strategy beneWts only himself and at the expense of
the other player. In addition, in going after the best possible pay-oV, the
deviant risks disaster for both!
Even more than heroic games, it is imperative in games of exploitation that the player who intends to deviate from the minimax convinces the other that he is resolute in his intent. Put crudely, the most
convincing player always wins exploitation games! In addition, the
more games of this sort a player wins, the more likely the player is to
continue winning, since the players seriousness of intent has been
amply demonstrated and the player has become more conWdent. Reputation the sum of a players historical behaviour in previous trials of
the game is everything. As Colman (1982) puts it, nothing succeeds
like success in the Weld of brinkmanship! The more reckless, selWsh and
irrational players are perceived to be, the greater is their advantage in
games of exploitation, since opposing players know that they risk
disaster for everyone if they try to win. This psychological use of
craziness can be seen in terrorist organisations (Corsi, 1981), political
leaders and among small children, though it should be noted that
although the player is perceived to be irrational, he or she is nevertheless
acting rationally throughout with a view to winning the game. (Schelling, 1960; Howard, 1966; Brams, 1990).
108
Stockbroker
Strategy
Lawyer
Figure 6.8
Refuse to
cooperate
with
investigators
Cooperate
with
investigators
Refuse to
cooperate
with
investigators
Cooperate with
investigators
3, 3
1, 4
4, 1
2, 2
109
110
Stockbroker
Lawyer
Figure 6.9
Strategy
Refuse to
cooperate
regardless
of lawyer
Cooperate
regardless
of lawyer
Choose
same
strategy as
lawyer
Choose
opposite
strategy to
lawyer
Refuse to
cooperate
with the
investigators
3, 3
1, 4
3, 3
1, 4
Cooperate
with the
investigators
4, 1
2, 2
2, 2
4, 1
metagame theory (Howard, 1966). Metagame theory is the construction of any number of higher level games based on the original game. A
player is then assumed to choose from a collection of meta-strategies,
each of which depends on what the other player chooses.
Consider the case outlined in Example 6.4 and represented by Figure
6.8. The lawyer has two pure strategies: to refuse to cooperate and to
cooperate. For each of these, the stockbroker has four meta-strategies:
to refuse to cooperate regardless of what the lawyer chooses,
to cooperate regardless of what the lawyer chooses,
to choose the same strategy as the lawyer is expected to choose, and
to choose the opposite strategy to the one the lawyer is expected to
choose.
This two-by-four matrix constitutes the Wrst-level metagame and is
represented on Figure 6.9. There is an equilibrium point at (row 2,
column 2) with pay-oV (2, 2), since if the lawyer chooses row 1, the
stockbroker should choose column 2 or column 4. In addition, if the
lawyer chooses row 2, the stockbroker should choose column 2 or
column 3. This corresponds to the same paradox point in the original
game and a higher level metagame must be constructed in order to
eliminate it.
Suppose the lawyer selects a meta-strategy depending on which of
the four meta-strategies the stockbroker chooses. The lawyer can
choose:
111
112
Broker
Strategy
Refuse no matter
which column
the broker choses
Refuse unless the
broker chooses 4
Lawyer
Cooperate
regardless
of the
lawyer
Choose
same
strategy as
the lawyer
Choose
opposite
strategy to
the lawyer
3, 3
1, 4
3, 3
1, 4
3, 3
1, 4
3, 3
4, 1
3, 3
1, 4
2, 2
1, 4
3, 3
2, 2
3, 3
1, 4
4, 1
1, 4
3, 3
1, 4
3, 3
1, 4
2, 2
4, 1
3, 3
2, 2
3, 3
4, 1
4, 1
1, 4
3, 3
4, 1
3, 3
2, 2
2, 2
1, 4
4, 1
1, 4
2, 2
1, 4
4, 1
2, 2
3, 3
1, 4
3, 3
2, 2
2, 2
4, 1
Cooperate unless
the broker chooses 2
4, 1
1, 4
2, 2
4, 1
Cooperate unless
the broker chooses 3
4, 1
2, 2
3, 3
4, 1
Cooperate unless
the broker chooses 4
4, 1
2, 2
2, 2
1, 4
4, 1
2, 2
2, 2
4, 1
Cooperate no matter
which column
the broker choses
Figure 6.10
Refuse to
cooperate
regardless
of the
lawyer
113
Heroic games
114
Exploitation games
The minimax strategies are dominant at the one and only equilibrium point, but paradoxically this pay-oV is worse for both players
than their inadmissible strategies.
115
The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games
116
117
The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games
If c1 is the marginal cost of production per tonne of linerboard for SmurWtStone and c2 is the marginal cost of production per tonne for International
Paper, both constants for the year, how many tonnes should each Wrm produce
in order to maximise proWt? The New York Stock Exchange, of which both Wrms
are conscientious members, stipulates that Wrms must set their price structures
and production levels independently.
118
1
2
: 0 and
:0
R1
R2
then
2R1 ; R2 : A 9 c1
and
R1 ; 2R2 : A 9 c2
where A, c1 and c2 are constants. These two equations are the reaction
functions and reveal that the optimal level of production for each Wrm
is negatively related to the expected level of supply from the other. Note
also that:
21
22
:92
and
:92
R12
R22
which indicate local maxima.
The method of simultaneous equations,
4R1 ; 2R2 : 2A 9 2c1
R1 ; 2R2 : A 9 c2
produces the Nash solutions
RN1 :
and
A ; c2 9 2c1
3
119
The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games
RN2 :
A ; c1 9 2c2
3
120
R2
A c1
Reaction function for
Smurfit-Stone
(A c 2 )/2
C
(A + c 1 2c2 ) / 3
(A c 1) / 2
(A
Figure 6.11
c2
R1
2c 1 + c 2) / 3
121
The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games
R2
A c1
(A c2 ) / 2
Cournot equilibrium
C
(A
Figure 6.12
c1 ) /2
c2
R1
122
In the von Stackelberg (1934) model, at least one organisation precommits to a particular level of production and the other responds to
it. Let us assume, for the purposes of this example, that the precommitting Wrm is SmurWt-Stone (which becomes the market leader)
and the responding Wrm is International Paper. Unlike the Cournot
duopoly model, the von Stackelberg model is a dynamic game, where
International Paper can observer the actions of SmurWt-Stone before
deciding upon its optimal response. Unlike static (or simultaneous)
games, dynamic (or sequential) games carry the tactical possibility of
giving false information and the need to conceal ones true intentions
from the other player. If either Wrm believes the false production levels
of the other, no matter how unlikely, the game will have multiple Nash
equilibria. For example, International Paper may threaten to Xood the
market with linerboard in the hope that SmurWt-Stone will reduce its
production to zero in response which it will if it believes the threat
and thereby produce a Nash equilibrium. And yet, such a threat is
illogical, since it would not be in either players interests to carry it
through. To exclude such idle production threats, the von Stackelberg
model imposes the condition that the predicted outcome of the game
must be sub-game perfect that the predicted solution to the game
must be a Nash equilibrium in every sub-game.
The method of backward induction may be applied to this von
Stackelberg game, starting with International Papers output response
decision, which is its attempt to maximise its own proWt, 2 , given by:
2 : (A 9 R)R2 9 c2R2
123
The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games
: (A 9 R1 9 R2)R2 9 c2R2
: AR2 9 R22 9 R1R2 9 c2R2
DiVerentiating with respect to R2 and setting equal to zero for a
maximum, gives:
2
: A 9 2R2 R1 9 c2
R2
then
R1 ; 2R2 : A 9 c2
(1)
(2)
(3)
Looking at Equations (2) and (3), it can be seen that, if the marginal
cost of production for both Wrms is the same (equal to c say), R2 will be
124
R2
A c1
Reaction function for
Smurfit-Stone
(A c 2 ) / 2
(CN)
(A + c1 2c 2 ) / 3
(A + 2c1 3c 2 ) / 4
Reaction function for International Paper
R1
(A c1 ) / 2
(A 2c1 + c 2) / 3
Figure 6.13
c2
(4)
125
The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games
The strategic variable for Wrms in both the Cournot and the von
Stackelberg duopolies is the level of production (R). In the Bertrand
duopoly (1883), the strategic variable is the price charged in the
marketplace (P). The Wrms simultaneously decide their pricing structures and market forces then decide how much product is absorbed.
Like the Cournot duopoly, the Bertrand duopoly is a static game, but
one in which the two Wrms compete in terms of the price they charge
customers, rather than production levels.
Consider the following example.
Example 6.6 The UK supermarket sector as a Bertrand duopoly
Sainsbury and Tesco dominate the UK food supermarket sector. Competition is
so Werce and proWt margins so thin that it has been termed a price war in the
Wnancial press. Suppose Sainsbury decides to sell a quantity R1 of some product
at a price P1 and Tesco decides to sell a quantity R2 of the same or some other
product at a price P2. Let 1 represents Sainsburys proWt and2 represents that
of Tesco. How would the two supermarket chains set their prices so as to
maximise proWts in a stable market?
126
products are identical, customers will only buy from the supermarket
that oVers the lowest price. Say, for the purposes of this example, that
Tesco initially oVers lower prices and makes higher than normal proWts.
It gains a monopoly, although Sainsbury is eventually forced to challenge it by undercutting prices in an attempt to win some market share
for itself. However, if Tesco initially oVers lower prices and makes lower
than normal proWts or none at all, then it must raise its prices to normal
proWt levels or go out of business. So, either way, it is clear that
charging diVerent prices never results in a Nash equilibrium for competing Wrms in a Bertrand duopoly.
If the food products are identical and both supermarkets charge the
same prices, and if each supermarket is making higher or lower than
normal proWts, then each will have an incentive to deviate. One will
slightly undercut the other to increase market share if it is making
higher than normal proWts; and it will slightly overcharge the other to
increase proWt margins if it is making lower than normal proWts.
So the only Nash equilibrium is where both Wrms charge the same
prices and make normal proWts. The situation where as few as two Wrms
make a competitive outcome without any collusion to increase proWts
above the normal, is known as the Bertrand paradox.
One way to overcome the Bertrand paradox is to have Wrms sell
distinguishable products. If product lines are distinguishable, Tesco
and Sainsbury face a negative demand curve and their interdependency
is not as strong as when they sold identical product lines. If Sainsbury
decides to sell at a price P1 0 and Tesco decides to sell at a price P2
0, then the Bertrand duopoly model assumes that customers will
demand quantities:
R1 : A 9 P1 ; BP2
and
R2 : A 9 P2 ; BP1
from each of the two supermarkets, respectively; where A is a constant
as in the previous duopoly models and B is a constant that reXects the
extent to which Sainsburys products are substitutes for Tescos and
vice versa. These two equations, called the demand functions for the two
Wrms, are somewhat unrealistic, however, because demand for one
supermarkets product is positive even when it charges an arbitrarily
127
The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games
1
2
: 0 and
:0
P1
R2
then
P1 : (A ; BP2 ; c1)/2
P2 : (A ; BP1 ; c2)/2
And since
21
22
:92 and
:92
P1
P2
local maxima are indicated (see Figure 6.14).
128
P2
Reaction function
for Tesco
T
A+c
2 B
BN
Bertrand
Nash equilibrium
(A + c ) / 2
(B = 0)
Reaction function for Sainsbury
P1
(A + c ) / 2
(B = 0)
Figure 6.14
A+c
2 B
(Assumes c1 = c 2 = c)
and
P2 :
129
The reaction curves for the Bertrand duopoly have positive gradients, unlike those for the Cournot and von Stackelberg models. They
are positively related and are said to complement each other strategically. In the case of the Cournot and von Stackelberg duopolies, the
reaction functions have negative gradients. One Wrms output causes
the other Wrm to decrease its output. In such cases, the reaction
functions are said to substitute for each other strategically .
To maximise their proWts and to arrive at the BertrandNash equilibrium, both supermarket Wrms must be on their reaction function
lines (marked BN on Figure 6.14). As with the previous duopoly
models, the Nash equilibrium point is not pareto-eYcient, since both
Sainsbury and Tesco could make higher proWts if they set higher prices.
This set of possibilities is shown marked T on Figure 6.14, but since
each Wrm has an incentive to deviate from these arrangements, they do
not oVer a more likely solution than the Bertrand-Nash equilibrium.
Pareto-ineYciency is a feature of all three duopoly models. If the
competitors collude they can increase proWts, but since this requires (at
best) an agreement that is diYcult to enforce given the incentives to
deviate from it, and (at worst) an illegal cartel, such solutions are not
realisable in practice.
130
Player 2
Player
1
Figure 6.15
Strategy
c1
c2
r1
1, 4
3, 0
r2
2, 1
1, 2
Player 2
Strategy
c1
c2
........
cn
r1
u1(r1, c2 ), u2(r1, c2 )
........
Player
r2
........
:
:
:
:
:
:
:
:
:
rm
u1(rm, c2 ), u2(rm, c2 )
Figure 6.16
:
:
:
........
:
:
:
the same pay-oV matrix represents both players (Example 5.3). The
general case for mixed-motive games must now be considered.
Suppose player 1 has m strategies
S1 : r1, r2, . . . , rm
and player 2 has n strategies
S2 : c1, c2, . . . , cn
and u1(ri, cj) represents the pay-oV to player 1 when player 1 chooses
strategy ri and player 2 chooses strategy cj, then the game can be
represented by the m ; n matrix shown on Figure 6.16.
An abbreviated version of the matrix is shown on Figure 6.17, where
Uij is the utility pay-oV function for player 1 for strategies ri and cj ; and
Vij is the utility pay-oV function for player 2 for strategies ri and cj.
A re-deWnition of the Nash equilibrium can now be made for such a
131
Player 2
Strategy
c1
c2
........
cn
r1
U 11, V 11
U 12, V12
........
U1n , V1n
Player
r2
U 21, V 21
U 22, V22
........
U2n , V2n
:
:
:
:
:
:
:
:
:
rm
Um 1, Vm 1
Um 1, Vm 1
Figure 6.17
:
:
:
........
:
:
:
Um n , Vmn
132
133
focused bias, the pro-mutual lobby would do well at the expense of the other,
since community-focus was already the strength of the status quo, although the
pro-change lobby would gain some small measure of credibility (1, 4). If both
factions suggested a criterion-based focus for future business, the pro-mutual
faction would just about prevail (1, 2). However, if the pro-mutual lobby oVered
a criterion-based focus, and the pro-change lobby did not, the latter would fare
better, since it would totally undermine the argument for mutuality (3, 0). And
if the pro-mutual group oVered a continuation of community-focused service
and the pro-change group oVered a change to criterion-based service, the vote
would probably go the way of change (2, 1).
As things turned out, the action failed and Standard Life remains a mutual
society to this day, but what strategy should each side have adopted, if they had
accepted this analysis?
134
Pro-mutual lobby
Strategy
Pro-change
lobby
Figure 6.18
Community-focused
business
Criterion-based
business
Communityfocused
business
1, 4
3, 0
p1
Criterionbased
business
2, 1
1, 2
p2 = 1 p1
q1
q2 = 1 q1
Assigned
probabilities
Repeated games
136
Repeated games
BUPA
GHG
Strategy
Large
subsidy
for NHS
Small
subsidy
for NHS
Large
subsidy
for NHS
20, 20
40, 10
Small
subsidy
for NHS
10, 40
30, 30
137
incentive to increase its level of subsidy above the other (10, 40) and
(40, 10).
As with the prisoners dilemma, if this game is played only once there
is a Nash equilibrium where the minimax strategies intersect, at (20,
20). Neither Wrm can do better by choosing another strategy once the
other Wrms strategy becomes known. However, this dominant solution is worse than the other strategy where both Wrms do the same
thing, (30, 30), and the problem for competing Wrms is how to coordinate their strategies on the optimal outcome, (30, 30), without priceWxing. In the one-oV game this is not possible as there is a clear
incentive to increase subsidies. However, if the interaction between
BUPA and GHG is inWnitely repeated, it is possible for the two Wrms to
coordinate their actions on the pareto-eYcient outcome.
Two concepts that of adopting a punishing strategy and not
discounting the future too much help explain how and why this
happens. A punishing strategy is one where a player selects a strategy
based purely on what the other player has done, in order to punish him
if he deviates from the pareto-eYcient outcome. The shadow of the
gallows deters players from deviation and the pareto-eYcient outcome
can thus be maintained indeWnitely. Of course, the punishment and the
punisher must both have credibility, so it must be in the interests of the
punisher to punish the deviant player if and when the need arises.
A punishment strategy will only be eVective if it is part of the
sub-game perfect Nash equilibrium for the entire game. In Example
7.1, it could be that each Wrm starts with small-subsidy strategies and
that this arrangement is allowed to persist as long as no one deviates
from the status quo. If, however, either Wrm adopts a large-subsidy
strategy, then in that event the opposing Wrm guarantees to undertake
large-subsidy strategies ever after.
This particular type of punishment strategy is known as a trigger
strategy, where the actions of one player in a game cause the other
player permanently to switch to another course of action. In the case of
Example 7.1, it threatens an inWnite punishment period if either player
opts for a large-subsidy strategy. Once one Wrm increases its level of
subsidy, the other Wrm guarantees to do the same thereafter, thus
precluding the possibility of ever returning to the pareto-eYcient
outcome. The Wrm that Wrst adopts the large-subsidy strategy will
increase proWts from 30m to 40m in the initial period, but will drop
138
Repeated games
to 20m per annum thereafter. The game will reach equilibrium at (20,
20), a sub-optimal outcome for both parties.
For a trigger strategy to maintain a pareto-eYcient outcome (30,
30) in the above example both the punishment and the agreement to
maintain the pareto-eYcient outcome must not be ridiculous. In
Example 7.1, the threat of punishment is perfectly reasonable because if
one Wrm switches to a large-subsidy strategy, then it is rational for the
other Wrm to also switch to a large-subsidy strategy, since that move
guarantees to increase the latters proWt from 10m to 20m. The
punishment strategy corresponds to the Nash equilibrium for the
one-oV game. This is always credible because, by deWnition, it is the
optimal response to what is expected of the other player.
The promise to maintain the implicit agreement of small-subsidy
strategies in Example 7.1 is also credible. Organisations in the forproWt sector generally seek to maximise total discounted proWt, so the
cooperative outcome at (30, 30) will be maintained indeWnitely as long
as the present value of cooperation is greater than the present value of
deviating (Romp, 1997), and as long as Wrms do not discount the future
too much. Since an inWnitely repeated game develops over time, future
pay-oVs need to be discounted to some extent. Pay-oVs lose value over
time, so a sum of money to be received in the future should be assigned
a lower value today. Conversely, a pay-oV received today should be
assigned a higher value in the future since it could gain interest over the
intervening period.
Suppose r is the potential rate of interest, then d : 1/(1 ; r) is the
rate of discount. With this rate of discount, the present value, Vnow, of
maintaining a small-subsidy strategy, Vnow(small), is given by the expression:
Vnow(small) : 30 ; 30d ; 30d 2 ;
Therefore:
dVnow(small) : 30d ; 30d 2 ; 30d 3 ;
So,
(1 9 d)Vnow(small) : 30
or
Vnow(small) : 30/(1 9 d)
(1)
139
(2)
(since d 1)
140
Repeated games
141
142
Repeated games
BUPA
GHG
Figure 7.2
Large
subsidy
Moderate
subsidy
Small
Subsidy
Large
subsidy
20, 20
40, 10
0, 0
Moderate
subsidy
10, 40
30, 30
0, 0
Small
subsidy
0, 0
0, 0
25, 25
143
BUPA
GHG
Figure 7.3
Large
subsidy
Moderate
subsidy
Small
Subsidy
Large
subsidy
40, 40
60, 30
20, 20
Moderate
subsidy
30, 60
55, 55
20, 20
Small
subsidy
20, 20
20, 20
45, 45
The extended NHS subsidy pay-off matrix for the entire game played over two iterations.
each Wrm is allowed three strategies rather than two and that is played
twice (see Figure 7.2).
The one-oV game has two Nash equilibria, shaded in Figure 7.2, at
(20, 20) and (25, 25). They are both pareto-ineYcient because if players
could coordinate on (30, 30), then both players would be better oV. For
the second iteration of the game, suppose that the two Wrms adopt the
following punishment strategy: In the initial iteration, adopt a moderate subsidy strategy. In the second iteration, adopt a small subsidy
strategy if the other player has also adopted a moderate subsidy strategy
in the Wrst iteration; otherwise, adopt a large subsidy strategy (Romp,
1997).
In terms of the pay-oV matrix for the entire game, this punishment
strategy has the eVect of increasing each pay-oV by 20m, with the
exception of the case where both Wrms adopt moderate-subsidy
strategies, in which case pay-oV is increased by 25m. Figure 7.3 shows
the pay-oV matrix for the entire game, assuming no discount of
pay-oVs over time.
The game now has three Nash equilibria, shaded above, at (40, 40),
(55, 55) and (45, 45). Adopting moderate-subsidy strategies in the Wrst
iteration and small-subsidy strategies in the second is a sub-game
perfect Nash equilibrium, and players thus avoid the paradox of backward induction.
144
Repeated games
145
GHG
Large subsidy
Small subsidy
BUPA
BUPA
Large sub.
Small sub.
Large sub.
Small sub.
(20, 20)
(40, 10)
(10, 40)
(30, 30)
BUPA
Strategy
GHG
Large
subsidy
Small
subsidy
Large
subsidy
20, 20
40, 10
Small
subsidy
10, 40
30, 30
(a)
Figure 7.4
GHG is (a ) free of any additional constraints and (b, overleaf ) bound by some internal constraints.
146
Repeated games
GHG
Large subsidy
Small subsidy
BUPA
BUPA
Large sub.
Small sub.
Large sub.
Small sub.
(0, 20)
(5, 10)
(10, 20)
(30, 30)
BUPA
Strategy
GHG
Large
subsidy
Small
subsidy
Large
subsidy
0, 20
5, 10
Small
subsidy
10, 20
30, 30
(b)
Figure 7.4
(cont.)
This game can be solved using the principle of iterated strict dominance, which produces a unique Nash equilibrium. Row 1 dominates
row 2, when GHG is free of additional constraints; row 4 dominates
row 3, when GHG is bound by additional constraints. Now BUPA
knows that GHG will adopt a large-subsidy strategy with probability p
and a small-subsidy strategy with probability 1 9 p. Therefore, BUPA
can calculate its own expected proWt conditional on its own subsidy
strategy. If it decides on a large-subsidy strategy, its expected proWt
level is:
20p ; 20(1 9 p)
If it decides on a small-subsidy strategy, its expected proWt level is:
10p ; 30(1 9 p)
147
Nature
GHG is free of
constraints
GHG is bound
by constraints
GHG
Large sub.
GHG
Small sub.
Large sub.
Small sub.
BUPA
Large
sub.
(20, 20)
BUPA
Smalll
sub.
Large
sub.
(40, 10)
(10, 40)
Small
sub.
Large
sub.
(30, 30)
(0, 20)
Small
sub.
(5, 10)
BUPA
Large
subsidy
Small
subsidy
Free &
large
subsidy
20, 20
40, 10
Free &
small
subsidy
10, 40
30, 30
Prob. = 1 p
Bound &
large
subsidy
0, 20
5, 10
Prob. = 1 p
Bound &
small
subsidy
10, 20
30, 30
Strategy
Prob. = p
Prob. = p
GHG
Figure 7.5
Large
sub.
Small
sub.
(10, 20)
(30, 30)
148
Repeated games
1
2
The management of the balance of power is a permanent undertaking, not an exertion that
has a forseeable end.
Henry Kissinger 1979 The White House Years
Multi-person games consist of three or more players and diVer theoretically from single- and two-person games because they potentially
involve coalitions. If the interests of the players coincide exactly, so that
coalitions are unnecessary or meaningless, then the games are ones of
pure coordination and reduce to the case of two-person cooperative
games discussed already in Chapter 4. In such cases, the only possible
coalition is the grand coalition, which involves all players acting in
unison, and coordination is eVected either by explicit communication
or by informal expectation.
Zero-sum multi-person games, on the other hand, are radically
aVected by the possibility of coalition, since they introduce the potential for cooperation into a game that would otherwise not have any.
These non-cooperative multi-person games use an approach which is
an extension of the saddle/equilibrium point approach.
Partially cooperative and mixed-motive games come somewhere
between the two extremes of purely cooperative and zero-sum games.
Partially cooperative and mixed-motive games have more realistic
solutions than those arising from completely non-cooperative games,
although some have approaches which tend towards obscurity (von
Neumann & Morgenstern, 1953).
Following a brief discussion on non-cooperative multi-person
149
150
games, this chapter begins by extending some concepts and deWnitions to mixed-motive and partially cooperative multi-person games.
Theories such as the minimal winning coalition theory and the minimum resource theory are discussed as useful predictors of coalition
forming on committees. The bulk of the chapter is devoted to developing methods for analysing the distribution of power among factions on a committee. Five diVerent indices of power are described
and two in particular are developed from Wrst principles and used in
a detailed examination of power on boards of governance. Power and
pay-oVs for both majority and minority factions are considered, as
are voting tactics and the implications for structuring committees
generally.
151
pounded. There are often many non-equivalent and non-interchangeable Nash equilibrium points and there is no easy way of Wnding them,
never mind sorting them. In fact, the outcome of a multi-person game
may not be a Nash equilibrium point at all.
152
(1)
(2)
(3)
153
The three funding amounts, which follow from this unique solution,
are therefore:
u1 : $4344 per capita
u2 : $134 per capita
u3 : $323 424 lump sum for special services
The second derivatives are:
2u1
: 92
a2
2u2
: 99/4(a 9 3b)3/2
b2
2u3
: 92a
c2
Clearly, all three second derivatives are always less than zero, since a, b
and c are all positive numbers, so the solution is a unique Nash
equilibrium that maximises funding income for the police force.
154
155
156
a losing one into a winning one by virtue of its vote; and as critical if its
withdrawal causes that coalition to change from a winning one to a
losing one.
Underlying assumptions: sincerity, completeness and transitivity
The Shapley value (Shapley, 1953) rates each faction according to its a
priori power. In other words, in proportion to the value added to the
coalition by that faction joining it. Suppose a game, G, has n factions
(not players) and some of them vote together to form a coalition C.
Suppose an individual faction of C is denoted by fi and the size of the
coalition C is s, then:
G : f1, f2, f3 . . . , fn; C : f1, f2, . . . , fi, . . ., fs; C is a subset of G, not
Clearly, fi has s 9 1 partners, selected from n 9 1 players. Therefore,
there are
157
(n 9 1)!
(s 9 1)! [(n 9 1) 9 (s 9 1)]!
ways of re-arranging the coalition partners of i. The reciprocal of this
expression is:
(s 9 1)! (n 9 s)!
(n 9 1)!
and represents the probability of each such selection.
Assuming that all sizes of coalition are equally likely, a particular size
occurs with a probability of 1/n. Therefore, the probability of any
particular coalition of size s containing the individual faction i, from n
factions, is given by the expression:
(s 9 1)! (n 9 s)!
n!
(1)
(2)
158
G : f1, f2, f3, . . . , fn; C : f1, f2, . . . , fi, . . . , fs; C is a subset of G, not
The ShapleyShubik index, SS(fi), is deWned as:
SS(fi) :
159
160
161
C pivotal
Single
Two-faction
Three-faction
Four-faction
Grand
Totals
L pivotal
P pivotal
T pivotal
D pivotal
Totals
0
4
24
72
72
0
1
3
12
12
0
1
3
12
12
0
1
3
12
12
0
1
3
12
12
0
8
36
120
120
172
28
28
28
28
284
(s 9 1)!(n 9 s)!
n!
Single
Two-faction
Three-faction
Four-faction
Grand
Shapley values
1/5
1/20
1/30
1/20
1/5
Total
0
4
24
72
72
0
1
3
12
12
0
1
3
12
12
0
1
3
12
12
0
1
3
12
12
0
8
36
120
120
19.0
3.15
3.15
3.15
3.15
162
C pivotal
Number of coalitions
where pivotal
172
Number of possible
winning coalitions
284
ShapleyShubik
0.61
L pivotal
P pivotal
T pivotal
D pivotal
28
28
28
28
284
284
284
284
0.099
0.099
0.099
0.099
163
C pivotal
L pivotal
P pivotal
T pivotal
Totals
Single
Two-faction
Three-faction
Grand
0
3
12
12
0
1
4
4
0
1
4
4
0
1
4
4
0
6
24
24
Totals
27
54
(s 9 1)!(n 9 s)!
n!
Single
Two-faction
Three-faction
Grand
Shapley values
1/4
1/12
1/12
1/4
C
0
3
12
12
4.25
0
1
4
4
0
1
4
4
0
1
4
4
1.42
1.42
1.42
Total
0
6
24
24
Grand coalitions
These cases reduce to the three-faction coalition analysis outlined
above.
Table 8.4 summarises the extent to which each faction is pivotal in
each of the Wve possible coalition sizes.
The two actual power measurements for each of the Wve participating factions can now be calculated.
The Shapley value for each faction
For the Shapley value equation in the case of controlled secondary
boards, n : 4 and s : 1, 2, 3, 4. Again, we assume that the contribution of fi to each coalition in which it is pivotal, namely
C 9 C 9 i, is unity and that the contribution of fi to each
unsuccessful coalition is zero. The results are summarised on Table 8.5.
The ShapleyShubik index for each faction
The results are summarised on Table 8.6.
An analysis of power on out-of-state boards (Model C)
Let the Wve factions on the board of governors be denoted as follows:
164
C pivotal
L pivotal
P pivotal
T pivotal
27
54
9
54
9
54
9
54
0.50
0.167
0.167
0.167
165
L pivotal
P pivotal
R pivotal
T pivotal
r pivotal
Totals
Single
Two-faction
Three-faction
Four-faction
Grand
0
1
15
36
36
0
1
15
36
36
0
0
8
16
16
0
0
8
16
16
0
0
8
16
16
0
2
54
120
120
Totals
88
88
40
40
40
296
will be pivotal in these six coalitions. R, T and r (in any order) will vote
Wrst on a further 12 occasions and, half that time, P will be pivotal. In
all other coalitions, the pivotal position will be third in the voting and
P will be in this position on 24 occasions. In total then, P will be pivotal
for 36 coalitions.
Similar analysis reveals that L will also be pivotal for 36 four-faction
coalitions and R, T and r will each be pivotal for 16.
Grand coalitions
Since grand coalitions have 11 votes and the largest faction commands
only three, the last faction voting can never be pivotal. Therefore, these
cases reduce to the four-faction coalition analysis outlined above.
Table 8.7 is a summary table of the extent to which each faction is
pivotal in each of the Wve possible coalition sizes.
The Shapley value and the ShapleyShubik index for each of the Wve
participating factions can now be calculated.
The Shapley value for each faction
For the Shapley value equation in the case of out-of-state boards,
n : 5 and s : 1, 2, 3, 4, 5. We assume that the contribution of fi to
each coalition in which it is pivotal, namely C 9 C 9 i, is unity
and that the contribution of fi to each unsuccessful coalition is zero.
The results are summarised on Table 8.8.
The ShapleyShubik index for each faction
The results are summarised on Table 8.9.
166
(s 9 1)!(n 9 s)!
n!
Single
Two-faction
Three-faction
Four-faction
Grand
1/5
1/20
1/30
1/20
1/5
Total
0
1
15
36
36
0
1
15
36
36
0
0
8
16
16
0
0
8
16
16
0
0
8
16
16
0
2
54
120
120
Shapley values
9.55
9.55
4.27
4.27
4.27
L pivotal
P pivotal
R pivotal
T pivotal
r pivotal
88
296
88
296
40
296
40
296
40
296
0.297
0.297
0.135
0.135
0.135
Conclusions
The relative power of major and minor players
On voluntary maintained boards, the church nominees have four
seats on the board. Parent, teacher and Department of Education
(DE) representatives have one each, while the Education and Library
Board (ELB) has two seats. However, analysis reveals that church
nominees have more than six times the power of any of the other
factions!
On controlled secondary boards, the church nominees have four
seats on the board, and parent and ELB representatives have two
each. There is one teacher seat. Analysis from both indices reveals
that church nominees have three times the power of any of the other
factions.
On the sample out-of-state board, the Shapley value and the ShapleyShubik index both reveal that the power of the ELB faction and
the parent body is approximately 2.2 times that of each of the other
three factions. This is a truer reXection of power than the ratio of
memberships: 3:2 in the case of both teachers and majority religious
167
168
Table 8.10 Most pivotal position in the voting sequence for voluntary maintained board factions
C
L
P
T
D
37
57
57
57
57
35
0
0
0
0
28
43
43
43
43
169
Table 8.11 Most pivotal position in the voting sequence for controlled secondary board factions
C
L
P
T
56
56
56
56
44
44
44
44
Table 8.12 Most pivotal position in the voting sequence for out-of-state board factions
L
P
R
T
r
18
18
0
0
0
68
68
100
100
100
14
14
0
0
0
170
Game theory analysis of multi-person coalitions raises additional practical implications for how committees are constituted, whether they are
dissemination forums or statutory decision-making bodies.
The numerical voting strength of a faction on a committee is not a
reXection of its real voting power. This can lead to frustration, but it
can also be a source of stability.
Statutory decision-making committees should be constituted so as
to reXect accurately the desired or entitled proportional representation.
Managers need to be aware of the possibility of disproportionate
voting power, particularly when setting up structures for staV involvement in decision making. StaV committees, which appear to
reXect the relative sizes of diVerent groupings within organisations
for example, may be dangerously skewed.
There are other measurements of power, such as the Johnston index
(Johnston, 1978), which looks at the reciprocal of the number of
critical factions; the DeeganPackel index (Deegan & Packel, 1978),
which looks at the reciprocal of the number of minimal factions; and
the Banzhaf index (Banzhaf, 1965), which looks at the number of
coalitions for which a faction is both critical and pivotal.
The Johnston index
171
jp(fi)
i jp(fi), for i : 1 to n
Since the Johnston index is normalised, 0 O J(fi) O 1, where 1 represents absolute power.
The DeeganPackel index
dp(fi)
i dp(fi), for i : 1 to n
172
The Banzhaf index looks at the number of coalitions for which a faction
is both critical and pivotal. Using the same notation as before, the total
Banzhaf power, b(fi), is deWned as the number of winning coalitions in
which fi is a pivotal and critical member. This is normalised to the
Banzhaf index, B(fi), as:
B(fi) :
b(fi)
i b(fi), for i : 1 to n
Since the Banzhaf index is normalised, 0 O B(fi) O 1, where 1 represents absolute power.
Summary
Each of the Wve power indices has its own characteristics. Three of them
the Shapley, ShapleyShubik and Banzhaf indices depend on the
order in which the winning coalition is formed. The Johnston index,
which looks at coalitions that are winning but not minimal, may
contain factions that are not critical, i.e. their defection does not cause
the coalition to become a losing one. The DeeganPackel index looks at
the number of factions in minimal winning coalitions and thus regards
all such factions as having equal power.
The two indices used in Example 8.2 are the most straightforward
and popular, although they are limited in a minor way by the axioms
and assumptions already noted. These include the assumption that
factions always vote sincerely and along rational self-interest lines; that
voting is open; that coalitions have not been pre-arranged; that all
coalitions are equally likely to appear; and that there is a reward for
being part of a winning coalition. The appropriateness of these assumptions is, of course, a matter for judgement. Each faction judges
the suitability of a particular solution according to the favourableness
of its outcomes and not by any innate attractiveness. Therefore, power
is ultimately judged by its actual exercise, rather than by its perceived
distribution; and perceptions can be mistaken, as game theoretic
173
How selsh soever man may be supposed, there are evidently some principles in his nature
which interest him in the fortunes of others, and render their happiness necessary to him,
though he deserves nothing from it except the pleasure of seeing it.
Adam Smith 1795 The Theory of Moral Sentiments
Rationality
Game theory is based on a presumption of rationality, which at Wrst
sight appears to be optimistic. At the very least, there is need for more
174
175
Rationality
176
177
Indeterminacy
got locked into a losing strategy for both itself and its opponent
whenever the opponent made random irrational errors a doomsday
scenario from which was needed another irrational error from the
opponent in order to escape. To investigate further, Axelrod conducted
a third run of the experiment, generating random error for tit-for-tat
and its opponents. This time it was beaten by more tolerant opponents
ones which waited to see whether aggression was a mistake or a
deliberate strategy.
The paradox that sometimes it is rational to act irrationally can only
be resolved by altering the deWnition of what it means to be rational.
The importance of such a deWnition is more than mere semantics. The
success or otherwise of game theory as a model for behaviour depends
on it. It may mean diVerent things in diVerent circumstances to
diVerent people, but it undermines or underpins the very foundations
of game theory, whatever it is.
Indeterminacy
The second major criticism of game theoretic constructs is that they
sometimes fail to deliver unique solutions, usually because the game
has more than one equilibrium. In such cases, the optimal strategy
remains undetermined and selections are usually made on the basis of
what players think other players will do. Therefore, strategic selection is
not necessarily rational. It may centre on prominent features of the
game focal points towards which decision making gravitates (Schelling, 1960). These salient features act like beacons for those playing the
game, so that the Wnal outcome is in equilibrium. They are usually
experiential or cultural, rather than rational.
The problem of indeterminacy aVects, in particular, mixed-strategy
Nash equilibrium solutions because, if one player expects the other to
choose a mixed strategy, then he or she has no reason to prefer a mixed
strategy to a pure one. To overcome this, some writers have suggested
that mixed-strategy probabilities represent what players subjectively
believe other players will do, rather than what they will actually do
(Aumann, 1987). This is akin to the Harsanyi doctrine, which states that
if rational players have the same information, then they must necessarily share the same beliefs, although it is undermined in turn by the fact
178
B
Pass 1
A
Pass 2
Pass 3
Take 1
Take 2
Take 3
(1, 0)
(0, 2)
(3, 0)
Pass 50
Take 4
(0, 4)
(0, 0)
Take 50
(0, 50)
that rational players with the same information do not always make the
same suggestions or reach similar conclusions.
Inconsistency
The third major criticism of game theory, that of inconsistency (Binmore, 1987), concerns the technique of backward induction and the
assumption of common knowledge of rationality in Bayesian sub-game
perfect Nash equilibria. The criticism is best illustrated by way of an
example.
The centipede game, so-called because of the appearance of its game
tree, was developed by Rosenthal (1981) from Selten (1978), and has
since been extended to include a number of variations (Megiddo, 1986;
Aumann, 1988; McKelvey & Palfrey, 1992). The original basic version
has two players, A and B, sitting across from each other at a table. A
referee puts 1 on the table. Player A is given the choice of taking it and
ending the game, or not taking it, in which case the referee adds
another 1 and oVers player B the same choice take the 2 and end the
game, or pass it back to the referee who will add another 1 and oVer
the same choice to player A again. The pot of money is allowed to grow
until some pre-arranged limit is reached 50 say which is known in
advance to both players. Figure 9.1 shows the decision tree for the
game.
The method of backward induction tells us that, since player B must
surely take the 50 at the Wnal node, a rational player A should accept
179
Inconsistency
180
Conclusion
Game theory clearly fails to describe the reality of decision making in
some circumstances, although in its defense, it should be said that it
primarily seeks to provide a prescriptive analysis that better equips
players to make good strategic decisions. It does not make moral or
ethical recommendations. It merely explores what happens when certain selWsh incentives are assumed. Game theory cannot be held responsible for selWsh behaviour, no more than medicine can be held
responsible for sickness.
Game theory is in Xux. It is continually being developed and researched. Not all predictions have been found to be supported by
empirical evidence and this has led to reWnement and reconstruction.
So it should be! New and more complex variables have been introduced, largely as a result of its application to neo-classical and neoKeynesian economics, though the extent to which in-game learning
inXuences both success and rationality has not yet been fully explored.
Fundamental questions such as whether learning increases pay-oV or
determines strategy, whether good learners play better games and
which type of learning best equips players for which type of games,
have been left unasked and unanswered. Such questions are of fundamental importance in education, training and organisational development, of course. The rapidly changing nature of society and its
181
Conclusion
post-industrial economy brings new challenges almost daily. Information is no longer precious, the property of the privileged few. It is
immediate, available in real time and irrespective of individual status.
Organisational intelligence has thus become the shared faculty of the
many, and the worth of collectives has become rooted in notions of
social and intellectual capital.
If surviving and thriving in the face of change is the name of the
game, then everyone involved in it is a player. Individuals and organisations need to learn generic concepts of strategic networking and
problem resolution, as a cultural expectation and over a lifetime.
Decision making needs to be informed and sure-footed. The pay-oV
for keeping apace is eVectiveness, the price for failing to do so is
degeneration, and the strategy for avoiding failure lies at the interface
of game theory and learning. It is an interaction that can only grow
stronger.
Preamble
Suppose that the pay-oV matrix for a two-person zero-sum game has m
rows and n columns and that player 1 and player 2 choose their
strategies, represented by row ri and column cj, simultaneously and
with pay-oV uij. Both players randomise their selections using mixed
strategies in which a probability is assigned to each available option: p
for player 1 and q for player 2 say). Of course, the sum of each players
mixed strategies is unity and can be written:
p : (p1, p2, . . . , pi, . . . , pm)
pi : 1, for i : 1 to m.
and
q : (q1, q2, . . . , q j, . . . , qn)
q j : 1, for j : 1 to n
Therefore, strategy ri will be chosen with probability pi and strategy cj
with probability q j. These strategies are chosen independently, so the
probability of getting a particular pay-oV wij is pi q j. The expected
pay-oV for the game is then:
wij pi q j, for i : 1 to m and j : 1 to n.
182
183
Appendix A
Proof: step 1
Since, by deWnition, the maximum value of a variable cannot be smaller
than any other value and the minimum value cannot be bigger than any
other value, it follows that
maxp minq wij pi q j : minq wrj pr q j O wrc pr qc O maxp wic pi qc
: minq maxp wij pi q j
So:
maxp minq wij pi q j O minq maxp wij pi q j
184
Appendix A
Teachers
Students
Strategy
Teach on
Passively
supervise
group study
Actively
give revision
workshops
Attend
lessons
20
13
Do not attend
lessons
10
Figure A.1
185
Appendix A
Player 2
chooses
column 1
Player 2
chooses
column 2
Player 2
chooses
column 3
20
10
( 20, 0 )
( 8, 10 )
( 3, 9 )
Player 1
chooses
row 1
Player 1
chooses
row 2
Vertex
Figure A.2
Player 1
chooses
row 2
b
10
20
Player 1
chooses
row 1
Figure A.3
186
Appendix A
q2 : 45/62;
q3 : 12/62
Proof: step 2
If
minq maxp wij pi q j 0
then the coordinates of any point in W cannot both be negative. At least
one must be zero or positive and therefore, the region of the third
quadrant, Q, can have no common point with W.
Let a : (a1, a 2, . . . , am) be the point in Q nearest W, and b : (b1,
b2, . . . , bm) be the point in W nearest a. Clearly, if ai is replaced by a
negative number ai* or zero, a is still a point in Q and is therefore no
nearer W. In other words:
(b1 9 a1*)2 ; (b2 9 a2)2 ; ; (bm 9 am)2 P (b1 9 a1)2
; (b2 9 a2)2 ; ; (bm 9 am)2
which simpliWes after cancellation to:
(b1 9 a 1*)2 P (b1 9 a1)2
There are two cases to consider.
If b1 O 0, then possibly a1* : b1, in which case a1 : b1. Likewise a
and b.
If b1 0 and if it is assumed that a1* : 0, then b12 P (b1 9 a1)2,
which simpliWes to (b1 9 a1) a1 : 0 because a1 O 0 and b1 0.
Likewise a and b.
Proof: step 3
For any number t between 0 and 1 and any point w + W, the point
tw ; (1 9 t)b : [tw1 ; (1 9 t)b1, tw2 ; (1 9 t)b2, . . . ,
twm ; (1 9 t)bm]
187
Appendix A
Proof: step 4
Each of the numbers (b1 9 a1), (b2 9 a2), . . . , (bm 9 am) is positive or
zero, according to the result of step 2. They cannot all be zero, because a
and b are diVerent points and therefore cannot have all the same
coordinates. Therefore, the sum of all these numbers is positive, i.e:
(bi 9 ai) 0
Therefore, if (bi 9 ai)/(bi 9 ai) is denoted by i, then 1, 2, . . . , m are
each either zero or positive and i : 1, since (bi 9 ai)/
(bi 9 ai) : 1.
So : (1, 2, . . . , m) satisWes the requirements of a mixed strategy.
Dividing each term of the expression at the end of step 3 by
(bi 9 ai) gives:
a11 ; a22 ; ; amm 0, for every w + W
But according to the deWnition of W in the graphic model, the
coordinates of w are:
w : (w1j q j, w2j q j, . . . , wmj q j)
188
Appendix A
so a mixed strategy has therefore been found for player 1 such that:
1w1j q j ; 2w2j q j ; ; mwmj q j 0
for every q, so that:
minq wij i q j 0
Since this holds for , it must hold for the mixed-strategy p that
maximises
minq wij pi q j
Therefore:
maxp minq wij pi q j 0
So it has been shown that:
if minq maxp wij pi qj 0, then maxp minq wij pi qj 0
Proof: step 5
Let k be any number and consider the pay-oV matrix which has wij 9 k
in place of wij everywhere. All pay-oVs are reduced by k in both pure
and mixed strategies. So:
(maxp minq wij pi q j) is replaced by (maxp minq wij pi q j 9 k)
and
(minq maxp wij pi q j) is replaced by (minq maxp wij pi q j 9 k)
It was proved in step 4 that:
if (minq maxp wij pi q j 9 k) 0, then (maxp minq wij pi q j 9 k) 0
So
if minq maxp wij pi q j k, then maxp minq wij pi q j k
Since k can be as close as necessary to minq maxp wij pi q j, it follows
that:
maxp minq wij pi q j P minq maxp wij pi q j
But we have already seen that:
189
Appendix A
Preamble
Bayess theorem shows how a posteriori probabilities are calculated
from a priori ones. In other words, how probabilities are updated as
more information is received. In its simplest form it states:
p(A/B) :
p(B/A) p(A)
p(B/A) p(A) ; p(B/Ac) p(Ac)
p(B/Ai) p(Ai)
ip(B/Ai) p(Ai)
For example, if there are only two possibilities for Ai, then:
p(A1/B) :
190
p(B/A1) p(A1)
p(B/A1) p(A1) ; p(B/A2) p(A2)
191
Appendix B
and
p(A2/B) :
p(B/A2) p(A2)
p(B/A1) p(A1) ; p(B/A2) p(A2)
Proof
By deWnition,
p(Ai and B) : p(B/Ai) p(Ai)
and
p(Ai and B) : p(Ai/B) p(B)
Equating these two gives:
p(Ai/B) :
p(B/Ai) p(Ai)
p(B)
(1)
or
p(B) p(Ai/B) : p(B/Ai) p(Ai)
Summing both sides over i gives:
p(B) i p(Ai/B) : i p(B/Ai) p(Ai)
But we know that
i p(Ai/B) : 1
so
p(B) : i p(B/Ai) p(Ai)
Substituting (1) into (2) gives the result:
p(Ai/B) :
p(B/Ai) p(Ai)
i p(B/Ai) p(Ai)
(2)
Bibliography
193
Bibliography
Benoit, J.P. & Krishna, V. (1987) Dynamic duopoly: prices and quantities, Review of
Economic Studies, Vol.54, No.1(177), pp.2335.
Bertrand, J. (1883) Theorie mathematique de la richesse sociale, par Leon Walras; recherches sur les principes mathematique de la theorie des richesses, par Augustin
Cournot, Journal des Savants, September, pp.499508.
Bierman, H.S. & Fernandez, L.F. (1998) Game Theory with Economic Applications (Reading,
MA, Addison-Wesley). [2nd edition.]
Binmore, K. (1987) Modelling rational players (part I), Economics and Philosophy, Vol.3,
No.2, pp.179214.
Binmore, K.G. (1990) Essays on the Foundation of Game Theory (Oxford, Basil Blackwell).
Binmore, K. (1992) Fun and Games: A Text on Game Theory (Lexington, MA, D.C. Heath).
Borel, E. (1924) Sur les jeux ou intervennent lhasard et lhabilite des joueurs, in: J.
Hermann (Ed.) Theorie des Probabilities, pp.20424 (Paris, Librairie ScientiWque).
Translation: Savage, L.J. (1953) On games that involve chance and the skill of players,
Econometrica, Vol.21, No.1, pp.10115.
Brams, S.J. (1990) Negotiation Games: Applying Game Theory to Bargaining and Arbitration
(London, Routledge).
Bresnahan, T.F. (1981) Duopoly models with consistent conjectures, American Economic
Review, Vol.71, No.5, pp.93445.
Champsaur, P. (1975) Cooperation versus competition, Journal of Economic Theory,
Vol.11, No.3, pp.394417.
Colman, A.M. (1982) Game Theory and Experimental Games: The Study of Strategic
Interaction (Oxford, Pergamon Press).
Corsi, J.R. (1981) Terrorism as a desperate game: fear, bargaining and communication in
the terrorist event, Journal of ConXict Resolution, Vol.25, No.2, pp.4785.
Cournot, A.A. (1838) Recherches sur les Principes Mathematiques de la theorie des Richesses
(Paris). [See Bacon, N.T. for English edition.]
Cowen, R. & Fisher, P. (1998) Security council reform: a game theoretic analysis, Mathematics Today, Vol.34, No.4, pp.1004.
David, F.N. (1962) Games, Gods and Gambling: The Origins and History of Probability
and Statistical Ideas from the Earliest Times to the Newtonian Era (London, Charles
GriYn).
Deegan, J. & Packel, E.W. (1978) A new index of power for simple n-person games,
International Journal of Game Theory, Vol.7, Issue 2, pp.11323.
Dimand, R.W. & Dimand, M.A. (1992) The early history of the theory of strategic games
from Waldegrave to Borel, in: E.R. Weintraub (Ed.) Toward a History of Game Theory
(Durham, NC, Duke University Press).
Dixit, A.K. & NalebuV, B.J. (1991) Thinking Strategically: The Competitive Edge in Business,
Politics, and Everyday Life (New York, Norton).
Dixit, A. & Skeath, S. (1999) Games of Strategy (New York, Norton).
Eatwell, J., Milgate, M. & Newman, P. (Eds) (1989) The New Palgrave: Game Theory
(London, Macmillan). [First published in 1987 as The New Palgrave: A Dictionary of
Economics.]
194
Bibliography
195
Bibliography
196
Bibliography
Mitchell, C.R. & Banks, M. (1996) Handbook of ConXict Resolution: the Analytical ProblemSolving Approach (London, Pinter).
Morgenstern, O. (1976) The collaboration between Oskar Morgenstern and John von
Neumann on the theory of games, Journal of Economic Literature, Vol.14, No.3, pp.805
16.
Myerson, R.B. (1984) Cooperative games with incomplete information, International
Journal of Game Theory, Vol.13, No.2, pp.6996.
Nasar, S. (1998) A Beautiful Mind: The Life of Mathematical Genius and Nobel Laureate
John Nash (London, Faber).
Nash, J.F. (1950) Equilibrium points in n-person games, Proceedings of the National cademy
of Sciences of the United States of America, Vol.36, No.1, pp.489.
Nash, J. (1951) Non co-operative games, Annals of Mathematics, Vol.54, No.2, pp.286
95.
ONeill, B. (1987) Nonmetric test of the minimax theory of two-person zerosum games,
Proceedings of the National Academy of Sciences of the United States of America, Vol.84,
No.7, pp.21069.
Peleg, B. (1963) Solutions to cooperative games without side payments, Transactions of the
American Mathematical Society, Vol.106, pp.28092.
Phlips, L. (1995) Competition policy: a Game Theoretic Perspective (Cambridge, Cambridge
University Press).
Plon, M. (1974) On the meaning of the notion of conXict and its study in social psychology,
European Journal of Social Psychology, Vol.4, pp.389436.
Poundstone, W. (1993) Prisoners Dilemma: John von Neumann, Game Theory, and the
Puzzle of the Bomb (Oxford, Oxford University Press). [First published, 1922, New York,
Doubleday.]
Radner, R. (1980) Collusive behaviour in noncooperative epsilon-equilibria of oligopolies
with long but Wnite lives, Journal of Economic Theory, Vol.22, No.2, pp.13656.
Rapoport, A. (1967a) Exploiter, leader, hero and martyr: the four archetypes of the 2 ; 2
game, Behavioral Science, Vol.12, pp.814.
Rapoport, A. (1967b) Escape from paradox, ScientiWc American, Vol.217, No.1, pp.506.
Rapoport, A (1989) Prisoners dilemma, in: J. Eatwell, M. Milgate & P. Newman (Eds) The
New Palgrave: Game Theory (London, Macmillan). [Originally published as The New
Palgrave: A Dictionary of Economics, 1987.]
Rapoport, A. & Guyer, M. (1966) A taxonomy of 2 ; 2 games, General Systems, Vol.11,
Part V, pp.20314.
Rapoport, A. & Orwant, C. (1962) Experimental games: a review, Behavioral Science, Vol.7,
pp.137.
Rees, R. (1993) Tacit collusion, Oxford Review of Economic Policy, Vol.9, No.2, pp.2740.
Riker, W.H. (1962) The Theory of Political Coalitions (New Haven, CT, Yale University
Press).
Riker, W.H. (1992) The entry of game theory into political science, in: E.R. Weintraub
(Ed.) Toward a History of Game Theory (Durham, NC, Duke University Press).
Riker, W.H. & Ordeshook, P.C. (1973) An Introduction to Positive Political Theory (Englewood CliVs, NJ, Prentice Hall).
197
Bibliography
Robinson, M. (1975) Prisoners dilemma: metagames and other solutions (critique and
comment), Behavioral Science, Vol.20, pp.2015.
Romp, G. (1997) Game Theory: Introduction and Applications (Oxford, Oxford University
Press).
Rosenthal, R.W. (1979) Sequence of games with varying opponents, Econometrica, Vol.47,
No.6, pp.135366.
Rosenthal, R.W. (1980) New equilibria for non-cooperative two-person games, Journal of
Mathematical Sociology, Vol.7, No.1, pp.1526.
Rosenthal, R.W. (1981) Games of perfect information, predatory pricing, and the chainstore paradox, Journal of Economic Theory, Vol.25, No.1, pp.92100.
Rosenthal, R.W. & Landau, H.J. (1979) A game-theoretic analysis of bargaining with
reputations, Journal of Mathematical Psychology, Vol.20, No.3, pp.23355.
Rosenthal, R.W. & Rubinstein, A. (1984) Repeated two-player games with ruin, International Journal of Game Theory, Vol.13, No.3, pp.15577.
Sauage, L.J. (1954) The Foundation of Statistics (New York, John Wiley).
Scarf, H.E. (1967) The core of an n person game, Econometrica, Vol.35, No.1, pp.5069.
Schelling, T.C. (1960) The Strategy of ConXict (Cambridge, MA, Harvard University Press).
[1980 edition.]
Schmalensee, R. & Willig, R.D. (Eds) (1989) Handbook of Industrial Organisation (Amsterdam, Elsevier Science/North-Holland). [2 vols.]
Selten, R. (1975) The reexamination of the perfectness concept for equilibrium points in
extensive games, International Journal of Game Theory, Vol.4, Issue 1, pp.2555.
Selten, R. (1978) The chain-store paradox, Theory and Decision, Vol.9, No.2, pp.12759.
Selten, R. (1980) A note on evolutionary stable strategies in asymmetric animal conXicts,
Journal of Theoretical Biology, Vol.84, No.1, pp.93101.
Shapley, L.S. (1953) A value for n-person games, in: H.W. Kuhn & A.W. Tucker (Eds)
Contributions to the Theory of Games Vol.II, Annals of Mathematics Studies Number 28,
pp.30718 (Princeton, NJ, Princeton University Press).
Shapley, L.S. & Shubik, M. (1954) A method for evaluating the distribution of power in a
committee system, American Political Science Review, Vol.48, No.3, pp.78792.
Shapley, L.S. & Shubik, M. (1969) Pure competition, coalitional power and fair division,
International Economic Review, Vol.10, No.3, pp.33762.
Shapley, L.S. & Snow, R.N. (1950) Basic solutions of discrete games, in: H.W. Kuhn & A.W.
Tucker (Eds) Contributions to the Theory of Games Vol.I, Annals of Mathematics Studies
Number 24, pp.2735 (Princeton, NJ, Princeton University Press).
Shubik, M. (1959) Edgeworth market games, in: A.W. Tucker & R.D. Luce (Eds) Contributions to the Theory of Games Vol.IV, Annals of Mathematical Studies Number 40,
pp.26778 (Princeton, N.J., Princeton University Press).
Sidowski, J.B. (1957) Reward and punishment in a minimal social situation, Journal of
Experimental Psychology, Vol.54, No.5, pp.31826.
Simon, H.A. (1997) Models of Bounded Rationality, Vol.3: Empirically Grounded Economic
Reason (Cambridge, MA., MIT Press).
StraYn, P.D. (1977) Homogeneity, independence, and power indices, Public Choice,
Vol.30, pp.10718. [Quoted in: Riker, W.H. (1992) The entry of game theory into
198
Bibliography
political science, in: E.R. Weintraub (Ed.) Toward a History of Game Theory (Durham,
NC, Duke University Press).]
Sugden, R. (1991) Rational choice: a survey of contributions from economics and philosophy, Economic Journal, July, Vol.101, pp.75185.
Todhunter, I. (1865) A History of the Mathematical Theory of Probability (Cambridge,
Cambridge University Press). [Reprinted 1965.]
Touraine, A. (1969) La Societe Post-Industrielle (Paris, Editions Denoel). Translation:
Mayhew, L.F.X. (1974) The Post-Industrial Society (London, Wildwood House).
Ville, J.A. (1938) Sur la theorie generale des jeux ou intervient lhabilite des joueurs, in:
Borel, E. (Ed.) Traite du Calcul des Probabilities et de ses Applications, Vol.4, pp.10513
(Paris, Gauthier-Villars).
Von Neumann, J. (1928) Zur Theorie der Gesellschaftsspiele, Mathematische Annalen,
Band 100, pp.295320. Translation: Bargmann, S. (1959) On the theory of games and
strategy, in: R.D. Luce & A.W. Tucker (Eds) Contributions to the Theory of Games Vol.IV,
Annals of Mathematics Studies Number 40, pp.1342 (Princeton, NJ, Princeton University Press).
Von Neumann, J. (1937) Uber ein Okonomisches Gleichungssystem und eine Verallgemeinerung des Brouwerschen Fixpunktsatzes, in: Menger, K. Ergebnisse eines
Mathematischen Seminars (Vienna). Translation: Morgenstern, G. (1945) A model of
general economic equilibrium, Review of Economic Studies, Vol.13, No.1, pp.19.
Von Neumann, J. & Morgenstern, O. (1953) Theory of Games and Economic Behaviour
(Princeton, NJ, Princeton University Press). [First published in 1944.]
Von Stackelberg, H. (1934) Marktform und Gleichgewicht (Vienna, Springer). [As described
in: H. Von Stackelberg (1952) The Theory of the Market Economy (London, Hodge).
Translation: Peacock, A.T. (First published in 1943 as Grundlagen der Theoretischen
Volkswirtschaftslehre).]
Wald, A. (1945) Statistical decision functions which minimise maximum risk, Annals of
Mathematics, Vol.46, No.2, pp.26580.
Weyl, H. (1950) Elementary proof of a minimax theorem due to von Neumann, in: H.W.
Kuhn & A.W. Tucker (Eds) Contributions to the Theory of Games Vol.I, Annals of
Mathematics Studies Number 24, pp. 1925 (Princeton, NJ, Princeton University Press).
Wilson, R. (1978) Information, eYciency, and the core of an economy, Econometrica,
Vol.46, No.4, pp.80716.
Zermelo, E. (1913) Uber eine Anwendung der Mengenlehre auf die Theorie des Schachspiels, in: E.W.Hobson & A.E.H. Love (Eds) Proceedings of the Fifth International
Congress of Mathematicians, Cambridge, 2228 August, 1912, Vol.2, pp.5014 (Cambridge, Cambridge University Press).
Index
199
maximin principle 46
minimax principle 47
probability theory 32, 337
regret matrices 47
with risk 32, 3745
risk-averse players/functions 41
risk-neutral players/functions 41
risk-taking players/functions 42
six faces of die example 345
standard deviation 36
with uncertainty 32, 457
utility theory/value 3845
variance 367
von NeumannMorgenstern utility
function 412
characteristic functions 1535
children/successors of decision edges 49
coalescence or voting order 1689
coalition factions, and power 155
coalitions see Multi-person games; power
analysis/indices
collective rationality 175
committee forming and voting power 170
complete information games 5
completeness concept 156
computer training course, chance games
example 3941
constant-sum games 77
constraint set 18, 201, 245
contract lines 1212
cooperative games of strategy 1, 6
see also multi-person games, partially
cooperative; sequential decision making,
cooperative two-person games
Cournot model of duopoly 11516, 11622
balance and imbalance 121
paper and packaging sector example 11620
see also Nash, John, Nash equilibrium
critical factions 156
decision making graphs 525
200
Index
201
Index
202
Index
example 1089
unveiled by Tucker, A.W. 13
private healthcare industry publicity funding,
repeated games example 136, 1412,
1448
probability theory
basics 32, 334
expected value 356
six faces of die example 345
standard deviation 36
variance 367
see also chance games
proposing change under uncertainty, sequential
change example 702
punishing strategies 137
pure strategy
about 4
union/redundant teacher dispute 45
RaiVa, Howard 11
rail high-speed link funding, chance games
example 378
Rand Corporation and US research funds
1213
Rapaports winning strategy for prisoners
dilemma game 1767
rationality/assumption/presumption 2, 3,
1747
bounded rationality 176
categorical imperatives 1756
collective rationality 175
instrumental rationality 175
Kants notion of moral imperative 176
reaction functions 11718
regret matrices 47
repeated games see Wnitely repeated games;
inWnitely repeated games
research and design funding , skill games
example 2831
risk in games of chance 32
computer training course example 3941
high-speed rail link funding example 378
investment portfolio example 434
risk-averse players/functions 41
risk-neutral players/functions 41
risk-taking players/functions 42
utility theory/value 3845
von NeumannMorgenstern utility
function 412
see also chance games
roots, sequential decision making 501
Rosenthal and Rubinstein 12
Royal Ballet fundraising, skill games example
223
saddle points, two-person zero-sum games
806
dominance and inadmissibility 856
ordinal pay-oVs 957
203
Index
1589
Shapley value, about 1567
ShapleyShubik index, about 1579
skill games
about 6, 7, 17
balancing full/part time staV example 235
constraint set 18, 201, 245
derivatives of a functions 18, 23
exam success by tutoring example 267
funding research and design example 2831
fundraising at Royal Ballet example 223
hospital facilities example 202
Lagrange method of partial derivatives
2731
Lagrangian function 278
linear programming/optimisation/basic
calculus 1827
optimiser/maximiser 18
pay-oV matrix 20, 21
utility function 18
sponsorship organising, sequential decision
example 559
standard deviation, probability theory 36
strategy games, about 6, 7
cooperative games 6
mixed motive games 6
zero-sum games 6
sub-optimal strategies 3
successors/children, of decision edges 49
supermarket sector, Bertrand duopoly
example 1257
suppliers, dealing with, sequential decisions
examples 746, 756
teachers union and a redundancy discussion
about 2
pure strategies 45
terminal nodes, sequential decision making 50
terminology 35
Theory of Games and Economic Behaviour 1011
Todhunter, Isaac 8
Touraine, Alain 14
transitivity concept 156
travel timetable reforming, saddle point game
example 834
trees, sequential decision making 502
trigger strategies 1379
Tucker, A.W., and the prisoners dilemma
game 13, 10713
tutoring/exam success rate, skill games
example 267
two-person games see mixed-motive two-person
games of strategy; sequential decision
making, cooperative two-person games;
zero-sum two-person games of strategy
uncertainty, games with 32
chance games with 457
maternity leave insurance example 457
204
Index
Zermolos theorem 8, 12
zero-sum non-cooperative games, about 2
zero-sum two-person games of strategy 6,
7797
about 778
constant-sum games 77
dominance and inadmissibility 856
Wnite and inWnite 78
information sets 78
interval scales for pay-oVs 934
large matrices 903
law of conservation of utility value 77
medical incompetence example 823
minimax mixed strategies 89
mixed strategies 89
no saddle-point games 8690
normal forms 7880
ordinal pay-oVs 947
pareto-eYciency 77
pay-oV matrix 83
perfect and imperfect information
789
re-allocation of airline pilot duties example
8790
representation 7880
saddle points 806
solutions, deWnition 78
student attendance example 913
travel timetable reforming example 834
zero-sum game representation 7880