100% found this document useful (15 votes)

3K views

Decision Making Using Game Theory

Game Theory

Uploaded by

kaasim

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (15 votes)

3K views

Decision Making Using Game Theory

Game Theory

Uploaded by

kaasim

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 215

Decision Making Using Game Theory

An Introduction for Managers

Game theory is a key element in most decision-making processes involving two or more
people or organisations. This book explains how game theory can predict the outcome of
complex decision-making processes, and how it can help you to improve your own
negotiation and decision-making skills. It is grounded in well-established theory, yet the
wide-ranging international examples used to illustrate its application oVer a fresh approach
to what is becoming an essential weapon in the armoury of the informed manager. The
book is accessibly written, explaining in simple terms the underlying mathematics behind
games of skill, before moving on to more sophisticated topics such as zero-sum games,
mixed-motive games, and multi-person games, coalitions and power. Clear examples and
helpful diagrams are used throughout, and the mathematics is kept to a minimum. It is
written for managers, students and decision makers in any Weld.
Dr Anthony Kelly is a lecturer at the University of Southampton Research & Graduate School

of Education where he teaches game theory and decision making to managers and students.

MMMM

Decision Making using

Game Theory
An introduction for managers

Anthony Kelly

Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, So Paulo
Cambridge University Press
The Edinburgh Building, Cambridge , United Kingdom
Published in the United States of America by Cambridge University Press, New York
www.cambridge.org
Information on this title: www.cambridge.org/9780521814621
Cambridge University Press 2003
This book is in copyright. Subject to statutory exception and to the provision of
relevant collective licensing agreements, no reproduction of any part may take place
without the written permission of Cambridge University Press.
First published in print format 2003
-
isbn-13 978-0-511-06494-4 eBook (NetLibrary)
-
isbn-10 0-511-06494-2 eBook (NetLibrary)
-
isbn-13 978-0-521-81462-1 hardback
-
isbn-10 0-521-81462-6 hardback

Cambridge University Press has no responsibility for the persistence or accuracy of

s for external or third-party internet websites referred to in this book, and does not
guarantee that any content on such websites is, or will remain, accurate or appropriate.

Contents

Preface

Introduction

Terminology
Classifying games
A brief history of game theory
Layout

3
6
8
14

Games of skill

Linear programming, optimisation and basic calculus

The Lagrange method of partial derivatives

18
27

Games of chance

An introduction to basic probability theory

Games of chance involving risk
Games of chance involving uncertainty

33
37
45

Contents

Sequential decision making and cooperative games of

strategy

Representing sequential decision making

Sequential decision making in single-player games
Sequential decision making in two-player and multi-player games
Cooperative two-person games

49
52
66
72

Two-person zero-sum games of strategy

Representing zero-sum games

Games with saddle points
Games with no saddle points
Large matrices generally
Interval and ordinal scales for pay-oVs

78
80
86
90
93

Two-person mixed-motive games of strategy

Representing mixed-motive games and the Nash equilibrium

Mixed-motive games without single equilibrium points:
archetype 1 leadership games
Mixed-motive games without single equilibrium points:
archetype 2 heroic games
Mixed-motive games without single equilibrium points:
archetype 3 exploitation games
Mixed-motive games without single equilibrium points:
archetype 4 martyrdom games
Summary of features of mixed-motive prototypes
The Cournot, von Stackelberg and Bertrand duopolies:
an interesting application of mixed-motive games
Solving games without Nash equilibrium points using mixed
strategies

99
102
104
105
107
113
115
129

vii

Contents

Repeated games

135

InWnitely repeated games

Finitely repeated games

135
139

Multi-person games, coalitions and power

149

Non-cooperative multi-person games

Mixed-motive multi-person games
Partially cooperative multi-person games
Indices of power: measuring inXuence

150
151
153
155

A critique of game theory

174

Rationality
Indeterminacy
Inconsistency
Conclusion

174
177
178
180

Appendix A Proof of the minimax theorem

Appendix B Proof of Bayess theorem

182
190

Bibiliography
Index

192
199

MMMM

Preface

And, greatest dread of all, the dread of games!

John Betjeman 19061984 Summoned by Bells

Game theory is the science of strategic decision making. It is a powerful

tool in understanding the relationships that are made and broken in
the course of competition and cooperation. It is not a panacea for the
shortcomings of bad management. For managers, or those who interact with management, it is simply an alternative perspective with which
to view the process of problem solving. It is a tool, which, like all others,
is best used by those who reXect on their own practice as a mechanism
for improvement. Chance favours a prepared mind and this book is
intended as much for those who are seeking eVectiveness as for those
who have already found it.
Game theory has been used to great eVect in sciences as diverse as
evolutionary biology and economics, so books on the subject abound.
They vary from the esoteric to the populist; from the pedantic to the
frivolous. This book is diVerent in a number of ways. It is designed for
both students and practitioners. It is theoretical insofar as it provides
an introduction to the science and mathematics of game theory; and
practical in that it oVers a praxis of that theory to illustrate the
resolution of problems common to management in both the commercial and the not-for-proWt sectors.
The book is intended to help managers in a number of ways:
To expand the conceptual framework within which managers operate and in doing so, encourage them to develop more powerful
generic problem-solving skills.
To resolve practical diYculties as and when they occur, more eYciently and with increased eVectiveness.
ix

Preface

To Wnd new solutions to familiar problems that have not been

satisfactorily resolved, by giving practitioners a deeper understanding of the nature of incentives, conXict, bargaining, decision making
and cooperation.
To oVer an alternative perspective on problems, both old and new,
which may or may not yield solutions, but which at worst, will lead
to an increased understanding of the objective nature of strategic
decision making.
To help managers understand the nature of power in multi-person
systems and thereby reduce the perception of disenfranchisement
among those who work in committee-like structures within
organisations.
The book is a self-contained, though by no means exhaustive, study
of game theory. It is primarily intended for those who work as managers, but not exclusively so. Students of politics, economics, management science, psychology and education may Wnd the approach used
here more accessible than the usual format of books on the subject. No
great mathematical prowess is required beyond a familiarity with
elementary calculus and algebra in two variables.
Game theory, by its very nature, oVers a rational perspective and, in
a society that has developed an aversion to such things, this will be
suYcient reason for some to criticise it. This is as unfortunate as it is
short-sighted. Research suggests that good managers are well informed,
multi-skilled and Xexible in their approach to problem solving.
Organisations themselves are increasingly complex places, which can
no longer aVord to live in isolation from the expectations of their
employees or the wider community. More than ever, they are workplaces where managers must continuously balance opposing forces.
The resulting tensions are ever-changing, and know-how, mathematical or otherwise, is often what separates a failing manager from a
successful one.
It has been said, by way of an excuse for curtailing knowledge, that a
person with two watches never knows what time it is! Unfortunately,
managers cannot aVord such blinkered luxury. Game theory has clearly
been successful in describing, at least in part, what it is to be a decision
maker today and this book is for those who are willing to risk knowing
more.

Introduction

Man is a gaming animal. He must always be trying to get the better in something or other.
Charles Lamb 17751834 Essays of Elia

Game theory is the theory of independent and interdependent decision

making. It is concerned with decision making in organisations where
the outcome depends on the decisions of two or more autonomous
players, one of which may be nature itself, and where no single decision
maker has full control over the outcomes. Obviously, games like chess
and bridge fall within the ambit of game theory, but so do many other
social situations which are not commonly regarded as games in the
everyday sense of the word.
Classical models fail to deal with interdependent decision making
because they treat players as inanimate subjects. They are cause and
eVect models that neglect the fact that people make decisions that are
consciously inXuenced by what others decide. A game theory model,
on the other hand, is constructed around the strategic choices available
to players, where the preferred outcomes are clearly deWned and
known.
Consider the following situation. Two cyclists are going in opposite
directions along a narrow path. They are due to collide and it is in both
their interests to avoid such a collision. Each has three strategies: move
to the right; move to the left; or maintain direction. Obviously, the
outcome depends on the decisions of both cyclists and their interests
coincide exactly. This is a fully cooperative game and the players need to
signal their intentions to one other.
However, sometimes the interests of players can be completely
opposed. Say, for example, that a number of retail outlets are each
1

Introduction

vying for business from a common Wnite catchment area. Each has to
decide whether or not to reduce prices, without knowing what the
others have decided. Assuming that turnover increases when prices are
dropped, various strategic combinations result in gains or losses for
some of the retailers, but if one retailer gains customers, another must
lose them. So this is a zero-sum non-cooperative game and unlike
cooperative games, players need to conceal their intentions from each
other.
A third category of game represents situations where the interests of
players are partly opposed and partly coincident. Say, for example, the
teachers union at a school is threatening not to participate in parents
evenings unless management rescinds the redundancy notice of a
long-serving colleague. Management refuses. The union now complicates the game by additionally threatening not to cooperate with
preparations for government inspection, if their demands are not met.
Management has a choice between conceding and refusing, and whichever option it selects, the union has four choices: to resume both
normal work practices; to participate in parents evenings only; to
participate in preparations for the inspection only; or not to resume
participation in either. Only one of the possible strategic combinations
leads to a satisfactory outcome from the managements point of view
management refusing to meet the unions demands notwithstanding
the resumption of normal work although clearly some outcomes are
worse than others. Both players (management and union) prefer some
outcomes to others. For example, both would rather see a resumption
of participation in parents evenings since staV live in the community
and enrolment depends on it than not to resume participation in
either. So the players interests are simultaneously opposed and coincident. This is an example of a mixed-motive game.
Game theory aims to Wnd optimal solutions to situations of conXict
and cooperation such as those outlined above, under the assumption
that players are instrumentally rational and act in their own best
interests. In some cases, solutions can be found. In others, although
formal attempts at a solution may fail, the analytical synthesis itself can
illuminate diVerent facets of the problem. Either way, game theory
oVers an interesting perspective on the nature of strategic selection in
both familiar and unusual circumstances.
The assumption of rationality can be justiWed on a number of levels.

Terminology

At its most basic level, it can be argued that players behave rationally by
instinct, although experience suggests that this is not always the case,
since decision makers frequently adopt simplistic algorithms which
lead to sub-optimal solutions.
Secondly, it can be argued that there is a kind of natural selection at
work which inclines a group of decisions towards the rational and
optimal. In business, for example, organisations that select sub-optimal
strategies eventually shut down in the face of competition from optimising organisations. Thus, successive generations of decisions are
increasingly rational, though the extent to which this competitive
evolution transfers to not-for-proWt sectors like education and the
public services, is unclear.
Finally, it has been suggested that the assumption of rationality that
underpins game theory is not an attempt to describe how players
actually make decisions, but merely that they behave as if they were not
irrational (Friedman, 1953). All theories and models are, by deWnition,
simpliWcations and should not be dismissed simply because they fail to
represent all realistic possibilities. A model should only be discarded if
its predictions are false or useless, and game theoretic models are
neither. Indeed, as with scientiWc theories, minor departures from full
realism can often lead to a greater understanding of the issues (Romp,
1997).

Terminology
Game theory represents an abstract model of decision making, not the
social reality of decision making itself. Therefore, while game theory
ensures that a result follows logically from a model, it cannot ensure
that the result itself represents reality, except in so far as the model is an
accurate one. To describe this model accurately requires practitioners
to share a common language which, to the uninitiated, might seem
excessively technical. This is unavoidable. Since game theory represents
the interface of mathematics and management, it must of necessity
adopt a terminology that is familiar to both.
The basic constituents of any game are its participating, autonomous
decision makers, called players. Players may be individual persons,
organisations or, in some cases, nature itself. When nature is desig-

Introduction

nated as one of the players, it is assumed that it moves without favour

and according to the laws of chance. In the terminology of game
theory, nature is not counted as one of the players. So, for example,
when a deck of cards is shuZed prior to a game of solitaire, nature the
second player is making the Wrst move in what is a one-player game.
This is intrinsically diVerent from chess for example, where nature
takes no part initially or subsequently.
A game must have two or more players, one of which may be nature.
The total number of players may be large, but must be Wnite and must
be known. Each player must have more than one choice, because a
player with only one way of selecting can have no strategy and therefore
cannot alter the outcome of a game.
An outcome is the result of a complete set of strategic selections by all
the players in a game and it is assumed that players have consistent
preferences among the possibilities. Furthermore, it is assumed that
individuals are capable of arranging these possible outcomes in some
order of preference. If a player is indiVerent to the diVerence between
two or more outcomes, then those outcomes are assigned equal rank.
Based on this order of preference, it is possible to assign numeric
pay-oVs to all possible outcomes. In some games, an ordinal scale is
suYcient, but in others, it is necessary to have interval scales where
preferences are set out in proportional terms. For example, a pay-oV of
six should be three times more desirable than a pay-oV of two.
A pure strategy for a player is a campaign plan for the entire game,
stipulating in advance what the player will do in response to every
eventuality. If a player selects a strategy without knowing which strategies were chosen by the other players, then the players pure strategies
are simply equivalent to his or her choices. If, on the other hand, a
players strategy is selected subsequent to those of other players and
knowing what they were, then there will be more pure strategies than
choices. For example, in the case of the union dispute cited above,
management has two choices and two pure strategies: concede or
refuse. However, the unions strategic selection is made after managements strategic selection and in full knowledge of it, so their pure
strategies are advance statements of what the union will select in
response to each of managements selections. Consequently, although
the union has only four choices (to resume both practices; to participate in parents evenings only; to participate in preparations for gov-

Terminology

Table 1.1 The unions pure strategies

Then the union will . . .

And if
management
chooses to . . .

Then the union will . . .

GAME THEORY

Games of skill

Games of chance

Games of strategy

Games involving
risk
Games involving
uncertainty

Two-person

Cooperative

Purely
cooperative

Mixed- motive

Minimal
social
situation

Zero-sum

Infinite

Multi-person

Coalitions not permitted

Finite

Non-cooperative

Cooperative

Perfect info
Imperfect info

Symmetric
games

No saddle

Saddle

Have no optimal
equilibrium points

Shapley
Snow

Non-essential
coalitions

Power
indices

Mixed strategy

Have optimal
equilibrium points

Essential
coalitions

Dominance &
admissability

Shapley

ShapleyShubik
Leadership
Johnston
Duopoly
models

Heroic
DeeganPackel
Exploitation
Repeated games

Banzhaf

Martyrdom

Figure 1.1

A taxonomy of games.

Games in which the interests of players are neither fully conXicting

nor fully coincident are called mixed-motive games of strategy.
Of the three categories, this last one represents most realistically the
intricacies of social interaction and interdependent decision making
and most game theory is concentrated on it.

Introduction

franchise not sincerely, according to their true preferences, but tactically, to bring about a preferred outcome. Thus the concept of strategic
voting was born. Following publication of a simpliWed version nine
years later (McKelvey & Niemi, 1978), it became an essential part of
political theory.
After that, game theory expanded dramatically. Important centres of
research were established in many countries and at many universities.
It was successfully applied to many new Welds, most notably evolutionary biology (Maynard Smith, 1982; Selten, 1980) and computer
science, where system failures are modelled as competing players in a
destructive game designed to model worst-case scenarios.
Most recently, game theory has also undergone a renaissance as a
result of its expansion into management theory, and the increased
importance and accessibility of economics in what Alain Touraine
(1969) termed the post-industrial era. However, such progress is not
without its dangers. Ever more complex applications inspire ever more
complex mathematics as a shortcut for those with the skill and knowledge to use it. The consequent threat to game theory is that the
fundamentals are lost to all but the most competent and conWdent
theoreticians. This would be a needless sacriWce because game theory,
while undeniably mathematical, is essentially capable of being understood and applied by those with no more than secondary school
mathematics. In a very modest way, this book attempts to do just that,
while oVering a glimpse of the mathematical wonderland beyond for
those with the inclination to explore it.

Layout
The book basically follows the same pattern as the taxonomy of games
laid out in Figure 1.1. Chapter 2 describes games of skill and the
solution of linear programming and optimisation problems using
diVerential calculus and the Lagrange method of partial derivatives. In
doing so, it describes the concepts of utility functions, constraint sets,
local optima and the use of second derivatives.
Chapter 3 describes games of chance in terms of basic probability
theory. Concepts such as those of sample space, random variable and
distribution function are developed from Wrst principles and

Layout

explained. Games involving risk are diVerentiated from those involving

uncertainty, using principles such as maximin and the von Neumann
utility function. Organisational characteristics such as risk aversion,
risk neutrality and risk taking are also considered.
Chapter 4 digresses from the typology of games to consider sequential and simultaneous decision making. Standard means of representing sequential decision making, like directed graphs and trees, are
discussed and examples are used to illustrate techniques such as the
method of backward induction and optimal sub-paths, for both singleplayer and multi-player games. A sub-section considers the common
but interesting case of single-player games involving uncertainty, the
notions of a priori and a posteriori probability and Bayess formula.
The chapter Wnishes by considering brieXy the category of games
known as two-person cooperative games and the minimal social situation.
The remaining chapters consider games of strategy. Chapter 5 considers two-person zero-sum games of strategy. Games with saddle
points are discussed in terms of the principles of dominance and
inadmissibility, and games without saddle points are solved using
mixed strategies. The solution of large matrices is considered using the
notion of embeddedness and examples of interval and ordinal scales
are shown to be adjustable using linear transformations.
Chapter 6 considers two-person mixed-motive games of strategy
and how to represent them. The famous prisoners dilemma game and
its suggested solution in metagame theory is discussed along with three
other categories of mixed-motive games without unique equilibrium
points: leadership games; heroic games; and exploitation games. The
Cournot, von Stackelberg and Bertrand duopoly models are fully
explored, as is the solution of games without Nash equilibrium points.
Chapter 7 examines how repeated dynamic games can be analysed
and how repetition itself aVects outcome. Finitely and inWnitely repeated games are considered, illustrated by example, and developed in the
context of important concepts such as credibility, threat and discounting future pay-oVs. The paradox of backward induction is also described and four theoretical methods of avoiding it are discussed.
Chapter 8 describes multi-player cooperative, non-cooperative and
mixed-motive games of strategy, coalitions and the real distribution of
power among voting factions on committees. Measurements of voting

Introduction

strength such as the Shapely value, the ShapleyShubik, Johnston,

DeeganPackel and Banzhaf indices are described and an extended
real-life example is fully explored, with some interesting results.
Finally, Chapter 9 presents a brief critique of game theory, considering the problems of rationality, indeterminacy and inconsistency, and
the future role of game theory in a learning society.
Nearly 100 illustrations and 40 worked examples hopefully make this
book accessible, even for those without formal mathematical training.
The examples are all drawn from commonplace situations and are
intended to illustrate the fundamental theoretical precepts upon which
problems and conXicts are resolved, rather than the complicated reality
of everyday decision making. Thus, some of the examples may appear
over-simpliWed or a triXe contrived, but better that than the principles
become obfuscated by detail, no matter how realistic. In addition, some
examples are worked and then re-worked under slightly diVerent
conditions in order to present a coherent progressive study. This has
the intended merit of demonstrating how subtle changes in circumstance can result in signiWcant diVerences in outcome, but it has the
unfortunate side eVect that readers who miss the point initially, become even more confused as the story unfolds. Great care has been
taken to explain these examples in the simplest terms especially in the
initial workings to avoid the likelihood of this happening. Hopefully,
the strategy has paid oV and the readers enjoyment of the book will not
be curtailed by the necessity to be diligent.

Games of skill

It is not from the benevolence of the butcher, the brewer, or the baker, that we expect our
dinner, but from their regard to their own interest.
Adam Smith 1789 The Wealth of Nations

Games of skill are one-player games. Since they do not involve any
other player, and when they involve nature it is under the condition of
certainty, they are not really regarded as genuine games. Nature does
not constitute a genuine second player, as in the case of games of
chance, because nothing nature does aVects the outcomes of the
players choices. The solitary player in games of skill knows for certain
what the outcome of any choice will be. The player completely controls
the outcomes. Solving a crossword puzzle is a game of skill, but playing
golf is not, since the choices that the player makes do not lead to
outcomes that are perfectly predictable. Golf is a game of chance
involving uncertainty, although some would call it a form of moral
eVort! Nature inXuences the outcomes to an extent which depends on
the players skill, but the probability of which is not known.
The operation of single-player decision making is discussed in the
following sections. The problem of linear programming and optimisation, where a player wishes to optimise some utility function within a
set of constraints, is considered with the help of some realistic
examples. The application of some basic concepts from calculus, including the Lagrange method of partial derivatives, is also discussed.

Games of skill

Linear programming, optimisation and basic calculus

The branch of mathematics known as linear programming or optimisation is devoted to games of skill. Typically, in linear programming, the
player wishes to maximise output or minimise input, given by a utility
function, from a set of alternatives, , called the constraint set. The
player also needs to devise some criteria for ranking the alternatives in
order of preference, represented by a real function:
:;R
so that + can be chosen such that () is maximised or minimised,
in which case is known as the optimiser or maximiser.
Since optimisation involves Wnding the local maxima and local
minima of functions (collectively called optima), diVerential calculus is
often the instrument of choice for solving problems.
The derivative of a function (x), denoted by '(x), expresses the rate
of change of the dependent variable (y) with respect to the independent
variable (x). Graphically then, '(x) represents the gradient of the
tangent to a curve at a particular point.
As can be seen on both Figures 2.1 and 2.2, the gradient of a tangent
is zero at a maximum and a minimum. This gives us a Wrst-order test
for local optima.
If a p b and '(p) : 0, then:
If '(a) 0 and '(b) 0, then p is a local maximum;
If '(a) 0 and '(b) 0, then p is a local minimum.
The second derivative of a function, denoted by "(x), is the derivative
of the derivative. Clearly, if the Wrst derivative changes from positive,
through zero, to negative (so that p is a local maximum), then its rate of
change is decreasing. Conversely, if the Wrst derivative changes from
negative, through zero, to positive (so that p is a local minimum), then
its rate of change is increasing. This gives us a second-order test for
functions. It amounts to the same thing as the Wrst-order test above,
but is quicker.
If "(p) 0, then p is a local maximum;
If "(p) 0, then p is a local minimum.

Linear programming, optimisation and basic calculus

'(p) = 0
'(a) > 0

b
'(b) < 0

y = x)

x
Figure 2.1

A function with a local maximum.

y = x)

'(a) < 0

'(b) > 0

b
p

'(p) = 0

x
Figure 2.2

A function with a local minimum.

The following examples illustrate how the techniques are used in

practice. Sometimes calculus is needed (Examples 2.2 and 2.3) and
sometimes not (Examples 2.1 and 2.4).

Games of skill

Example 2.1 Hospital in-patient and out-patient facilities

A hospital has raised a building fund of 480 000 with which it plans to convert
one of its old nurses residences to cater for the increased numbers using the
hospital.
The hospital caters for both in-patients and out-patients. Each in-patient
facility (bedroom, emergency, washing and catering facilities) costs 12 000 to
install. Each out-patient facility costs half that. The Hospital Trust governors
want to plan the renovation so that it maximises fee income. What is the optimal
balance between in-patient and out-patient facilities, and what are the implications for setting the level of fees charged to local medical practices? The Wre
safety and planning authorities have imposed an overall limit of 60 patients at
any one time on the new facility.

Let x represent the number of out-patients accepted in the renovated

house. Let y represent the number of in-patients accepted.
The Wnancial constraint imposed on the renovation can thus be
expressed as:
6000x ; 12 000y O 480 000
which reduces to:
x ; 2y O 800
The constraint imposed by the planning authority can be expressed as:
x ; y O 60
Common sense dictates that negative patients are impossible (at least
in the mathematical sense!), so:
x P 0 and y P 0
Figure 2.3 is a graphic representation of the constraint set, .
Clearly, there are three possible solutions. Either the hospital plans
for 40 in-patients only; or for 60 out-patients only; or for 40 outpatients and 20 in-patients.
If t is the proWt per month for each out-patient and n is the proWt per
month for each in-patient, Figure 2.4 shows the theoretical pay-oVs for
each of the three strategies.
A little algebra reveals that:
If t n, then the optimal strategy is to cater for 60 out-patients only.

Linear programming, optimisation and basic calculus

(40, 20)

x
60

Figure 2.3

The constraint set for the conversion of a nurses residence for in-patient and out-patient use.

Pay-off
(Profit)
tx + ny

Strategy
(Patient mix)

Figure 2.4

(0, 40)

40ny

(40, 20)

40tx + 20ny

(60, 0)

60tx

Pay-off matrix for the conversion of a nurses residence for in-patient and out-patient use.

If n t, then the optimal strategy is to cater for 40 out-patients and

20 in-patients.
If n 2t, then the optimal strategy is to cater for 40 in-patients only.
Figure 2.5 puts some notional numeric values on each of these three
strategies, to illustrate these features.

Games of skill

Profits per
month

Patient
mix

Figure 2.5

If t = 1600
and n = 1500

If t = 1400
and n = 1700

If t = 1000
and n = 2100

(0, 40)

60 000

68 000

84 000

(40, 20)

94 000

90 000

82 000

(60, 0)

96 000

84000

60 000

Sample numerical prots for the conversion of a nurses residence for in-patient and out-patient use.

Example 2.2 Fundraising

The Royal Ballet at Covent Garden, London, wishes to raise money by holding a
series of popular short matinee performances at lunchtime. The previous year,
the Royal Ballet sold similar tickets for 10 each and 300 patrons attended on
each of four afternoons. Experience from theatre impressarios on the Arts
Council advisory board suggests that every 1 added to the price of admission
results in 50 fewer people attending each afternoon; and every 1 reduction
results in an increase of 50 people. If the theatre can hold anything up to 500
people, how should the ballet company pitch its pricing so as to maximise
revenue?

If (10 ; x) represents the price of each ticket, the revenue per

matinee for each of four afternoons is given by the equation:
R1 : (300 9 50x)(10 ; x)
which expands to:
R1 : 3000 9 200x 9 50x 2
Attendance is subject to the constraint:
300 9 50x 500
which reduces to:
x 94

Linear programming, optimisation and basic calculus

The Royal Ballet seeks to maximise the R1 revenue expression, so

equating the Wrst derivative to zero reveals any stationary point. Thus:
R1' :9200 9 100x
has the solution:
x :92
The second derivative conWrms that this single stationary point is a
maximum, since:
R1" :9100 is less than zero.
So the Royal Ballet should charge 8 per ticket, in which event 400
people will attend each matinee, resulting in maximum revenue of
3200 per afternoon.
Figure 2.6 is a graphic representation of the revenue function, R1(x).
It can be shown that the same pay-oV would have resulted if the
problem had been calculated over four shows, since:
R4 : (1200 9 200x)(10 ; x)
has the same derivative as R1.
Example 2.3 Balancing full-time and part-time staff
A call centre in Ireland supporting IBMs voice recognition software package,
ViaVoice, has a staYng schedule which requires 680 hours per week cover time
(5 days per week; 8 hours per day; 17 lines). The staV comprises both full-time
and part-time employees. The former have 20 hours per week (maximum)
on-line contact, while the latter have 8 hours (maximum), and the initial staYng
allocation from head oYce is 40 full-time equivalents. The centre currently has
30 full-time permanent employees, the minimum number required under
Irelands tax-free employment incentive scheme.
Yearly staV on-costs, which are not pro rata with the number of hours
worked, are 40 per week for full-time staV and 14 per week for part-time staV.
Naturally, the company wishes to minimise this overhead.

Let x represent the number of full-time staV employed at the call centre.
Let y represent the number of part-time staV.
Clearly,
x P 30 and y P 0

Games of skill

R (x)

.
.

3200

3000

x = +4
2800

.
x

Figure 2.6

Revenue function for Royal Ballet fundraising.

and since each part-time worker (on maximum hours) is 8/20 of a

full-time equivalent:
x ; 2/5y O 40
Also, the minimum number of hours required per week imposes the
following constraint:
20x ; 8y P 680
Figure 2.7 represents the constraint set, , and Figure 2.8 is a tabulation
of the pay-oVs for each of the four possible strategies. It can be seen that
the combination of 30 full-time and 10 part-time staV minimises the
overheads.

Linear programming, optimisation and basic calculus

x > 30
(30, 25)

20x + 8y > 680

(30, 10)

x
34

x + 2/5y < 40
Figure 2.7

The constraint set for balancing full-time and part-time staff.

Cost
40x + 14y

Strategy

Figure 2.8

( 34, 0 )

1360

( 30, 10 )

1340

( 30, 25 )

1550

( 40, 0 )

1600

Pay-offs for balancing full-time and part-time staff.

Games of skill

Example 2.4 Examination success and time given to tutoring

KPMG (UK), the British subsidiary of the worldwide accountancy and business
services Wrm, has been analysing the examination results (for chartered status) of
its trainees over a number of years in relation to national trends and has found
that the number of hours of direct tutoring is one of the determinants of how
well students do. Up to a certain point, overall results (as measured by the
number of students achieving distinction grades) improve as more timetabled
instruction is given, but after that, results decline as the students practical
experience diminishes.
The relationship between the number of hours timetabled per week for direct
instruction (h) and the percentage by which the results are below the companys
international benchmark (r), which is complicated by other variables such as age
(a) and the percentage of students without a prior qualiWcation in a numerate
discipline (n), is found to be given by the equation:
r:

n[(h 9 4)1/2 9 h]
a

The company wishes to determine the optimal number of teaching hours for its
trainees so as to maximise overall results, subject to a minimum requirement of
4 hours per week imposed by the Institute of Chartered Accountants code of
practice.

Holding n and a constant, the derivative of r with respect to h is:

1
r'(h) : n/a[ (h 9 4)91/2 9 1]
2
Solving the equation r'(h) : 0 gives:
1
(h 9 4)91/2 : 1
2
1
(h 9 4)1/2 :
2
h : 4.25
Putting this result back into the equation for r gives:
r(h) : n/a[(0.25)1/2 9 4.25]
:93.75n/a
In other words, the companys results are maximised at [3.75n/a]

The Lagrange method of partial derivatives

h
4

3.5

(4.25, 3.75)

4.0

4.5

5.0
r (h )
5.5

Figure 2.9

A graphic representation of the solution to the problem of examination success and the time given to
direct tutoring.

below the benchmark when the number of hours per week given to
direct tutoring is 4.25.
A graphic representation of the above function and its solution can
be found on Figure 2.9, where the constant quotient n/a is normalised
to unity for convenience.

The Lagrange method of partial derivatives

Some problems of optimisation, where the function to be optimised is
a function of two variables, require a technique known as Lagranges
method of partial derivatives to solve them. The basic steps in the
Lagrange method are as follows.
Suppose there are two functions, (x, y) and g(x, y), and a new
function called the Lagrangian function, , is deWned by the equation

Games of skill

(x, y, ) : (x, y) ; [c 9 g(x, y)]

where c is a constant.
The solution to the optimisation problem occurs when all partial
derivatives of the Lagrangian are zero, which is analagous to the
Wrst-order test for stationary points mentioned already:

:
:
:0
y

x
In other words, the following equations must be solved for x, y and :

g
:
x
x

g
:
y
y
g(x, y) : c
in which case all the values which produce maxima and minima for
(x, y), subject to g(x, y) : c, will be contained in the solution set.
The following example illustrates the technique.
Example 2.5 Funding research and design
The German car manufacturer, BMW, allocates budgets internally to departmental teams on the basis of funding units for materials (M) of 25 each and
production time units (T ) of 60 each. The Creative Design team requires a mix
of material and time units to produce an acceptable standard of project work for
modelling and display.
The relationship between the number of projects modelled by the design team
(S) and funding units was studied over a number of years and found to be
directly proportional in the case of production time units the more production
time designers got, the greater the variety of project work produced (see Figure
2.10); and proportional to the square root of materials funding greater funding
produced greater output, but less and less so as funding increased (see Figure
2.11). Both these relationships are encapsulated in the formula:
S : 20TM1/2
Production time costs the company 40 per hour plus 50% on-costs

The Lagrange method of partial derivatives

(insurance, superannuation, etc.), so each time unit is valued at 60. Each

materials funding unit is valued at 25 and the Creative Design team is expected
to vary the balance between the two types of funding so as to maximise project
output while minimising expenditure for an expected 380 projects per week.

Let m : the number of materials funding units per week per project.
Let t : the number of time units per week per project.
Let s : the expected number of units of output : 380.
Let c : the cost of the projects per week.
Clearly,
c : 25m ; 60t
and the department wishes to minimise this equation subject to the
constraint
20tm1/2 : 380
The following three Lagrange equations must be solved:
s
c
:
m
m
c
s
:
t
t
20tm1/2 : 380
The partial derivatives are:
c
: 25
m
c
: 60
t
s
10t
:
m m 1/2
s
: 20m 1/2
t

Games of skill

ST
S

T
Figure 2.10

Relationship between modelling output and time units in the Creative Design department.

Substitution into the three Lagrange equations produces the following

set of equations:
5:

2t
m 1/2

3 : m 1/2
20tm 1/2 : 380
From the second Lagrange equation,
:

3
m 1/2

Putting this into the Wrst equation gives:

5m
6

and substituting this for t in the third Lagrange equation gives the Wrst
solution:
m : 8.04
Therefore, from the equation above,
t : 6.7

The Lagrange method of partial derivatives

1/2

M
Figure 2.11

Relationship between modelling output and materials units in the Creative Design department.

and the cost per project per week is

c : 25(8.04) ; 60(6.7)
c : 603
So to produce a project model, the design team should organise
material expenditure of approximately 200 per week and approximately 6 hours 40 minutes per week of individual project time by staV.

Games of chance

Chaos umpire sits, And by decision more embroils the fray By which he reigns; next him high
arbiter Chance governs all.
John Milton 16081674 Paradise Lost

Games of chance are one-player games against nature, but ones in

which the single player is not making decisions under the conditions of
certainty. In other words, nature aVects the outcomes resulting from
the players choices in an unpredictable way. Games of chance either
involve risk, where the probability of natures response is known; or
involve uncertainty, where the probability of natures response is not
known.
Those who seek to understand games of risk fully cannot but beneWt
from some knowledge of the concepts which underpin probability
theory. It is not strictly necessary, but it is desirable. There are many
outstanding texts on probability theory for readers wishing to deepen
their understanding of gaming in its more esoteric forms, but the
following synopsis should be suYcient for the average non-specialist
reader to understand the link between game theory and the probabilistic notions of distribution function and expected value.
The following sections describe some of the underlying concepts of
probability theory probability spaces, distribution functions, random
variables and expected value as a prelude to discussing games of
chance involving risk. Subsequently, utility value and games of chance
involving uncertainty are considered along with the various minimax
strategies used for their solution.

An introduction to basic probability theory

Figure 3.1

P{si } = pi

A sample space and probability of an event.

An introduction to basic probability theory

The fundamental concept in probability theory is the notion of a
probability space. If S is a set of events called the sample space,
S : s1, s2, s3, . . . , si, . . . , sn
and P is a function that assigns to every subset si a real number
Psi : pi, such that:
0 O pi O 1
then the ordered pair (S, P) is called the probability space and Psi : pi
is called the probability of si (see Figure 3.1).
The probability function, P(s), has the following six properties:
pi P 0, i
pi : 1, from i : 1 to n
If A 1 S, then PA : pi, for all events si + A
P : 0
PS : 1
If the sample space, S, is inWnite, then P(s) is deWned only on some
subset of S.
A number of other concepts Xow from those given above, among
them the idea of a random variable and its distribution function. A
random variable, X, is a function mapping S onto the real numbers.

Games of chance

Typically, X maps an event s onto a real number X(s), where s + S and

X(s) + R. Sometimes si is already a real number, but other times it is
not, in which case the random variable assigns one to it.
The distribution function of the random variable X is the function,
F(x), which maps the real numbers onto itself, such that:
F(x) : P(X), where X O x, x + R
In other words, the distribution function is the operation which transforms random variables into probabilities and it is the concept at the
heart of many game theory solutions, as the following example illustrates.

Example 3.1
The six faces of a die have colours rather than the usual numbers. The sample
space is:
S : red, orange, yellow, green, blue, white
The die is fair, so the probability of each outcome, Psi : pi : 1/6.

The random variable, X, assigns to each face of the die a real number. In
other words, say:
X(red) : 1, X(orange) : 2, X(yellow) : 3
X(green) : 4, X(blue) : 5, X(white) : 6
The distribution function, F, is now the function that turns each
random variable into a probability. In this example,
F(x) : P(X) :

Z(si)
6

where Z(si) is the number of integers less than or equal to the integer
representing X(si). So the obvious values for this distribution function
are:
F[x(r)] : 1/6, F[x(o)] : 1/3, F[x(y)] : 1/2
F[x(g)] : 2/3, F[x(b)] : 5/6, F[x(w)] : 1
and its graph can be seen on Figure 3.2

An introduction to basic probability theory

y = F (x)
1

0.5

x
1
Figure 3.2

The graph of the distribution function of the random variable in Example 3.1.

This step graph is typical of the graphs of distribution functions, all

of which have the following properties:
0 F(x) 1, x + R
F(x) is increasing
limx;9- F(x) : 0, i.e. the x-axis is a barrier
limx;;- F(x) : 1, i.e. the line y : 1 is a barrier
P(X) : p(x) : F(b) 9 F(a) if a X O b
P(X : a) is the jump in the distribution function F(x) at x : a
Expected value, variance and standard deviation

Three important measurements, commonly used in statistics, are associated with random variables and their distribution functions expected value, variance and standard deviation. In particular, expected
value is the basis for solving many games of chance.
The expected value, E(X), of the random variable X is the real
number:
E(X) : pi xi, from i : 1 to n, where xi : X(si)
In Example 3.1 above,
E(X) : 1/6(1) ; 1/6(2) ;1/6(3) ;1/6(4) ;1/6(5) ;1/6(6)
:3.5

Games of chance

The variance, V(X), of the random variable X is the real number:

V(X) : pi[xi 9 E(X)]2, from i : 1 to n, where xi : X(si)
Obviously, V(X) is always positive.
In Example 3.1 above,
E(X) : 1/6[1 9 3.5]2 ; 1/6[2 9 3.5]2 ; 1/6[3 9 3.5]2
; 1/6[4 9 3.5]2 ; 1/6[5 9 3.5]2 ; 1/6[6 9 3.5]2
: 2.917
The standard deviation, (X), of the random variable X is the real
number:
(X) : [V(X)]1/2
By convention, the positive square root is taken, so (X) P 0.
In Example 3.1 above,
(X) : 2.9171/2
: 1.708
For most practical game theory applications, the distribution function is usually continuous and provides enough information for a
solution. The sample space and random variable are rarely used. In the
case where the distribution function, F(X), is continuous, it can be
deWned in terms of its density function, (x), such that:
x

F(x) :
(x) dx, x + R
9-

The density function allows probabilities to be calculated using integral

calculus, as:
b

P(X) : F(x) :
(x) dx
a

: F(b) 9 F(a), if a X O b
Using this new (continuous function) notation, the expected value and
variance are given by the formulae:
;-

E(X) :
x (x) dx
9-

and

Games of chance involving risk

V(X) :
[x 9 E(X)]2 (x) dx
9-

Games of chance involving risk

A game of chance involving risk is a game in which a single player is
opposed by nature. The player can assign a probability to each of
natures moves and therefore to the various outcomes of his or her own
actions. Games of risk can be solved using the concept of expected
value discussed above, as the following example illustrates.
Example 3.2 Funding a new high-speed rail link
Railtrack, the company that owns and manages Britains rail structure, must
accept funding for Section 2 of the new high-speed rail link from the Channel
Tunnel to London (St Pancras), from one of the following bodies:
A structural fund of the European Union (EU) makes 20 grants annually of
varying amounts: three grants of 4000m; seven grants of 2000m; and ten
grants of 1000m. Each year, 50 applications are shortlisted for consideration
and the high-speed rail link proposal is one of them.
A Public Transport Reconstruction Fund (PTRF), supported by the UK
government and philanthropic donations from banks and businesses, makes
60 grants annually: ten grants of 2500m; 20 grants of 1500m; and 30 grants
of 750m. One hundred applications are shortlisted annually and the highspeed rail link proposal is one of them.
At this stage, the Railtrack company can only proceed with one application.
What are the expected values of the funding and which choice should the
company make?

The Wgures in Table 3.1 oVer the solution. The rail company should
apply for EU funding, since its expected value is 720m, compared with
690m from the Public Transport Reconstruction Fund.
However, simple and attractive though this technique may be, there
are a number of serious objections. In the Wrst place, not every decision
can be made on the basis of monetary value alone. If that were the case,
no one would ever take part in lotteries or go to casinos since, in the
long term, it is almost certain that one would lose. Even the notion of

Games of chance

Table 3.1 Pay-off matrix for a company applying for funding for a new high-speed rail link

Source of
grant
(i)

Amount of Number of Probability Probability of

grant ()a such grants of getting
getting this grant
(A)
(G)
any grant
(Q = P G/Gi)
(P)

Expected
value ()a
(E = Q A)

4000
2000
1000

3
7
10
20

0.40
0.40
0.40

0.40 3/20
0.40 7/20
0.40 10/20
0.40

240
280
200
720

PTRF

2500
1500
750

10
20
30
60

0.60
0.60
0.60

0.60 5/50
0.60 15/50
0.60 30/50
0.50

150
270
270
690

Values are in millions of pounds.

insurance would be redundant, since actuaries calculate insurance

premiums so as to make a proWt on average. There must be something
other than average monetary gain at stake when players play games.
For one thing, prudence appears to act as a counterbalance to
gambling everything on the chance of a huge gain and, furthermore,
there may be other than monetary values at stake in the game. For
example, political pressure may be brought to bear on a company to
deter it from accepting the solution with the maximum expected
monetary pay-oV. Or a companys creditors (numerous and inXuential
in the case of the Channel Tunnel) may demand the safest option,
irrespective of possible gain. Secondly, even in a game where the odds
of winning are slightly in the players favour, few people can aVord to
play the game long enough to take advantage of it. Thirdly, the size of
the outcome may be an important factor and size is a relative thing. In
Example 3.2, if the high-speed rail link costs 2 billion, applying to the
Public Transport Reconstruction Fund makes less sense than if the rail
link costs 1 billion.
Utility theory

Expected utility value is a better principle than expected value when it

comes to guiding players through games of risk. Utility theory assumes
that decisions are made on the basis of what the outcomes are worth to

Games of chance involving risk

the player, rather than their objective value, although of course, simple
pay-oV value sometimes equates to utility value. Moreover, both principles share some common ground. For example, a smaller amount can
never have a greater utility or pay-oV value than a larger amount.
The expected utility value of a choice c, U(c), for a continuous
distribution function, is deWned as:
U(c) : pi ui, from i : 1 to n
where ui is called the von NeumannMorgenstern utility function and
represents the players preferences among his or her expected values,
E(xi). In probabilistic terms, every decision involving risk is a lottery
whose pay-oV is the sum of the expected utility values, as the following
example demonstrates.
Example 3.3 The viability of computer training courses
The City & Guilds of London Institute, the UK training organisation, oVers a
range of education classes in autumn and spring for adults wishing to return to
work. On average, only one course in six actually runs; the others fail because of
insuYcient enrolment. The organisation is considering oVering a new Membership diploma course (Level 6) in Information Technology, which nets the
organisation 300 per capita in government capitation subsidies if it runs. At
present, the organisation oVers both Licentiateship (Level 4) and Graduateship
(Level 5) certiWcate courses. The former nets it 108 per capita and the latter nets
it 180 per capita.
Which course should the organisation oVer the new single diploma course
(D) or the established pair of certiWcate courses (C) if the organisation is: (i)
risk-neutral; (ii) averse to risk; (iii) risk-taking?

The relationship between an organisations expected utility value and

its expected pay-oV value can best be gauged by the manner in which it
tolerates risk. There are three possibilities:
(i) the organisation is risk-neutral, in which case the relationship is:
u(xi) . E(xi)
(ii) the organisation is risk-averse, in which case the relationship is
u(xi) . (E(xi)
(iii) the organisation is risk-taking, in which case the relationship is
u(xi) . E(xi)2

Games of chance

(i) The organisation is risk-neutral. The expected utility value for the
diploma option, U(D), is:
U(D) : 1/6(300) : 50
whereas the expected utility value for the double certiWcate option,
U(C), is:
U(C) : 1/6(180) ; 1/6(108) : 48
so the diploma option is marginally preferred.
(ii) The organisation is averse to risk. The expected utility value for the
diploma option, U(D), is:
U(D) : 1/6(300 : 2.887
whereas the expected utility value for the double certiWcate option,
U(C), is
U(C) : 1/6(180 ; 1/6(108 : 3.968
so the certiWcate option is preferred.
(iii) The organisation is willing to take risks. The expected utility value
for the diploma option, U(D), is:
U(D) : 1/6(300)2 : 15 000
whereas the expected utility value for the double certiWcate option,
U(C), is
U(C) : 1/6(180)2 ; 1/6(108)2 : 7344
so the diploma option is clearly preferred.

This example illustrates the point that the relationship between

simple expected pay-oV value and expected utility value is not necessarily linear. This agrees with most everyday experience. Doubling a
sum of money, for example, may not double its utility value. (In fact, it
was suggested by Bernoulli in the eighteenth century, that the most
common relationship between money and utility is probably logarithmic. The utility value increases in equal steps as the corresponding cash
value increases by equal proportions. For example, if 5 has a utility

Games of chance involving risk

u(x) E(x)
[ u(x ) = 0 ]

X
Figure 3.3

The von NeumannMorgenstern utility function and the expected value have a linear relationship.

value of 1 and 50 has a utility value of 2, then 500 has a utility value of
3.)
The results of Example 3.3 can be represented graphically and the
three categories of risk generalised to deWnitions, as follows:
If the von NeumannMorgenstern utility function, which represents
a players preferences among expected values, and the expected value
itself have a linear relationship, the player is said to be risk-neutral
(see Figure 3.3). In such a linear von NeumannMorgenstern utility
function, the player is essentially ranking the values of the game in
the same order as the expected values. This, by deWnition, is what it
means to be risk-neutral. (The values of the game in Example 3.3 are:
300 for the diploma option; and 144 for the certiWcate option,
being the average of 180 and 108.)
Notice that, for a linear function,
u"(x) : 0
If the von NeumannMorgenstern utility function is proportional to
any root of the expected value, the player is said to be risk-averse (see
Figure 3.4). Generally, risk-averse functions are of the form:
u(xi) . n'E(xi)
Notice that the derivative of a concave function such as this is clearly
decreasing, so:
u"(x) 0

Games of chance

u x

X
Figure 3.4

The von NeumannMorgenstern utility function is proportional to a root of the expected value.

u(x) E(x)n
[ u (x) < 0 ]

X
Figure 3.5

The von NeumannMorgenstern utility function is proportional to a power of the expected value.

If the von NeumannMorgenstern utility function is proportional to

any power of the expected value, the player is said to be risk-taking
(see Figure 3.5). Generally, risk-taking functions are of the form:
u(xi) . E(xi)n
and since the derivative of a convex function such as this is increasing,
u"(x) 0
A more complicated example of the usefulness of the von Neumann
Morgenstern utility function follows.

Games of chance involving risk

Example 3.4 A portfolio for investment over a waiting period

Coutts, the private bank, holds 100 000 on behalf of a client in a short-term
discretionary account. At the end of each year, some of the money is remitted to
the client for childrens school fees. For each 12-month period, the money is to
be divided in some way between a portfolio of shares and a Wxed-interest deposit
account currently yielding 6.7% p.a.
The share portfolio yields dividends and, it is hoped, capital appreciation. It
has been estimated that there is a 40% chance that the share portfolio will yield
2.4% and a 60% chance that it will yield 9.6%. How should the account manager
spread the investments so as to maximise the likely return, given that the client is
averse to risk on this account?

Let m represent the fraction of 100 000 invested in shares and (1 9 m)

the fraction of 100 000 invested in Wxed deposits, where 0 O m O 1.
There is a 40% chance of getting the following return:
106.7
102.4
; 100 000(1 9 m)
100
100
: 106 700 9 4300m

100 000m

There is a 60% chance of getting the following return:

106.7
109.6
; 100 000(1 9 m)
100
100
: 106 700 ; 2900m
100 000m

Assuming that the von NeumannMorgenstern utility function is

proportional to the square root of the expected value, the expected
utility value is:
U(x) : 0.4(106 700 9 4300m)1/2 ; 0.6(106 700 ; 2900m)1/2
: 4(1067 9 43m)1/2 ; 6(1067 ; 29m)1/2
Therefore,
U '(x) : 986(1067 9 43m)91/2 ; 87(1067 ; 29m)91/2
For a local maximum or minimum, U '(x) : 0, so:
86(1067 9 43m)91/2 : 87(1067 ; 29m)91/2
7396(1067 ; 29m) : 7569(1067 9 43m)
m : 0.342

Games of chance

So the account manager should invest (approximately) 34% in the

share portfolio and 66% in Wxed deposits.
The second derivative veriWes that U '(x) : 0 represents a local
maximum since:
U "(x) : 91849(1067 9 43m)93/2 9 1131(1067 ; 29m)93/2
and, clearly, if the case of the negative root of expected value is ignored,
U "(x) 0

Of course, any relationship between value and utility only makes

sense if the pay-oV is numerical which usually means monetary.
Although, in theory, it is possible to assign a utility value to any game
being played, there is no reason to assume that a relationship of any
kind deWnitely exists linear, logarithmic, root, power or anything else.
What is needed is an interval scale (a scale in which the units of
measurement and the Wxed points are arbitrarily, but proportionally
spaced) for solving games of risk that do not have numerical outcomes.
One theory, proposed by von Neumann and Morgenstern in 1944, is
based on the assumption that a player can express a preference not only
between outcomes, but also between any outcome and any lottery
involving another pair of outcomes. The upshot of this theory is that it
is possible to convert a players order of preference among outcomes
into numerical utility values. This is what makes gambling and insurance both rational games; the utility value of the gamble, involving the
probable loss of a small stake for the unlikely gain of a large prize, may
be positive, even if the average outcome (the cash value) over a long
period of time is negative. The von NeumannMorgenstern utility
theory thus assigns arbitrary utility values to each players least and
most preferred outcomes, like the Wxed points on a temperature scale.
The utility values of all the outcomes in between can then be determined as follows. If the player does not diVerentiate between two or
more outcomes, they are assigned the same utility value. Otherwise,
utility values can be assigned to each outcome by comparing each
outcome, like a yardstick, to a lottery involving the most and least
preferred. If the player does not distinguish between a lottery with a
known probability and an outcome, then that outcome can be assigned

Games of chance involving uncertainty

a utility value based on the same fraction of the most and least
preferred values.
Utility values reXect a players relative preferences and the von
NeumannMorgenstern theory suggests that players will always try to
maximise utility value, rather than expected value, although the two
may occasionally produce the same result.

Games of chance involving uncertainty

The second category of games of chance is the category of games
involving uncertainty. A game of chance involving uncertainty is a
game in which, like games of risk, a single player is opposed by nature,
but unlike games of risk, the player cannot assign probabilities to
natures moves. Three principles for making a decision in such circumstances have been suggested and the following example illustrates all of
them.
Example 3.5 Insurance against maternity leave
BNP Paribas, the French high-street bank, can take out an insurance policy to
make up the shortfall between the real cost to the company of an employee on
maternity leave and the entitlement under the European Unions statutory
Maternity Pay Scheme. The branch managers estimate, from previous experience, that the shortfall is of the order of 10 per cent of the cost of staV cover
which, including on-costs, pay-related social insurance and superannuation,
comes to 13 500 per maternity leave. The insurance premium per branch is
5000 per annum. Should branches take out the insurance policy?

If a branch takes out insurance and there is no maternity cover

required, then eVectively the bank has lost 5000. If the branch insures
and there is one case of maternity leave, the bank will still have lost
5000, but better that than losing 13 500 by not insuring! (Note: The
bank has not gained 8500. It has still lost 5000, but is 8500 better
oV than if it did not insure.) Conversely, if the branch does not take out
insurance and there is one case of maternity leave, it costs the bank
13 500. Of course, if the branch does not insure and there is no
maternity cover required, then there is no cost involved one way or the

Games of chance

Nature
Strategy

Bank

Insure

Do not insure

Figure 3.6

No maternity
leave

1 maternity
leave

5000

13 500

Insurance against maternity leave: pay-off matrix.

other. These four outcomes constitute the banks pay-oV matrix, which
can be seen on Figure 3.6.
Of course, the pay-oV matrix only shows monetary values and
ignores extraneous factors such as the diYculty of getting long-term
staV cover in relatively isolated rural areas or at certain times of the
year.

If natures probabilities were known to the bank, this problem would

be easily solved using expected or utility values. But natures probabilities are not known. The bank has no idea whether or not maternity
cover will be required. Three suggestions for making a decision in
circumstances such as these have been made: the maximax principle;
the maximin principle; and the minimax principle.
The maximax principle recommends that the player choose the
strategy that contains the greatest pay-oV. It is a super-optimistic
approach, comparable to the risk-taking strategy described in the
previous section. It throws caution to the wind and recommends that a
player choose the greatest pay-oV no matter what the downside
even when it is only marginally better than the next best strategy. The
greatest pay-oV on Figure 3.6 is zero, so the maximax principle tells the
branch not to take out insurance.
The maximin principle (Wald, 1945) recommends that a player
avoids the worst possible pay-oV. In other words, the player should
choose the strategy that oVers the best worst-case scenario. This is a
super-pessimistic risk-averse strategy approach and ignores the size of
any possible gain no matter how large. If all game players subscribed to
this principle, banks would never lend money and there would be

Games of chance involving uncertainty

Nature
Strategy

Bank

Insure

Do not insure

Figure 3.7

No maternity
leave

1 maternity
leave

5000

8500

Insurance against maternity leave: regret matrix.

precious little research and development in industry. It can be seen

from Figure 3.6 that the minimum Wgure on the second row is
913 500, whereas the minimum Wgure on the Wrst row is 5000.
Therefore, the worst-case scenario to be avoided is a loss of 13 500, so
the other strategy should be chosen. The branch should insure.
The minimax principle (Savage, 1954) recommends that a player
avoids the strategy of greatest regret and is a good balance between the
super-optimistic and the super-pessimistic. To understand the principle, the pay-oV matrix shown on Figure 3.6 must Wrst be transformed
into a regret matrix.
Regret matrices are constructed by calculating how much better a
player could have done by choosing a diVerent strategy if the player had
known natures choice in advance. Each pay-oV in a column is subtracted from the highest pay-oV in the column, to see how much better
each strategy could have been. The results are shown on Figure 3.7.
Clearly the greatest regret (8500) comes from the second row
strategy where the branch does not insure and where there is subsequently need for maternity cover and the minimax principle suggests
that this strategy should be avoided. The branch should therefore
insure.
Further analysis reveals that, as long as the insurance premium is less
than the beneWt (13 500), both the maximin and minimax principles
will recommend taking out insurance; and the maximax principle will
always recommend not taking out insurance, unless it is free!

Sequential decision making and

cooperative games of strategy

A wrong decision isnt forever; it can always be reversed. But the losses from a delayed
decision can never be retrieved.
J.K. Galbraith 1981 A Life in our Times

To make a decision is to choose a course of action, whether in a game of

skill, chance or strategy. Multiple decisions are sometimes taken simultaneously and sometimes sequentially, usually irrespective of the nature of the game, although games of skill are necessarily sequential
since they involve only one player who has complete control over all
the outcomes. Simultaneous decision making itself is relatively simple,
although resolving the ensuing game may be diYcult. Sequential decision making, on the other hand, can be very complex and certain
techniques have been developed to represent the process.
This chapter considers sequential decision making in all its forms
and develops the terminology used to describe directed graphs and
decision making trees. The method of backward induction is illustrated
by example and sequential decision making in single- and multi-player
games are thus explored. The diVering concepts of a priori and a
posteriori probabilities are developed from a consideration of sequential decision making involving uncertainty, and Bayess formula is used
to illustrate these diVerences in action. The chapter concludes with a
discussion on purely cooperative two-person games and the minimal
social situation, which (being games without conXict) are interesting
only for their decision making processes.
48

Representing sequential decision making

Figure 4.1

A directed graph.

Representing sequential decision making

Sequential decision making is usually represented by tree-like diagrams
of various sorts. The most basic type is a directed graph, GD, which is
deWned as an ordered pair (N, E), where N is a Wnite set of points called
nodes and E is a Wnite set of edges, each of which is an ordered pairs of
nodes (see Figure 4.1).
On Figure 4.1,
N:a, b, c, d, e, f
E:(a, b), (b, d), (d, c), (c, d), (c, a), (d, e), (c, e), (f, e)
For any given edge, e:(n1, n2), n1 is called the predecessor or parent of
n2, and n2 is called the successor or child of n1. On Figure 4.1, for
example, d is the predecessor of both c and e.
A path, P, from node n1 to node n2 is a set of edges that start at n1 and
Wnish at n2 , as
P(n1, n2) : (n1, a), (a, b), (b, c), . . . , (x, y), (y, z), (z, n2)
In other words, the path joins n1 and n2 on the arrow diagram. n1 is
called the ancestor of n2 and n2 is called the descendent of n1. On Figure
4.1, for example, the path Pdb is
Pdb : (d, c), (c, a), (a, b)
Eventually, of course, each path must end somewhere, since the sets of

Sequential decision making and cooperative games of strategy

b
f
d
a

e
c
Figure 4.2

A backward directed graph.

nodes and edges are themselves Wnite. Such nodes are called terminal
nodes, nt, and are recognisable as places where edges go in, but not out.
On Figure 4.1, for example, node e is a terminal node.
Conversely, every set of decisions and therefore every arrow diagram
must have a starting node, called a root, r, which is recognisable as the
node with no arrows going into it. Edges come out, but do not go in.
Node f is the only root on Figure 4.1. Roots are important features of a
special type of directed graphs, called a tree, discussed in greater detail
later.
In preparation for tracing decision-making strategies back through
time, from the pay-oV to the root, it is worth mentioning that, for every
directed graph, GD : (N, E), there exists another directed graph,
GDB : (N, EB), called the backward directed graph of GD, whose nodes
are the same as those of GD, but whose edges are reversed. So GDB for
the directed graph represented on Figure 4.1, for example, is:
N : a, b, c, d, e, f
E : (b, a), (d, b), (c, d), (d, c), (a, c), (e, d), (e, c), (e, f)
and this can be seen on Figure 4.2.
As was mentioned above, a special type of directed graph exists,
known as a tree, in which there is a root and (for all other nodes) one
and only one path (see Figure 4.3).
It can be seen that for trees, the root is an ancestor of every node,
every node is a descendent of the root and no node can have more than
one parent. Also, there are no reverse paths if there exists a path from
n1 to n2, one does not exist from n2 to n1.

Representing sequential decision making

l
m
n

e
A branch

o
f

p
q
g
r

u
c
v
w

j
d
x

y
z
Figure 4.3

A tree, also showing a branch or sub-tree.

Trees can have sub-trees or branches (see Figure 4.3), if:

the nodes of the branch form a subset of the nodes of the tree
the terminal nodes of the branch form a subset of the terminal nodes
of the tree
the edges that nodes of the branch form are the same as those formed
by the same nodes as part of the tree
the root of the branch is the same node as the root of the tree.

Sequential decision making and cooperative games of strategy

Example 4.1 Decision tree for a proposed building programme

BGS DArcy, a sub-division of the advertising agency BCom3, has the option of
building new premises or extending its existing facilities in Milan. Subsequent
decisions must then be made, depending on which option is exercised. For
example, if the agency opts for new premises on a green-Weld site on the
outskirts of the city, it must decide whether or not to include outdoor sports
facilities for staV.
Figure 4.4 represents these sequential decisions on a tree.

Sequential decision making in single-player games

A single decision maker often has to make a number of decisions in
sequence and the process is best represented on a decision-making
graph. A decision-making graph is a directed graph, like a tree but one
that allows more than one path from the root to the terminal nodes
which represent the pay-oVs (see Figure 4.5). Every tree is a decisionmaking graph, but a decision-making graph is only a tree in the special
case where there is only one path from the root to a terminal node.
Example 4.2 Decision graph for a system of sanction and reward
McDonalds, the chain of fast food outlets, has a recommended system of reward
and sanction for its employees. Poor quality work, uncertiWed absence and poor
punctuality not dealt with on the shop Xoor is referred to a shift supervisor or
directly to the manager if it is serious. Sanctions range from a simple reprimand
to forced redundancy.
Work or commitment likely to beneWt the company is rewarded with merit
stars which accumulate to a promotion, a higher hourly rate of pay and
end-of-year bonuses.
Figure 4.6 shows the system on a decision-making graph (not a tree!).

In a decision-making graph, if a path from ni to nt is optimal, then all

its sub-paths must also be optimal. This is the principle upon which the
method of backward induction is based. This is a method of tracing
decisions backwards from the pay-oVs at the terminal nodes, to the
root. The steps for backward induction in decision-making trees are:

Sequential decision making in single-player games

With sports fields

Without sports
fields
Green-field site
Disperse staff
to other offices
nearby
Existing site
New building

Move
during
work

Move en bloc to
another site

Stay in prefabs
on existing site
Extension
With indoor
sports
facilities

With theatre or
presentation facilities

With common rooms

With a full Art &
Design centre

Individual
studios

Computer
centre

Networked

Stand alone

Figure 4.4

The planning process for a building programme.

Alpha stage: All the parents of terminal nodes are selected and to each
of these alpha nodes is assigned the best possible pay-oV from their
terminals. If a node has only one terminal node, then that pay-oV
must be assigned. All but the best pay-oV nodes are deleted to get a
thinned-out tree.

Sequential decision making and cooperative games of strategy

f
k
r

c
i
Figure 4.5

A decision-making graph.

Dealt
with
in-shop
Shift
supervisor
Bad
behaviour

Referred
to
manager

Reprimand

Re-do work
without overtime

Lower hourly rate

Sacked
Good
behaviour
Praise

Merit stars

Praise in weekly
meetings
Increased
hourly rate

Promotion to manager

Figure 4.6

A decision-making graph for a system of sanction and reward.

Sequential decision making in single-player games

The best pay-oVs are not always the biggest numbers. Sometimes,
in cases of minimising cost, for example, the backward selection of
pay-oVs is based on selecting the lowest numbers.
Beta stage: The process is traced back a further step by looking at the
parents of the alpha nodes the beta nodes and assigning to each of
them the best pay-oVs of their alpha nodes. Then all but the best
pay-oV nodes are deleted.
Gamma stage et seq: The above steps are repeated to get an everthinning tree, until the root is the only parent left. What remains is
the optimal path.

Example 4.3 Organising sponsorship

Pearson plc, the media and information company and publishers of the Financial Times, is considering sponsoring one of three possible events: a series of
concerts or theatrical events to be held just before Christmas (proceeds to
charity); an international sporting event, to be held annually in mid-October; or
a literary prize to be awarded annually in March. All events involve considerable
staV time, though some are more urgent than others. The company wishes to
minimise the amount of staV down time and disruption involved.
Figure 4.7 (ae) shows the backward induction process for the organisation of
each of the three alternative events. The numbers against the tree represent the
estimated number of days spent by staV in active preparation and participation.

Clearly, after all stages have been carried out, the option which involves
the lowest time commitment from staV is the middle one sponsoring
the UK soccer event (50 days). The concert option involves the staV in a
minimum of 64 days of preparation and participation, and the literary
prize option in a minimum of 66 days of involvement.
The method of backward induction is slightly more cumbersome for
decision making graphs than for trees. The pay-oVs must be carried
backwards in a cumulative sense, because the optimal path can only be
determined when all the backward moves are made. This is a consequence of the fact that nodes may have more than one edge coming in.
The following example illustrates the method.

Sequential decision making and cooperative games of strategy

Classical

26
30

Pop

Music
32
Drama

Mixed

A run of one three-act play

30
48

Series of one-act plays

Concert series
40
10

Soccer
Rugby

One meeting,
all races

Sporting
event

Europe

N. America
4

Athletics

Baseball

Soccer

One race in a series

of meetings

Serie A (Italy)

A European cup

Literary
prize
Confined
to new
writers

Boxing

Poets

36
Open to all
40

Novelists

Poets

d stage

g stage

Novelists

b stage

a stage

(a)
Figure 4.7

(a ) A backward induction process for organising three sponsored events. The backward induction
process after (b ) the stage; (c ) the and stages; (d ) the , and stages; and (e ) all stages.

Sequential decision making in single-player games

Classical

26
30

Pop

Music
32
Drama

Mixed

A run of one three-act play

30
48

Series of one-act plays

Concert series
40
10

Soccer
Rugby

Sporting
event

Europe

N. America
4

Athletics

Baseball

Literary
prize
Confined
to new
writers

Boxing

Poets

36
Open to all
40

Novelists

Poets

d stage
(b)
Figure 4.7 (cont. )

g stage

One race in a series

of meetings

Soccer

Novelists

b stage

A European cup

Sequential decision making and cooperative games of strategy

Classical

26
40
(26)

Music

Drama

A run of one three-act play

(34)

Concert series

Soccer

(40)

Sporting
event

(46)

Europe
Soccer

N. America
4

(48)

Literary
prize
Confined
to new
writers

Boxing

(30)

36
30

Open to all

Poets
(27)

d stage
(c)
Figure 4.7 (cont. )

g stage

Novelists

A European cup

Sequential decision making in single-player games

(64)

Drama

A run of one
three-act play

Concert series

Soccer

10
UK

Sporting
event
(50)

Literary
prize
Confined
to new
writers
36
(66)

Novelists

d stage
(d )

40
Soccer
10
UK
(50)

(e)
Figure 4.7 (cont. )

Sporting
event

Sequential decision making and cooperative games of strategy

Example 4.4 Appointment options

The Learning & Skills Development Agency (LSDA) needs to recruit some IT
staV urgently to meet a contract for educational CD-ROMs. Figure 4.8(a, b)
shows the method of backward induction for their recruitment process. Time
is the critical factor. The agency has three appointment options: six part-time
operators; four full-time programmers; or another systems manager plus two
programmers. Subsequent decisions must be made about advertising the
positions, convening an appointments board and checking references. The
number of days involved in each step is shown beside each edge. The numbers
in brackets represent backward cumulative totals. The agency needs to
minimise the time for appointment.

Some nodes have more than one running total. In this event, the lowest
total is selected, since the agency wishes to minimise the time to
appointment. For example, there are two paths to Convene internal
board. One takes 21 days (15 ; 6) and the other takes 18 days
(10 ; 8). It can never make sense to select the former, so it is discarded.
For all calculations thereafter, this node is assigned the value 18, and
the backwards process continues. The lower Wgure is shown in bold
type on Figures 4.8, and the discarded one in light type.
The optimal strategy, as far as time is concerned, is to appoint one
systems manager and two programmers by advertising internally, convening an internal appointments board and not requiring a medical
examination before appointment.
Sequential decision making in single-player games involving uncertainty

In the preceding examples, it was assumed that the full consequences of

each selection were known. Of course, this is not always the case in
real-life situations and there is often a degree of uncertainty associated
with decision making. Fortunately, there is a method for dealing with
such uncertainty and that is to add nodes at which nature makes a
selection in the existing tree. The following example illustrates the
principle.

Sequential decision making in single-player games

Prepare
specifications

Advertise
externally

(43)
(39)

Convene full
appoint. board

(40)

Medical exam &

reference take-up

(20)

3
15

1 manager & 2
programmers

(42)
(36)
4 full-time
progrs

Prep.
spec

Advertise
internally

(44) / (40)
6 part-time operators

(21)
(18)
Convene
internal board

Appoint

8
28

(46)
(20)

(46)
(40)
20

Prepare
specifications

Do not
advertise

stage

(a)

(12)

Prepare
specifications

Appointment
by managing
director

stage

Advertise
externally
(40)

(39)

Take up
references

stage

Convene full
appoint. board

stage

Medical exam &

reference take-up

(20)

3
15

1 manager & 2
programmers

(36)
4 full-time
progrs

Prep.
spec

(18)

Advertise
internally

Convene
internal board

Appoint

(40)
6 part-time operators

8
10

Prepare
specifications

(b)
Figure 4.8

(12)

(20)

(40)
20

stage

Do not
advertise

stage

Appointment
by managing
director

stage

Take up
references

stage

A backward induction process (a) for three possible appointments options, where time is the critical
factor; and (b ) after discarding from four stages and showing the optimal strategy.

Sequential decision making and cooperative games of strategy

Example 4.5 A university research partnership

AstraZeneca, the pharmaceutical company whose R&D division is headquartered at Sodertajle, Sweden, was involved in a research partnership with a
leading Swedish university. Unfortunately, the partnership was not successful
and the agreement to collaborate lapsed. A newly appointed professor at the
university wishes to revive the consortium and the company is considering how
much to commit to it.
The company has decided on some involvement at least. If the company
undertakes a small commitment (i.e. another company carries out the clinical
trials) and it fails to produce any beneWt, it will lose 150 hours of staV time. If the
company undertakes a large commitment (i.e. AstraZeneca itself carries out the
clinical trials) and it fails to produce any beneWt, it will lose 2250 hours of staV
time.
It has been estimated that success will bring 8000 or 9000 hours of free
university laboratory time and professional training to the staV, depending on
whether the company commits on a small or large scale, respectively.
Figure 4.9 shows the decision tree for the game. Naturally, the company
wishes to maximise its beneWts, although they will only accrue to the company if
it continues its involvement beyond the Wrst year, because of the longitudinal
nature of the clinical trials.

The nodes marked chance represent the uncertainties associated with

the companys involvement; in other words, what nature does in
response to the companys selected strategy. The nodes marked company represent the selection of a strategy by the company.
The method of backward induction, stage, reveals that the company should never discontinue its involvement, irrespective of its level
of commitment. So the nodes marked company have the values
shown in brackets and the nodes marked chance have the values:
9000p 9 2250(1 9 p) (upper chance node)
and
8000q 9 150(1 9 q) (lower chance node)
Clearly, the company will opt for large-scale involvement only when:
9000p 9 2250(1 9 p) 8000q 9 150(1 9 q)

Sequential decision making in single-player games

9000
Continue involvement

Company
Produces
benefits

Discontinue involvement

(9000)
p

Chance

2250
Fails to produce benefits

Give high commitment

1 p

2250

Company
8000
Continue involvement

Give low commitment

Company
Produces
benefits

(8000)

Discontinue involvement

q
Chance

150

Fails to produce benefits

1 q
150

b stage
Figure 4.9

a stage

Decision tree for a company involved in a research partnership.

This condition reduces to:

p 0.724q ; 0.186
Interestingly, it is a consequence of this result that, even if success
following small-scale involvement is certain (q : 1), the company
should still opt for large-scale involvement as long as the probability of
success following large-scale commitment (p) is greater than 0.91.

Sequential decision making and cooperative games of strategy

r
9000
1 r
Company
Produces
benefits

Continue involvement

Discontinue involvement

(9000)

Chance

2250
Fails to produce benefits

Give high commitment

1 p

2250

r
Company
8000
1 r
Give low commitment

Continue involvement

Company
Produces
benefits

Discontinue involvement

(8000)

q
150

Chance

Fails to produce benefits

1 q
150

stage
Figure 4.10

stage

Decision tree for a company involved in a research partnership, with a priori probability.

The situation described above would be more complicated if the

company was unsure of the extent to which the beneWts of involvement
in research would actually result in increased eVectiveness within the
company.
If r represents the a priori probability that the beneWts (whatever
they are) would result in increased eVectiveness, the extended decision
tree is now represented as on Figure 4.10.
There is a diVerence between the a priori probability that something

Sequential decision making in single-player games

will happen (denoted by r on Figure 4.10) and how this gets revised to
an a posteriori probability when the pharmaceutical company gathers
more information from experience of involvement in research or from
other similar companies. The probability of one event occurring once
another event is known to have occurred is usually calculated using
Bayess formula.
If the a posteriori probability of event A happening given that B has
already happened is denoted by p(A/B); and Ac denotes the complementary event of A such that:
p(Ac) : 1 9 p(A)
then Bayess formula is:
p(A/B) :

p(B/A) p(A)
p(B/A)p(A) ; p(B/Ac)p(Ac)

Bayess formula is demonstrated in the following example.

Example 4.6 Schools coming out of special measures
Statistics from UK government inspections of schools by Ofsted, the oYce for
standards in education, reveal that 38% of schools in special measures continue
to fail after four years and are subsequently closed permanently. The preferred
option for schools in special measures is to close them down, install new
(sometimes private sector) management and staV, and re-open them under a
new name and with enhanced resources. Some 75% of schools that eventually
make it out of special measures go through this process, although 17% of the
schools which were eventually closed permanently, underwent the close and
re-open procedure. What is the probability of success for a school in special
measures after closing and re-opening?

We want to Wnd the a posteriori probability of the school successfully

passing its Ofsted inspection, given that it has undergone closure,
p(O/C), where O is the school passing its Ofsted inspection and C is the
school undergoing closure. It is known that:
p(Oc) : 0.38
p(O) : 0.62
p(C/O) : 0.75
p(C/Oc) : 0.17

Sequential decision making and cooperative games of strategy

where Oc denotes the complementary event of O, i.e. the school fails its
Ofsted inspection.
p(C/O)p(O)
p(C/O)p(O) ; p(C/Oc)p(Oc)
0.75 ; 0.62
:
0.75 ; 0.62 ; 0.17 ; 0.38
: 0.878

p(O/C) :

So schools in special measures have a 87.8% chance of passing their

Ofsted inspections after closure and re-opening.

Sequential decision making in two-player and multi-player games

The discussion so far has centred on decision making by single players.
The pay-oVs have therefore depended only on the decisions taken by
one individual (and nature, in the cases where there is uncertainty).
Games which involve more than one player, but in which moves are
still made one after the other, are now considered.
Multi-person sequential games can be thought of as decisionmaking games played in stages and, as such, they form a bridge between
the decision-making theory discussed above and the true games
discussed in later chapters. To understand their main features, some
familiar concepts need to be extended.
A tree is a multi-player game tree for n players if each decision node
(i.e. non-terminal node) belongs to one and only one player. There
should be a pay-oV at each terminal node for each player (see Figure
4.11), although it may happen that some players do not own any
decision nodes. (Players cannot own terminal nodes.)
The player who owns the root node (P3) chooses a strategy (rb say)
to start the game. This brings the decision-making process to another
node, b, owned by another player (P4), assuming it is not a terminal
node. This player in turn chooses a strategy (bg say) which leads to
node g and another player (P1), who chooses a strategy (gk say) which
leads to the terminal node, k. The pay-oVs at k are, say:
4 for player P1;

2 for P2;

1 for P3;

7 for P4.

Two or more nodes of a game tree are said to be equivalent if:

Sequential decision making in two-player and multi-player games

d (3, 1, 2, 0)
a

e (1, 0, 4, 3)

f (1, 0, 5, 2)

b
k (4, 2, 1, 7)

P1
P3

P4
g

l (3, 1, 2, 5)

h (1, 2, 8, 4)
c
i (2, 3, 4, 1)

P2
j (2, 1, 1, 3)

Figure 4.11

A multi-player game tree.

they are all (m) owned by the same player;

the same number of edges start from each, i.e. No. (E1) : No.
(E2) : : No. (Em);
the edges from all m nodes can be arranged in such a way that the
player views them as identical. Say the edges from n1 are:
E1 : (n1, a), (n1, b), . . . , (n1, i)
and the edges from n2 are
E2 : (n2, a), (n2, b), . . . , (n2, i)
and the edges from nm are
Em : (nm, a), (nm, b), . . . , (nm, i)

Sequential decision making and cooperative games of strategy

Choose X

n1
P2

Choose Y

Choose X

n3
f

P3
g
Figure 4.12

An information set for player 2.

Then this equivalence is written

(n1, i) (n2, i) (nm, i), i
The essence of equivalence, a notion that is particularly important in
games of imperfect information, is that a player cannot tell the diVerence between equivalent edges, since a player has no knowledge of, or
cares, at what node the selection is made.
An information set for a player is deWned as the set of all such
equivalent (and therefore non-terminal) nodes belonging to the player
such that no two nodes are related to one another, i.e. are not parent or
child within the set. They are denoted by being joined by a dotted line
(see Figure 4.12) and usually, but not always, they are all from the same
stage (, , , , etc.).
Multi-person sequential games can now be described as multi-player
game trees where decision nodes have been gathered into information
sets belonging to each player. A choice for a player at a node, n, is
represented by an edge coming out of n, and a choice function for a
player who owns a set of nodes, N, is a function, , which maps this set
onto the set of its children, C:
:N;C

Sequential decision making in two-player and multi-player games

In other words, a choice function is the set of all edges coming from a
players nodes. This is the players strategy the choices that the player
must make at each information set. It is a complete plan of what to do
at each information set in case the game, brought by the other players,
arrives at one of the information set nodes. And the set of all such
strategies uniquely determines which terminal node and pay-oV is
reached.
As was mentioned already, sequential games can be games of either
perfect or imperfect information. In the former case, each player moves
one at a time and in full knowledge of what moves have been made by
the other players. In the latter, players have to act in ignorance of one
anothers moves, anticipating what the others will do, but knowing that
they exist and are inXuencing the outcomes. The following examples
illustrate the diVerence.

Example 4.7 Proposing change

The newly appointed operations manager of Walmart Stores Inc., the US retail
company, must decide whether or not to propose some radical changes to the
way the subsidiary company, Sams Super Centres, is managing its outlets. The
board of directors must in turn decide whether to endorse these changes if and
when they are made. Credibility is at stake. If the operations manager proposes
changes and the board rejects them, the manager will lose considerable credibility with staV and board alike (let the ordinal pay-oV for the manager for this
outcome be 2 say), although the board will remain unaVected (0). If the
proposals are accepted, the manager and board both gain, but the former (;2)
more than the latter (;1), since the manager initiated them.
The tree for this game is shown on Figure 4.13. The pay-oVs are ordinal in the
sense that they represent the ranking of preferred outcomes only, without scale.
(The managers pay-oVs have been shown Wrst in the ordered pairs.)

The game is clearly one of perfect information. Both players know

precisely the nodes where they must make choices, so each player
knows all the previous choices that have been made.

It is noticeable that every information set has only one element and,
in fact, this is true for all games of perfect information a game will be

Sequential decision making and cooperative games of strategy

(2, 1)
Accept proposals

Reject proposals
Propose change

( 2, 0)
P1

Propose no change

(0, 0)

Figure 4.13

Proposing change: a game of perfect information.

one of perfect information if every information set has only one

element. Otherwise, the game will be one of imperfect information, as
the following example shows.
Example 4.8 Proposing change under uncertainty
Suppose in the situation outlined in the previous example (Example 4.7), the
board of directors had already been approached by the Sams Super Centres
regional manager who wished to take early retirement on health grounds. The
board then made its decision in relation to the regional manager but, upon
request, did not make the new operations manager aware of it before the latter
had made her recommendations for change.

This game can be represented as a three-player game the third player

being nature, which has decided whether or not the grounds for
ill-health retirement are suYciently strong (see Figure 4.14).
The ordinal pay-oVs shown on Figure 4.14 are explained as follows.
If the board has not accepted the regional managers retirement, the
pay-oVs revert to those of Example 4.7 (bottom half of Figure 4.14). If

Sequential decision making in two-player and multi-player games

Accept proposals

(4, 2)

Propose change

Reject proposals
Propose no change

( 4, 0)

Accept resignation

( 1, 0)

Nature
P2

(P3)

Accept proposals

(2, 1)

Reject resignation
Propose change
Reject proposals

Propose no change

(0, 0)

Figure 4.14

( 2, 0)

Proposing change under uncertainty: a game of imperfect information.

the board has accepted the retirement, unbeknown to the new operations manager, and she proposes no change, then the manager loses
some credibility with staV (91) and there is no alteration in the status
of the board (0). If, under the same circumstances, the operations
manager proposes change and the proposals are accepted by the board,
the manager gains massively (;4) and the board gains to a lesser
extent (;2) for their foresight and Xexibility. However, if the board
rejects the proposals for change, notwithstanding the regional managers resignation, the operations manager will appear out-of-touch
and unsympathetic (94), while the board will at least appear sympathetic in the eyes of staV (;1).
In this game, the operations manager cannot distinguish between
the two nodes in her information set. There is more than one node in
the information set, so the game is one of imperfect information. The
operations manager, at the time of making her choice, does not know

Sequential decision making and cooperative games of strategy

of natures outcome, although the board, making its decision after her,
does.

Cooperative two-person games

Cooperative two-person games are interesting only for their decisionmaking processes, since they are games without conXict in which the
players have coinciding interests. In fact, some writers on game theory
(Luce & RaiVa, 1989) have suggested that purely cooperative games are
not games at all in the true sense, since both players have a unitary
interest and may therefore be regarded as a single playing entity.
There are two classes of cooperative two-person games: purely
cooperative games, where the interests of both players coincide perfectly; and minimal social situation games (Sidowski, 1957), which are
characterised by ignorance on the players part as to their own strategic
interdependence, although it is still in their interests to cooperate.
Purely cooperative games may be games of perfect or imperfect information, but minimal social situation games are necessarily ones of
imperfect information.
In a purely cooperative game, it is in each players interest to
anticipate the other players selection so that a mutually beneWcial
outcome can be obtained. Each player knows that the other is similarly
motivated. This feature is absent from minimal situation games, either
through lack of communication or lack of understanding about the
rules, although it is still in the players mutual interest to cooperate.

Purely cooperative games

In purely cooperative games, both players must agree on their order of

preference of possible outcomes, so that there is no conXict of interest.
This is their incentive for cooperating that they achieve an outcome
which they both prefer. The games can be ones of either perfect or
imperfect information. In some games, players may be able to anticipate the strategies others will choose, if all players know the rules of the
game and the preferences of the other players (complete information).
In other games, where the condition of complete information does not

Cooperative two-person games

apply, no unique best combination of strategies exists and a solution

must be sought through informal analysis.
It has been shown (Schelling, 1960) that informal analysis often
reveals a focal point which possesses a certain prominence. Even if the
players cannot communicate explicitly with one another, there is
always the chance that each will notice the focal point and assume that
the other has noticed it too as the obvious solution. For example, in the
case of the two cyclists on the path heading towards one another, the
focal point might be the UK traYc regulation: road users must drive
on the left. Each cyclist would therefore assume that the other would
use this rule as a basis for choosing, and so each would choose
accordingly. The rule acts like a beacon and decision making tends
towards it. Such focal points may be rooted in national, local or
institutional culture and when they are, they are all the more preeminent for that.
Minimal social situation games

The minimal social situation is a class of games of imperfect information, where the players are ignorant of their own pay-oV functions and
the inXuence of other players. Each player only knows what choices
may be made. In some minimal situation games, one player may not
even know of the existence of the others.
Kelley et al. (1962) proposed a principle of rational choice for
minimal social situation games known as the winstay, losechange
principle. This principle states that, if a player makes a choice which
produces a positive pay-oV, the player will repeat that choice. On the
other hand, if a player makes a choice which produces a negative
pay-oV, the player will change strategy. Thus, strategies which produce
positive pay-oVs are reinforced and ones which produce negative
pay-oVs are not.
Again, the distinction must be made between players in a minimal
social situation game making decisions simultaneously and those in
which decisions are made sequentially. The following two examples
illustrate the signiWcant diVerences in outcome that can result from
these subtle diVerences in process.

Sequential decision making and cooperative games of strategy

Kelley, Drye & Warren LLP

Clifford
Chance
LLP

Strategy

Supplier A

Supplier B

Supplier
A

2 content; 1 content

2 content; 1 discontent

Supplier
B

2 discontent; 1 content

2 discontent; 1 discontent

Choosing supplier A makes the other player content,

whereas choosing supplier B has the opposite effect
Figure 4.15

Dealing with suppliers.

Example 4.9 Dealing with suppliers: acting successively

Each year, two New York commercial law Wrms, CliVord Chance LLP and
Kelley, Drye & Warren LLP, choose between two companies (A and B) who
supply stationery and oYce supplies. Successive decisions are made by each of
the Wrms two principal legal secretaries. If either law Wrm selects supplier A,
then the other is made content because supplier A showers them with special
oVers in an attempt to get their business also. (Supplier A enjoys an economy of
scale if it services both Wrms, since both are on Park Avenue.) If either Wrm
chooses supplier B, then the other is made discontent, because certain special
oVers are withdrawn.

Figure 4.15 shows the pay-oV matrix.

Analysis reveals three distinct sequences.
Say CliVord Chance chooses supplier A Wrst and Kelley, Drye &
Warren follows by choosing supplier A too. The series of decisions
looks like this:
A 9 A, A 9 A;

A 9 A, etc.

If CliVord Chance chooses B Wrst and Kelley, Drye & Warren follows
by choosing supplier B too, both become discontent. So the former
changes, thus rewarding the latter who naturally sticks with the
original strategy. Still receiving no satisfaction, CliVord Chance
changes again, which makes Kelley, Drye & Warren discontent, and

Cooperative two-person games

it now changes to supplier A. CliVord Chance is now content and

sticks with B, making Kelley, Drye & Warren discontent and causing
another change. The whole cycle repeats itself endlessly and the series
of decisions looks like this:
B 9 B, A 9 B, B 9 A;

B 9 B, etc.

If CliVord Chance chooses supplier A and Kelley, Drye & Warren

subsequently chooses supplier B, analysis reveals that the pattern
looks like this:
A 9 B, B 9 A, B 9 B;

A 9 B, etc.

If CliVord Chance chooses Supplier B and Kelley, Drye & Warren

subsequently chooses A, analysis reveals that the pattern looks like
this:
B 9 A, B 9 B, A 9 B;

B 9 A, etc.

Both cycles repeat themselves endlessly without ever achieving the

stability of mutually favourable outcomes.
Example 4.10 Dealing with suppliers: acting simultaneously
Now consider the same case, but where the law Wrms make their selections not
successively, but simultaneously, unaware of the others choice.

Analysis reveals that:

If both Wrms randomly choose supplier A, both will receive favourable outcomes, and so both will stay with their selections. The series
of decisions will look like this:
AA 9 AA 9 AA 9 AA, etc.
If both Wrms randomly choose supplier B, both will receive unfavourable outcomes and both will change their original strategies.
In this case, the series of decisions will look like this:
BB 9 AA 9 AA 9 AA, etc.
If CliVord Chance chooses supplier A and Kelley, Drye & Warren
simultaneously chooses supplier B, the former will be discontent and
change to B next time. The latter will be content with the Wrst choice
and so stick with it. Each Wrm will now become discontent and

Sequential decision making and cooperative games of strategy

change. The series of decisions will look like this:

AB 9 BB 9 AA 9 AA, etc.
A similar series emerges when CliVord Chance chooses B and Kelley,
Drye & Warren simultaneously chooses A.
Two conclusions may be drawn from this analysis of simultaneous
decision making. The Wrst is that the decision-making process will
eventually settle down and achieve stability around the outcome that
makes both players content (AA). The second conclusion is that,
unless players hit on the right combination Wrst time, they must pass
through a mutually unsatisfactory outcome (BB) before they can
achieve stability around the mutually satisfying one (AA).
A comparison between the two examples reveals that sequential
decision making in cooperative games diVers signiWcantly from
simultaneous decision making. For one thing, the series of strategic
selections does not inevitably lead to a mutually rewarding conclusion. In fact, the sequential selection of strategies never leads to
stability. It just goes round and round endlessly, which is just about
the worst-case scenario for any management team!
Admittedly, minimal social situation scenarios are a triXe contrived. In most realistic situations, players are not unaware of their
mutual interdependence, so strictly minimal social situations are
very rare. Players are usually informed of each others existence and
realise that their choices inXuence each others pay-oVs, although it
is common enough that players do not fully understand the pay-oV
structure.

Two-person zero-sum games of

strategy

He either fears his fate too much, Or his deserts are small, That puts it not unto the touch to
win or lose it all.
James Graham, Marquess of Montrose 16121650 My Dear and only Love

A two-person zero-sum game is one in which the pay-oVs add up to

zero. They are strictly competitive in that what one player gains, the
other loses. The game obeys a law of conservation of utility value, where
utility value is never created or destroyed, only transferred from one
player to another. The interests of the two players are always strictly
opposed and competitive, with no possibility of, or beneWt in, cooperation. One player must win and at the expense of the other; a feature
known as pareto-eYciency. More precisely, a pareto-eYciency is a
situation in which the lot of one player cannot be improved without
worsening the lot of at least one other player.
Game theory is particularly well-suited to the analysis of zero-sum
games and applications to everyday life (especially sporting contests)
abound. Actually, constant-sum games would be a better title since, in
some circumstances, the pay-oVs do not add up to zero because the
game is unfair. However, they do sum to a constant, which is the
prevalent feature of these strictly competitive games, so we will continue to use the term zero-sum even in these instances, for the sake of
simplicity.
The section that follows establishes a link between the tree representation described in the previous chapter and the normal form of the
game, represented by pay-oV matrices. Subsequent sections describe
77

Two-person zero-sum games of strategy

various methods for solving games with and without saddle points and
show how the issue of security leads inexorably to the notion of mixed
strategy. The chapter concludes with a discussion of interval and
ordinal scales and shows that game theory analyses can recommend
strategies even in cases where speciWc solutions cannot be found.

Representing zero-sum games

There are two classes of zero-sum games Wnite and inWnite. Finite
zero-sum games are those in which both players have a Wnite number
of pure strategies. InWnite zero-sum games are those in which at least
one player has an inWnite number of pure strategies from which to
choose and are thankfully fairly rare. Only some inWnite games have
solutions, but all Wnite ones do. A solution to a zero-sum game is a
speciWcation of the way each of the players should move. If both players
move according to this speciWcation, then the pay-oV that results is
known as the value of the game.
Two-person zero-sum Wnite games are usually represented using
pay-oV matrices, but in order to demonstrate the link with game trees
and decision making, Figure 5.1 shows the game in tree form.
As usual, the terminal nodes of the game tree represent the outcomes
and joining nodes here with a dotted line indicates an information
set. When making a move, a player cannot distinguish between nodes
in an information set. If the game is one of imperfect information,
where the players move in ignorance of any previous moves, the player
must choose without knowing what the other player has simultaneously chosen. In a game of perfect information, on the other hand, the
nodes on the game tree will all be individual information sets (see
Figure 5.2).
Figure 5.3 is the pay-oV matrix representation for the same game,
sometimes called the normal form of the game. Each row corresponds
to a pure strategy for player 1 and each column corresponds to a pure
strategy for player 2. The matrix element at the intersection of a row
and a column represents, by convention, the pay-oV to the row player.
The column players pay-oVs will be the negatives of those shown on
the matrix.

Representing zero-sum games

Player 1

Sometimes saddle points are unique and there is only one for the
game. Other times there is no saddle point or there are multiple ones.
Games without saddle points require fairly complicated methods to
Wnd a solution and are discussed later, but games with multiple saddle
points present no new problems. Figure 5.5 shows a pay-oV matrix
with two saddle points. The solution for player 1 is to choose either
strategy A or strategy B. The solution for player 2 is simply to choose
strategy B.
The following examples illustrate the methods by which two-person
zero-sum games may be represented and how they may be solved.

Two-person zero-sum games of strategy

Surgeon

Incompetent

Competent

Hospital board

Dismiss

20 000
Figure 5.6

Retrain

10 000

Dismiss

50 000

Retrain

Game-tree diagram for professional incompetence: retain and retrain.

Example 5.1 Medical incompetence: retain and retrain?

A hospital ethics committee in the UK has assessed a surgeon as falling short of
the level of care and competence demanded of professionals in that Weld. The
surgeon disputes this and has received advice from the appropriate professional
association. The surgeon has two options: to accept the committees assessment
(incompetence) or to reject it (competence). The hospital board has two
options; to dismiss the surgeon with a redundancy package, or to retain and
retrain the surgeon. Both surgeon and hospital must make submissions to the
General Medical Council (the licensing and disciplinary body for the medical
profession in the UK) and the government Department of Health in writing, by
the same date, declaring their positions for arbitration.

Figure 5.6 shows a game tree for the game.

The game is one of imperfect information. Each player must choose
without knowing what the other has decided, so the two nodes representing the hospital boards strategy are shown joined together by a
dotted line. The terminals represent the outcomes of the diVerent
strategies. If the surgeon accepts the committees assessment (declared
incompetent), retraining will cost the hospital 10 000. There will
obviously be no retraining cost if the surgeon is declared competent. If
the surgeon is dismissed and declared incompetent, it will cost the
hospital 20 000 in redundancy. If the surgeon is dismissed and yet

Games with saddle points

Hospital board
Strategy

Surgeon

Figure 5.7

Accept
incompetency
assessment
Reject
incompetency
assessment

Dismiss with
retirement
package

Retain
&
retrain

20 000

10 000

50000

Pay-off matrix for professional incompetence: retain and retrain.

declared competent, it will cost the hospital an estimated 50 000 in

compensation.
Figure 5.7 shows the pay-oV matrix for the game. Clearly, the
surgeons maximin strategy (10 000) is to accept the committees
assessment. The hospital boards minimax strategy (also 10 000) is to
retain and retrain the surgeon. This is therefore a unique saddle point
for the game. Neither the surgeon nor the hospital board can do better
by choosing any other strategy.
Example 5.2 Reforming a travel timetable
PaciWc Western, one of the bus operating companies for Pearson Airport,
Toronto, operates a 21-hour week for its drivers. The operations manager is
restructuring the travel timetable to the airport, oVering new routes and fewer
stops, and he has three models in mind. One is based on twenty 63-minute
journeys; another on twenty-one 60-minute journeys; and a third model based
on twenty-eight 45-minute journeys (dubbed the Airport Rocket). At the same
time, the personnel manager is considering whether to employ four, Wve or six
bus drivers on the airport run.
To meet the terms of a binding legal agreement with unions and the General
Toronto Airport Authority, all bus drivers must receive the same weekly employment at a Wxed rate, so some arrangements provide drivers with oV-theroad time which can be Wlled with clerical and maintenance duties - while
others do not. Table 5.1 shows the balance of driving and oV-road periods for
each of the nine strategies.
The personnel manager, choosing the number of drivers, wishes to maximise
driving time. The operations manager, choosing the travel schedule, wishes to
minimise it to keep more Xexibility for other (shorter) routes and indirectly
keep maintenance and secretarial staYng to a minimum.

Two-person zero-sum games of strategy

Table 5.1 Nine congurations for reforming a travel timetable

No. bus
drivers

No. journeys
per driver

No.
oV-the-road
periods

Driving time
(h/m)

20 63-min journeys

4
5
6

5
4
3

0
0
2

21/0
21/0
18/54

21 60-min journeys

4
5
6

5
4
3

1
1
3

20/0
20/0
18/0

28 45-min journeys

4
5
6

7
5
4

0
3
4

21/0
18/45
18/0

Weekly timetable

Operations
manager

Personnel
manager

Figure 5.8

Strategy

20 journeys

21 journeys

28 journeys

4 drivers

5 drivers

18 / 45

6 drivers

18 / 54

Pay-off matrix for reforming a travel timetable.

Figure 5.8 shows the pay-oV matrix for the game.

The saddle point is the element that is minimum in its row and
maximum in its column and this occurs when the timetable has 21
journeys per week and four drivers. The value of the game is thus 20
hours.
This technique for Wnding maximum saddle point solutions is a
useful one for very large matrices (see Figure 5.9). Each row minimum
can be examined in turn, to see which, if any, is also a maximum in its
column.

Games with saddle points

Player 2

Player 1

Figure 5.9

Figure 5.10

a
b
c
d
e
f
g

6
14
10
8

7
12
7
4

5
12
8
2

11
11
13
13

8
10
4
8

12
12
10
14

10
13
9
10

13
12
10

7
9
7

8
8
3

14
13
14

5
9
7

14
12
14

10
8
10

A large matrix with a saddle point.

Strategy

Column 1

Column 2

Row 1

Row 2

Dominance and admissability.

Figure 5.9 is a seven-by-seven matrix and closer inspection reveals a

saddle point at row b, column e, where the value of the game is 10. In a
zero-sum game, since the pay-oV for one player is the negative of the
pay-oV for the other, a game is said to be fair if the value of the game is
zero. None of the games we have considered so far has been fair in that
sense. In the case of the surgeon charged with incompetence, the game
is unfair because the surgeon cannot ignore the allegation. In the case
of the new Toronto Airport bus timetable, the game is unfair because
not changing the timetable is not an option.
Dominance and inadmissibility in large saddle point games

Sometimes one of a players pure strategies dominates the other ones in

that it yields an outcome at least as good against any strategy that the
opposing player may select. It cannot make sense for a rational player
to select a dominated strategy and it is thus said to be inadmissible.
More formally, one row dominates another when elements in the
dominant row are larger and as large (smaller in the case of dominant
columns) as the corresponding elements in the inadmissible row. For
example, in Figure 5.10, row 1 dominates row 2 if a c and b P d, for
all a, b, c, d + R.

Two-person zero-sum games of strategy

Player 2

Player 1

Figure 5.11

d
e
f
g

4
7
9
7

2
8
8
3

13
14
13
14

8
5
9
7

10
10
8
10

First elimination of dominated rows and columns.

The notions of dominance and inadmissibility provide an alternative

method to the minimax/maximin strategy for locating saddle points.
Consider Figure 5.9 again. Row c is dominated by row e (higher
numbers everywhere) so there is no reason why player 1 should ever
select it. Similarly, row a is dominated by row b. Once row a has been
eliminated, column a is dominated by column b (lower numbers
everywhere) so there is no reason why player 2 should ever select it.
Similarly, column f is dominated by column e. These four eliminations
reduce the matrix to the Wve-by-Wve matrix shown in Figure 5.11.
Four further eliminations reduce Figure 5.11 to a three-by-three
matrix. Column d is dominated by column e. Once column d is gone,
row e is dominated by row f and row d is dominated by row b. Once
row e is gone, column b is dominated by column c. The matrix has now
been reduced to the three-by-three matrix shown in Figure 5.12.
Another four eliminations produce the unique saddle point at row b,
column e (see Figure 5.13).
While this method of eliminating inadmissible rows and columns to
produce a saddle point eventually is cumbersome and certainly not as
immediate as the minimax strategy (where the saddle point is the
element which is minimum in its row and maximum in its column), it
is useful in that it demonstrates why a game of perfect information can
always be solved.

Games with no saddle points

Some two-person zero-sum games have no saddle point and are relatively complicated to solve.

Games with no saddle points

Player 2
c

f
g

8
3

9
7

8
10

Player 1

Figure 5.12

Second elimination of dominated rows and columns.

Player 2
e
b

Player 1

Figure 5.13

Final elimination of dominated rows and columns.

Example 5.3 Re-allocation of duties

The trade union representing airline pilots in Germany (Vereinigung Cockpit
e.V.) and that representing cabin crews (Unabhangige Flugbegleiter Organisation, UFO) are in conXict with each other over a possible reallocation of duties
at Lufthansa, the German airline. Each group has been given the opportunity to
change to a new rota either immediately or subsequent to proposals for a
redistribution of duties among airline staV made by representatives of the cabin
crews.
The pilots and cabin crews potentially stand to gain (or lose) a performance
related bonus from Lufthansa, depending on what they choose to do. If both
groups change simultaneously, the pilots will gain nothing since it will be
perceived that they are blindly following the leadership shown by their more
junior colleagues. The cabin crews will gain everything since it was their
initiative. However, if the pilots change Wrst and the cabin crews later, the pilots
will gain, since it will be assumed that it was they who took the lead. If the cabin
crews change now and the pilots change later, credibility and the performance
related bonus will probably be shared, with approximately 75% going to the
cabin crews.

Two-person zero-sum games of strategy

Cabin crews (UFO)

Pilots
(Cockpit e.V.)

Strategy

Change
now

Change
later

Change
now

100%

Change
later

25%

(Pay-offs for the row player only are shown)

Figure 5.14

Pay-off matrix for two groups in conict over re-allocation of duties.

Figure 5.14 shows the pay-oV matrix for the game. Clearly, the game is
an imperfect one and has no saddle point, since there is no element in
the matrix which is a row minimum and a column maximum. Such
matrices are not unusual and the larger the matrix, the more likely it is
that it will not have a saddle point. However, it is still possible to Wnd a
rational solution.
If the groups try to out-guess each other, they just go round in
circles. If the pilots use the minimax principle, it makes no diVerence
whether they change now or later the minimum gain is zero in either
case. The cabin crews will attempt to keep the pilots gain to a minimum and will therefore opt to change now. The most that the pilots
can gain thereafter is 25% of the performance bonus. However, the
pilots can reasonably anticipate the cabin crew strategy and will therefore opt to change later to guarantee themselves 25%. The cabin crews
can reasonably anticipate this anticipation and will consequently opt to
alter their choice to later, thereby keeping the pilots gain to zero. This
cycle of counter-anticipation can be repeated ad nauseam.
There is simply no combination of pure strategies that is in equilibrium and so there is no saddle point. All that can be said is that the
value of the game lies somewhere between zero and 25%. One of the
groups will always have cause to regret its choice once the other groups
strategy is revealed. A security problem therefore exists for zero-sum
games with no saddle point. Each group must conceal their intentions
from the other and, curiously, this forms the basis for a solution.

The best way for players to conceal their intentions is to leave their
selections completely to chance, assigning to each choice a predetermined probability of being chosen. This is called adopting a mixed

Games with no saddle points

Cabin crews (UFO)

Pilots
(Cockpit e.V.)

Strategy

Change
now

Change
later

Probability

Change
now

100%

p = 0.2

Change
later

25%

1 p = 0.8

Probability

q = 0.8

1 q = 0.2

(Pay-offs for the row player only are shown)

Figure 5.15

Pay-off matrix for the re-allocation of duties, with assigned probabilities.

strategy. Mixed strategies that are in equilibrium, where neither player

can proWt by deviating from them, give rise to strategies called minimax
mixed strategies. Such strategies are optimal in that they are the best
strategies available to rational players and, along with the value of the
game, constitute the solution. In a sense, games that have saddle points
are simply special cases of mixed-strategy games, in which a probability
of zero is assigned to every selected strategy except one, which is
assigned a probability of unity. This is a consequence of von
Neumanns fundamental minimax theorem (1928), which states that
every Wnite strictly competitive game possesses an equilibrium point in
mixed strategies. (A proof of the minimax theorem can be found in
Appendix A.)
The game represented in Example 5.3 can thus be solved by assigning
probabilities as follows (see Figure 5.15). Let the pilots Wrst-row mixed
strategy be assigned a probability p and the second-row mixed strategy
a probability 1 9 p. Since the game is zero-sum, the expected pay-oVs
yielded by the two strategies are equal. Therefore:
0(p) ; 25(1 9 p) : 100(p) ; 0(1 9 p)
which gives the solutions:
p : 0.2 and 1 9 p : 0.8
Similarly, let the cabin crews Wrst-column mixed strategy be assigned a
probability q and the second-column mixed strategy a probability
1 9 q. Again, the expected pay-oVs yielded by these two strategies are
equal. Therefore:
0(q) ; 100(1 9 q) : 25(q) ; 0(1 9 q)

Two-person zero-sum games of strategy

which gives the solutions:

q : 0.8 and 1 9 q : 0.2
The pilots should therefore choose the Wrst strategy (change now)
once in every Wve random selections and choose the second strategy the
other four times. Conversely, the cabin crews should choose to change
now four times in every Wve random selections and choose to change
later on the other occasion.
The groups can choose whatever method they like for randomising
the selection of their strategies. For example, one of the pilots may have
Wve diVerent denomination coins: 1p; 2p; 5p; 10p; and 50p, and may
choose the Wrst strategy if he or she randomly and blindly selects the 5p
say, and the second strategy if any of the other four coins is selected.
The value of any game is the expected pay-oV when both players
choose their optimal strategies. In the case of a two-by-two matrix,
however, the value of the game is the same even if only one player uses
an optimal mixed strategy. Therefore, the value of the game depicted in
Example 5.3 is got by substituting the solutions for p and q back into
either side of either equation:
Value of game : 0(p) ; 25(1 9 p)
: 0(0.2) ; 25(0.8)
: 20
Note that the value of the game, using mixed strategies with predetermined probabilities, guarantees at least no worse a pay-oV than other
strategic combinations. From the pilots point of view, they have
guaranteed a pay-oV of at least 20%, instead of the previous zero. From
the cabin crews point of view, they have guaranteed to keep the pilots
pay oV to a maximum of 20%, instead of the previous 25%. The
minimax mixed strategies are therefore in equilibrium. Neither faction
can get a better result by using any other strategy, assuming that both
opponents play optimally.

Large matrices generally

Deviating from an optimal strategy can never beneWt a player, no
matter what the size of the matrix of the game. In the case of matrices

Large matrices generally

bigger than two-by-two, the player who deviates from the optimal
strategy will be actually disadvantaged, since the value of the game is
assured only if both players adopt optimal mixed strategies.
Not all games with pay-oV matrices larger than two-by-two are
problematic. For example, it has been shown (Shapely and Snow, 1950)
that any game which can be represented by a matrix with either two
rows or two columns (i.e. a game in which one of the players has
speciWcally two strategies) can be solved in the same way as two-by-two
matrices, because one of the two-by-two matrices embedded in the
larger matrix is a solution of the larger matrix. Therefore, to solve the
larger matrix, it is necessary only to solve each two-by-two sub-matrix
and to check each solution.
Example 5.4 Student attendance
A school has a problem with student attendance in the run up to public
examinations, even among students who are not candidates. Teachers must
decide whether to teach on, passively supervise students studying in the (reduced) class groups or actively revise coursework already done during the year.
Students must decide whether to attend school or study independently at home.
Research has shown that, if teachers teach on, year-on-year results improve by
12% if students attend, but fall by 8% if they do not. If teachers passively
supervise group study, examination results improve by 2% if students study at
home and are unchanged if they attend. If teachers actively revise coursework,
results improve 5% if students attend the revision workshops and 1% if they do
not.

Figure 5.16 shows the pay-oV matrix for the game. It has no saddle
point, so the players must adopt a mixed strategy. One of the columns
the Wrst one say can be arbitrarily deleted and the residual two-bytwo matrix solved, as follows.
Let the students Wrst-row mixed strategy be assigned a probability p
and the second-row mixed strategy a probability 1 9 p. The expected
pay-oVs yielded by these two strategies are equal no matter what the
teachers do. Therefore:
0(p) ; 2(1 9 p) : 5(p) ; 1(1 9 p)
which yields the solutions:

Two-person zero-sum games of strategy

Teachers

Students
Figure 5.16

Strategy

Teach on

Passively
supervise
group study

Actively
give revision
workshops

Attend
lessons

Do not attend
lessons

Pay-off matrix for attendance of students during public examination period.

p : 1/6 and 1 9 p : 5/6

Similarly, let the teachers second-column mixed strategy be assigned a
probability q and the third-column mixed strategy a probability 1 9 q.
The expected pay-oVs yielded by these two strategies against the students are equal. Therefore:
0(q) ; 5(1 9 q) : 2(q) ; 1(1 9 q)
which gives the solutions:
q : 2/3 and 1 9 q : 1/3
The value of this game, got by back-substituting into either side of
either equation, is 5/3. If the solution to the two-by-two sub-matrix
given above is the solution to the two-by-three original matrix, then
the ignored strategy the teachers Wrst column will yield a worse
value than 5/3 for the teachers, i.e. a value bigger than 5/3 for the
students. However, this is not the case. The value against column one
is:
12(1/6) 9 8(5/6) :28/6
and therefore the strategy represented by column one should not be
ignored by the teachers.
If the second column is deleted, the residual matrix has a saddle
point (whose value is 5) and so the method of mixed strategies makes
no sense. However, this solution can be ignored since the game does
not have a saddle point and cannot be made to have one by ignoring
awkward options. (The reader can check that a minus probability is
always produced for p or q if such a nebulous saddle point appears in
any residual two-by-two matrix.)

Interval and ordinal scales for pay-offs

The search continues by deleting the third column. A method

similar to the one described above produces the following solutions:
p : 5/11 and 1 9 p : 6/11
and
q : 1/11 and 1 9 q : 10/11
The value of this game is 12/11. If this solution is the solution to the
original matrix, then the ignored strategy the teachers third column
will yield a worse value than 12/11 for the teachers, i.e. a value bigger
than 12/11 for the students. And this is indeed the case. The value
against column three is:
5(5/11) ; 1(6/11) : 31/11
So a solution has been found. The strategy represented by column three
(actively giving revision workshops) should be ignored by teachers.
They should adopt the teach on and passive supervision strategies
with the probabilities 1/11 and 10/11, respectively. Students should
attend or not attend with the probabilities 5/11 and 6/11, respectively.

Interval and ordinal scales for pay-offs

All the pay-oV matrices used so far have assumed an interval scale. For
example, in the case of the pay-oV matrix in Figure 5.16 (Example 5.4),
it is assumed that pay-oV 12 is six times more desirable than pay-oV
2. Moreover, it is assumed that the zero point on the pay-oV scale is
properly located, so that negative pay-oVs can have meaning.
Such interval pay-oV matrices can be adjusted, if need be, using
linear transformations. In other words, if x represents any one pay-oV
matrix element and and are real constants, then:
x' : x ;
where x' represents a modiWed version of the pay-oV so that all the
strategic properties of the matrix remain intact, i.e. the matrix has the
same relative preference values, the same optimal strategies and the
same game value. Linear transformations can be used to rid a matrix of
negative values or fractions. For example, the matrix on Figure 5.8

Two-person zero-sum games of strategy

Operations
manager

Personnel
manager

Figure 5.17

Strategy

20 journeys

21 journeys

28 journeys

4 drivers

2100

2000

2100

5 drivers

2100

2000

1875

6 drivers

1890

1800

Transformation of the pay-off matrix in Figure 5.8 (each pay-off multiplied by 100).

Teachers

Students

Figure 5.18

Strategy

Teach on

Passively
supervise
group study

Actively
give revision
workshops

Attend
lessons

Do not attend
lessons

Transformation of the pay-off matrix in Figure 5.16 (each pay-off has 8 added).

(representing the game in Example 5.2) has fractions which can be

removed by the linear transformation x' : 100x, producing the matrix
in Figure 5.17.
Similarly, the matrix in Figure 5.16 (representing the game in
Example 5.4) has a negative pay-oV which can be removed by the linear
transformation x' : x ; 8, producing the matrix in Figure 5.18.
It should be noted, however, that while the optimal solution strategies remain the same, the value of the game has to be factored up or
down depending on how the linear transformation has adjusted the
pay-oVs, to bring it back in line with the original zero point and scale.
In the examples above, the value of the game for Figure 5.17 has to be
divided by 100 and the value of the game for Figure 5.18 has to be
reduced by 8.
Ordinal pay-oVs on the other hand, have no regard for scale,

Interval and ordinal scales for pay-offs

Hospital board
Strategy

Surgeon

Figure 5.19

Dismiss with
retirement
package

Retain
&
retrain

Accept
incompetency
assessment
Reject
incompetency
assessment

A recongured matrix with ordinal pay-offs.

although they preserve the relative ranking of preferences. So pay-oV

3 is less favourable than 4 and more favourable than 2, although 4
is not twice as favourable as 2. Pay-oV matrices in mixed-motive
games (Chapter 6) are usually ordinal, while cooperative and zero-sum
games tend to have interval scales. In theory, ordinal pay-oVs can be
adjusted, like interval pay-oVs, by linear transformations, but in practice it makes no sense to do so.
Games with ordinal pay-offs and saddle points

Games with ordinal pay-oVs can be solved easily if they have saddle
points. Figure 5.19 is Figure 5.7 reconWgured as an ordinal pay-oV
matrix, and vb, b, g and vg represent very bad, bad, good and very
good, respectively.
It is still clear that b is a saddle point, since b is the minimum in its
row and the maximum in its column. In other words,
vb b g, for all values of vb, b, g + R
So, it is not necessary to have exact quantitative measurements of the
relative desirability of the pay-oVs in a zero-sum game in order to Wnd
a solution.
Games with ordinal pay-offs, but without saddle points

Consider Figure 5.20, an ordinal reconWguration of Figure 5.15, in

which w, b and e represent the worst, bad and excellent pay-oVs for the
senior deputy. p and 1 9 p are the senior deputys mixed-strategy
probabilities and q and 1 9 q and those for the junior deputy.

Two-person zero-sum games of strategy

Cabin crews
Change
now

Change
later

Probability

Change
now

Change
later

1 p

1 q

Assigned
probabilities

Strategy

Pilots

Figure 5.20

A recongured ordinal pay-off matrix, with probabilities.

As before:
w(p) ; b(1 9 p) : e(p) ; w(1 9 p)

(i)

and
w(q) ; e(1 9 q) : b(q) ; w(1 9 q)

(ii)

From Equation (i):

p : (w 9 b)/(2w 9 b 9 e)
and
1 9 p : (w 9 e)/(2w 9 b 9 e)
So the following ratios are equivalent
p:19p:w9b:w9e
Clearly, if the diVerence between worst and bad (25% in the case of
Example 5.3) is smaller than the diVerence between worst and excellent (100% in the case of Example 5.3), then the pilots should assign a
higher probability to changing now, i.e. p should be higher than
1 9 p.
From Equation (ii):
q : (w 9 e)/(2w 9 b 9 e)
and
1 9 q : (w 9 b)/(2w 9 b 9 e)
So the following ratios are equivalent

Interval and ordinal scales for pay-offs

q:19q:w9e:w9b
and the cabin crews should assign a higher probability to changing
later, which is the inverse strategy and probability of that of the pilots.
Although this ordinal game has not been completely solved, because
the probabilities and the value of the game have not been determined,
nevertheless the analysis has at least given both players some indication
of what their optimal strategies are. This is a recurring feature of game
theory as applied to more intractable problems: it does not always
produce a solution, but it does provide a greater insight into the nature
of the problem and the whereabouts of the solution.

Two-person mixed-motive games of

strategy

Consider what you think is required and decide accordingly. But never give your reasons; for
your judgement will probably be right, but your reasons will certainly be wrong.
Earl of Mansfield 17051793 Advice to a new governor

Whereas games of cooperation are games in which there is no conXict

of interest and the pay-oVs are identical for both players; and whereas
zero-sum games are games in which the players interests are totally
opposed and what is good for one player is necessarily bad for the
other; mixed-motive games come somewhere between the two.
In a mixed-motive game, the sum of the pay-oVs diVers from
strategy to strategy, so they are sometimes called variable-sum games,
although the term is not strictly accurate since cooperative games are
also variable. They rarely produce pure solutions, but they are interesting for the real-life situations they represent and for providing an
insight into the nature of conXict resolution.
Even the simplest mixed-motive games, represented by two-by-two
matrices, have many strategically distinct types. There are 12 distinct
symmetrical two-by-two mixed-motive games, of which eight have
single stable Nash equilibrium points and four do not (Rapoport &
Guyer, 1966). The Wrst section of this chapter reWnes some familiar
concepts for use in mixed-motive games and illustrates the features of
the eight mixed-motive games that have stable Nash equilibrium
points (Figure 6.4). The group of four games that do not have Nash
equilibria is more interesting and is considered in detail in the second
section. In fact, all two-by-two mixed-motive games without stable
98

Representing mixed-motive games and the Nash equilibrium

Nash equilibrium points are categorised according to their similarity to

one of these four archetypes, so they are of prime importance in any
typology of mixed-motive games (Rapoport, 1967a; Colman, 1982).
One of the four categories that of martyrdom games includes the
famous prisoners dilemma game which, unlike the other three, has a
single Nash equilibrium, but is one which is curiously paradoxical. A
proposed solution in metagame theory is discussed.
The chapter Wnishes with a detailed examination of the famous
Cournot, von Stackelberg and Bertrand duopolies and a section on
how to solve games that do not have any Nash equilibrium points,
using mixed strategies.

Representing mixed-motive games and the Nash equilibrium

Mixed-motive games are represented in a slightly diVerent way from
cooperative and zero-sum games. They always use simple ordinal
pay-oVs, where only relative preference is indicated by the numbers in
the matrix. Both players pay-oVs are displayed on the pay-oV matrix,
with the row player coming Wrst, by convention, in the coordinate
pair.
The terms and concepts introduced in the previous chapters, such as
pay-oV matrices, dominance, inadmissibility, saddle or equilibrium
points and mixed strategies, need to be developed further or adjusted
slightly for mixed-motive games, so that the principles of approach can
be rigorously described.
Firstly, the deWnition of the games themselves.
A two-player mixed-motive game is deWned as a game in which:
player 1 (row) has a Wnite set of strategies S1 : r1, r2, . . . , rm, where
No. (S1) : m;
player 2 (column) has a Wnite set of strategies S2 : c1, c2, . . . , cn,
where No. (S2) : n;
the pay-oVs for the players are the utility functions u1 and u2 and the
pay-oV for player 1 of outcome (r, c) is denoted by u1(r, c) + S1 ; S2
(the cartesian product).
Using this new notation, the concepts of dominance and inadmissibility can be reWned (see Figure 6.1). A strategy ri of player 1 is said to
dominate another strategy rj of player 1 if

100

Two-person mixed-motive games of strategy

Player 2

Player
1
Figure 6.1

Strategy

u1(r1,c1) , u2(r1,c1)

u1(r1,c2), u2(r1,c2)

u1(r2,c1), u2(r2,c1)

u1(r2,c2), u2(r2,c2)

A mixed-motive game with two players.

Player 2

Player
1
Figure 6.2

Strategy

2, 2

4, 3

3, 4

1, 1

A mixed-motive game with no dominant strategies.

u1(ri, c) P u1(rj, c), c + S2

Strategy rj of player 1 is now said to be inadmissible in that player 1
cannot choose it and at the same time claim to act rationally.
The dominance of ri over rj is said to be strict if:
u1(ri, c) u1(rj, c), c + S2
and weak if:
u1(ri, c) P u1(rj, c), c + S2
In the previous chapter, the method of iterated elimination of
dominated strategies was described as an alternative way of solving
game matrices (see Figures 5.115.13). Unfortunately, it cannot be
used to solve many mixed-motive games. The matrix represented on
Figure 6.2, for example, has no dominant or dominated strategies and
therefore cannot be solved using the elimination method.
Instead, such games must be solved using the concept of the Nash
equilibrium (Plon, 1974). A pair of strategies (rN, cN) + S1 ; S2 is said to
be a Nash equilibrium if:

101

Representing mixed-motive games and the Nash equilibrium

Player 2

Player
1

Figure 6.3

Strategy

1, 0

0, 3

3, 1

0, 2

1, 1

4, 0

0, 2

3, 4

6, 2

A Nash equilibrium.

u1(rN, cN) P u1(r, cN), r + S1

u2(rN, cN) P u2(rN, c), c + S2
In other words, rN is bigger than any other r in the same column and cN
is bigger than any other c in the same row. In the example represented
on Figure 6.3, three player 1 pay-oVs are maximum in their columns
(1, 0), (6, 2) and (3, 4) but only the last one is simultaneously
maximum in its row for player 2.
So a Nash equilibrium is a unique pair of strategies from which
neither player has an incentive to deviate since, given what the other
player has chosen, the Nash equilibrium is optimal. In many ways, the
concept of the Nash equilibrium is a self-fulWlling prophecy.
If both players know that both know about the Nash equilibrium,
then they will both want to choose their Nash equilibrium strategy.
Conversely, any outcome that is not the result of a Nash equilibrium
strategy will not be self-promoting and one player will always want to
deviate. For example, the matrix on Figure 6.2 has two Nash equilibria
(4, 3) and (3, 4) and it can be seen that, if either of the other pairs of
strategies were chosen, there would be regret from at least one of the
players.
It is worth noting that, if a game can be solved using the method of
iterated elimination, then the game must have a single unique Nash
equilibrium that would, of course, give the same solution. For example,
the game represented on Figure 6.4 has a Nash equilibrium at (4, 4) and
could easily have been solved by the principles of dominance and
inadmissibility, since strategy r1 dominates r2 for player 1, and strategy
c1 dominates c2 for player 2. Neither player has any incentive to deviate
from these strategies r1 and c1.

102

Two-person mixed-motive games of strategy

Player 2

Player
1
Figure 6.4

Strategy

4, 4

2, 3

3, 2

1, 1

Pay-off matrix for a two-person mixed-motive game with a single Nash equilibrium point.

Unfortunately, just as not every game can be solved using iterated

elimination, not every game has a unique Nash equilibrium point. The
ones that have no Nash equilibrium points at all must be solved using
the method of mixed strategies discussed in Chapter 5 (see Example
5.3) and the ones that have multiple or unstable equilibrium points are
categorised according to their similarity to one of the following archetypes.

Mixed-motive games without single equilibrium points: archetype 1

leadership games
All leadership games have pay-oV matrices like Figure 6.5. Since the
pay-oVs are ordinal rather than interval, the matrix can be made to
represent many games, but one example should serve to illustrate its
main features.
Example 6.1 A leadership game
There are two trade unions in a factory and each has proposed its own candidate
to chair the staV relations committee. Each candidate must decide whether to
accept or decline the nomination. If both accept, then the matter will be decided
by a potentially divisive vote, which is the worst possible outcome for all
concerned. If both decline the nominations, then the divisive vote will be
avoided in favour of a third agreed candidate the second worst pay-oV for
both, since neither nominee beneWts. If one candidate accepts the nomination
and the other declines in favour of the other candidate, the accepting candidate
beneWts most obviously, but the other candidate retains hopes of an unopposed
nomination the following year. These pay-oVs are represented on Figure 6.5.

103

Mixed-motive games without single equilibrium points: archetype 1 leadership games

Nominee 2

Nominee

Strategy

Decline
nomination

Accept
nomination

Decline
nomination

2, 2

3, 4

Accept
nomination

4, 3

1, 1

Figure 6.5

Pay-off matrix for leadership games.

It can be seen from the matrix that there are no dominant or inadmissible strategies. Neither candidate can select a strategy that will yield the
best pay-oV no matter what the other candidate does. The minimax
principle fails too because, according to it, both candidates should
choose their Wrst strategy (decline the nomination) so as to avoid the
worst pay-oV (1, 1). Yet, if they do this, both candidates regret it once
the others choice becomes known. Hence, the minimax strategies are
not in equilibrium and the solution (2, 2) is not an equilibrium point.
It is unstable and both players are tempted to deviate from it, although
it should be pointed out that the worst case scenario (1, 1) arises when
both players deviate from their minimax strategies.
Despite the failure of both the elimination and the minimax approaches, there are two equilibrium points on the Figure 6.5 matrix. If
nominee 1 chooses to accept the nomination, nominee 2 can do no
better than decline; and if nominee 1 chooses to decline the nomination, nominee 2 can do no better than accept. So there are two
equilibrium points those with pay-oVs (4, 3) and (3, 4).
Unlike zero-sum games, the value of the game is not a constant
because the players do not agree about preferability and the two
equilibrium points are therefore asymmetrical. There is no formal
solution beyond this. Informal factors such as explicit negotiation and
cultural prominence must be explored if a more deWnite outcome
is required. For example, a younger nominee may defer in favour of
an older one in companies where seniority is the prominent basis
for promotion; or the two candidates may negotiate a political
arrangement. Either way, it is in the interests of both players in a

104

Two-person mixed-motive games of strategy

College 2

College
1

Figure 6.6

Strategy

Submit
preferred
calendar

Submit
unpreferred
calendar

Submit
preferred
calendar

2, 2

4, 3

Submit
unpreferred
calendar

3, 4

1, 1

Pay-off matrix for heroic games

mixed-motive game to communicate their intentions to one another,

which is the opposite paradigm to that which prevails in zero-sum
games, and informal considerations are common. Games with this type
of pay-oV matrix are called leadership games (Rapoport, 1967a) because the player who deviates from the minimax strategy beneWts both
self and the other player, but self more and as such is leading from the
front.

Mixed-motive games without single equilibrium points: archetype 2

heroic games
All heroic games have pay-oV matrices like Figure 6.6. Again, an
example should serve to illustrate the main features.
Example 6.2 An heroic game
Two colleges in the same conurbation are required, as far as possible, to
coordinate their closures so that school buses do not have to run unnecessarily,
and they must submit their proposed calendars to the local education authority
by a certain date. End of term opening and closing dates are relatively uncontentious, but a major disagreement has arisen over mid-term closures.
The worst case scenario is that both colleges submit their less preferred
options. If both submit their preferred options, then the outcome is not as bad,
but far from ideal, since the local authority is bound to arrive at some partially
unsatisfactory compromise. Much better if only one college elects to submit its
preferred calendar. It maximises its own pay-oV of course, but the pay-oV for the

105

Mixed-motive games without single equilibrium points: archetype 3 exploitation games

other college reXects its hope of reversing the arrangement next year. The
ordinal pay-oVs are displayed on Figure 6.6.

As was the case with the leadership game described in the previous
example, there are no dominant or inadmissible strategies and the
minimax principle, in which both colleges choose their Wrst strategy
(submit preferred calendar) so as to avoid the worst pay-oV (1, 1) fails.
Again, as was the case with leadership games, the minimax strategies
are unstable and both players are tempted to deviate from it.
There are, nevertheless, two equilibrium points on the Figure 6.6
matrix. If college 2 chooses to submit its preferred calendar, college 1
can do no better than submit its less preferred calendar; and vice versa.
So there are two equilibrium points with pay-oVs (4, 3) and (3, 4).
Again, like leadership games, the value of the game is not a constant
because the players do not agree about preferability.
Games with this type of pay-oV matrix are called heroic games
(Rapoport, 1967a) because the player who deviates from the minimax
strategy beneWts both self and the other player, but beneWts the other
player more and as such is exhibiting heroic unselWsh behaviour.
Like leadership games, there is no formal solution beyond this,
although it is clearly in the interests of both players to communicate
their intentions to one another. Informal considerations suggest that it
is a good strategy to convince the other player of ones own determination! For example, if college 2 convinces college 1 that it has a school
tour abroad planned which is impossible to cancel, college 1 serves its
own interest best by acting heroically and choosing its less preferred
option (Luce & RaiVa, 1989). It can also be seen from this example that
the commonly held notion of keeping all options open is erroneous,
as many game theoreticians have pointed out. Better to adopt a
scorched earth policy, like Napoleons advance on Moscow, or at least
convince the other player of ones intention to do so!

Mixed-motive games without single equilibrium points: archetype 3

exploitation games
Exploitation games have pay-oV matrices like Figure 6.7. The following
example illustrates their common features.

106

Two-person mixed-motive games of strategy

Ericsson
Strategy

Nokia

Figure 6.7

Issue
shares
later

Issue
shares
now

Issue
shares
later

3, 3

2, 4

Issue
shares
now

4, 2

1, 1

Pay-off matrix for exploitation games.

Example 6.3 An exploitation game

Nokia and Ericsson, two publicly quoted Scandinavian companies in the telecommunications sector, are considering a share issue to raise funds for investment. The Wnancial considerations are complicated by uncertainty over the US
Federal Reserves intention regarding money rates. All other things being equal,
it would be better for both companies to wait until the next quarters inXation
Wgures are known.
The worst case scenario (Wnancially) occurs when both companies decide to
issue shares now, since they will both be undersubscribed, and the maximum
pay-oV for a company occurs when it issues shares and the other does not. In
such circumstances, market demand would support a high issue price and the
issuing company would be perceived among investors as the market leader,
though the non-issuing company would also beneWt from increased sector
conWdence. Figure 6.7 is the ordinal pay-oV matrix for the game.

Once again, there are no dominant or inadmissible strategies and the

minimax principle, in which both companies choose to issue later, fails.
Although the minimax strategies intersect at (3, 3), they are unstable
since both players are tempted to deviate from it and both regret their
selections once the others selection becomes known.
The two equilibrium points on Figure 6.7 are the asymmetric ones
with pay-oVs (4, 2) and (2, 4). If Nokia chooses to opt out, Ericsson can
do no better than choose the opposite, since to opt out as well would

107

Mixed-motive games without single equilibrium points: archetype 4 martyrdom games

result in the worst possible pay-oV (1, 1). The converse is true in the
case where Ericsson chooses to opt out.

Games with this type of pay-oV matrix are called exploitation games
(Rapoport, 1967a) because the player who deviates unilaterally from
the safe minimax strategy beneWts only himself and at the expense of
the other player. In addition, in going after the best possible pay-oV, the
deviant risks disaster for both!
Even more than heroic games, it is imperative in games of exploitation that the player who intends to deviate from the minimax convinces the other that he is resolute in his intent. Put crudely, the most
convincing player always wins exploitation games! In addition, the
more games of this sort a player wins, the more likely the player is to
continue winning, since the players seriousness of intent has been
amply demonstrated and the player has become more conWdent. Reputation the sum of a players historical behaviour in previous trials of
the game is everything. As Colman (1982) puts it, nothing succeeds
like success in the Weld of brinkmanship! The more reckless, selWsh and
irrational players are perceived to be, the greater is their advantage in
games of exploitation, since opposing players know that they risk
disaster for everyone if they try to win. This psychological use of
craziness can be seen in terrorist organisations (Corsi, 1981), political
leaders and among small children, though it should be noted that
although the player is perceived to be irrational, he or she is nevertheless
acting rationally throughout with a view to winning the game. (Schelling, 1960; Howard, 1966; Brams, 1990).

Mixed-motive games without single equilibrium points: archetype 4

martyrdom games
Martyrdom games have pay-oV matrices like Figure 6.8. Its most
famous prototype is the prisoners dilemma game, so-called in 1950 by
A.W. Tucker, who two years earlier had convinced the young John
Nash to study at Princeton. It is the most famous and analysed game in
game theory, and the example below is a variation on that well-known
theme.

108

Two-person mixed-motive games of strategy

Stockbroker
Strategy

Lawyer

Figure 6.8

Refuse to
cooperate
with
investigators
Cooperate
with
investigators

Refuse to
cooperate
with
investigators

Cooperate with
investigators

3, 3

1, 4

4, 1

2, 2

Pay-off matrix for martyrdom (or prisoners dilemma) games.

Example 6.4 A martyrdom game

A stockbroker and a company lawyer are suspected of insider trading and are
held in separate oYces to be questioned by investigators from the Metropolitan
Police Fraud Department and the Serious Fraud OYce, London. Evidence from
colleagues is circumstantial and is not suYcient to convict either party unless
one of them incriminates the other. Consequently, investigators oVer them
immunity and the bonus of being regarded by Wnancial institutions as honest
beyond reproach, if they give evidence against the other.
If both refuse to cooperate, they will both get oV with a reprimand, in the
absence of any evidence to impose more serious sanctions. If both cooperate,
they will both be permanently suspended from trading and excluded from
company directorships, but not jailed. If one incriminates the other by cooperating with the investigators, the latter will be jailed and the former will have
obtained the best possible pay-oV that of having his/her reputation for honesty
enhanced (see Figure 6.8).

This type of game is a genuine paradox. The minimax strategies

intersect at (2, 2) and suggest that both players should choose to
cooperate with the investigators. Unlike the other three prototype
games above, this minimax solution does form an equilibrium point,
since neither player can do better by choosing another strategy once the
other players strategy becomes known. For example, even if the stockbroker knew that the lawyer was going to cooperate, the stockbroker
could not do any better by refusing to cooperate. It can also be seen
that the second strategy (cooperation) for both lawyer and broker

109

Mixed-motive games without single equilibrium points: archetype 4 martyrdom games

dominates. So from every point of view, there is a stable minimax

solution at (2, 2).
However and this is the paradox this dominant solution is worse
than the other strategy where both players agree to do the same thing,
i.e. refuse to cooperate with the investigators (3, 3)! It appears that
there is a conXict between individual self-interest and collective selfinterest. Furthermore, the latter strategy, where both players optimise
their collective pay-oVs, (3, 3), is itself unstable, since each player is
tempted to deviate from it. In other words, in the event of both
suspects refusing to cooperate with the investigators, each will regret
doing so after it becomes apparent that the other player has also refused
to cooperate! It appears that neither the individual self-interest Nash
equilibrium at (2, 2) nor the collective self-interest equilibrium at (3, 3)
oVers an acceptable solution.

Games such as this are called martyrdom games (Rapoport, 1967a)

because if both players deviate from the minimax strategy, they are
doing so to beneWt the other as much as self. And yet, the martyr who
defects from this mutuality of martyrdom will always win the game,
guaranteeing a better pay-oV no matter what the other does! Martyrdom games have other unique features too. Unlike leadership, heroic
and exploitation games, martyrdom games do not have pairs of asymmetric equilibrium points and, unlike them too, the worst possible
outcome does not occur when both players choose non-minimax
solutions. In martyrdom games, both players have dominant strategies
and one equilibrium point. If one player deviates from the minimax, he
suVers himself (martyr) and beneWts the other the complete opposite
of exploitation games. And if both players deviate from the minimax
solution, the pay-oV is better for both.
Contrary to what is sometimes falsely described as a condition of the
game, the players may communicate with each other if they so choose
(Aumann, 1989). It makes no diVerence. They might agree to refuse to
cooperate with investigators before the game, but they will still choose
selWshly to cooperate with investigators when faced with the actual
decision, if acting rationally.
It has been suggested that a formal solution to the prisoners dilemma and other martyrdom games can be found with the help of

110

Two-person mixed-motive games of strategy

Stockbroker

Lawyer

Figure 6.9

Strategy

Refuse to
cooperate
regardless
of lawyer

Cooperate
regardless
of lawyer

Choose
same
strategy as
lawyer

Choose
opposite
strategy to
lawyer

Refuse to
cooperate
with the
investigators

3, 3

1, 4

3, 3

1, 4

Cooperate
with the
investigators

4, 1

2, 2

4, 1

Pay-off matrix for the level-one martyrdom metagame.

metagame theory (Howard, 1966). Metagame theory is the construction of any number of higher level games based on the original game. A
player is then assumed to choose from a collection of meta-strategies,
each of which depends on what the other player chooses.
Consider the case outlined in Example 6.4 and represented by Figure
6.8. The lawyer has two pure strategies: to refuse to cooperate and to
cooperate. For each of these, the stockbroker has four meta-strategies:
to refuse to cooperate regardless of what the lawyer chooses,
to cooperate regardless of what the lawyer chooses,
to choose the same strategy as the lawyer is expected to choose, and
to choose the opposite strategy to the one the lawyer is expected to
choose.
This two-by-four matrix constitutes the Wrst-level metagame and is
represented on Figure 6.9. There is an equilibrium point at (row 2,
column 2) with pay-oV (2, 2), since if the lawyer chooses row 1, the
stockbroker should choose column 2 or column 4. In addition, if the
lawyer chooses row 2, the stockbroker should choose column 2 or
column 3. This corresponds to the same paradox point in the original
game and a higher level metagame must be constructed in order to
eliminate it.
Suppose the lawyer selects a meta-strategy depending on which of
the four meta-strategies the stockbroker chooses. The lawyer can
choose:

111

Mixed-motive games without single equilibrium points: archetype 4 martyrdom games

to refuse no matter which of the four columns the broker chooses,

to refuse unless the broker chooses the fourth column,
to refuse unless the broker chooses the third column,
to refuse unless the broker chooses the second column,
to refuse unless the broker chooses the Wrst column,
to refuse if the broker chooses Wrst or second column,
to refuse if the broker chooses Wrst or third column,
to refuse if the broker chooses second or third column,
to refuse if the broker chooses Wrst or fourth column,
to refuse if the broker chooses second or fourth column,
to refuse if the broker chooses third or fourth column,
to cooperate unless the broker chooses the Wrst column,
to cooperate unless the broker chooses the second column,
to cooperate unless the broker chooses the third column,
to cooperate unless the broker chooses the fourth column,
to cooperate no matter which of the four columns the broker
chooses.
This sixteen-by-four matrix constitutes the second-level metagame and
is displayed on Figure 6.10. This metagame has the same paradox
equilibrium point (row 16, column 2), but another two equilibrium
points with pay-oVs (3, 3) have appeared in addition (row 7, column 3)
and (row 14, column 3). They are equilibrium points because neither
player has cause to regret his or her choice of strategy. The paradox
equilibrium point (2, 2) is clearly not the solution to the game, since a
better pay-oV can be obtained for both players from either of the other
two equilibrium points. Of the other two equilibrium points, row 14
dominates row 7, and so row 7 is inadmissible as a strategy for the
lawyer.
So the solution is this. The lawyer should choose to refuse to
cooperate only if he or she expects the stockbroker to choose the same
strategy as the lawyer is expected to make otherwise the lawyer should
choose to cooperate (row 14). The stockbroker should choose the same
strategy as the lawyer is expected to choose (column 3). The value of
the game is 3 for each player.
In theory, there is an inWnite number of metagames and their pay-oV
matrices get very large very quickly. Fortunately, games represented by
two-by-two matrices do not require metagames beyond level two, but
opinion is divided as to whether the concept of metagame is legitimate.

112

Two-person mixed-motive games of strategy

Broker
Strategy

Refuse no matter
which column
the broker choses
Refuse unless the
broker chooses 4

Lawyer

Cooperate
regardless
of the
lawyer

Choose
same
strategy as
the lawyer

Choose
opposite
strategy to
the lawyer

3, 3

1, 4

3, 3

1, 4

3, 3

1, 4

3, 3

4, 1

Refuse unless the

broker chooses 3

3, 3

1, 4

2, 2

1, 4

Refuse unless the

broker chooses 2

3, 3

2, 2

3, 3

1, 4

Refuse unless the

broker chooses 1

4, 1

1, 4

3, 3

1, 4

3, 3

1, 4

2, 2

4, 1

3, 3

2, 2

3, 3

4, 1

1, 4

3, 3

4, 1

3, 3

2, 2

1, 4

4, 1

1, 4

2, 2

1, 4

4, 1

2, 2

3, 3

1, 4

Refuse unless the

broker chooses 1 or
2
Refuse unless the
broker chooses 1 or
3
Refuse unless the
broker chooses 2 or
3
Refuse unless the
broker chooses 1 or
4
Refuse unless the
broker chooses 2 or
4
Refuse unless the
broker chooses 3 or
4
Cooperate unless
the broker chooses 1

3, 3

2, 2

4, 1

Cooperate unless
the broker chooses 2

4, 1

1, 4

2, 2

4, 1

Cooperate unless
the broker chooses 3

4, 1

2, 2

3, 3

4, 1

Cooperate unless
the broker chooses 4

4, 1

2, 2

1, 4

4, 1

2, 2

4, 1

Cooperate no matter
which column
the broker choses

Figure 6.10

Refuse to
cooperate
regardless
of the
lawyer

Pay-off matrix for the level-two martyrdom metagame.

Some theorists (Harris, 1969; Robinson, 1975) dispute the somewhat

Jesuitical solution, while others trumpet its success (Rapoport, 1967b).
Either way, the solution depends on each player being able to predict
what the other player will elect to do and although this is usually

113

Summary of features of mixed-motive prototypes

impossible in real-life situations, the technique at worst sheds light on

what is an interesting category of game. Mixed-motive games like this
are often aVected by what other players are expected to do, making it
important for players to conceal their intentions or to misrepresent
them deliberately; metagame theory, if it does nothing else, illuminates
this deception.
What experimental evidence there is supports the theory outlined
above (Axelrod, 1981, described in Chapter 9). One-oV prisoners
dilemma games show the predicted predisposition to selWshness (2, 2);
and the results from Wnitely repeated prisoners dilemma games, although they reveal a predisposition towards unselWshness (3, 3), can
also be explained by theory and by the somewhat contrived nature of
the situation.

Summary of features of mixed-motive prototypes

Leadership games

There are no dominant or inadmissible strategies.

Heroic games

There are no dominant or inadmissible strategies.

There is a minimax solution, but it is not stable. There is a temptation to deviate.
There are two asymmetrical equilibrium points, which are unstable

114

Two-person mixed-motive games of strategy

because the players do not agree on preferability. The value of the

game is not constant.
The worst case scenario comes about when both players deviate from
the minimax.
There is no formal solution.
The player who deviates from minimax strategy beneWts both self
and the other player, but the other player more.
It is good strategy to convince the other player of ones own determination.
The commonly held notion of keeping all options open is erroneous.

Exploitation games

There are no dominant or inadmissible strategies.

There is a minimax solution, but it is not stable. There is a temptation to deviate.
There are two asymmetrical equilibrium points, which are unstable
because the players do not agree on preferability. The value of the
game is not constant.
The worst case scenario comes about when both players deviate from
the minimax.
There is no formal solution.
The player who deviates unilaterally from the safe minimax strategy
beneWts only self and at the expense of the other player.
In going after the best possible pay-oV, a deviant player risks disaster
for everyone.
The player who intends to deviate from the minimax must convince
the other player of his resolution in that regard.
The more games of this sort a player wins, the more likely the player
is to continue winning.
Players perceived as reckless, selWsh and irrational have an advantage.
Martyrdom games

The minimax strategies are dominant at the one and only equilibrium point, but paradoxically this pay-oV is worse for both players
than their inadmissible strategies.

115

The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games

The value of the game is constant.

If both players deviate from the minimax strategy, they beneWt the
other as much as they beneWt self. If one player deviates from the
minimax, he suVers himself and beneWts the other player.
A formal solution can (possibly) be found with the help of metagame
theory. Games represented by two-by-two matrices do not require
metagames beyond level two.
Martyrdom games represent the clash of individual and collective
rationality.

The Cournot, von Stackelberg and Bertrand duopolies: an interesting

application of mixed-motive games
Oligopoly is a term used to describe the situation where a number of
organisations dominate a particular market and where the competitors
are interdependent. As a result of that interdependence, the behaviour
of one organisation aVects the proWts made by other organisations in
the same sector. Oligopoly comes somewhere between the extremes of
monopoly and pure market, but is distinguished from both by the
condition of interdependency described above. In a pure market, all
Wrms are independent of each other because the price of any commodity is set solely by competition; and in a monopolistic situation, by
deWnition, there is no interdependence since there is only one producer.
The paper and packaging industry, dominated by two Wrms, SmurWt-Stone and International Paper, is an example of a duopoly an
oligopoly of two organisations. Their proWts (and share prices) are
largely determined by whether or not the market can sustain linerboard price increases, which in turn is determined by overall levels of
production. Clearly, the behaviour of either company inXuences the
well-being of both.
There are three classic duopolistic models von Stacklberg, Cournot
and Bertrand and they attempt to explain and predict the behaviour
of organisations in such circumstances. In their simplest form, they are
treated as one-oV games and are distinguished from each other by their
market structure. In the Cournot model, organisations compete simply
in terms of production levels. In the von Stackelberg model, at least one

116

Two-person mixed-motive games of strategy

organisation pre-commits to a particular level of production and the

other players respond to it. In the Bertrand model, organisations
compete simply in terms of the price they charge customers.
The Cournot and Bertrand models of duopoly are examples of static
or simultaneous games of complete information ones in which players
simultaneously choose courses of action and each players pay-oV
structure is common knowledge across the game, like a sealed-bid
auction. This does not necessarily mean that all decisions are made
literally at the same time, but rather as if the decisions were made at the
same time (Gibbons, 1992a). The von Stackelberg model of duopoly,
on the other hand, is an example of a dynamic or sequential game of
perfect information. It is a game in which each player knows both the
pay-oV structures and the history of the game thus far, like an English
auction, and the game has a sequence to its order of play, since players
can observe the actions of others before deciding upon their own
optimal responses.
The Cournot duopoly

The Cournot duopoly, as originally conceived, describes how two

manufacturers selling the same product can settle on their respective
factory output levels so as to maximise their own proWts (Cournot,
1838). Aggregate supply is determined and the price is set, though not
in any illegal sense, so that supply just meets demand, assuming that
each Wrm supplies the market without observing the others level of
production.
The following example illustrates the features of the Cournot and
von Stackelberg duopolies.
Example 6.5 The paper and packaging sector as a Cournot and von Stackelberg duopoly
SmurWt-Stone and International Paper dominate the global paper and packaging sector, though they are of slightly diVerent size. SmurWt-Stone produces R1
tonnes of linerboard every year and International Paper produces R2 tonnes. The
price per tonne (P) charged to customers is a function of production output; the
greater the total output of linerboard (R : R1 ; R2), the lower the price
charged. They are related linearly and negatively as:
P(R) : A 9 R, where A is some Wxed constant.

117

The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games

If c1 is the marginal cost of production per tonne of linerboard for SmurWtStone and c2 is the marginal cost of production per tonne for International
Paper, both constants for the year, how many tonnes should each Wrm produce
in order to maximise proWt? The New York Stock Exchange, of which both Wrms
are conscientious members, stipulates that Wrms must set their price structures
and production levels independently.

SmurWt-Stone and International Paper are not strictly in competition

with each other, but are partly competing and partly cooperating in a
market. Although neither Wrm could alone cater for the entire market,
it can be assumed that each Wrm could supply any non-negative level of
output within reason, so the duopoly can be modelled as a mixedmotive game. Each Wrm needs to maximise proWt subject to what the
market will take. Once SmurWt-Stone and International Paper have
decided on their respective optimal levels of production, the market
price is made, the pay-oVs (proWts) are eVectively determined and the
game is assumed to be over. (The game is assumed to be a one-oV.)
The cost to SmurWt-Stone of producing R1 tonnes is c1R1 and the cost
to International Paper of producing R2 tonnes is c2R2, assuming no
Wxed costs. Therefore, the proWt functions are:
1 : (A 9 R)R1 9 c1R1
2 : (A 9 R)R2 9 c2R2
where 1 represents SmurWt-Stones proWt and 2 represents that of
International Paper. (Notice that the proWt for each Wrm depends on
the output of the other Wrm as well as its own, since R : R1 ; R2.)
Substituting for R gives:
1 : (A 9 R1 9 R2)R1 9 c1R1
: AR1 9 R12 9 R1R2 9 c1R1
and
2 : (A 9 R1 9 R2)R2 9 c2R2
: AR2 9 R22 9 R1R2 9 c2R2
The Wrms choose their strategies independently and simultaneously, so
the concept of the Nash equilibrium oVers a solution. This involves
drawing each Wrms reaction function, which is a curve that shows every

118

Two-person mixed-motive games of strategy

optimal production level for every possible production level of the

other Wrm.
Each Wrms reaction function can be found by diVerentiating its
proWt function with respect to output, giving:
1
: A 9 2R1 9 R2 9 c1
R1
and
2
: A 9 2R2 9 R1 9 c2
R2
If

1
2
: 0 and
:0
R1
R2

then
2R1 ; R2 : A 9 c1
and
R1 ; 2R2 : A 9 c2
where A, c1 and c2 are constants. These two equations are the reaction
functions and reveal that the optimal level of production for each Wrm
is negatively related to the expected level of supply from the other. Note
also that:
21
22
:92
and
:92
R12
R22
which indicate local maxima.
The method of simultaneous equations,
4R1 ; 2R2 : 2A 9 2c1
R1 ; 2R2 : A 9 c2
produces the Nash solutions
RN1 :
and

A ; c2 9 2c1
3

119

The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games

RN2 :

A ; c1 9 2c2
3

The total output at the Nash equilibrium is therefore:

RN : RN2 ; RN2
2A 9 c1 9 c2
:
3
and the maximum price that can be aVorded by the market is:
P(R) : A 9 RN
A 9 2A 9 c1 9 c2
:
3
A 9 c1 9 c2
:
3
This solution, being a Nash equilibrium, means that both Wrms are
content operating on these parameters, since neither feels a tendency to
deviate.
Figure 6.11 represents the Cournot duopoly diagrammatically the
scales assume that c1 and c2 are of the same order of magnitude, though
not necessarily equal and further analysis is interesting, if a little
complicated.
The two reaction function lines, having negative gradients, demonstrate the fact that the optimal level of production (and hence proWt)
for one Wrm is negatively related to the levels of production of the
other. The semicircles on Figure 6.11 represent a pair of the Wrms
iso-proWt curves, which plot the diVerent combinations of production
levels that yield the same proWt for each Wrm in turn. Not surprisingly,
proWts are greatest when each Wrm is the sole supplier, in which case it
supplies (A 9 c1)/2 or (A 9 c2)/2 tonnes of linerboard to the market
the intercept value of the respective reaction functions.
The sets of iso-proWt curves for SmurWt-Stone and International
Paper are centred, respectively, at the Nash equilibrium coordinates
(A 9 2c1 ; c2)/3 and (A ; c1 9 2c2)/3 and these production levels
represent the solution. The further away a semicircle is from its axis
from its monopolistic situation the lower the Wrms proWt. At the
Nash equilibrium, marked N on Figure 6.11, both Wrms are maximising
proWts simultaneously, so the equilibrium is unique and the tangents to

120

Two-person mixed-motive games of strategy

R2
A c1
Reaction function for
Smurfit-Stone

(A c 2 )/2
C

(A + c 1 2c2 ) / 3

Nash equilibrium (also the Cournot

equilibrium)

Reaction function for

International Paper

(A c 1) / 2
(A
Figure 6.11

2c 1 + c 2) / 3

The CournotNash equilibrium.

the two iso-proWt curves are mutually perpendicular.

Although the concept of the Nash equilibrium has provided a
solution, Cournot arrived at the same solution in a diVerent way. He
analysed how the two Wrms would behave if they were out of equilibrium. Figure 6.12 shows the reaction functions for SmurWt-Stone and
International Paper. Say initially that SmurWt-Stone is operating a
monopoly, it will produce (A 9 c1)/2 tonnes of linerboard. If International Paper now enters the market and assumes that SmurWt-Stone
will maintain that production level, it will produce at a level vertically
above that point, but on its own reaction curve. This point is oV the
SmurWt-Stone reaction curve, so SmurWt-Stone will move its production level horizontally onto its own reaction curve. This incremental
process continues, in a zigzag fashion, until the Cournot equilibrium
point (C) is reached, at the intersection of the two reaction curves.
The Nash and Cournot approaches produce the same solution, but
the Cournot concept is weaker because it supposes that one Wrm is able
to react to the other Wrms entry to the market, contradicting the
games assumption of static-ness or simultaneity. The concept of

121

The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games

R2
A c1

Reaction function for

Smurfit-Stone

(A c2 ) / 2
Cournot equilibrium
C

Reaction function for

International Paper

(A
Figure 6.12

c1 ) /2

Arriving at the Cournot equilibrium.

Cournot balance is reasonable, but that of Cournot imbalance is not

(Romp, 1997).
The NashCournot equilibrium is not pareto-eYcient. It is not the
case that one player is made better oV at the expense of the other. If the
two Wrms could cooperate, they would both increase their combined
proWt. Diagrammatically, the Nash equilibrium is ineYcient because
the iso-proWt curves are not tangential there, which in turn means that
there are other production combinations that weakly dominate the
Nash combination, i.e. one Wrm is better oV and the other is no worse
oV. The combinations for which the iso-proWt curves are tangential
constitute the area denoted by T on Figure 6.11, which is bounded by
the two iso-proWt curves that intersect at the Nash equilibrium. Inside
T, both Wrms are better oV (strict dominance) and on the boundary,
one Wrm is better oV and the other is no worse oV (weak dominance).
Figure 6.11 also shows the contract line, C, which represents all the
points where the outcomes are pareto-eYcient. This line therefore
represents a zero-sum (or a constant-sum) sub-game of the Cournot
duopoly. If the two Wrms could coordinate their production levels, they
would maximise joint proWt by choosing strategic combinations on
this line. Given that they would also choose strategies inside area T, the

122

Two-person mixed-motive games of strategy

optimal combination would be both on C and inside T. Naturally, the

particular point chosen from this inWnite set would be the one that
divides income in the fairest way equally if the two Wrms are equally
powerful. However, not being a Nash equilibrium and being oV the
reaction function lines, this optimal mid-point of C in T would be
unstable because each Wrm would have an incentive to deviate from it.
This is analogous to Martyrdom games like the prisoners dilemma,
discussed already in this chapter, and explains why cartels, both legal
and illegal, tend to be unstable each party to the collusion has an
incentive to deviate unilaterally.
The von Stackelberg duopoly

In the von Stackelberg (1934) model, at least one organisation precommits to a particular level of production and the other responds to
it. Let us assume, for the purposes of this example, that the precommitting Wrm is SmurWt-Stone (which becomes the market leader)
and the responding Wrm is International Paper. Unlike the Cournot
duopoly model, the von Stackelberg model is a dynamic game, where
International Paper can observer the actions of SmurWt-Stone before
deciding upon its optimal response. Unlike static (or simultaneous)
games, dynamic (or sequential) games carry the tactical possibility of
giving false information and the need to conceal ones true intentions
from the other player. If either Wrm believes the false production levels
of the other, no matter how unlikely, the game will have multiple Nash
equilibria. For example, International Paper may threaten to Xood the
market with linerboard in the hope that SmurWt-Stone will reduce its
production to zero in response which it will if it believes the threat
and thereby produce a Nash equilibrium. And yet, such a threat is
illogical, since it would not be in either players interests to carry it
through. To exclude such idle production threats, the von Stackelberg
model imposes the condition that the predicted outcome of the game
must be sub-game perfect that the predicted solution to the game
must be a Nash equilibrium in every sub-game.
The method of backward induction may be applied to this von
Stackelberg game, starting with International Papers output response
decision, which is its attempt to maximise its own proWt, 2 , given by:
2 : (A 9 R)R2 9 c2R2

123

The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games

: (A 9 R1 9 R2)R2 9 c2R2
: AR2 9 R22 9 R1R2 9 c2R2
DiVerentiating with respect to R2 and setting equal to zero for a
maximum, gives:
2
: A 9 2R2 R1 9 c2
R2
then
R1 ; 2R2 : A 9 c2

(1)

This is International Papers reaction function and to SmurWt-Stone,

the only believable threat about production that International Paper
can make will be on this function line.
Continuing the method of backward induction. SmurWt-Stone
knows that the eventual outcome must be on International Papers
reaction function line and so it must attempt to maximise its own
proWt, 1, subject to that constraint:
1 : (A 9 R)R1 9 c1R1
: (A 9 R1 9 R2)R1 9 c1R1
: AR1 9 R12 9 R1R2 9 c1R1
Substituting for R2 from equation (1) gives:
1 : AR1 9 R12 9 R1[(A 9 c2 9 R1)/2] 9 c1R1
which diVerentiates with respect to R1 as:
1
: A/2 9 R1 9 c1 ; c2/2
R1
Equating this to zero gives the von StackelbergNash equilibrium level
of production of linerboard for SmurWt-Stone:
R1 : (A 9 2c1 ; c2)/2

(2)

Substituting (2) into (1) gives International Papers optimal production:

R2 : (A ; 2c1 9 3c2)/4

(3)

Looking at Equations (2) and (3), it can be seen that, if the marginal
cost of production for both Wrms is the same (equal to c say), R2 will be

124

Two-person mixed-motive games of strategy

A c1
Reaction function for
Smurfit-Stone

(A c 2 ) / 2
(CN)

(A + c1 2c 2 ) / 3

( Cournot Nash equilibrium )

von Stackelberg Nash equilibrium

(A + 2c1 3c 2 ) / 4
Reaction function for International Paper

R1
(A c1 ) / 2
(A 2c1 + c 2) / 3
Figure 6.13

The von StackelbergNash equilibrium

half of R1 and total production of linerboard will be given by the

equation:
R : 3(A 9 c)/4

(4)

Given that SmurWt-Stone knows that International Paper is managed

rationally and therefore will produce according to its own reaction
function, SmurWt-Stone will maximise proWts where its iso-proWt curve
is tangential to International Papers reaction function (see Figure
6.13).
SmurWt-Stone would, of course, ideally like to have a monopoly on
linerboard production. On Figure 6.13, this is the point where the
SmurWt-Stone reaction function crosses the R1 axis and has the value
(A 9 c1)/2 tonnes. However, if International Papers output is on its
own reaction line, the lowest iso-proWt curve SmurWt-Stone can reach
is the one through the von StackelbergNash equilibrium point,
marked SN on Figure 6.13. EVectively, at the von StackelbergNash
equilibrium point, SmurWt-Stone is producing a monopoly-level out-

125

The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games

put of linerboard, but is not making monopoly-level proWts because

International Papers non-zero production level pushes down the market price.
The CournotNash equilibrium point is also shown on Figure 6.13,
marked CN, so that the reader can see that the von StackelbergNash
equilibrium point means higher production and more proWt for the
pre-committing Wrm and less for the responding Wrm. The responding
Wrm is actually worse oV knowing what the market leader is doing and,
conversely, the pre-committing Wrm has made itself better oV by
pre-committing to a certain level of production. As Romp (1997)
rightly states, the von Stackelberg duopoly model is one in which the
Wrm with the Wrst move has the advantage.

The Bertrand duopoly

The strategic variable for Wrms in both the Cournot and the von
Stackelberg duopolies is the level of production (R). In the Bertrand
duopoly (1883), the strategic variable is the price charged in the
marketplace (P). The Wrms simultaneously decide their pricing structures and market forces then decide how much product is absorbed.
Like the Cournot duopoly, the Bertrand duopoly is a static game, but
one in which the two Wrms compete in terms of the price they charge
customers, rather than production levels.
Consider the following example.
Example 6.6 The UK supermarket sector as a Bertrand duopoly
Sainsbury and Tesco dominate the UK food supermarket sector. Competition is
so Werce and proWt margins so thin that it has been termed a price war in the
Wnancial press. Suppose Sainsbury decides to sell a quantity R1 of some product
at a price P1 and Tesco decides to sell a quantity R2 of the same or some other
product at a price P2. Let 1 represents Sainsburys proWt and2 represents that
of Tesco. How would the two supermarket chains set their prices so as to
maximise proWts in a stable market?

In the Bertrand duopoly model, stability depends on whether or not

the products sold by the competing Wrms are identical. Consider Wrstly
the case where the product lines are indistinguishable. If the food

126

Two-person mixed-motive games of strategy

products are identical, customers will only buy from the supermarket
that oVers the lowest price. Say, for the purposes of this example, that
Tesco initially oVers lower prices and makes higher than normal proWts.
It gains a monopoly, although Sainsbury is eventually forced to challenge it by undercutting prices in an attempt to win some market share
for itself. However, if Tesco initially oVers lower prices and makes lower
than normal proWts or none at all, then it must raise its prices to normal
proWt levels or go out of business. So, either way, it is clear that
charging diVerent prices never results in a Nash equilibrium for competing Wrms in a Bertrand duopoly.
If the food products are identical and both supermarkets charge the
same prices, and if each supermarket is making higher or lower than
normal proWts, then each will have an incentive to deviate. One will
slightly undercut the other to increase market share if it is making
higher than normal proWts; and it will slightly overcharge the other to
increase proWt margins if it is making lower than normal proWts.
So the only Nash equilibrium is where both Wrms charge the same
prices and make normal proWts. The situation where as few as two Wrms
make a competitive outcome without any collusion to increase proWts
above the normal, is known as the Bertrand paradox.
One way to overcome the Bertrand paradox is to have Wrms sell
distinguishable products. If product lines are distinguishable, Tesco
and Sainsbury face a negative demand curve and their interdependency
is not as strong as when they sold identical product lines. If Sainsbury
decides to sell at a price P1 0 and Tesco decides to sell at a price P2
0, then the Bertrand duopoly model assumes that customers will
demand quantities:
R1 : A 9 P1 ; BP2
and
R2 : A 9 P2 ; BP1
from each of the two supermarkets, respectively; where A is a constant
as in the previous duopoly models and B is a constant that reXects the
extent to which Sainsburys products are substitutes for Tescos and
vice versa. These two equations, called the demand functions for the two
Wrms, are somewhat unrealistic, however, because demand for one
supermarkets product is positive even when it charges an arbitrarily

127

The Cournot, von Stackelberg and Bertrand duopolies: an application of mixed-motive games

high price, provided the other supermarket charges a high enough

price (Gibbons, 1992a). As will be shown below, this only makes sense
if:
B2
The proWt functions for the supermarkets are:
1 : P1R1 9 c1R1
2 : P2R2 9 c2R2
Substituting for R1 and R2 gives:
1 : (A 9 P1 ; BP2)P1 9 c1(A 9 P1 ; BP2)
: AP1 9 P12 ; BP1P2 9 c1A ; c1P1 9 c1BP2
and
2 : (A 9 P2 ; BP1)P2 9 c2(A 9 P2 ; BP1)
: AP2 9 P22 ; BP1P2 9 c2A ; c2P2 9 c2BP1
Each supermarkets reaction function can now be found by diVerentiating its respective proWt function with respect to price:
1
: A 9 2P1 ; BP2 ; c1
P1
and
2
: A 9 2P2 ; BP1 ; c2
P2
If

1
2
: 0 and
:0
P1
R2

then
P1 : (A ; BP2 ; c1)/2
P2 : (A ; BP1 ; c2)/2
And since
21
22
:92 and
:92
P1
P2
local maxima are indicated (see Figure 6.14).

128

Two-person mixed-motive games of strategy

Reaction function
for Tesco

T
A+c
2 B

Bertrand

Nash equilibrium

(A + c ) / 2
(B = 0)
Reaction function for Sainsbury

P1
(A + c ) / 2
(B = 0)
Figure 6.14

A+c
2 B

(Assumes c1 = c 2 = c)

The BertrandNash equilibrium.

The method of simultaneous equations,

2P1 9 BP2 : A ; c1
9BP1 ; 2P2 : A ; c2
produces the BertrandNash equilibrium, which has solutions:
P1 :

A(B ; 2) ; 2c1 ; Bc1

4 9 B2

and
P2 :

A(B ; 2) ; Bc1 ; 2c2

4 9 B2

If c1 : c2, these two solutions reduce to:

P1 : P2 : (A ; c)/(2 9 B)
Note that since the price charged to customers cannot be negative,
B 2.

129

Solving games without Nash equilibrium points using mixed strategies

The reaction curves for the Bertrand duopoly have positive gradients, unlike those for the Cournot and von Stackelberg models. They
are positively related and are said to complement each other strategically. In the case of the Cournot and von Stackelberg duopolies, the
reaction functions have negative gradients. One Wrms output causes
the other Wrm to decrease its output. In such cases, the reaction
functions are said to substitute for each other strategically .
To maximise their proWts and to arrive at the BertrandNash equilibrium, both supermarket Wrms must be on their reaction function
lines (marked BN on Figure 6.14). As with the previous duopoly
models, the Nash equilibrium point is not pareto-eYcient, since both
Sainsbury and Tesco could make higher proWts if they set higher prices.
This set of possibilities is shown marked T on Figure 6.14, but since
each Wrm has an incentive to deviate from these arrangements, they do
not oVer a more likely solution than the Bertrand-Nash equilibrium.
Pareto-ineYciency is a feature of all three duopoly models. If the
competitors collude they can increase proWts, but since this requires (at
best) an agreement that is diYcult to enforce given the incentives to
deviate from it, and (at worst) an illegal cartel, such solutions are not
realisable in practice.

Solving games without Nash equilibrium points using mixed

strategies
Unfortunately, not every mixed-motive game has a Nash equilibrium,
though some can nevertheless be solved fairly easily. Consider the
matrix on Figure 6.15.
Clearly, there is no Nash equilibrium for this game. If player 1
chooses r1, player 2 will choose c1. However, if player 1 thought that
player 2 was going to choose c1, he or she would have chosen r2, in
which case player 2 would select c2; in which case, player 1 would revert
to r1; and so on, ad nauseam. All that can be concluded from this
terrible circularity is that if either side reveals its intent, it will guarantee
a bad pay-oV. Each player should keep the other guessing and the best
way to do this is for both players to randomise their strategic selections.
Random strategies thus selected mixed strategies have already been
considered for the simple case of two-person zero-sum games, where

130

Two-person mixed-motive games of strategy

Player 2

Player
1
Figure 6.15

Strategy

1, 4

3, 0

2, 1

1, 2

A two-person mixed-motive game with no Nash equilibrium point.

Player 2
Strategy

:
:
:

u1(rm, cn), u2(rm, cn)

The general matrix for mixed strategies in mixed-motive games.

the same pay-oV matrix represents both players (Example 5.3). The
general case for mixed-motive games must now be considered.
Suppose player 1 has m strategies
S1 : r1, r2, . . . , rm
and player 2 has n strategies
S2 : c1, c2, . . . , cn
and u1(ri, cj) represents the pay-oV to player 1 when player 1 chooses
strategy ri and player 2 chooses strategy cj, then the game can be
represented by the m ; n matrix shown on Figure 6.16.
An abbreviated version of the matrix is shown on Figure 6.17, where
Uij is the utility pay-oV function for player 1 for strategies ri and cj ; and
Vij is the utility pay-oV function for player 2 for strategies ri and cj.
A re-deWnition of the Nash equilibrium can now be made for such a

131

Solving games without Nash equilibrium points using mixed strategies

Player 2
Strategy

........

U 11, V 11

U 12, V12

........

U1n , V1n

Player

U 21, V 21

U 22, V22

........

U2n , V2n

:
:
:

Um 1, Vm 1

Figure 6.17

:
:
:
........

:
:
:

Um n , Vmn

The abbreviated general matrix for mixed strategies in mixed-motive games.

matrix. It is a pair of strategies (ri, cj) such that:

Uij is maximum in its column cj, and
Vij is maximum in its row ri.
A mixed strategy for player 1 is a selection of probabilities pi such that
0 O pi O 1 and
pi : 1, for i : 1 to m
A mixed strategy for player 2 is a selection of probabilities q j such that
0 O q j O 1 and
q j : 1, for j : 1 to n
A mixed strategy becomes pure if pi (or q j) is 1 and the probabilities
assigned to every other strategy are zero.
If player 1 and player 2 choose their strategies according to their
mixed strategies (p and q, respectively), each has no way of knowing the
other players strategy, though the question remains, of course, as to
how these mixed strategies and resulting pay-oVs are calculated.
If player 2 chooses strategy cj, then player 1 has a pay-oV:
piuij, for i : 1 to m
However, player 2 is selecting strategy cj with a probability q j, so the
cumulative pay-oV for player 1 is:
1 :piq juij, for i : 1 to m and for j : 1 to n
and for player 2:

132

Two-person mixed-motive games of strategy

2 :piq jvij, for i : 1 to m and for j : 1 to n

The form of the game has now been changed to a game of mixed
strategies and although the original game in strategic form did not have
a Nash equilibrium, the game in its new form does! In fact, every game
has a Nash equilibrium in mixed strategies (Nash, 1951), and pure
strategies, when they exist, are just special cases in which the probabilities of all the players strategies are zero, except for one, which is unity.
Of course, every Nash equilibrium in pure strategies is also a Nash
equilibrium of the game in mixed strategies.
Mixed strategies and the Nash equilibrium pay-oV can be calculated
from the partial derivative equations:
1
:0
pi
and
2
:0
pi
as the following example illustrates.
Example 6.7 Mutual societies and life companies changing status: a case study
In 2000, a group of policyholders of Scottish insurance giant Standard Life
campaigned for it to convert to a public company and pay windfalls to each of its
2.3 million member policyholders, averaging 2500 apiece. In June of that year,
the company announced that only 46% of members had voted for demutualisation, far fewer than the 75% needed to secure victory. However, a group of
dissatisWed policyholders continued their campaign on the basis that the company directors were overly conservative in the value they placed on the business
(12 billion) and hence on the potential pay-oVs for members.
Some members favoured demutualisation, some wished to remain a mutual
society and others remained uncommitted. The concerns of those uncommitted
policyholders centred largely on whether or not investment and Wnancial services would continue to be customer-focused. Independent research carried out
on behalf of one of the action groups suggested that each factions ordinal
pay-oVs, in terms of winning over the uncommitted, were as represented on
Figure 6.17.
If both factions (the pro-mutual and the pro-change) oVered a community-

133

Solving games without Nash equilibrium points using mixed strategies

focused bias, the pro-mutual lobby would do well at the expense of the other,
since community-focus was already the strength of the status quo, although the
pro-change lobby would gain some small measure of credibility (1, 4). If both
factions suggested a criterion-based focus for future business, the pro-mutual
faction would just about prevail (1, 2). However, if the pro-mutual lobby oVered
a criterion-based focus, and the pro-change lobby did not, the latter would fare
better, since it would totally undermine the argument for mutuality (3, 0). And
if the pro-mutual group oVered a continuation of community-focused service
and the pro-change group oVered a change to criterion-based service, the vote
would probably go the way of change (2, 1).
As things turned out, the action failed and Standard Life remains a mutual
society to this day, but what strategy should each side have adopted, if they had
accepted this analysis?

The game represented by Figure 6.18 has no Nash equilibrium in pure

strategies, but it does in mixed strategies and so can be solved.
Let the probabilities p1, p2, q1 and q2 be as indicated, with:
p2 : 1 9 p1 and q2 : 1 9 q1
If 1 represents the expected pay-oV functions for the prodemutualisation (pro-change) lobby and 2 represents the expected
pay-oV functions for the pro-mutual lobby, then:
1 : p1q11 ; p1q23 ; p2q12 ; p2q21
: p1q1 ; 3p1(1 9 q1) ; 2 (1 9 p1) q1 ; (1 9 p1)(1 9 q1)
:93p1q1 ; 2p1 ; q1 ; 1
and
2 : p1q14 ; p1q20 ; p2q11 ; p2q22
: 4p1q1 ; (1 9 p1) q1 ; (1 9 p1)(1 9 q1)2
: 5p1q1 9 2p1 9 q1 ; 2
So,
1
: 3q1 ; 2 : 0
p1
and
2
: 5p1 9 1 : 0
q2

134

Two-person mixed-motive games of strategy

Pro-mutual lobby
Strategy

Pro-change
lobby

Figure 6.18

Community-focused
business

Criterion-based
business

Communityfocused
business

1, 4

3, 0

Criterionbased
business

2, 1

1, 2

p2 = 1 p1

q2 = 1 q1

Assigned
probabilities

Pay-off matrix for a mutual society changing status.

which assigns the following probabilities to the strategies:

p1 : 1/5; p2 : 4/5; q1 : 2/3; and q1 : 1/3
and the pay-oV value of the game for each player, back substituting
these four values, is:
1 : 5/3
2 : 8/5
Therefore, the pro-change lobby should have adopted the community-focused business strategy with a probability of 1/5 and the criterion-based business strategy the rest of the time. The pro-mutual
lobby should have adopted the community-focused business strategy
with a probability of 2/3 and the criterion-based business strategy the
rest of the time.
It may seem a strange concept, even counter intuitive, to advocate
randomising strategic decisions like this. Organisational culture is one
where decision makers are often inclined towards advocacy, sometimes
at the expense of the game itself, so it should be understood that the
randomisation of strategies is merely a subterfuge for ones intentions,
since to reveal them is to lose beneWt. The extent of the randomisation
is then a reXection of the relative merits of each position. It may be
Machiavellian, but it is not irrational!

Repeated games

Life is an offensive, directed against the repetitious mechanism of the universe.

A.N. Whitehead 1933 Adventures of Ideas

In everyday life, when people interact, they usually do so as part of a

developing dynamic relationship, and when people interact with organisations, they do so on a continuous basis rather than as a series of
one-oV events. In such circumstances of repeated interaction, individual players learn to coordinate their strategies so as to avoid ineYcient
outcomes. This chapter examines how such repeated dynamic games
can be analysed and how repetition aVects those outcomes. It examines
the important concepts of credibility, threat and sub-game perfection
as applied to dynamic games.
Initially, the chapter examines inWnitely repeated games where the
one-oV game has a unique Nash equilibrium. It is demonstrated that,
provided players do not discount future returns too much, a
cooperative non-collusive outcome can be sustained, but that this
result collapses if the game is Wnitely repeated. The chapter then goes
on to look at Wnitely repeated games where the one-oV game has a
unique Nash equilibrium and examines the paradox of backward
induction and four proposals for avoiding it bounded rationality,
multiple Nash equilibria, uncertainty, and incompleteness of information.

Infinitely repeated games

Consider a martyrdom or prisoners dilemma type game with cardinal,
rather than ordinal, pay-oVs (cf. Example 6.4; Figure 6.8).
135

136

Repeated games

BUPA

GHG

Strategy

Large
subsidy
for NHS

Small
subsidy
for NHS

Large
subsidy
for NHS

20, 20

40, 10

Small
subsidy
for NHS

10, 40

30, 30

Figures represent profits in millions of pounds sterling.

Figure 7.1

Pay-off matrix for two private healthcare rms.

Example 7.1 Funding publicity

Two Wrms dominate the UK private healthcare industry: the General Healthcare
Group (GHG) and British United Provident Association (BUPA) (Source: Laing
& Buisson). The more the state-run National Health Service (NHS) uses private
healthcare facilities to make up the shortfall in public health provision, the
greater the proWts made by the two Wrms. However, the relationship is far from
simple. The private healthcare Wrms actually subsidise the cost of healthcare to
the state by selling at a discount, but their involvement and the good publicity
surrounding it enhances their proWts by attracting more subscribing customers.
Generally, each Wrms proWt depends negatively on how much the other
company subsidises NHS provision. The more GHG subsidises it, the fewer
recruits BUPA gets, and vice versa. If matters are simpliWed by categorising
subsidies to the NHS as either large or small, then the pay-oVs will be as
shown on Figure 7.1 (Wgures represent proWt in millions of pounds sterling).

A comparison between Example 7.1 and Example 6.4 in the previous

chapter reveals the similarity. In both cases, higher pay-oVs are more
desirable than lower ones (though some models of the prisoners
dilemma game focus on the penalties inXicted in each eventuality, in
which case lower values indicate more preferred outcomes).
It can be seen that both Wrms are better oV if they oVer small
subsidies to the NHS, (30, 30), as opposed to big subsidies, (20, 20),
because if both Wrms oVer large subsidies simultaneously, market share
will be unaVected and proWts will fall. Nevertheless, each Wrm has an

137

Infinitely repeated games

incentive to increase its level of subsidy above the other (10, 40) and
(40, 10).
As with the prisoners dilemma, if this game is played only once there
is a Nash equilibrium where the minimax strategies intersect, at (20,
20). Neither Wrm can do better by choosing another strategy once the
other Wrms strategy becomes known. However, this dominant solution is worse than the other strategy where both Wrms do the same
thing, (30, 30), and the problem for competing Wrms is how to coordinate their strategies on the optimal outcome, (30, 30), without priceWxing. In the one-oV game this is not possible as there is a clear
incentive to increase subsidies. However, if the interaction between
BUPA and GHG is inWnitely repeated, it is possible for the two Wrms to
coordinate their actions on the pareto-eYcient outcome.
Two concepts that of adopting a punishing strategy and not
discounting the future too much help explain how and why this
happens. A punishing strategy is one where a player selects a strategy
based purely on what the other player has done, in order to punish him
if he deviates from the pareto-eYcient outcome. The shadow of the
gallows deters players from deviation and the pareto-eYcient outcome
can thus be maintained indeWnitely. Of course, the punishment and the
punisher must both have credibility, so it must be in the interests of the
punisher to punish the deviant player if and when the need arises.
A punishment strategy will only be eVective if it is part of the
sub-game perfect Nash equilibrium for the entire game. In Example
7.1, it could be that each Wrm starts with small-subsidy strategies and
that this arrangement is allowed to persist as long as no one deviates
from the status quo. If, however, either Wrm adopts a large-subsidy
strategy, then in that event the opposing Wrm guarantees to undertake
large-subsidy strategies ever after.
This particular type of punishment strategy is known as a trigger
strategy, where the actions of one player in a game cause the other
player permanently to switch to another course of action. In the case of
Example 7.1, it threatens an inWnite punishment period if either player
opts for a large-subsidy strategy. Once one Wrm increases its level of
subsidy, the other Wrm guarantees to do the same thereafter, thus
precluding the possibility of ever returning to the pareto-eYcient
outcome. The Wrm that Wrst adopts the large-subsidy strategy will
increase proWts from 30m to 40m in the initial period, but will drop

138

Repeated games

to 20m per annum thereafter. The game will reach equilibrium at (20,
20), a sub-optimal outcome for both parties.
For a trigger strategy to maintain a pareto-eYcient outcome (30,
30) in the above example both the punishment and the agreement to
maintain the pareto-eYcient outcome must not be ridiculous. In
Example 7.1, the threat of punishment is perfectly reasonable because if
one Wrm switches to a large-subsidy strategy, then it is rational for the
other Wrm to also switch to a large-subsidy strategy, since that move
guarantees to increase the latters proWt from 10m to 20m. The
punishment strategy corresponds to the Nash equilibrium for the
one-oV game. This is always credible because, by deWnition, it is the
optimal response to what is expected of the other player.
The promise to maintain the implicit agreement of small-subsidy
strategies in Example 7.1 is also credible. Organisations in the forproWt sector generally seek to maximise total discounted proWt, so the
cooperative outcome at (30, 30) will be maintained indeWnitely as long
as the present value of cooperation is greater than the present value of
deviating (Romp, 1997), and as long as Wrms do not discount the future
too much. Since an inWnitely repeated game develops over time, future
pay-oVs need to be discounted to some extent. Pay-oVs lose value over
time, so a sum of money to be received in the future should be assigned
a lower value today. Conversely, a pay-oV received today should be
assigned a higher value in the future since it could gain interest over the
intervening period.
Suppose r is the potential rate of interest, then d : 1/(1 ; r) is the
rate of discount. With this rate of discount, the present value, Vnow, of
maintaining a small-subsidy strategy, Vnow(small), is given by the expression:
Vnow(small) : 30 ; 30d ; 30d 2 ;
Therefore:
dVnow(small) : 30d ; 30d 2 ; 30d 3 ;
So,
(1 9 d)Vnow(small) : 30
or
Vnow(small) : 30/(1 9 d)

(1)

139

Finitely repeated games

The present value of deviating from this cooperative outcome and

adopting a large-subsidy strategy, Vnow(large), is given by:
Vnow(large) : 40 ; 20d ; 20d 2 ;
Therefore:
dVnow(large) : 40d ; 20d 2 ; 20d 3 ;
So,
(1 9 d)Vnow(large) : 40 ; 20d 9 40d ; 20d 2 9 20d 2 ;
: 40(1 9 d) ; 20d
Therefore:
Vnow(large) : 40 ; 20d/(1 9 d)

(2)

As long as Vnow(small) P Vnow(large), the cooperative outcome will be

maintained indeWnitely. In other words, if
30/(1 9 d) P 40 ; 20d/(1 9 d)
30 P 40 9 20d
1
dP
2

(since d 1)

So, in an inWnitely repeated version of the game described in Example

7.1, with the punishment trigger described, both healthcare Wrms will
maintain a pareto-eYcient equilibrium at (30, 30) as long as their rate
of discount is greater than 1/2. If the rate of discount is less than 1/2,
then each Wrm will deviate from the (30, 30) equilibrium. It cannot be
maintained with the given punishment trigger because the future
threat of punishment is not a suYcient deterrent. The players assign
too great a signiWcance to current proWts at the expense of future
proWts, and the promise to maintain the implicit arrangement is no
longer credible.

Finitely repeated games

In the previous section on inWnitely repeated games, it was shown that
it is possible for Wrms to maintain a cooperative outcome diVerent
from the Nash equilibrium for the one-oV game. To do this, players

140

Repeated games

must adopt appropriate punishment triggers and not discount the

future beyond a certain threshold. The extent to which this result
continues to hold in the context of Wnitely repeated games will now be
considered.
Backward induction and its inherent paradox

The method of backward induction, as applied to Wnitely repeated

games, reveals that if a one-oV game has a unique Nash equilibrium,
then the sub-game perfect Nash equilibrium for the entire game is this
same Nash equilibrium played in every repetition of the game, no
matter how many. Say, for example, a game has a unique Nash
equilibrium and is played a Wnite number of times. To Wnd the
sub-game perfect Nash equilibrium for the game, the Wnal iteration is
examined Wrst. This is a one-oV game in itself and the predicted
outcome is the unique Nash equilibrium for that single game. Both
players know that in the last iteration of the game, the Nash equilibrium must be played, irrespective of what has gone before, so there can
be no credible inducement or threat for a player to play anything other
than the unique Nash equilibrium in the penultimate iteration. Both
players know this and so the Nash equilibrium is again played. This
argument can be applied to all preceding one-oV iterations, so the
sub-game perfect Nash equilibrium for the entire game is simply the
same Nash equilibrium played in each one-oV game. Therefore, a
cooperative non-collusive outcome is not possible.
Consider the game described in Example 7.1, played over two consecutive years. In the second year, the predicted solution is that both
Wrms will adopt large-subsidy strategies and make a proWt of 20m
each the Nash equilibrium for the one-oV game. Since the outcome
for the second year is fully determined, the equilibrium for the Wrst year
must be the same, i.e. that both Wrms will adopt large-subsidy strategies. A similar analysis applied to any Wnite number of repetitions
gives the same result that the unique one-oV game Nash equilibrium
will be played in every iteration.
This general result is known as the paradox of backward induction, so
called because no matter how many times a Wnite game is repeated, it
never produces the same result as an inWnitely repeated game. The
relationship between inWnitely and Wnitely repeated games is not a

141

Finitely repeated games

continuous function. There is a break in the continuum, which is

counter-intuitive, since with a large number of iterations of a game, it
would seem reasonable to assume that players would Wnd some way of
implicitly coordinating on the pareto-eYcient outcome.
Essentially, the diVerence between inWnitely and Wnitely repeated
games and the root cause of the paradox is that in the former, the
structure of the game does not change over time and there is no place
from which to start the process of backward induction. A number of
concepts have been developed to overcome the paradox. These include
the notions of bounded rationality (see also Chapter 9), multiple Nash
equilibria, uncertainty and incompleteness of information, and these
are each discussed in turn below.
Avoiding the paradox of backward induction: bounded rationality
Bounded or near rationality allows people to be rational, but only
within certain limits. Players are allowed to play sub-optimal strategies
as long as the pay-oV per iteration is within P 0 of their optimal
strategy (Radner, 1980). An equilibrium occurs when all players play
these best sub-optimal strategies and, if the number of repetitions is
large enough, playing the cooperative outcome, even if it is not a
sub-game perfect Nash equilibrium, can still be an equilibrium.
Consider again Example 7.1. Say both healthcare Wrms adopt the
same punishment strategy, but for a game repeated a Wnite number of
times, t, say. The pay-oV for continuing to play according to the
punishment strategy if the other Wrm adopts a large-subsidy strategy,
assuming no discounting over time, is calculated as follows. Both GHG
and BUPA are using small-subsidy strategies, so the pay-oV is 30m for
each Wrm. One Wrm, say BUPA, deviates from this pareto-eYcient
outcome and adopts a large-subsidy strategy. It increases its pay-oV to
40m, but the GHG pay-oV falls to 10m. The punishment strategy
now kicks in and GHG changes to a large-subsidy strategy, thereby
settling both pay-oVs at 20m. So, the expression:
10 ; 20(tR 9 1)
describes the pay-oV for either Wrm continuing to play the punishment
strategy, where tR represents the Wnite number of iterations remaining
in the game.
The pay-oV for the Wrm that Wrst deviates from the pareto-eYcient

142

Repeated games

BUPA

GHG

Figure 7.2

Large
subsidy

Moderate
subsidy

Small
Subsidy

Large
subsidy

20, 20

40, 10

0, 0

Moderate
subsidy

10, 40

30, 30

0, 0

Small
subsidy

0, 0

25, 25

The NHS subsidy game with multiple Nash equilibria.

equilibrium, BUPA in the above scenario, is given by the expression:

40 ; 20(tR 9 1)
So breaking the cooperative equilibrium yields a beneWt to the deviant
of 30m, or 30/tR per iteration. According to the deWnition of bounded
rationality then, the cooperative outcome is an equilibrium as long as:
30/tR
Obviously, if tR is very large, this condition is always satisWed, since
(30/tR) ; 0, and the cooperative outcome (30, 30) becomes prominent
at least in the early stages of the game.
There are inherent diYculties, however, contained within the notion
of bounded rationality. Friedman (1986), for example, argues that
bounded rationality implies that players only calculate optimal strategies for a limited number of iterations and that therefore the game
becomes shorter and the result of backward induction more likely.
Avoiding the paradox of backward induction: multiple Nash equilibria
With multiple Nash equilibria, there is no unique prediction concerning the last iteration, so players have a credible threat with which to
induce other players to play the cooperative solution. Consider the
following game, a variation of the one in Example 7.1, but one in which

143

Finitely repeated games

BUPA

GHG

Figure 7.3

Large
subsidy

Moderate
subsidy

Small
Subsidy

Large
subsidy

40, 40

60, 30

20, 20

Moderate
subsidy

30, 60

55, 55

20, 20

Small
subsidy

20, 20

45, 45

The extended NHS subsidy pay-off matrix for the entire game played over two iterations.

each Wrm is allowed three strategies rather than two and that is played
twice (see Figure 7.2).
The one-oV game has two Nash equilibria, shaded in Figure 7.2, at
(20, 20) and (25, 25). They are both pareto-ineYcient because if players
could coordinate on (30, 30), then both players would be better oV. For
the second iteration of the game, suppose that the two Wrms adopt the
following punishment strategy: In the initial iteration, adopt a moderate subsidy strategy. In the second iteration, adopt a small subsidy
strategy if the other player has also adopted a moderate subsidy strategy
in the Wrst iteration; otherwise, adopt a large subsidy strategy (Romp,
1997).
In terms of the pay-oV matrix for the entire game, this punishment
strategy has the eVect of increasing each pay-oV by 20m, with the
exception of the case where both Wrms adopt moderate-subsidy
strategies, in which case pay-oV is increased by 25m. Figure 7.3 shows
the pay-oV matrix for the entire game, assuming no discount of
pay-oVs over time.
The game now has three Nash equilibria, shaded above, at (40, 40),
(55, 55) and (45, 45). Adopting moderate-subsidy strategies in the Wrst
iteration and small-subsidy strategies in the second is a sub-game
perfect Nash equilibrium, and players thus avoid the paradox of backward induction.

144

Repeated games

Avoiding the paradox of backward induction: uncertainty

The paradox of backward induction can also be avoided if it is assumed
that there is a Wxed probability that the game will end after a given
iteration, though when exactly might not be known. So, like an inWnitely repeated game, the structure of the remaining Wnite game does
not change over time and there is no point from which to start the
process of backward induction. Since backward induction is only
applicable to games with an end point, the paradox is avoided.
A similar analysis to that advanced already for the case of inWnitely
repeated games can be made, although the rate of discount, d, must be
redeWned. Instead of this depending only on a potential rate of interest,
r, it now also depends on the probability that the game will end after
any given iteration. Players now discount the future more heavily, since
there is a possibility that future returns will not be received at all, so the
rate of discount is deWned as:
d : (1 9 p)/(1 ; r)
where p is the probability that the game will end after any given
iteration.
Avoiding the paradox of backward induction: incomplete information
As deWned already in Chapter 1, a game is said to have complete
information if everyone knows everyone elses pay-oV function and
everyone knows everyone knows it. In contrast, incomplete information means that players are not sure what other players know or what
their pay-oV functions are.
In everyday life, when people interact, they are unlikely to have
complete information and this lack of common knowledge complicates
games and strategic selection. Consider Example 7.1 again, played
once, but this time suppose GHGs involvement in the healthcare
market may or may not be bound by internal Wnancial or ethical
constraints, known only to itself and not to BUPA. Clearly, the game is
now one of incomplete information. Figure 7.4(a) shows the game
when GHG is free of any additional constraints; and (b) shows the
game when GHG is bound by some additional internal constraints.
If GHG is free of additional constraints, the pay-oV matrix, shown in
Figure 7.4(a), remains unchanged from that illustrated on Figure 7.1.
On the other hand, if GHG is bound by some internal Wnancial or

145

Finitely repeated games

GHG

Large subsidy

Small subsidy

BUPA

Large sub.

Small sub.

Large sub.

Small sub.

(20, 20)

(40, 10)

(10, 40)

(30, 30)

BUPA
Strategy

GHG

Large
subsidy

Small
subsidy

Large
subsidy

20, 20

40, 10

Small
subsidy

10, 40

30, 30

(a)
Figure 7.4

GHG is (a ) free of any additional constraints and (b, overleaf ) bound by some internal constraints.

ethical constraints, the pay-oV matrix becomes that shown on Figure

7.4(b), and GHG no longer has the strictly dominant strategy of a
large-subsidy strategy. Its behaviour now depends on which type of
competitor it thinks it is playing against.
It has been shown (Harsanyi, 1966; 1967) that games with incomplete information can be transformed into games of complete but
imperfect information, by assuming in this case that nature determines
whether GHG is free or bound. Hence nature determines the payoVs for the game, according to a probability distribution, and this is
assumed to be common knowledge.
This game is now one of complete but imperfect information. If
nature makes GHG free of constraints with a probability p, and bound
by constraints with a probability 1 9 p, the game can be represented as
on Figure 7.5.

146

Repeated games

GHG

Large subsidy

Small subsidy

BUPA

Large sub.

Small sub.

Large sub.

Small sub.

(0, 20)

(5, 10)

(10, 20)

(30, 30)

BUPA
Strategy

GHG

Large
subsidy

Small
subsidy

Large
subsidy

0, 20

5, 10

Small
subsidy

10, 20

30, 30

(b)

Figure 7.4

(cont.)

This game can be solved using the principle of iterated strict dominance, which produces a unique Nash equilibrium. Row 1 dominates
row 2, when GHG is free of additional constraints; row 4 dominates
row 3, when GHG is bound by additional constraints. Now BUPA
knows that GHG will adopt a large-subsidy strategy with probability p
and a small-subsidy strategy with probability 1 9 p. Therefore, BUPA
can calculate its own expected proWt conditional on its own subsidy
strategy. If it decides on a large-subsidy strategy, its expected proWt
level is:
20p ; 20(1 9 p)
If it decides on a small-subsidy strategy, its expected proWt level is:
10p ; 30(1 9 p)

147

Finitely repeated games

Nature

GHG is free of
constraints

GHG is bound
by constraints

GHG

Large sub.

GHG

Small sub.

Large sub.

Small sub.

BUPA

Large
sub.

(20, 20)

BUPA

Smalll
sub.

Large
sub.

(40, 10)

(10, 40)

Small
sub.

Large
sub.

(30, 30)

(0, 20)

Small
sub.

(5, 10)

BUPA
Large
subsidy

Small
subsidy

Free &
large
subsidy

20, 20

40, 10

Free &
small
subsidy

10, 40

30, 30

Prob. = 1 p

Bound &
large
subsidy

0, 20

5, 10

Prob. = 1 p

Bound &
small
subsidy

10, 20

30, 30

Strategy

Prob. = p

GHG

Figure 7.5

The game transformed by introducing nature.

Large
sub.

Small
sub.

(10, 20)

(30, 30)

148

Repeated games

So, BUPA will adopt a large-subsidy strategy if and only if:

20p ; 20(1 9 p) 10p ; 30(1 9 p)
or
p

1
2

So if the probability of GHG being free of internal constraints is greater

than 1/2, BUPA will adopt a large subsidy strategy. It is indiVerent if
that probability is exactly 1/2. And it will adopt a small-subsidy strategy
otherwise. In this game, the healthcare Wrms achieve a pareto-eYcient
solution, (30, 30), if both adopt small subsidy strategies and p 1/2.
Games of incomplete information can therefore lead to paretoeYcient outcomes, but there are complications. In dynamic games of
incomplete information, players may be able to learn what other
players are like by observing their past actions. This gives players the
opportunity to inXuence the behaviour of opposing players by modifying their own actions. For example, if GHG can convince BUPA that it
is bound by constraints, then BUPA will adopt a small-subsidy strategy,
expecting a (30, 30) pay-oV. However, GHG may actually be free of
constraints, resulting in a (40, 10) pay-oV. It is clearly in GHGs interest
to mislead BUPA, since that increases proWt at GHG by 10m. This
kind of subterfuge is usual: players should cultivate a reputation for
being Wnancially or ethically constrained (in this case) so as to mislead
the opposition.
Of course, players know that other players have this incentive to
conceal their true intentions, a fact that should be reXected in the
probabilities assigned to the game. These in turn should be determined
by observation of previous behaviour and up-dated in accordance with
Bayess theorem (see Chapter 4 and Appendix B). The equilibrium
concept used in such circumstances is the Bayesian sub-game perfect
Nash equilibrium, which is particularly important in the case of repeated games because even small amounts of uncertainty concerning the
nature of opposing players are greatly magniWed by repetition. This
probabilistic adjustment of the pay-oVs for players often leads to a
pareto-eYcient outcome, so the paradox of backward induction is
again overcome. (See also the centipede game in Chapter 9.)

Multi-person games, coalitions and

power

The management of the balance of power is a permanent undertaking, not an exertion that
has a forseeable end.
Henry Kissinger 1979 The White House Years

Multi-person games consist of three or more players and diVer theoretically from single- and two-person games because they potentially
involve coalitions. If the interests of the players coincide exactly, so that
coalitions are unnecessary or meaningless, then the games are ones of
pure coordination and reduce to the case of two-person cooperative
games discussed already in Chapter 4. In such cases, the only possible
coalition is the grand coalition, which involves all players acting in
unison, and coordination is eVected either by explicit communication
or by informal expectation.
Zero-sum multi-person games, on the other hand, are radically
aVected by the possibility of coalition, since they introduce the potential for cooperation into a game that would otherwise not have any.
These non-cooperative multi-person games use an approach which is
an extension of the saddle/equilibrium point approach.
Partially cooperative and mixed-motive games come somewhere
between the two extremes of purely cooperative and zero-sum games.
Partially cooperative and mixed-motive games have more realistic
solutions than those arising from completely non-cooperative games,
although some have approaches which tend towards obscurity (von
Neumann & Morgenstern, 1953).
Following a brief discussion on non-cooperative multi-person
149

150

Multi-person games, coalitions and power

games, this chapter begins by extending some concepts and deWnitions to mixed-motive and partially cooperative multi-person games.
Theories such as the minimal winning coalition theory and the minimum resource theory are discussed as useful predictors of coalition
forming on committees. The bulk of the chapter is devoted to developing methods for analysing the distribution of power among factions on a committee. Five diVerent indices of power are described
and two in particular are developed from Wrst principles and used in
a detailed examination of power on boards of governance. Power and
pay-oVs for both majority and minority factions are considered, as
are voting tactics and the implications for structuring committees
generally.

Non-cooperative multi-person games

Decision-makers often have to choose independently from among
alternative courses of action. Communication may be impossible or
undesirable and there may be no prospect of forming a coalition. In
some cases, coalitions may even be illegal or actively discouraged, as in
the case of price-Wxing cartels and share support schemes.
The formal solution to a multi-person non-cooperative game is
based on its equilibrium points, which is the outcome that gives none
of the players any cause for regret when the choices of the other players
are revealed. The minimax theorem (von Neumann, 1928), which
established that every Wnite strictly competitive game possesses an
equilibrium point in mixed strategies, was extended by Nash in 1951
to prove that every Wnite multi-person game possesses at least one
equilibrium point in pure or mixed strategies and that each solution corresponds to a Nash equilibrium point. In many cases, more
than one solution and hence more than one Nash equilibrium point
exists.
In two-person zero-sum games, diVerent equilibrium points can be
equivalent in that they yield the same pay-oVs, and interchangeable
insofar as any combination of strategies or Nash points can be used
together. Unfortunately, these desirable features are not shared by
equilibrium points even in two-person mixed-motive games, as was
demonstrated in the previous chapter, because the players cannot agree
on preferability. In multi-person games, these problems are com-

151

Mixed-motive multi-person games

pounded. There are often many non-equivalent and non-interchangeable Nash equilibrium points and there is no easy way of Wnding them,
never mind sorting them. In fact, the outcome of a multi-person game
may not be a Nash equilibrium point at all.

Mixed-motive multi-person games

A mixed-motive multi-person game, with n players say, is a game such
that:
each player i has a Wnite set of strategies Si, i + 1, 2, . . . , n;
each player i has a pay-oV utility function, ui + S1 S2 Sn ; R
Each player i chooses (simultaneously) a strategy si + Si and receives the
pay-oV ui. So in order to describe a game, the various sets of strategies
S1, S2 Sn need to be known, as do the pay-oV functions u1, u2, . . . , un.
Each players pay-oV (ui) is, of course, a function of all n strategies and
not just the players own.
The Nash equilibrium for a mixed-motive multi-person game can now
be deWned as the set of strategies sN1, sN2 sNn such that:
ui (sN1, sN2 sNn) P ui (s1, s2 sn, si + Si
As in two-person games, it is in the interests of each player to play a
Nash equilibrium strategy only if all the other players are going to do
the same. Each player must know about the Nash equilibrium strategy
and must know that the others know too. It must be accepted by all, in
which case it becomes a self-fulWlling prophecy and no player will have
an incentive to deviate.
The solution sN1, sN2 sNn will be a Nash equilibrium if:
ui
(sN1, sN2 sNn) : 0
si
and when ui/si : 0 has a unique solution, then there will be one and
only one Nash equilibrium for the game.
Also, each sNi will be the only stationary point of the function ui and
2ui
0, i
si2
will indicate a local maximum.

152

Multi-person games, coalitions and power

Example 8.1 Maximising funding

The New York city OYce of Management & Budget (OMB) has three sources of
funding: per capita revenue from local taxes and user charges which link funding
to population (u1); additional per capita subsidies from the state of New York
for disadvantaged areas (u2); and an additional allocation from the federal
government in Washington to take account of special services which the city
provides, like United Nations diplomatic liaison and consular protection (u3).
The three funding formulae (pay-oV functions) for the city council are:
u1(a, b, c) : 36a ; 5bc 9 a2
u2(a, b, c) : (a 9 3b)1/2 ; c, where a P 3b
u3(a, b, c) : 4824c ; 2ab 9 ac2
where a is the population of each of the Wve boroughs, b is the percentage of the
population from disadvantaged areas and c is the number of man-hours spent
providing additional facilities, averaged over the last three years (expense
budget).
What arrangements allow the Managers OYce to achieve maximum funding? Trial and error attempts in previous executive budgets have failed to
optimise income.

It can easily be seen that

u1
: 92a ; 36
a
u2
: 93/2(a 9 3b)1/2
b
u3
: 4824 9 2ac
c
Equating each of these partial derivatives to zero gives, from (1):
a : 18
Putting this value into (2) gives:
0 : 93/2(18 9 3b)1/2
b:6
And putting this value into (3) gives:
c : 134

(1)
(2)
(3)

153

Partially cooperative multi-person games

The three funding amounts, which follow from this unique solution,
are therefore:
u1 : $4344 per capita
u2 : $134 per capita
u3 : $323 424 lump sum for special services
The second derivatives are:
2u1
: 92
a2
2u2
: 99/4(a 9 3b)3/2
b2
2u3
: 92a
c2
Clearly, all three second derivatives are always less than zero, since a, b
and c are all positive numbers, so the solution is a unique Nash
equilibrium that maximises funding income for the police force.

Partially cooperative multi-person games

Partially cooperative multi-person games are ones in which coalitionforming is allowed and sometimes even essential, but not to the extent
that the games reduce to purely cooperative ones. Their study is really
the study of committee-like structures, the most interesting feature of
which is the way power is distributed and used by those who wield it to
achieve a desired pay-oV.
Partially cooperative multi-person games have associated with them
the notion of a characteristic function, which is a rule assigning a
numerical value to every possible coalition. Consider again the martyrdom game outlined in Example 6.4. Although this game (G) involves
no more than two players, there are four possible coalitions: the null
coalition consisting of nobody (C0); the single coalition consisting of
the lawyer only (C1); the single coalition consisting of the stockbroker
only (C2); and the grand coalition consisting of both (C3). This can all
be represented as:
G : i (Ci) from i : 0 to 3

154

Multi-person games, coalitions and power

Suppose that coalition Ci has a characteristic function (or security level,

as it is sometimes called) denoted by Ci, which is deWned as the
minimum pay-oV that coalition Ci can guarantee to its member factions (von Neumann & Morgenstern, 1953). The value of the game to
the null coalition is zero by deWnition, so:
C0 : 0
The single coalition consisting solely of the lawyer can guarantee a
minimum pay-oV of 2 by choosing to cooperate with the investigators.
So, the security level or characteristic function for the lawyer is:
C1 : 2
Similarly, the single coalition consisting solely of the stockbroker can
guarantee a minimum pay-oV of 2, so the characteristic function for
the stockbroker is:
C2 : 2
Finally, if both lawyer and broker form their grand coalition and act in
unison, agreeing not to cooperate with the investigators, they can
guarantee a pay-oV of 6:
C3 : 6
It can be seen from the four characteristic functions that, if the players
wish to maximise individual gain, they will form the grand coalition.
In some committee-type games with three or more players, more
than one coalition may provide the maximum pay-oV. The characteristic function does not in itself determine which of these winning
coalitions will form, so in order to narrow the possibilities down, a
concept known as the minimal winning coalition theory was developed
by Riker in 1962. Put simply, it states that a winning coalition will seek
to minimise its membership so as to avoid sharing the pay-oV unnecessarily. SelWshly, the smallest winning coalition will usually form and
there will seldom be an incentive to form a grand coalition. This theory
may not single out any given coalition as the one destined to form, but
it does eliminate some non-minimal coalitions from consideration.
A more speciWc concept, known as minimum resource theory, was
developed to narrow the Weld of possible coalitions even further (Gam-

155

Indices of power: measuring influence

son, 1961). It proposes that factions in a coalition should and do

demand a share of the pay-oV commensurate with their relative voting
strengths. Therefore, the coalition that will form will be the one that
wins with the smallest majority.
The notion of a characteristic function which cannot solve all
multi-person games it should be said is based on dividing the pay-oV
among coalition partners such that individual and collective greed are
both satisWed. Martyrdom games, such as the one in Example 6.4,
represent a clash of individual and collective rationality, which has to
be reconciled in the way the pay-oV is proportionately allocated. Such
proportional payment is known as an imputation. Clearly, some imputations are inadmissible because they are dominated by other imputations and, when they have been eliminated, what remains is known as
the core of the game. Each faction should receive at least what it would
receive if it acted independently and the winning coalition should win
the whole pot. In other words, the pay-oV should be the same as would
be the case if the coalition were a grand one.

Indices of power: measuring influence

There is a tendency to analyse power only from the perspective of how
participants exercise it or how others perceive it, which is legitimate of
course, but it can obfuscate the reality of having and using it. Understanding the distribution of power among players in a multi-person
game is a vital undertaking, as sometimes it is the only check on the
exercise of political power among the parties represented.
In cooperative multi-person majority games, coalitions are free to
form and disintegrate as the agenda changes. A coalition is said to
consist of factions the term is not intended to indicate any belligerence and can be deWned as one or more of these factions voting
together, by agreement or by chance. A winning coalition is therefore
one that commands a majority of the votes and a minimal winning
coalition is one that cannot suVer any defection without losing its
majority.
The order of voting can sometimes be important and so it is
necessary to introduce the notion of pivotal and critical factions. A
faction is deWned as pivotal to its coalition if it turns the coalition from

156

Multi-person games, coalitions and power

a losing one into a winning one by virtue of its vote; and as critical if its
withdrawal causes that coalition to change from a winning one to a
losing one.
Underlying assumptions: sincerity, completeness and transitivity

By their very nature, coalitions often have to choose one course of

action from several alternatives according to a set of formal principles,
such as majority voting, designed to ensure fair outcomes. A tacit
understanding underlying these principles is that voting is always
sincere and along self-interest lines; in other words, that factions always
vote for the choices they prefer, and that coalitions have not been
prearranged, but come about through self-interest in order to better
their pay-oV.
Other axioms, like those of completeness and transitivity, also
underpin the principles of cooperative action. Completeness refers to
the assumption that a preference is real and irreversible; in other words,
that factions which prefer choice X to choice Y do not also prefer choice
Y to choice X. Transitivity assumes a consistent hierarchy of preferences; in other words, factions which prefer choice X to choice Y, and
choice Y to choice Z, necessarily prefer choice X to choice Z.
There are several ways of measuring power. Two of them, the
Shapley value and the ShapleyShubik index, will now be developed
using the principles outlined above. They calculate the distribution of
power among factions in coalitions and their application is demonstrated in Example 8.2.
The Shapley value

The Shapley value (Shapley, 1953) rates each faction according to its a
priori power. In other words, in proportion to the value added to the
coalition by that faction joining it. Suppose a game, G, has n factions
(not players) and some of them vote together to form a coalition C.
Suppose an individual faction of C is denoted by fi and the size of the
coalition C is s, then:
G : f1, f2, f3 . . . , fn; C : f1, f2, . . . , fi, . . ., fs; C is a subset of G, not
Clearly, fi has s 9 1 partners, selected from n 9 1 players. Therefore,
there are

157

Indices of power: measuring influence

(n 9 1)!
(s 9 1)! [(n 9 1) 9 (s 9 1)]!
ways of re-arranging the coalition partners of i. The reciprocal of this
expression is:
(s 9 1)! (n 9 s)!
(n 9 1)!
and represents the probability of each such selection.
Assuming that all sizes of coalition are equally likely, a particular size
occurs with a probability of 1/n. Therefore, the probability of any
particular coalition of size s containing the individual faction i, from n
factions, is given by the expression:
(s 9 1)! (n 9 s)!
n!

(1)

Suppose now that coalition C has a characteristic function denoted by

C, and that the characteristic function that the remaining factions
have if fi is removed from C is denoted by C 9 i, then the contribution that fi alone makes to C is:
C 9 C 9 i

(2)

The Shapley value, S(fi), is now deWned as the product of expressions

(1) and (2), summed over s from 1 to n:
(s 9 1)! (n 9 s)!
S(fi) : s
[C 9 C 9 i]
n!
Regarding all sizes of coalition as equally likely may initially appear to
be an unreasonable assumption in political situations, but experience
suggests that all coalitions are possible and do happen in reality.
Motivation by factional self-interest sees to that!
The ShapleyShubik index

The ShapleyShubik index is a friendlier variation of the Shapley value

(Shapley & Shubik, 1954; Cowen & Fisher, 1998). Suppose that a game,
G, has n factions, which form themselves into various coalitions, C, for
voting purposes. Then, as before:

158

Multi-person games, coalitions and power

G : f1, f2, f3, . . . , fn; C : f1, f2, . . . , fi, . . . , fs; C is a subset of G, not
The ShapleyShubik index, SS(fi), is deWned as:
SS(fi) :

iCi where fi is pivotal

iCi, for i : 1 to n

The ShapleyShubik index is normalised, since 0 O SS(fi) O 1, and 1

represents absolute power.
Shapley and Shubik famously used their index in an analysis of
power in the United Nations Security Council. Up to 1965, there were
Wve permanent members of the Security Council (USA, the former
USSR, UK, France and China) and six non-permanent members.
Analysis showed that the permanent members, who had (and still have)
power of veto, controlled 98.7% of the power. In 1965, in an attempt to
increase the power of the non-permanent faction, the number of
non-permanent members was increased to ten, but the ShapleyShubik
analysis showed that the power of the same Wve permanent members
had only decreased marginally, to 98%. Clearly, membership ratios are
not true reXections of actual power.
The following example illustrates the Shapley value and the Shapley
Shubik index of power at work in weighted committees in which
diVerent factions have diVerent voting strengths.
Example 8.2 Power and school governance in a divided society
Consider the following three archetypes of school governance boards in Northern Ireland:
Model A: Voluntary maintained schools, which are mostly non-selective Roman
Catholic schools, typically having the following representations on their boards:
four Roman Catholic church trustees
two Education and Library Board representatives (ELB)
one parent
one teacher
one representative from the Department of Education (DE)
Model B: Controlled secondary schools, which are mostly non-selective Protestant schools, typically having the following representations on their boards:
four Protestant church trustees (transferors)

159

Indices of power: measuring influence

two Education and Library Board representatives (ELB)

two parents
one teacher
Model C: Out-of-state schools in the Republic of Ireland also educate pupils
from Northern Ireland. Typically, they have boards of governors consisting of:
three Education and Library Board representatives (ELB)
three parents
two majority religious representatives
one minority religious representative
two teachers
What real power does each faction have on each of the three types of
governing body?

Irrespective of religious aYliation or constitution, schools everywhere

have become more complex organisations as a result of the increased
participation of stakeholders. This has manifested itself in a proliferation of committee-like structures, such as boards of governors, which
mediate between society and the organisation. Representation on these
committees usually reXects an imputational entitlement to power and
many assumptions about the relative voting strengths of factions on
such committees are frequently made. Take the case of the Model C
school described above. At Wrst sight, it appears that the voting power
of the majority religious body is twice that of the minority religious
body. Similarly, the ELB and parent factions are assumed to be 50%
more powerful than the teaching or majority religious bodies. These
casual assumptions are dangerous on two counts. Firstly, they are
simply wrong; and secondly, they support an illusion of participative
democracy and empowerment that distracts from the need to bring
about more fundamental change. The advantage of a game theoretic
approach is that it can model majority voting situations, such as exist
on school governing boards, so that these inherent fallacies become
apparent.
An analysis of power on voluntary maintained boards (Model A)
Let the Wve factions on the board of governors be denoted as follows:

160

Multi-person games, coalitions and power

Church nominees (four votes): C

Education and Library Board (two votes): L
Parent body (one vote): P
Teaching staV (one vote): T
Department of Education (one vote): D
For any one particular order of C, L, P, T and D, there are:
5 one-faction, single coalitions
20 two-faction coalitions
60 three-faction coalitions
120 four-faction coalitions
120 Wve-faction, grand coalitions
A little consideration reveals the following:
None of the single coalitions is a winning one.
Eight two-faction coalitions are winning ones and they all include C.
Some 36 three-faction coalitions are winning ones and they all
include C.
All four-faction and grand coalitions are winning ones.
In summary, there are 284 winning coalitions out of a possible 325,
though the three-faction, four-faction and grand coalitions require
further investigation.
Three-faction coalitions
There are 36 winning three-faction coalitions. If C is Wrst, the second
voter will be pivotal; if C is second or third (12 occasions each) then C
itself will be pivotal. Therefore, C is pivotal 24 times and L, P, T and D
are pivotal on three occasions each.
Four-faction coalitions
There are 120 four-faction coalitions and they are all winning. If C is
Wrst in the voting, then the second faction to vote will be pivotal (six
times each for L, P, T and D). If C is second, third or fourth in the
voting, then C itself will be pivotal. In the case of the 24 coalitions that
do not include C, the last faction will always be the pivotal one.
Therefore, C will be pivotal 72 times and L, P, T and D will be pivotal
on 12 occasions each.
Grand coalitions
In grand coalitions, the last faction voting will never be pivotal, even if

161

Indices of power: measuring influence

Table 8.1 Summary table for voluntary maintained boards

C pivotal
Single
Two-faction
Three-faction
Four-faction
Grand
Totals

L pivotal

P pivotal

T pivotal

D pivotal

Totals

0
4
24
72
72

0
1
3
12
12

0
8
36
120
120

172

284

Table 8.2 Shapley values for voluntary maintained boards

(s 9 1)!(n 9 s)!
n!
Single
Two-faction
Three-faction
Four-faction
Grand
Shapley values

1/5
1/20
1/30
1/20
1/5

Total

0
4
24
72
72

0
1
3
12
12

0
8
36
120
120

19.0

3.15

it is C. Therefore, these cases reduce to the four-faction coalition

analysis outlined already.
Table 8.1 is a summary table of the extent to which each faction is
pivotal in each of the Wve possible coalition sizes.
The two actual power indices the Shapley value and the Shapley
Shubik index for each of the Wve participating factions can now be
calculated.
The Shapley value for each faction
For the Shapley value equation in this case, n : 5 and s : 1, 2, 3, 4, 5.
We assume that the contribution of fi to each successful coalition in
which it is pivotal, namely C 9 C 9 i, is unity; and that the
contribution of fi to each unsuccessful coalition is zero. The results are
summarised on Table 8.2.
The ShapleyShubik index for each faction
The results are summarised on Table 8.3.

162

Multi-person games, coalitions and power

Table 8.3 ShapleyShubik index for voluntary maintained boards

C pivotal
Number of coalitions
where pivotal
172
Number of possible
winning coalitions
284
ShapleyShubik

0.61

L pivotal

P pivotal

T pivotal

D pivotal

284

0.099

An analysis of power on controlled secondary boards (Model B)

Let the four factions on the board of governors be denoted as follows:
Church nominees (four votes): C
Education and Library Board (two votes): L
Parent body (two votes): P
Teaching staV (one vote): T
For any one particular order of C, L, P and T there are:
4 one-faction, single coalitions
12 two-faction coalitions
24 three-faction coalitions
24 four-faction, grand coalitions
None of the single coalitions is a winning one.
Six two-faction coalitions are winning ones the ones that include
C.
All three-factions and grand coalitions are winning ones.
Consequently, there are 54 winning coalitions out of a possible 64.
Two-faction coalitions
Three have C voting last and pivotal and the other three have C voting
Wrst.
Three-faction coalitions
There are 24 winning three-faction coalitions. If C is Wrst to vote, the
second faction will become pivotal (twice each for L, P and T). If C is
second (six times) or third (six times) in the voting, then C itself will be
pivotal. In the other six coalitions without C, the last one to vote will
become pivotal. Therefore, C is pivotal 12 times and L, P and T are
pivotal on four occasions each.

163

Indices of power: measuring influence

Table 8.4 Summary table for controlled secondary boards

C pivotal

L pivotal

P pivotal

T pivotal

Totals

Single
Two-faction
Three-faction
Grand

0
3
12
12

0
1
4
4

0
6
24
24

Totals

Table 8.5 Shapley values for controlled secondary boards

(s 9 1)!(n 9 s)!
n!
Single
Two-faction
Three-faction
Grand
Shapley values

1/4
1/12
1/12
1/4

C
0
3
12
12
4.25

0
1
4
4

1.42

Total
0
6
24
24

Grand coalitions
These cases reduce to the three-faction coalition analysis outlined
above.
Table 8.4 summarises the extent to which each faction is pivotal in
each of the Wve possible coalition sizes.
The two actual power measurements for each of the Wve participating factions can now be calculated.
The Shapley value for each faction
For the Shapley value equation in the case of controlled secondary
boards, n : 4 and s : 1, 2, 3, 4. Again, we assume that the contribution of fi to each coalition in which it is pivotal, namely
C 9 C 9 i, is unity and that the contribution of fi to each
unsuccessful coalition is zero. The results are summarised on Table 8.5.
The ShapleyShubik index for each faction
The results are summarised on Table 8.6.
An analysis of power on out-of-state boards (Model C)
Let the Wve factions on the board of governors be denoted as follows:

164

Multi-person games, coalitions and power

Table 8.6 ShapleyShubik index for controlled secondary boards

Number coalitions where pivotal

Number of possible winning coalitions
ShapleyShubik

C pivotal

L pivotal

P pivotal

T pivotal

27
54

9
54

0.50

0.167

Education and Library Board (three votes): L

Parent body (three votes): P
Majority religious body (two votes): R
Minority religious body (one vote): r
Teaching staV (two votes): T
For any one particular order of L, P, R, r and T, there are:
5 one-faction, single coalitions
20 two-faction coalitions
60 three-faction coalitions
120 four-faction coalitions
120 Wve-faction, grand coalitions
None of the single coalitions is a winning one.
Only two of the two-faction coalitions are winning ones (LP and
PL).
The only three-faction coalitions that are not winning ones are the
six variations of TrR.
All four-faction and grand coalitions are winning ones.
Consequently, there are 296 winning coalitions out of a possible 325,
but the three-faction, four-faction and grand coalitions require further
investigation.
Three-faction coalitions
There are 54 winning three-faction coalitions. Twelve of these Wnish
with voting from L; 12 with voting from P; and 10 with voting from
each of R, T and r. Of these last 30, six start with PL or LP. Therefore, L
and P are each pivotal in 15 three-faction coalitions; and R, T and r are
each pivotal in eight.
Four-faction coalitions
There are 120 four-faction coalitions, all winning, and in one-Wfth of
them, L votes Wrst. One-quarter of those 24 times, P will vote next, so P

165

Indices of power: measuring influence

Table 8.7 Summary table for out-of-state boards

L pivotal

P pivotal

R pivotal

T pivotal

r pivotal

Totals

Single
Two-faction
Three-faction
Four-faction
Grand

0
1
15
36
36

0
0
8
16
16

0
2
54
120
120

Totals

296

will be pivotal in these six coalitions. R, T and r (in any order) will vote
Wrst on a further 12 occasions and, half that time, P will be pivotal. In
all other coalitions, the pivotal position will be third in the voting and
P will be in this position on 24 occasions. In total then, P will be pivotal
for 36 coalitions.
Similar analysis reveals that L will also be pivotal for 36 four-faction
coalitions and R, T and r will each be pivotal for 16.
Grand coalitions
Since grand coalitions have 11 votes and the largest faction commands
only three, the last faction voting can never be pivotal. Therefore, these
cases reduce to the four-faction coalition analysis outlined above.
Table 8.7 is a summary table of the extent to which each faction is
pivotal in each of the Wve possible coalition sizes.
The Shapley value and the ShapleyShubik index for each of the Wve
participating factions can now be calculated.
The Shapley value for each faction
For the Shapley value equation in the case of out-of-state boards,
n : 5 and s : 1, 2, 3, 4, 5. We assume that the contribution of fi to
each coalition in which it is pivotal, namely C 9 C 9 i, is unity
and that the contribution of fi to each unsuccessful coalition is zero.
The results are summarised on Table 8.8.
The ShapleyShubik index for each faction
The results are summarised on Table 8.9.

166

Multi-person games, coalitions and power

Table 8.8 Shapley values for out-of-state boards

(s 9 1)!(n 9 s)!
n!
Single
Two-faction
Three-faction
Four-faction
Grand

1/5
1/20
1/30
1/20
1/5

Total

0
1
15
36
36

0
0
8
16
16

0
2
54
120
120

Shapley values

9.55

4.27

Table 8.9 ShapleyShubik index for out-of-state boards

Number of coalitions where pivotal

Number of possible winning coalitions
ShapleyShubik

L pivotal

P pivotal

R pivotal

T pivotal

r pivotal

88
296

40
296

0.297

0.135

Conclusions
The relative power of major and minor players
On voluntary maintained boards, the church nominees have four
seats on the board. Parent, teacher and Department of Education
(DE) representatives have one each, while the Education and Library
Board (ELB) has two seats. However, analysis reveals that church
nominees have more than six times the power of any of the other
factions!
On controlled secondary boards, the church nominees have four
seats on the board, and parent and ELB representatives have two
each. There is one teacher seat. Analysis from both indices reveals
that church nominees have three times the power of any of the other
factions.
On the sample out-of-state board, the Shapley value and the ShapleyShubik index both reveal that the power of the ELB faction and
the parent body is approximately 2.2 times that of each of the other
three factions. This is a truer reXection of power than the ratio of
memberships: 3:2 in the case of both teachers and majority religious

167

Indices of power: measuring influence

factions; 3:1 in the case of the minority religious body.

The distribution of power is most balanced on out-of-state boards
and most skewed in favour of the majority on voluntary maintained boards twice as much in fact as the skew towards the
majority on controlled secondary boards.
The relative power of the minor players
On voluntary maintained boards, the minor factions all have equal
power. The ELB does not have twice the strength of parents, teachers
or DE, as might be assumed. This is a reXection of the fact that each
of the four minor factions is equally useful in forming winning
coalitions.
The minor factions all have equal power on controlled boards too
a minor pleasant surprise for teachers, but a disappointment to
parents and the ELB.
On the sample out-of-state board, both indices reveal that the
power of the minority religious body is the same as that of the
majority religious body, although Wrst impressions would suggest
that the latter was twice as powerful.
It is interesting and perhaps signiWcant that majority and minority
religious bodies wield equal inXuence on out-of-state boards, despite the diVerent number of seats. It may encourage cooperation
while maintaining a countenance of proportionality.
The pay-oV for winning coalitions
If the Shapley value was used to award pay-oVs commensurate with
contribution, as suggested by minimum resource theory (Gamson,
1961), then it would be a way of achieving a market-like outcome
where a market did not exist per se, such as with school boards of
governors. However, members of boards of governors are not rewarded in market-like terms, if at all, and the motivational pay-oV
for the pivotal factions can best be understood in terms of political
inXuence and control. Experience suggests that the pay-oV for a
winning coalition is simply that its members are perceived to be
inXuential and share, to a greater or lesser extent, the power and
control associated with winning.

168

Multi-person games, coalitions and power

Table 8.10 Most pivotal position in the voting sequence for voluntary maintained board factions

C
L
P
T
D

% times pivotal in 2nd

% times pivotal in 3rd

% times pivotal in 4th

37
57
57
57
57

35
0
0
0
0

28
43
43
43
43

The order of voting or coalescence

Although not all committee decisions are made on the basis of voting
(and some votes are taken by secret ballot where the order is
unimportant), nevertheless, open voting (or making known ones
voting intentions) is a common occurence on committees. Table
8.10 reveals that minority factions on maintained boards should
avoid voting third! Voting second is slightly better than voting third
(57% of the time the faction will be pivotal, as opposed to 43%),
though it does not matter so much for the four church members.
Voting Wrst or last means that the faction cannot be pivotal, so
there is a commensurate loss of power in so doing.
On controlled boards, Table 8.11 reveals that, even for the majority
faction (the church nominees), there is no great advantage to
coalescing second as opposed to third (56% as opposed to 44%), but
again, factions should avoid voting Wrst or last.
On out-of-state boards, analysis reveals that a faction is most
powerfully placed when it is the third faction to make its position
known. This is particularly so for the three minor factions, as Table
8.12 shows. It can be seen that, for these to be pivotal, they must
coalesce third in the winning coalition. Even for the two majority
factions, there is considerable advantage to voting third in the order,
and second is slightly better than fourth. As always, factions should
avoid voting Wrst or last.
Minimal winning coalitions
Coalescing with the pivotally positioned faction is the next-best
thing to being pivotal oneself. If it is assumed that there is no
political reward for losing, and that all factions want to share the
spoils of winning, then the strategy should always be to end up on

169

Indices of power: measuring influence

Table 8.11 Most pivotal position in the voting sequence for controlled secondary board factions

C
L
P
T

% times pivotal in 2nd

% times pivotal in 3rd

56
56
56
56

44
44
44
44

Table 8.12 Most pivotal position in the voting sequence for out-of-state board factions

L
P
R
T
r

% times pivotal in 2nd

% times pivotal in 3rd

% times pivotal in 4th

18
18
0
0
0

68
68
100
100
100

14
14
0
0
0

the winning side. Unfortunately, this is opposed by an equal and

opposite desire on the part of those who have already formed a
winning coalition not to accept superXuous members. Just as there is
no incentive for the last voting faction to dissent, there is no incentive for the Wrst three or four factions to form grand coalitions, since
the last faction is never important. This idea is analogous to the
minimal winning coalition theory referred to previously (Riker,
1962), which states that if a coalition is large enough to win it should
avoid accepting additional factions, since these new members will
demand to share in the pay-oV without contributing essential votes
to the consortium.
This idea of minimal winning coalitions forms the basis for
another power index the DeeganPackel index which is discussed
below.
Other applications

While this discussion has concentrated on school governance, analysis

of this sort can easily be applied to other competitive voting situations,
both within education and without. For example, when the forerunner
of the European Union was set up by the Treaty of Rome in 1958, there

170

Multi-person games, coalitions and power

were Wve member states, of which Luxembourg was one. Luxembourg

had one vote out of a total of seventeen, but it has subsequently been
shown that no coalition of member states ever needed Luxembourg in
order to achieve a majority. In eVect, Luxembourg was a powerless
bystander in terms of competitive voting, though no doubt it beneWted
in other ways from membership of the grand coalition.
Implications for forming committees

Game theory analysis of multi-person coalitions raises additional practical implications for how committees are constituted, whether they are
dissemination forums or statutory decision-making bodies.
The numerical voting strength of a faction on a committee is not a
reXection of its real voting power. This can lead to frustration, but it
can also be a source of stability.
Statutory decision-making committees should be constituted so as
to reXect accurately the desired or entitled proportional representation.
Managers need to be aware of the possibility of disproportionate
voting power, particularly when setting up structures for staV involvement in decision making. StaV committees, which appear to
reXect the relative sizes of diVerent groupings within organisations
for example, may be dangerously skewed.
There are other measurements of power, such as the Johnston index
(Johnston, 1978), which looks at the reciprocal of the number of
critical factions; the DeeganPackel index (Deegan & Packel, 1978),
which looks at the reciprocal of the number of minimal factions; and
the Banzhaf index (Banzhaf, 1965), which looks at the number of
coalitions for which a faction is both critical and pivotal.
The Johnston index

The Johnston index uses the reciprocal of the number of critical

factions as its base.
Suppose, as before and using the same notation, that some of the n
factions in the game vote together to form a winning coalition C.
Suppose an individual faction of C is denoted by fi and the size of the
winning coalition C is s. Then:

171

Indices of power: measuring influence

C : f1, f2, . . . , fi, . . . , fs

Let k be the number of such winning coalitions in the game for which
the defection of fi is critical. Then:
Winning coalitions in which fi is critical : C1, C2 Ci Ck
Let the number of critical factions in any coalition be denoted by m,
where m s. Then, m1 denotes the number of critical factions in C1; m2
denotes the number of critical factions in C2; mi denotes the number of
critical factions in C i ; mk denotes the number of critical factions in Ck;
and so on.
The total Johnston power is now deWned as:
jp(fi) : 1/m1 ; 1/m2 ; ; 1/mi ; ;1/mk
and this is normalised to the Johnston index, J(fi), as:
J(fi) :

jp(fi)
i jp(fi), for i : 1 to n

Since the Johnston index is normalised, 0 O J(fi) O 1, where 1 represents absolute power.
The DeeganPackel index

The DeeganPackel index uses the reciprocal of the number of minimal

factions as its base. Let k be the number of minimal winning coalitions.
Let the number of factions in any coalition be denoted by m, such that
m1 denotes the number of factions in C1; m2 denotes the number of
factions in C2; mi denotes the number of factions in C i ; mk denotes the
number of factions in Ck; and so on. Using the same notations as
before, the total DeeganPackel power is deWned as:
dp(fi) : 1/m1 ; 1/m2 ; ;1/mi ; ;1/mk
and this is normalised to the DeeganPackel index, D(fi), as:
D(fi) :

dp(fi)
i dp(fi), for i : 1 to n

172

Multi-person games, coalitions and power

Since the DeeganPackel index is normalised, 0 O D(fi) O 1, where 1

represents absolute power.
The Banzhaf index

The Banzhaf index looks at the number of coalitions for which a faction
is both critical and pivotal. Using the same notation as before, the total
Banzhaf power, b(fi), is deWned as the number of winning coalitions in
which fi is a pivotal and critical member. This is normalised to the
Banzhaf index, B(fi), as:
B(fi) :

b(fi)
i b(fi), for i : 1 to n

Since the Banzhaf index is normalised, 0 O B(fi) O 1, where 1 represents absolute power.
Summary

Each of the Wve power indices has its own characteristics. Three of them
the Shapley, ShapleyShubik and Banzhaf indices depend on the
order in which the winning coalition is formed. The Johnston index,
which looks at coalitions that are winning but not minimal, may
contain factions that are not critical, i.e. their defection does not cause
the coalition to become a losing one. The DeeganPackel index looks at
the number of factions in minimal winning coalitions and thus regards
all such factions as having equal power.
The two indices used in Example 8.2 are the most straightforward
and popular, although they are limited in a minor way by the axioms
and assumptions already noted. These include the assumption that
factions always vote sincerely and along rational self-interest lines; that
voting is open; that coalitions have not been pre-arranged; that all
coalitions are equally likely to appear; and that there is a reward for
being part of a winning coalition. The appropriateness of these assumptions is, of course, a matter for judgement. Each faction judges
the suitability of a particular solution according to the favourableness
of its outcomes and not by any innate attractiveness. Therefore, power
is ultimately judged by its actual exercise, rather than by its perceived
distribution; and perceptions can be mistaken, as game theoretic

173

Indices of power: measuring influence

analysis shows. The actual exercise of authority and inXuence is a

diVerent matter. The constitution of powerful committees is sometimes taken as representative of something deeper happening in society
generally, as notions about democratisation and empowerment are
transferred to and from diVerent organisational settings. However, as
this chapter has illustrated, the perception of how power is distributed,
never mind exercised, is often Xattering to deceive. The reality is often
disappointing.

A critique of game theory

How selsh soever man may be supposed, there are evidently some principles in his nature
which interest him in the fortunes of others, and render their happiness necessary to him,
though he deserves nothing from it except the pleasure of seeing it.
Adam Smith 1795 The Theory of Moral Sentiments

Although game theory has been outstandingly successful at developing

a deeper understanding of how rational players make decisions under
interdependent circumstances, several criticisms have been made of
some of its assumptions. Some are fatuous; others are challenging.
Among the former, is the criticism that players who act irrationally
gain the upper hand in some games and that, therefore, the rational
basis for game theory is undermined. This only needs to be stated for its
absurdity to become apparent. It is the perception that players are
unpredictable and irrational that gives them the edge in some games,
not irrationality itself. In fact, they are being eminently rational in
deliberately giving that impression, while attempting to win the game
by the same conscious or subconscious manoeuvre.
Of the serious challenges to game theory mounted over the last few
years, the three issues of rationality, indeterminacy and inconsistency
are the most interesting. Being unresolved, it is Wtting that the book
Wnishes with them and if the reader is enticed to greater study as a
result, their inclusion will have been proWtable.

Rationality
Game theory is based on a presumption of rationality, which at Wrst
sight appears to be optimistic. At the very least, there is need for more
174

175

Rationality

experimental evidence to support the contention that individuals select

critical strategies and make complex decisions under uncertainty, on
the basis of rationality. Furthermore, in games which have no rational
solution, it may even be the case that players have to resort to some
kind of irrationality in order to progress.
The assumption of rationality in game theory can be partly justiWed
on a number of diVerent levels. Firstly, there is some evidence to
suggest that a kind of natural selection is at work which inclines
successive generations of decision makers towards the rational, on the
basis that organisations which select sub-optimal strategies eventually
shut down in the face of competition. Thus, successive generations of
decisions are increasingly rational, although the extent to which this
competitive evolution applies to all commercial and not-for-proWt
sectors is unknown.
Secondly, justiWcation can be demonstrated to hinge on which of the
various deWnitions of rationality is selected. Instrumental rationality,
for one, is the presumption that players always act selWshly in their own
interest (homo economicus (von Neumann, 1928)) and that they are
able to determine, at least probabilistically, the outcome of their actions and to rank them in order of preference. Thus, if a player makes a
seemingly crazy choice, it is because the player bases selection on an
irrational belief. Although the belief itself is irrational and not the
selection, deWning rationality thus, on the basis of outcome, is suspect.
For one thing, it is sometimes self-defeating in that a player may
achieve an optimal outcome by not acting selWshly, as in the case of the
prisoners dilemma game. Instrumental rationality suggests that the
players in such a game should refuse to cooperate, yet they can both do
better by rejecting this strategy. It seems that what is at work in these
circumstances is not irrationality, but another type of rationality
collective rationality.
One alternative deWnition of rationality that is often discussed in
relation to game theory is based on the work of Kant in the eighteenth
century (Beck, 1969; Beck, 1988; Guyer & Wood, 1998). Kant deWnes
rationality as behaviour in line with categorical imperatives or laws that
prescribe a certain type of behaviour derived from reason alone. Since
all individuals are endowed with the ability to reason, rationality
dictates behaviours with which everyone can agree and all individuals
use their ability to reason to formulate the same imperatives. Rational

176

A critique of game theory

players should therefore behave according to laws which they would

like to see made universal, by virtue of their rationality. If it is not
possible for every player to select a certain strategy, then it is, by
deWnition, irrational. Decisions based on Kants notion of moral imperative, rather than on the concept of self-interest, can lead to diVerent outcomes for game theoretic problems because players may still be
acting rationally even though they have chosen unselWshness, simply
because not all players Wnd it possible morally to choose selWshness. So,
depending on deWnition, it may yet be rational for players in a
prisoners dilemma game to cooperate after all.
Kantian and instrumental rationality are extreme alternatives to one
another and there are many less radical deWnitions that lie somewhere
between these two. One such is the notion of bounded rationality
(Simon, 1997). According to this concept, individuals have limited
computational ability and as a result, sometimes adopt very simple
decision-making rules to achieve desired outcomes. These can be very
successful and can sometimes outperform more rational decisionmaking algorithms.
The most compelling evidence for this comes from an experiment
designed and conducted by Robert Axelrod (1981). He invited game
theorists to submit computer programs on how best to play a repeated
(200 times) prisoners dilemma game. The 14 entrants were randomly
paired in a round-robin competition against each other (repeated Wve
times) and each programs aggregate score against every other program
was calculated.
The most successful strategy was the simplest one the so-called
tit-for-tat strategy submitted by Anatole Rapaport whose opening
gambit was cooperation and whose subsequent moves just mimicked
whatever the opponent had done previously. Axelrod later conducted a
follow-up experiment. He made participants aware of tit-for-tats
strategy and invited participants to re-design their programs to beat it.
At Wrst, aggressive programs did well at the expense of tit-for-tat, but
the more often the aggressive programs played each other, the better
tit-for-tat did. It won again! Commenting on the result, Rapaport
noted that his strategy did not actually beat any of its opponents, but
triumphed because opposing strategies designed to beat tit-for-tat
reduced each others scores by playing against each other.
Rapaports winning strategy did have an Achilles heel, however. It

177

Indeterminacy

got locked into a losing strategy for both itself and its opponent
whenever the opponent made random irrational errors a doomsday
scenario from which was needed another irrational error from the
opponent in order to escape. To investigate further, Axelrod conducted
a third run of the experiment, generating random error for tit-for-tat
and its opponents. This time it was beaten by more tolerant opponents
ones which waited to see whether aggression was a mistake or a
deliberate strategy.
The paradox that sometimes it is rational to act irrationally can only
be resolved by altering the deWnition of what it means to be rational.
The importance of such a deWnition is more than mere semantics. The
success or otherwise of game theory as a model for behaviour depends
on it. It may mean diVerent things in diVerent circumstances to
diVerent people, but it undermines or underpins the very foundations
of game theory, whatever it is.

Indeterminacy
The second major criticism of game theoretic constructs is that they
sometimes fail to deliver unique solutions, usually because the game
has more than one equilibrium. In such cases, the optimal strategy
remains undetermined and selections are usually made on the basis of
what players think other players will do. Therefore, strategic selection is
not necessarily rational. It may centre on prominent features of the
game focal points towards which decision making gravitates (Schelling, 1960). These salient features act like beacons for those playing the
game, so that the Wnal outcome is in equilibrium. They are usually
experiential or cultural, rather than rational.
The problem of indeterminacy aVects, in particular, mixed-strategy
Nash equilibrium solutions because, if one player expects the other to
choose a mixed strategy, then he or she has no reason to prefer a mixed
strategy to a pure one. To overcome this, some writers have suggested
that mixed-strategy probabilities represent what players subjectively
believe other players will do, rather than what they will actually do
(Aumann, 1987). This is akin to the Harsanyi doctrine, which states that
if rational players have the same information, then they must necessarily share the same beliefs, although it is undermined in turn by the fact

178

A critique of game theory

B
Pass 1

A
Pass 2

Pass 3

Take 1

Take 2

Take 3

(1, 0)

(0, 2)

(3, 0)

Pass 50

Take 4

(0, 4)

(0, 0)

Take 50

(0, 50)

Player A pay-offs shown first

Figure 9.1

The centipede game.

that rational players with the same information do not always make the
same suggestions or reach similar conclusions.

Inconsistency
The third major criticism of game theory, that of inconsistency (Binmore, 1987), concerns the technique of backward induction and the
assumption of common knowledge of rationality in Bayesian sub-game
perfect Nash equilibria. The criticism is best illustrated by way of an
example.
The centipede game, so-called because of the appearance of its game
tree, was developed by Rosenthal (1981) from Selten (1978), and has
since been extended to include a number of variations (Megiddo, 1986;
Aumann, 1988; McKelvey & Palfrey, 1992). The original basic version
has two players, A and B, sitting across from each other at a table. A
referee puts 1 on the table. Player A is given the choice of taking it and
ending the game, or not taking it, in which case the referee adds
another 1 and oVers player B the same choice take the 2 and end the
game, or pass it back to the referee who will add another 1 and oVer
the same choice to player A again. The pot of money is allowed to grow
until some pre-arranged limit is reached 50 say which is known in
advance to both players. Figure 9.1 shows the decision tree for the
game.
The method of backward induction tells us that, since player B must
surely take the 50 at the Wnal node, a rational player A should accept

179

Inconsistency

the pot of 49 at the penultimate node. Therefore player B should take

the pot of money (48) at the previous node and so on back to the
initial node, where a rational player A should take the Wrst 1 and end
the game there and then. This is the sub-game perfect Nash equilibrium.
However, such reasoning is inconsistent with the assumption of
common knowledge of rationality, where all players believe that all
players are rational, and which dictates that such a game should end at
the Wrst node. It is redundant to ask what a player would do in the
impossible event of a subsequent node being reached. These nodes
should never be reached and so any argument based on this reasoning
is spurious.
In reality, these games do not stop at the initial node; they go on for
at least a few rounds. By behaving irrationally, the players do better
than if they behaved selWshly and rationally. Some games have even
been known to go right to the Wnal node, with players splitting the 50
prize money, without having any prior agreement to do so (Dixit &
Skeath, 1999). So it appears that, in games such as the centipede game
and repeated martyrdom (prisoners dilemma) games, where initial
cooperation can make both players better oV, a form of unspoken
eudaemonic cooperation and a sense of fair play seems to usurp the
rationality of backward induction, although how long such a spirit of
cooperation would be maintained if the prize money at the Wrst node
was very large, is a moot point.
It is impossible to reconcile the Xaws inherent in the concept of
common knowledge of rationality since it attempts to impose a false
certainty into a game where there is, by deWnition, uncertainty. In the
centipede game, for example, it rules out the possibility of any player
ever refusing the pot of money even though this is a legitimate choice
(Romp, 1997). If player A does act irrationally and refuse the prize
money, this provides incontrovertible evidence to player B that player
A is not rational, and the progress of the game thereafter becomes
uncertain. Player B might decide to string player A along for a while in
the certain knowledge that the prize money will escalate and in the
hope that player A will not suddenly turn rational and take the pot!
And from player As point of view, it becomes rational to pretend to be
irrational, at least initially, in the hope that player B wont see through
the ruse. Whatever the outcome or the procedure of the game, the

180

A critique of game theory

assumption of common knowledge of rationality is demonstrably

inconsistent.
Various attempts have been made to overcome the obstacles imposed by the need for some kind of rational consistency. Selten (1975)
suggests that one way to resolve this inconsistency is to assume that
players make random mistakes while playing games. This so-called
trembling hand assumption allows a player in the centipede game to
refuse the prize money without violating the assumption of rationality.
More radically, Binmore (1987) proposes redeWning rationality as
something procedural, requiring players to adopt arbitrary stopping
rules. This model avoids situations where deviations from sub-game
perfection are inconsistent with rationality, by deWning diVerent types
of rationality simply in terms of alternative stopping rules.

Conclusion
Game theory clearly fails to describe the reality of decision making in
some circumstances, although in its defense, it should be said that it
primarily seeks to provide a prescriptive analysis that better equips
players to make good strategic decisions. It does not make moral or
ethical recommendations. It merely explores what happens when certain selWsh incentives are assumed. Game theory cannot be held responsible for selWsh behaviour, no more than medicine can be held
responsible for sickness.
Game theory is in Xux. It is continually being developed and researched. Not all predictions have been found to be supported by
empirical evidence and this has led to reWnement and reconstruction.
So it should be! New and more complex variables have been introduced, largely as a result of its application to neo-classical and neoKeynesian economics, though the extent to which in-game learning
inXuences both success and rationality has not yet been fully explored.
Fundamental questions such as whether learning increases pay-oV or
determines strategy, whether good learners play better games and
which type of learning best equips players for which type of games,
have been left unasked and unanswered. Such questions are of fundamental importance in education, training and organisational development, of course. The rapidly changing nature of society and its

181

Conclusion

post-industrial economy brings new challenges almost daily. Information is no longer precious, the property of the privileged few. It is
immediate, available in real time and irrespective of individual status.
Organisational intelligence has thus become the shared faculty of the
many, and the worth of collectives has become rooted in notions of
social and intellectual capital.
If surviving and thriving in the face of change is the name of the
game, then everyone involved in it is a player. Individuals and organisations need to learn generic concepts of strategic networking and
problem resolution, as a cultural expectation and over a lifetime.
Decision making needs to be informed and sure-footed. The pay-oV
for keeping apace is eVectiveness, the price for failing to do so is
degeneration, and the strategy for avoiding failure lies at the interface
of game theory and learning. It is an interaction that can only grow
stronger.

Proof of the minimax theorem

Preamble
Suppose that the pay-oV matrix for a two-person zero-sum game has m
rows and n columns and that player 1 and player 2 choose their
strategies, represented by row ri and column cj, simultaneously and
with pay-oV uij. Both players randomise their selections using mixed
strategies in which a probability is assigned to each available option: p
for player 1 and q for player 2 say). Of course, the sum of each players
mixed strategies is unity and can be written:
p : (p1, p2, . . . , pi, . . . , pm)
pi : 1, for i : 1 to m.
and
q : (q1, q2, . . . , q j, . . . , qn)
q j : 1, for j : 1 to n
Therefore, strategy ri will be chosen with probability pi and strategy cj
with probability q j. These strategies are chosen independently, so the
probability of getting a particular pay-oV wij is pi q j. The expected
pay-oV for the game is then:
wij pi q j, for i : 1 to m and j : 1 to n.
182

183

Appendix A

Player 1 wants to maximise this expected pay-oV and player 2 to

minimise it. If player 1 knew that player 2 was going to choose column
c say, he or she would choose the strategy which maximised the
expected pay-oV:
maxp wic pi qc
Similarly, if player 2 knew that player 1 was going to choose row r say,
he or she would choose the strategy which minimised the expected payoV:
minq wrj pr q j
Although these counter-strategies cannot be used in practice since
neither player knows what the other will do, the players can nevertheless maximise their security levels by assuming the most pessimistic
choice by the opposing player. So player 1 can guarantee that the
expected pay-oV will not be less than:
maxp minq wij pi q j
and player 2 can guarantee that the expected pay-oV will not be more
than:
minq maxp wij pi q j
The minimax theorem states that these two expressions are equal. In
other words:
maxp minq wij pi q j : minq maxp wij pi q j
for any pay-oV matrix, denoted by (wij).

Proof: step 1
Since, by deWnition, the maximum value of a variable cannot be smaller
than any other value and the minimum value cannot be bigger than any
other value, it follows that
maxp minq wij pi q j : minq wrj pr q j O wrc pr qc O maxp wic pi qc
: minq maxp wij pi q j
So:
maxp minq wij pi q j O minq maxp wij pi q j

184

Appendix A

Teachers

Students

Strategy

Teach on

Passively
supervise
group study

Actively
give revision
workshops

Attend
lessons

Do not attend
lessons

Figure A.1

The intention is to now prove that:

maxp minq wij pi q j P minq maxp wij pi q j
thereby proving the theorem.

A graphic model for the game

If player 1 chooses row 1, say, and player 2 chooses a mixed strategy,
then the expected pay-oV will be:
w11q1 ; w12q2 ; ; w1n qn : w1j q j
If player 1 had chosen row 2 instead, then the expected pay-oV would
have been:
w2j q j
and so on, for m possible rows. Therefore, any of player 2s mixed
strategies can be represented graphically by a point in an m-dimension
coordinate system, W, with m axes. That point can be written:
(w1j q j, w2j q j , . . . , wmj q j )
Consider the following example, in two dimensions, represented on
Figure A.1. (This example can be contextualised by reference to
Example 5.4.)
Every point in W corresponds to a strategy for player 2 and the
vertices are the pure strategies (see Figures A.2 and A.3).
Any point inside W corresponds to a mixed strategy for player 2
corresponding to
(q1, q2, q3), where q3 : 1 9 (q1 ; q2)

185

Appendix A

Player 2
chooses
column 1

Player 2
chooses
column 2

Player 2
chooses
column 3

( 20, 0 )

( 8, 10 )

( 3, 9 )

Player 1
chooses
row 1
Player 1
chooses
row 2
Vertex
Figure A.2

Player 1
chooses
row 2
b

Player 1
chooses
row 1

Figure A.3

For example, consider the point (8, 9). Clearly:

w1j q j : 8 : 20q1 ; 8q2 ; 3(1 9 q1 9 q2)
and
w2j q j : 9 : 0q1 ; 10q2 ; 9(1 9 q1 9 q2)
These two equations produce the following pair of simultaneous equations:
5 : 17q1 ; 5q2
0 : q2 9 9q1

186

Appendix A

which in turn produce the following solution set:

q1 : 5/62;

q2 : 45/62;

q3 : 12/62

So the point (8, 9) corresponds to the mixed strategy (5/62, 45/62,

12/62).

Proof: step 2
If
minq maxp wij pi q j 0
then the coordinates of any point in W cannot both be negative. At least
one must be zero or positive and therefore, the region of the third
quadrant, Q, can have no common point with W.
Let a : (a1, a 2, . . . , am) be the point in Q nearest W, and b : (b1,
b2, . . . , bm) be the point in W nearest a. Clearly, if ai is replaced by a
negative number ai* or zero, a is still a point in Q and is therefore no
nearer W. In other words:
(b1 9 a1*)2 ; (b2 9 a2)2 ; ; (bm 9 am)2 P (b1 9 a1)2
; (b2 9 a2)2 ; ; (bm 9 am)2
which simpliWes after cancellation to:
(b1 9 a 1*)2 P (b1 9 a1)2
There are two cases to consider.
If b1 O 0, then possibly a1* : b1, in which case a1 : b1. Likewise a
and b.
If b1 0 and if it is assumed that a1* : 0, then b12 P (b1 9 a1)2,
which simpliWes to (b1 9 a1) a1 : 0 because a1 O 0 and b1 0.
Likewise a and b.

Proof: step 3
For any number t between 0 and 1 and any point w + W, the point
tw ; (1 9 t)b : [tw1 ; (1 9 t)b1, tw2 ; (1 9 t)b2, . . . ,
twm ; (1 9 t)bm]

187

Appendix A

on the straight line between w and b is a probability weighted average of

the points w and b. Since w and b are in W, they in turn are probability
weighted averages of the vertices of W and therefore, so is the point
tw ; (1 9 t)b.
So t(w) ; (1 9 t)b also belongs to W and is thus no nearer to a than
b. Thus:
[tw1 ; (1 9 t)b1 9 a1]2 ; [tw2 ; (1 9 t)b2 9 a2]2 ;
; [twm ; (1 9 t)bm 9 am]2 P (b1 9 a1)2 ; (b2 9 a2)2;
; (bm 9 am)2
If t is small, this expression becomes:
w1(b1 9 a 1) ; w2(b2 a2) ; ; wm(bm 9 am)
P (b1 9 a1)2 ; (b2 9 a 2)2 ; ; (bm 9 am)2
The right-hand side of this equation cannot be negative, therefore:
w1(b1 9 a1) ; w2(b2 9 a2) ; ;wm(bm 9 am) P 0

Proof: step 4
Each of the numbers (b1 9 a1), (b2 9 a2), . . . , (bm 9 am) is positive or
zero, according to the result of step 2. They cannot all be zero, because a
and b are diVerent points and therefore cannot have all the same
coordinates. Therefore, the sum of all these numbers is positive, i.e:
(bi 9 ai) 0
Therefore, if (bi 9 ai)/(bi 9 ai) is denoted by i, then 1, 2, . . . , m are
each either zero or positive and i : 1, since (bi 9 ai)/
(bi 9 ai) : 1.
So : (1, 2, . . . , m) satisWes the requirements of a mixed strategy.
Dividing each term of the expression at the end of step 3 by
(bi 9 ai) gives:
a11 ; a22 ; ; amm 0, for every w + W
But according to the deWnition of W in the graphic model, the
coordinates of w are:
w : (w1j q j, w2j q j, . . . , wmj q j)

188

Appendix A

so a mixed strategy has therefore been found for player 1 such that:
1w1j q j ; 2w2j q j ; ; mwmj q j 0
for every q, so that:
minq wij i q j 0
Since this holds for , it must hold for the mixed-strategy p that
maximises
minq wij pi q j
Therefore:
maxp minq wij pi q j 0
So it has been shown that:
if minq maxp wij pi qj 0, then maxp minq wij pi qj 0

Proof: step 5
Let k be any number and consider the pay-oV matrix which has wij 9 k
in place of wij everywhere. All pay-oVs are reduced by k in both pure
and mixed strategies. So:
(maxp minq wij pi q j) is replaced by (maxp minq wij pi q j 9 k)
and
(minq maxp wij pi q j) is replaced by (minq maxp wij pi q j 9 k)
It was proved in step 4 that:
if (minq maxp wij pi q j 9 k) 0, then (maxp minq wij pi q j 9 k) 0
So
if minq maxp wij pi q j k, then maxp minq wij pi q j k
Since k can be as close as necessary to minq maxp wij pi q j, it follows
that:
maxp minq wij pi q j P minq maxp wij pi q j
But we have already seen that:

189

Appendix A

maxp minq wij pi q j O minq maxp wij pi q j

Therefore, the two must be equal, i.e:
maxp minq wij pi q j : minq maxp wij pi q j

Proof of Bayess theorem

Preamble
Bayess theorem shows how a posteriori probabilities are calculated
from a priori ones. In other words, how probabilities are updated as
more information is received. In its simplest form it states:
p(A/B) :

p(B/A) p(A)
p(B/A) p(A) ; p(B/Ac) p(Ac)

where Ac denotes the complementary event of A such that:

p(Ac) : 1 9 p(A)
and the a posteriori or conditional probability of event A happening,
given that B has already happened, is denoted by p(A/B).
Alternatively, if event B has a non-zero probability of occurring for
each event Ai, of which there are n possibilities say, then Bayess
theorem may be stated as:
p(Ai/B) :

p(B/Ai) p(Ai)
ip(B/Ai) p(Ai)

For example, if there are only two possibilities for Ai, then:
p(A1/B) :
190

p(B/A1) p(A1)
p(B/A1) p(A1) ; p(B/A2) p(A2)

191

Appendix B

and
p(A2/B) :

p(B/A2) p(A2)
p(B/A1) p(A1) ; p(B/A2) p(A2)

where p(A2) : 1 9 p(A1) of course.

Proof
By deWnition,
p(Ai and B) : p(B/Ai) p(Ai)
and
p(Ai and B) : p(Ai/B) p(B)
Equating these two gives:
p(Ai/B) :

p(B/Ai) p(Ai)
p(B)

(1)

or
p(B) p(Ai/B) : p(B/Ai) p(Ai)
Summing both sides over i gives:
p(B) i p(Ai/B) : i p(B/Ai) p(Ai)
But we know that
i p(Ai/B) : 1
so
p(B) : i p(B/Ai) p(Ai)
Substituting (1) into (2) gives the result:
p(Ai/B) :

p(B/Ai) p(Ai)
i p(B/Ai) p(Ai)

(2)

Bibliography

Abreu, D. (1986) Extremal equilibria of oligopolistic supergames, Journal of Economic

maximax principle 467
maximin principle 46
minimax principle 47
regret matrices 47
see also chance games
union candidates, leadership games example
1024
university research partnership, sequential
decision example 625
utility theory/value/function 18, 3845
variable-sum games see mixed-motive
two-person games of strategy
variance, probability theory 367
von Neumann, John 911
and Maurice Frechet 9
minimax theorem 89
von NeumannMorgenstern utility function
39, 412
von Stackelberg model of duopoly 11516
paper and packaging sector example 11620
Von StackelbergNash equilibrium level
1234
voting order 168
Waldegraves solution 8
Walmart Stores Inc change proposals,
sequential change examples 6970
whole pot payments 155

Zermolos theorem 8, 12
zero-sum non-cooperative games, about 2
zero-sum two-person games of strategy 6,
7797
about 778
constant-sum games 77
dominance and inadmissibility 856
Wnite and inWnite 78
information sets 78
interval scales for pay-oVs 934
large matrices 903
law of conservation of utility value 77
medical incompetence example 823
minimax mixed strategies 89
mixed strategies 89
no saddle-point games 8690
normal forms 7880
ordinal pay-oVs 947
pareto-eYciency 77
pay-oV matrix 83
perfect and imperfect information
789
re-allocation of airline pilot duties example
8790
representation 7880
saddle points 806
solutions, deWnition 78
student attendance example 913
travel timetable reforming example 834
zero-sum game representation 7880

Game Theory - 4th Edition - Owen, Guillermo
100% (6)
Game Theory - 4th Edition - Owen, Guillermo
627 pages
DHAMI, Sanjit - The Foundations of Behavioral Economic Analysis
No ratings yet
DHAMI, Sanjit - The Foundations of Behavioral Economic Analysis
2,709 pages
Petition For Change of Name
100% (10)
Petition For Change of Name
5 pages
BSC PT - Inpatient PT Evaluation Form
50% (2)
BSC PT - Inpatient PT Evaluation Form
1 page
Strategic Thinking in Complex Problem Solving
100% (15)
Strategic Thinking in Complex Problem Solving
300 pages
Repeated Games and Reputations Long PDF
No ratings yet
Repeated Games and Reputations Long PDF
664 pages
Avinash Dixit and Barry Nalebuff - Thinking Strategically
100% (6)
Avinash Dixit and Barry Nalebuff - Thinking Strategically
400 pages
(Universitext) Peter Morris (Auth.) - Introduction To Game Theory (1994, Springer-Verlag New York)
89% (9)
(Universitext) Peter Morris (Auth.) - Introduction To Game Theory (1994, Springer-Verlag New York)
240 pages
Exercises For The Midterm - Summer 2023
No ratings yet
Exercises For The Midterm - Summer 2023
6 pages
Behavioural Econ PDF
100% (3)
Behavioural Econ PDF
354 pages
Game Theory
67% (3)
Game Theory
52 pages
Enjoyable Econometrics-Cambridge University Press (2018)
100% (4)
Enjoyable Econometrics-Cambridge University Press (2018)
302 pages
Games and Decisions: Introduction and Critical Survey
From Everand
Games and Decisions: Introduction and Critical Survey
R. Duncan Luce
4/5 (14)
(Saul Sathl) A Gentle Introduction To Game Theory PDF
100% (4)
(Saul Sathl) A Gentle Introduction To Game Theory PDF
190 pages
Game Theory
50% (2)
Game Theory
12 pages
Fernando Vega-Redondo Economics and The Theory of Games 2003
No ratings yet
Fernando Vega-Redondo Economics and The Theory of Games 2003
526 pages
Systems Thinking
100% (10)
Systems Thinking
62 pages
Macroeconomics For Managers
100% (5)
Macroeconomics For Managers
838 pages
Case Study of Phobias
No ratings yet
Case Study of Phobias
8 pages
Game Theo
100% (2)
Game Theo
587 pages
The Managers Guide To Systems Thinking
100% (23)
The Managers Guide To Systems Thinking
227 pages
Game Theory
100% (8)
Game Theory
587 pages
Game Theory PDF
100% (5)
Game Theory PDF
295 pages
Models in Cooperative Game Theory
100% (2)
Models in Cooperative Game Theory
203 pages
Introduction To Game Theory
100% (2)
Introduction To Game Theory
30 pages
Behavioral Economics
75% (4)
Behavioral Economics
145 pages
Behavioral Economics PDF
No ratings yet
Behavioral Economics PDF
13 pages
Behavioral Economics: Masao Ogaki Saori C. Tanaka
100% (6)
Behavioral Economics: Masao Ogaki Saori C. Tanaka
218 pages
Game Theory and Its Applications
100% (1)
Game Theory and Its Applications
39 pages
Game Theory - Strategies Equilibria and Theorems PDF
100% (2)
Game Theory - Strategies Equilibria and Theorems PDF
396 pages
Theory of Decision Under Uncertainty
100% (2)
Theory of Decision Under Uncertainty
232 pages
A Bayesian Theory of Games: Iterative Conjectures and Determination of Equilibrium
100% (4)
A Bayesian Theory of Games: Iterative Conjectures and Determination of Equilibrium
109 pages
Game Theory Interactive Strategies in Economics and Management 1nbsped 0521176042 9780521176040 978052176
100% (3)
Game Theory Interactive Strategies in Economics and Management 1nbsped 0521176042 9780521176040 978052176
461 pages
Behavioral Economics
100% (2)
Behavioral Economics
248 pages
The Art of Logical Thinking
100% (5)
The Art of Logical Thinking
210 pages
Behavioral Economics-eBook PDF
100% (2)
Behavioral Economics-eBook PDF
32 pages
Economics and The Theory of Games - Vega-Redondo PDF
67% (3)
Economics and The Theory of Games - Vega-Redondo PDF
526 pages
Introduction To Game Theory in Business and Economics (Webster, Thomas J) (Z-Library)
100% (3)
Introduction To Game Theory in Business and Economics (Webster, Thomas J) (Z-Library)
425 pages
A Primer in Game Theory
No ratings yet
A Primer in Game Theory
164 pages
Theory of Games Economic Behavior
100% (8)
Theory of Games Economic Behavior
674 pages
Game Theory For Applied Economists
100% (3)
Game Theory For Applied Economists
142 pages
The Economics of Uncertainty: Course Guidebook
100% (1)
The Economics of Uncertainty: Course Guidebook
189 pages
Game_Theory_and_Applications_Florian_Bartholomae_&_Marcus_Wiens
No ratings yet
Game_Theory_and_Applications_Florian_Bartholomae_&_Marcus_Wiens
337 pages
Game Theory: Compiled By: Gaurav Rampall Gautam Mudgal Gurjot Singh Bhalla Jigyasa Srivastava
100% (1)
Game Theory: Compiled By: Gaurav Rampall Gautam Mudgal Gurjot Singh Bhalla Jigyasa Srivastava
19 pages
The Oxford Handbook of Behavioral Economics and The Law (PDFDrive) PDF
100% (5)
The Oxford Handbook of Behavioral Economics and The Law (PDFDrive) PDF
841 pages
X Game
100% (1)
X Game
365 pages
Probability For Finance
83% (6)
Probability For Finance
115 pages
Constraint Theory Multidimensional Mathematical Model Management
100% (3)
Constraint Theory Multidimensional Mathematical Model Management
235 pages
Tut 8 Solution
100% (1)
Tut 8 Solution
2 pages
Game Theory and Politics
From Everand
Game Theory and Politics
Steven J. Brams
4/5 (1)
Decision Making Using Game Theory: An Introduction For Managers
No ratings yet
Decision Making Using Game Theory: An Introduction For Managers
8 pages
Game Theory Thesis Ideas
100% (2)
Game Theory Thesis Ideas
8 pages
Game Theory Limitations Summary Monopoly, The Vivarium and Game Theory: Default Limitations of Simulation Models
No ratings yet
Game Theory Limitations Summary Monopoly, The Vivarium and Game Theory: Default Limitations of Simulation Models
1 page
Economics
No ratings yet
Economics
7 pages
How Can Gaming Help Test Your Theory_ _ RAND
No ratings yet
How Can Gaming Help Test Your Theory_ _ RAND
7 pages
Chapter 3: Game Theory: Game Theory Extends The Theory of Individual Decision-Making To
100% (2)
Chapter 3: Game Theory: Game Theory Extends The Theory of Individual Decision-Making To
20 pages
Decision Making and Business Strategy
No ratings yet
Decision Making and Business Strategy
20 pages
Game Theory: Sri Sharada Institute of Indian Management - Research
No ratings yet
Game Theory: Sri Sharada Institute of Indian Management - Research
20 pages
Bachelor Thesis Game Theory
100% (3)
Bachelor Thesis Game Theory
4 pages
Australian Journal of Agricultural Economics - June 1969 - Longworth - MANAGEMENT GAMES AND THE TEACHING OF FARM MANAGEMENT
No ratings yet
Australian Journal of Agricultural Economics - June 1969 - Longworth - MANAGEMENT GAMES AND THE TEACHING OF FARM MANAGEMENT
10 pages
Strategic Decisions: The Mathematics of Game Theory
From Everand
Strategic Decisions: The Mathematics of Game Theory
Lexa N. Palmer
No ratings yet
Session 01
No ratings yet
Session 01
18 pages
Introduction Game Theory
No ratings yet
Introduction Game Theory
3 pages
Handi Logo
No ratings yet
Handi Logo
1 page
Data
No ratings yet
Data
7 pages
Introduction of Smart Meters in Pakistan and Role of Mobilik
No ratings yet
Introduction of Smart Meters in Pakistan and Role of Mobilik
6 pages
Vocabulary Guidelines
No ratings yet
Vocabulary Guidelines
12 pages
The Villain in the Book_ Lost All Cultivation, The Heroine Reversed 1 to 30
No ratings yet
The Villain in the Book_ Lost All Cultivation, The Heroine Reversed 1 to 30
125 pages
Lax Tribune Preview
No ratings yet
Lax Tribune Preview
4 pages
Synthesis of Cellulose Acetate
100% (1)
Synthesis of Cellulose Acetate
4 pages
Chase Banking Mortgage Training Guide
100% (3)
Chase Banking Mortgage Training Guide
46 pages
Vanita, Homosexuality in India
No ratings yet
Vanita, Homosexuality in India
2 pages
Lily Cronin-Miles Aldridge Presentation Photography II
No ratings yet
Lily Cronin-Miles Aldridge Presentation Photography II
21 pages
Colorado Preschool Program: 2010 Legislative Report
No ratings yet
Colorado Preschool Program: 2010 Legislative Report
15 pages
Lectures 2
No ratings yet
Lectures 2
14 pages
SIS Book - Chapter 02 - SIL Allocation
No ratings yet
SIS Book - Chapter 02 - SIL Allocation
36 pages
Engineering Thermodynamics Paper Set
No ratings yet
Engineering Thermodynamics Paper Set
36 pages
The Cambridge Dictionary of Philosophy 3° Edition Robert Audi (Editor) All Chapters Instant Download
100% (3)
The Cambridge Dictionary of Philosophy 3° Edition Robert Audi (Editor) All Chapters Instant Download
50 pages
Cabadbaran City Resolution 2013
No ratings yet
Cabadbaran City Resolution 2013
2 pages
Destillation of Wood Activity
No ratings yet
Destillation of Wood Activity
6 pages
Almond Cakes - Savour Online Classes
No ratings yet
Almond Cakes - Savour Online Classes
4 pages
Nicolas Vs - Ca Digest
No ratings yet
Nicolas Vs - Ca Digest
3 pages
LAB - The Axial Skeleton: Objectives
No ratings yet
LAB - The Axial Skeleton: Objectives
4 pages
Librt: The Philippine Government Must Prohibit Political Dynasty
No ratings yet
Librt: The Philippine Government Must Prohibit Political Dynasty
3 pages
Lesson Plan
No ratings yet
Lesson Plan
2 pages
Nte
No ratings yet
Nte
8 pages
Suzuki Grand Vitara - Grand Vitara 2.0 TDi (2001 - 2005)
No ratings yet
Suzuki Grand Vitara - Grand Vitara 2.0 TDi (2001 - 2005)
3 pages
Reviewer (Book) Chapter 1-3
67% (3)
Reviewer (Book) Chapter 1-3
41 pages
Determinants of Employee Engagement and Their Impact On Employee Performance
No ratings yet
Determinants of Employee Engagement and Their Impact On Employee Performance
1 page
Applying Multiple Techniques of Critical Thinking in Teaching Of, E.g., Economics 507, 461, 360, 361
No ratings yet
Applying Multiple Techniques of Critical Thinking in Teaching Of, E.g., Economics 507, 461, 360, 361
28 pages
SSC CHSL Online Mock Test
No ratings yet
SSC CHSL Online Mock Test
11 pages
Instruction Manual For EMT777 Digital Timer: 1.) Main Functions
No ratings yet
Instruction Manual For EMT777 Digital Timer: 1.) Main Functions
3 pages
Sorting and Filtering
No ratings yet
Sorting and Filtering
4 pages
Health Examination Record: 2019 SHD Form 4-A
100% (1)
Health Examination Record: 2019 SHD Form 4-A
3 pages