GOS5 CH10 Solutions
GOS5 CH10 Solutions
SOLVED EXERCISES
S1. False. The players are not assured that they will reach the cooperative outcome. Rollback
reasoning shows that the subgame-perfect equilibrium of a finitely played repeated prisoners’ dilemma
will entail constant cheating.
S2. (a) The payoffs are ranked as follows: high payoff from cheating (72) > cooperative payoff
(64) > defect payoff (57) > low payoff from cooperating (20). This conforms to the pattern in the text so
the game is a prisoners’ dilemma, as can also be seen in the payoff table given below:
Kid’s Korner
If the game is played once, the Nash equilibrium strategies are (Low, Low) and payoffs are (57, 57).
(b) Total profits at the end of four years = 4 57 = 228. Firms know that the game ends in
four years so they can look forward to the end of the game and use rollback to find that it’s best to cheat
in year 4. Similarly, it is best to cheat in each preceding year as well. It follows that it is not possible to
sustain cooperation in the finite game.
(c) The one-time gain from defecting = 72 – 64 = 8. Loss in every future period = 64 – 57 =
7. Cheating is beneficial here if the gain exceeds the present discounted value of future losses or if 8 >
7/r. Thus, r > 7/8 (or d > 8/15) makes cheating worthwhile, and r < 7/8 lets the grim strategy sustain
cooperation between the firms in the infinite version of the game. If r = 0.25, cooperation can be
sustained.
(d) Total profits after four years = 4 64 = 256. With no known end of the world, the firms
can sustain cooperation if r < 7/8 as in part (c). This answer is different from that in part (b) because the
firms see no fixed end point of the game and can’t use backward induction. Instead, they assume the
game is infinite and use the grim strategy to sustain cooperative outcome.
S3. (a) Payoffs are in thousands of dollars of salary. Each manager has a dominant strategy to
expend high effort. The Nash equilibrium is (High, High) with payoffs of 150 to each. This is not a
prisoners’ dilemma because the (Low, Low) outcome does not provide the managers with higher salaries
than they receive in the Nash equilibrium:
Manager 2
(b)
Managers still have a dominant strategy to expend high effort, so the Nash equilibrium is still (High,
High), but payoffs are now 90 to each in that equilibrium. The payoffs from the (Low, Low) outcome are
now better than those achieved in the Nash equilibrium; this game is a prisoners’ dilemma:
Manager 2
Manager 2
S4. (a) The game tree below shows payoffs in the order (You, Friend). The rollback equilibrium
is shown by making the chosen branches thicker at each node; the equilibrium is (Don’t invest, Cheat if
invest):
(b) In the repeated version, honesty could be sustained by a “grim trigger strategy,” where if
your friend ever cheats y ou, you will never invest with him again. The friend can get an extra $120 = 130
– 10 any one time, but will lose $10 every time thereafter. Then he will not cheat you if 10 > 120r, or r <
10/120 = 8.33%. Then 100(1 + r) < 120 is also true, so it is optimal for you to invest in your friend’s
business so long as he is following the equilibrium strategy, that is, not cheating first.
(c) If the rate of interest is 10%, the above agreement cannot be sustained. Suppose an
alternative agreement is that your friend gets x and you get 130 – x. By cheating you once, he can get an
extra 130 – x, but will lose x each period thereafter. He will not cheat you if x > 0.1(130 – x), or 1.1x > 13,
S5. (a) One manager is designated to choose High and the other Low. The High chooser makes a
side payment to the Low chooser so that each gets [(200 – 60) + 80]/2 = 110 each period. The necessary
side payment is 30 (thousand).
(b) Defection entails refusing to make the side payment, so the cheater gets an extra 30 for
one period. But then the game collapses to the single-shot Nash equilibrium in which payoffs are 90 to
each manager, so the cost of cheating is 110 – 90 = 20 each subsequent period. Defection is beneficial if
30 > 20/r, or if r > 2/3 = 66.67%. This is unlikely to be the case.
S6. In the k < 1 case, (Swerve, Swerve) maximizes the player’s joint payoff. Maintaining this type
of cooperation, however, is essentially impossible. This game differs from a prisoners’ dilemma because a
cheater in a prisoners’ dilemma can rationally expect retaliation. When one player establishes a pattern of
playing Defect, it is individually optimal for the other player also to play Defect. Therefore, a potential
cheater must compare the immediate gain from cheating with the future loss from the breakdown of
cooperation.
In this chicken game, in contrast, if one driver succeeds in being the first to drive Straight (and
will continue to do so), it is not rational for the other driver to retaliate; if James is going Straight, Dean’s
best response is to Swerve. James can thus lock in an outcome in which he achieves his most preferred
result. Thus, any attempt to establish a pattern of (Swerve, Swerve) outcomes is likely to break down as
each player tries to be the first to establish that he will choose Straight.
In the k > 1 case, a pattern in which each player alternates between Swerve and Straight is, once
established, almost certain to last. Both (Swerve, Straight) and (Straight, Swerve) are Nash equilibria in a
single-play game. Thus, once the pattern of alternating actions has been established, neither driver can
gain by deviating from it.
One difficulty that is likely to arise in this situation is in determining who gets the k payoff (and
who gets the –1 payoff) in the first round. With either discounting or an uncertain end to the game (or
an odd number of rounds), the player who drives Straight in the first round will have an advantage;
if both players attempt to get this advantage, the alternating pattern may be hard to establish. Of course,
either player would prefer to always drive Straight, while having the other driver respond by choosing
Swerve. This Always Straight strategy, however, is not optimal if you expect the other driver to alternate.
(b) When each country i produces half of the Q found in part (a), each will earn
πi = (P – c) * qi = (180 – 75 – 30) * 37.5 = 2,812.5.
(c) Assume that Korea decides to defect. (Because the per-unit costs are the same, the answer
will be identical if it is Japan that decides to defect.) Given that Japan is cooperating, Korea’s profit
function is
πK = (P – c) * qK = (180 – 37.5 – qK – 30) * qK = 112.5qK – (qK)2,
which is maximized at qK = 56.25. The resulting profit for Korea in that year is
112.5 * 56.25 – (56.25)2 = 3,164.0625.
Japan’s profits will be
(180 – 37.5 – 56.25 – 30) * 37.5 = 2,109.375.
(d) Each year, Korea and Japan are playing the following game (rounding to the nearest tenth
and assuming that when both defect each knows the other is also defecting, so that in that case they play
their Nash strategies):
Japan
Cooperate Defect
Cooperate 2,812.5, 2,812.5 2,109.4, 3,164
Korea
Defect 3,164, 2,109.4 2,500, 2,500
(e) The one-time gain from cheating is 3,164 – 2,812.5 = 351.5. The loss in each subsequent
year is 2,812.5 – 2,500 = 312.5. If each country is using a grim-trigger strategy, to sustain cooperation the
interest rate r needs to satisfy 351.5r < 312.5, or r < 88.9%.
S8. (a) Suppose that Player 1 decides to Roll (rather than Steal) with a prize pot of $X. With
probability 1/6, the die comes up 1 and everything is lost; the expected loss is $X/6. With probability 5/6,
however, the die comes up 2, 3, 4, 5, or 6 and increases the prize pot on average by $4,000 and Player 1
has the chance to decide again whether to roll. So long as $X < $20,000, the expected benefit from rolling
to increase the prize pot (5/6 $4,000) exceeds the expected loss from losing everything (1/6 $X); so,
given any prize pot $X < $20,000, Player 1 should roll again. On the other hand, given any $X > $20,000,
(b) By part (a), Player 1 prefers to steal now even if she believes that Player 2 is certain to
roll in all future periods. Moreover, Player 1’s incentive to steal is even higher if Player 2 may steal now
or in the next period. (If Player 2 steals now, Player 1 gains nothing unless she steals now. If Player 2
steals in the next period, Player 1 only gets half of the prize pot when stealing next period, increasing the
relative attractiveness of stealing now.) Thus, Player 1 must steal; by the same reasoning, so must Player
2.
(c) To show that this is a prisoners’ dilemma, we need to show that both players have a
dominant strategy (to steal now) and that they are both worse off when they both play their dominant
strategies than when they both play another strategy (to roll one more time and then steal). By (a), both
players are better off rolling one more time and then splitting the pot, compared to splitting $18,000.
However, both players have a dominant strategy to steal right away. To see why, suppose that Player 1
believes that Player 2 is stealing now with probability p. Stealing now gives Player 1 a payoff of $9,000
p + $18,000 (1 - p) = $18,000 - $9,000 p. By contrast, since the best possible outcome after rolling is to
split a prize pot of $24,000, rolling now gives Player 1 an expected payoff less than 0 x p + $24,000/2
(1 - p) = $12,000 - $12,000 p, which is less than $18,000 - $9,000 p for any value of p.
(d) Yes! Even though both players must steal when the prize pot gets sufficiently large, they
have an incentive not to steal at first, as long as they don’t believe that the other player is stealing. To see
why, suppose that the prize pot is $1,000 and that Player 1 believes that Player 2 is certain to roll. Player
1’s payoff from stealing now is $1,000. However, rolling now (and stealing next time!) gives a payoff of
at least 1/6 $0 + 1/6 ($3,000/2) + 1/6 ($4,000/2) + 1/6 ($5,000/2) + 1/6 ($6,000/2) + 1/6
($7,000/2) = 5/6 2,500, slightly more than $2,000. So, this is Player 1’s best response when believing
that Player 2 is going to roll herself, and vice versa for Player 2. Consequently, there exists an equilibrium
in which both roll at least once.
(e) Let X* be the highest prize pool given which both players choose to roll in some rollback
equilibrium. Because the players are certain to steal given any larger prize pool, they are certain to split
the pool next period if they do roll now; so, if they roll, each player expects to get payoff 1/6 0 + 1/6
($X + $2,000)/2 + 1/6 ($X + $3,000)/2 + 1/6 ($X + $4,000)/2 + 1/6 ($X + $5,000)/2 + 1/6 ($X +
Because prize-pot amounts are always multiples of $1,000, we can conclude that both players will always
steal in any rollback equilibrium as soon as the prize pot reaches or exceeds $3,000. In particular, while a
rollback equilibrium exists in which the players roll once, they never roll twice!
S9. (a) Each transaction generates profit of $100 = ($200-$100). The present value of this profit
stream is PV = 100 + 100 δ + 100 δ2 + ⋯ = 100/(1- δ). Given δ=2/3, PV = $300.
(b) The game tree when Buyer moves first is shown below. To explain the payoffs: In the
outcome (Pay, Deliver), Buyer gets $100 net benefit because $200 is paid for something of $300 value,
while Seller gets $300 as shown in part (a); these include the present value of future transactions. On the
other hand, in the outcome (Pay, Cheat), Buyer loses $200 and Seller gains $200 because the money is
stolen and there are no future transactions (since, by presumption, Seller will be kicked off the forum and
not be allowed to sell usernames in the future).
The benefit of cheating a Buyer is that the Seller does not need to pay $100 to create a ready-to-sell eBay
username. The cost of cheating is that all future transactions will be lost; the present value of these lost
future transactions is 100 δ + 100 δ 2+ …=¿ 100/(1-δ ) - 100 = $200. Since $200 > $100, the Seller
will choose not to cheat if the Buyer pays; anticipating this outcome, the Buyer pays and each player
enjoys a net benefit of $100. In the unique rollback equilibrium, Seller plays the strategy “Deliver if
Buyer chooses Pay”; Buyer chooses Pay; and the rollback equilibrium outcome is (Pay, Deliver).
(c) The benefit of cheating a Buyer remains $100, but now the cost of lost future business is
only 20 δ + 20 δ2 + ⋯ = 20/(1- δ) - 20 = $40, where $20 = ($120-$100) is the profitability of each
transaction. Since $40 < $100, the Seller will choose to cheat any Buyer who pays; anticipating this, the
(d) There are three basic ways in which the Aspkin Forum could “change the game” to
incentivize Sellers not to cheat.
First, the forum could create a penalty in addition to loss of future sales that cheating Sellers would have
to pay. For instance, Sellers could be required to post a bond of $100 that they will lose if they ever cheat
a Buyer. This increases the cost of cheating from $40 (derived in part (c)) to $140. Since $140 > $100,
Sellers now prefer not to cheat.
Second, the forum could take steps to increase the value of future business. For instance, reduce the
number of Sellers who are allowed to participate, thereby increasing the volume of sales for each
remaining Seller. If each remaining Seller expects to sell (say) three usernames per year, being expelled
from the site now entails losing 3 20 δ + 3 20 δ2 + ⋯ = 3 20/(1- δ) – 3 20 = $120 in future-year
profits, enough to deter them from cheating today.
Finally, the forum could reduce Seller’s temptation to cheat in the first place. For instance, suppose that
the forum shared “best practices” with its Sellers, helping them to create fake eBay usernames most
easily. If such efforts drove down the cost of creating such usernames, Sellers would have less to gain by
cheating Buyers. For example, if the cost of creating a ready-to-sell eBay username falls from $100 to
$60, the profitability of each trade will increase from $20 to $60, and the present value of future business
will rise to 60 δ + 60 δ2 + ⋯ = 60/(1- δ) - 60 = $120, again enough to deter cheating.