0% found this document useful (0 votes)
68 views

Lecture 1 Moral Hazard (Note)

This document provides an overview of moral hazard in principal-agent models. It discusses: 1) Moral hazard arises when an agent has private information that becomes relevant after signing a contract, such as a worker's effort level or how carefully a driver operates a vehicle. 2) Adverse selection instead involves private information agents have before contracting, such as a salesperson's ability level. 3) A simple model of moral hazard with hidden action is presented, where a principal offers a contract to an agent who then chooses effort. The output depends on effort and random noise. 4) Under certain conditions like risk aversion, the optimal "first-best" contract pays the agent the same amount

Uploaded by

An Sining
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Lecture 1 Moral Hazard (Note)

This document provides an overview of moral hazard in principal-agent models. It discusses: 1) Moral hazard arises when an agent has private information that becomes relevant after signing a contract, such as a worker's effort level or how carefully a driver operates a vehicle. 2) Adverse selection instead involves private information agents have before contracting, such as a salesperson's ability level. 3) A simple model of moral hazard with hidden action is presented, where a principal offers a contract to an agent who then chooses effort. The output depends on effort and random noise. 4) Under certain conditions like risk aversion, the optimal "first-best" contract pays the agent the same amount

Uploaded by

An Sining
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Moral Hazard

Shota Ichihashi∗

ECON 813

1 Principal-Agent Models
We begin the course with the study of principal-agent models: A principal designs a contract for
an agent to join. We ask what contracts are optimal for the principal. Here are some examples:

1. Employment contract: A firm hires a worker for a project, which will generate some output.
The firm commits to an output-contingent wage. After signing a contract, the worker exerts
effort and gets paid.

2. Insurance contract: A car insurance company offers an insurance contract, which specifies
payments conditional on a normal event and on an accident (i.e., premium and deductible).
After a person signs a contract, they decide how carefully to drive.

3. Selling products: An smartphone company designs a menu of price-quality pairs to supply.


The buyer chooses an item from the menu, pays the price, and gets the phone.

A key ingredient in most principal-agent models is that the agent has private information, e.g.,
only the agent knows the effort level, how carefully to derive, or the willingness to pay for high-
quality versus low-quality phones. Facing the agent who has private information, the principal
designs a contract to incentivize the agent to act properly or report truthfully.

2 Moral Hazard and Adverse Selection


Principal-agent models typically consider two types of information problems, which differ in when
the agent acquires private information. Models of moral hazard consider a situation in which an
agent has relevant private information that arises after the contract is signed. Models of adverse
selection or screening consider a situation in which an agent has relevant private information at

Various sections of this note draw heavily on lecture slides written by Paul Milgrom, notes written by Alexander
Wolitzky, and “Microeconomic Theory” by Mas-Colell, Whsinston and Green. The opinions expressed in this article
are the author’s own and do not reflect the views of the Bank of Canada. These notes are not to be circulated without
my permission.

1
the time of contracting. Sometimes people classify moral hazard as “hidden action” and adverse
selection as “hidden information.” However, we can better understand moral hazard and adverse
selection in terms of the timings at which the agent has private information. For example, let
us go back to the example of the employment contract (example 1). To be concrete, suppose
that a firm hires a salesperson and offers a sales-contingent contract. If the salesperson privately
exerts effort and acquires information about customer demands after signing the contract, we may
analyze it with a model of moral hazard, even though the situation involves hidden information.
If the relevant information is that the salesperson knows their ability to sell the products before
signing the contract, we could study it as a model of adverse selection.
Having emphasized that moral hazard can be relevant not just to hidden action but to hidden
information, in this lecture note, we still study models of moral hazard with hidden action. We
present a model and begin with the analysis of the first-best, which is the optimal contract when
the agent’s action is observable. We then consider unobservable effort (our main focus) and provide
conditions under which the principal can attain the first-best outcome. We then consider variations
of the simple model. These variations do not satisfy the conditions for the first-best, so we analyze
the principal’s optimal (“second best”) contract. The appendix of this lecture note also covers the
basics of monotone comparative statics, which is a powerful tool to establish comparative statics
without concavity or differentiability of the objective function.

3 A Simple Model of Moral Hazard


The purpose of this section is to describe a model of moral hazard. We focus on hidden action
as in MWG. We examine when the principal can attain the first-best. The model is as follows.
There is a principal and an agent (e.g., a firm and a worker). First, the principal offers a contract,
w(·) : R → R. A contract w(·) maps a realized output x ∈ R to compensation w(x). The agent
decides whether to sign the contract. If the agent refuses, he earns an outside option of u ∈ R,
which is deterministic and commonly known. Otherwise, the agent privately chooses an effort level
e ∈ E, where E is the set of possible effort levels. (The set E can be finite and does not need to be
a subset of R.) Then the output x = g(e) + ε is realized, where g : E → R. Note that the output
depends on noise ε. For simplicity, assume E[ε] = 0. Finally, the output x is observed, and the
agent receives w(x). The ex post payoff of the agent is u(w(x), e), and the payoff of the principal
is x − w(x). The principal and the agents maximize expected payoffs.

First-Best

We derive the first-best solution of the principal. The term “first-best” typically refers to a solution
in which the agent has no private information. For moral hazard with hidden action, it refers to
a situation in which the agent’s effort is observable. If effort e is directly observable, the principal
can directly specify what e the agent should choose. (The principal can write in the contract that

2
if the agent’s doesn’t choose e, they go to a court.) However, the principal cannot force the agent
to sign the contract. Thus, the principal faces the “participation constraint,” which requires that
the contract offer the agent a payoff of at least u. Thus the first-best problem of the principal is

max E [g(e) − w(g(e) + ε)] ,


e,w(·)

subject to
(PC) E [u (w(g(e) + ε), e)] ≥ u.

Note that the expected value of the output is E[x] = E[g(e) + ε] = g(e), which is why we have g(e)
in the first term of the principal’s objective. Here, PC refers to the participation constraint.
How would the solution of this problem look like? Suppose that for each e ∈ E, u(w, e) is strictly
increasing and strictly concave in w, i.e., the agent is risk averse. For any (e, w(·)) that satisfies
the PC, the principal can get a weakly higher payoff by using (e, w∗ (·)) such that w∗ (x) = wCE
where E [u (w(g(e) + ε), e)] = u (wCE , e), i.e., the principal can be better off by paying the certainty
equivalent of w∗ (x) independently of the output. As a result, the principal’s optimal w(·) to enforce
e is w∗ (x) = we∗ for all x, where u(we∗ , e) = u. Having solved this equation with respect to we∗ , the
principal can choose e to maximize E [g(e) − we∗ ]. To sum up, if the agent is risk averse and the
principal is risk-neutral, then the first-best contract is to pay the same amount regardless of the
realized output.
An alternative way to show the same result (which MWG adopts) is the following. For notational
simplicity, assume that there are finitely many possible outputs g(e) + ε, that is, the noise term
ε takes finitely many values. Let π1 , . . . , πn denote the possible outputs, and let pj (e) denote the
probability of output πj given effort e. The Lagrangian for the principals’ problem is

n
X n
X
L= pj (e)(πj − wj ) + λ [pj (e)u (wj , e) − u] .
j=1 j=1

The first-order condition with respect to wj is

∂L ∂u
= −pj (e) + λpj (e) =0
∂wj ∂wj
∂u 1
⇒ = .
∂wj λ

Thus w1 = · · · = wn , i.e., payment does not vary across possible outputs.

3
Risk-Neutral Case

We now assume that the principal cannot observe the agent’s effort e. In general, the principal’s
problem under moral hazard is as follows.

max E [g(e) − w(g(e) + ε)] ,


e,w(·)

subject to
E u w(g(e0 ) + ε), e0
 
(IC) e ∈ arg max
0e ∈E

(PC) E [u (w(g(e) + ε), e)] ≥ u.

Here, IC refers to the incentive constraint: The principal must choose w(·) so that the agent’s
optimal effort level (i.e., the right-hand side of IC) is what the principal wants to induce (the left-
hand side of IC). Assume also that the agent is risk-neutral and effort involves a linear cost, i.e.,
u(w, e) = w − e. Then the problem becomes

max E [g(e) − w(g(e) + ε)] ,


e,w(·)

subject to
0 0
 
(IC) e ∈ arg max
0
E w(g(e ) + ε) − e
e ∈E

(PC) E [w(g(e) + ε) − e] ≥ u.

When the agent is risk-neutral, the principal can attain the first-best by “selling the firm.” Consider
the following contract (recall the notation x = g(e) + ε). Let e∗ ∈ arg maxe∈E (g(e) − e) and
w(x) = x − p, where p = g(e∗ ) − e∗ − u. Directly substituting these expressions into the objective
and constraints, we can confirm that this contract (i) maximizes the sum g(e) − e of the expected
payoffs of the principal and the agent, (ii) makes PC binding, and (iii) satisfies IC. (For example, the
left-hand side of PC is E [w(x) − e∗ ] = E [x − p − e∗ ] = E [x − (g(e∗ ) − e∗ − u) − e∗ ] = u.) Parts
(i) and (iii) imply that the agent’s effort level maximizes the joint surplus across all effort levels,
and Part (ii) implies that the agent receives the minimum utility u. Therefore even if the principal
can directly choose any e (subject to PC), she cannot earn a strictly higher profit than the above
contract. In other words, the above contract gives the principal the same profit as the first-best
outcome. This contract is as if the principal sells the firm to the agent at price of p.
So far, we have considered settings in which the principal can attain the first-best and the
optimal contract attains a point on the Pareto-frontier. From the next section on, we study various
settings in which the principal cannot attain the first-best.

Remark 1. The above problem also provide solutions to efficient contracting problem in general:
Any efficient contract maximizes the principal’s payoff subject to a fixed minimum level u of the
agent’s payoff, but this problem is exactly what we have solved above. Thus under the first-best

4
Principal’s
utilty

Pareto Frontier (efficient outcomes)


Every efficient contract maximizes
P’s utility subject to
a fixed minimum level of A’s utility.
The set of all outcomes
(varying e and w(·))

u0 u Agent’s utility

Figure 1: Efficient contracts

problem (or in an environment in which the first-best is attainable), by varying u, we can find
contracts that describe to the points on the Pareto frontier.

4 Limited Liability
Recall that according to the “selling the firm” solution above, the agent may have to a pay a large
penalty to the principal when x takes a negative value. The principal cannot use such a contract
if the agent is protected by limited liability. How would the optimal contract look like under the
agent’s limited liability?
We study the following setting: Given an effort level e ≥ 0, there are only two possible outcomes:
With probability g(e) ∈ (0, 1), the output is “success,” which has value π to the principal. With
probability 1 − g(e), the output is “failure,” which has value 0 to the principal. Note that we now
use g(e) as a probability of success, not the expected amount of the output as in Section 3. Assume
g is smooth, increasing, and strictly concave, and g(0) = 0—i.e., no effort, no success. We can write
any compensation function w(·) as (w1 , w0 ), where w1 is the payment given success, and w0 is the
payment given failure. The limited liability means that w0 and w1 (and generally, compensation
given any possible output) must be bounded from below. We consider the case in which this bound
is zero, so w1 , w0 ≥ 0 must hold.
The agent’s behavior is simple: Given (w1 , w0 ), the agent chooses e to maximize g(e)w1 + (1 −
g(e))w0 − e. The agent’s optimum is characterized by the first-order condition:

1
g 0 (e)(w1 − w0 ) = 1 ⇒ w1 = w0 + .
g 0 (e)

5
The optimal contract must solve the following:

max g(e)(π − w1 ) + (1 − g(e))w0 ,


e,w1 ,w0

subject to
1
(IC) w1 = w0 +
g 0 (e)
(PC) g(e)w1 + (1 − g(e))w0 − e ≥ u
(LL) w1 , w0 ≥ 0.

We consider two cases. Suppose that at the solution of the original problem without LL, we already
have w1 , w0 ≥ 0. Because the agent is risk-neutral, the optimal contract (e.g., the “selling the firm”
solution) in the original problem leads to the first-best effort level with the binding PC. Thus we
have e∗ ∈ arg maxe πg(e) − e, and

1 u + e∗ − g(e∗ )w1 ∗ g(e∗ )


w1 = w0 + and w0 = = u + e − .
g (e∗ )
0 1 − g(e∗ ) g 0 (e∗ )

1 u+e∗ −g(e∗ )w1


Here, we obtain the last equality by plugging w1 = w0 + g 0 (e∗ ) into w0 = 1−g(e∗ ) and solving
the resulting equation with respect to w0 .
In contrast, if u is low (i.e., u = 0), we may find that LL for w0 binds (note that whenever LL
binds, it binds for w0 ). Then we obtain

1
w0 = 0 and w1 = .
g 0 (e)

g(e) 1
The agent’s payoff (“rent”) is g 0 (e) − e > 0, because he gets paid w1 = g 0 (e) with probability g(e)
and incurs cost e.1 As we can see from the figure, the agent’s rent is increasing in e.

1
“Rent” is used in various contexts. In the principal-agent problem, rent typically refers to the premium of the
agent’s utility above the minimum utility required for participation. If we set u = 0, the rent equals the agent’s
utility itself.

6
g(e0 )
g(e)

seg
e e0
Rent(e)
Rent(e0 )

Figure 2: Graphical way to show Rent(e) is increasing in e. The tangential line that goes through
g(e)
(e, g(e)) has slope g 0 (e), which also equals seg+e (seg is the length of the thick red segment). Since
g(e)
g 0 (e) = seg+e , Rent(e) = gg(e) seg+e
0 (e) − e = g(e) · g(e) − e = seg. By considering a shallower tangential

line going through (e0 , g(e0 )) with e0 > e, we can see that Rent(e) is increasing in e.

The principal chooses e by solving


 
g(e)
max πg(e) − e − −e .
e g 0 (e)

The following result compares the first-best and the second-best effort levels. The proof uses the
basic results of monotone comparative statics, which we will cover in the lecture later and you can
find in the appendix of this note.

Claim 1. Whenever LL binds, the second-best effort level is weakly lower than the first-best effort
level. (To be precise, the set of the second-best effort levels is weakly smaller than the set of first-best
effort levels in the strong set order.)
g(e)
Proof. The agent’s rent Rent(e) = g 0 (e) − u is increasing in e. Consider the problem maxe πg(e) −
e − (1 − θ)Rent(e) for λ ≥ 0. The objective has increasing differences in (e, θ), so a higher θ leads
to a higher optimal e. If θ = 0, the problem is the second-best problem. If θ = 1, the problem is
the first-best problem. By Topiks’ theorem, the second-best effort level is weakly lower than the
first-best effort level.

Intuitively, the principal can induce a higher effort by decreasing w0 or increasing w1 . LL may
prevent the principal from lowering w0 , so the only way to induce a higher effort is to increase w1 .
Thus LL makes it more costly for the principal to induce a higher effort. Thus the second-best
effort level could be lower than the first-best level.

Remark 2. The analysis provides a three-step approach to solve the optimal contracting problem:
First, for any given e, we solve what compensation schedule we (·) can elicit e and satisfy the
participation constraint. Second, if there are multiple contracts that induce e in the first step, we
find the one that maximizes the principal’s payoff. For example, if the principal is risk-netural,

7
this step would select we (·) that minimizes the expected payment. Finally, we plug we into the
principal’s objective and solve for the optimal e.

5 Risk Averse Agent


“Selling the firm” cannot be the optimal contract also when the agent is risk-averse. This section
formulates and solves the optimal contract for the risk-averse agent. We also introduce a general
formulation with multiple outcomes.
Suppose now that there are n ≥ 2 possible outcomes, 1, . . . , n. The agent has K different
effort levels, i.e., E = {e1 , . . . , eK }. The agent’s effort e ∈ E determines the probability vector
p(e) = (p1 (e), . . . , pn (e)), where pi (e) ∈ [0, 1] is the probability of outcome i. The value of output
i to the principal is πi . Thus given e, the expected value of the output to the principal is π(e) :=
Pn
i=1 pi (e)πi . The contract is written as (e, w1 , . . . , wn ), where wi is the payment conditional on
outcome i. The agent’s payoff is now u(w, e) = u(w) − c(e), where u is smooth and strictly
concave. We saw before that when e is observable, the optimal contract is to pay a fixed amount
independently of the realized output. However, if the principal used the same contract when the
effort is unobservable, the agent would choose the cost-minimizing effort. Thus to induce an effort
level other than arg max c(e), the principal needs to use a contract that is different from paying a
constant wage.
If the effort is not observed, the principal’s problem is
n
X
max π(e) − pi (e)wi
e,w1 ,...,wn
i=1

subject to
n
X n
X
(IC) − c(e) + pi (e)u(wi ) ≥ −c(ek ) + pi (ek )u(wi ), k = 1, . . . , K
i=1 i=1
Xn
(PC) − c(e) + pi (e)u(wi ) ≥ u = 0.
i=1

The first step to solve the problem is derive the cost-minimizing way of inducing effort level e.

8
The principal’s cost-minimization problem is
n
X
min pi (e)wi
w1 ,...,wn
i=1

subject to
n
X n
X
(IC) − c(e) + pi (e)u(wi ) ≥ −c(ek ) + pi (ek )u(wi ), k = 1, . . . , K
i=1 i=1
Xn
(PC) − c(e) + pi (e)u(wi ) ≥ 0.
i=1

The Lagrangian is

n K n n
" # " #
X X X X
L=− pi (e)wi + λk c(ek ) − c(e) + [pi (e) − pi (ek )]u(wi ) + µ −c(e) + pi (e)u(wi ) .
i=1 k=1 i=1 i=1

The FOC with respect to wi is

K
∂L X
λk (pi (e) − pi (ek ))u0 (wi ) + µpi (e)u0 (wi ) = 0,
 
= −pi (e) + (1)
∂wi
k=1

which implies
K PK
1 k=1 λk pi (ek )
X
0
=µ+ λk − . (2)
u (wi ) pi (e)
k=1

Note that wi typically varies with i, i.e., compensation depends on the realized output. The fact
that wi may vary with output i highlights the principal’s trade-off between risk and incentive. On
the one hand, the principal has to make payment dependent on output in order to induce high
effort (if payment is independent of output, then the agent would only choose the cost-minimizing
effort level). On the other hand, since the agent is risk averse, the principal has to pay more
in expectation when payment depends on output. Under moral hazard with a risk-averse agent,
the second-best contract balances this trade-off. Now, how does the principal exactly resolve this
trade-off? Let’s take a look at (2).

A Statistical Interpretation of Optimal Contracts


PK
λk pi (ek )
Compensation wi depends on output i only through k=1
pi (e) . The denominator is the probabil-
ity of output i when the agent chooses e as instructed. The numerator is the probability of the same
output when the agent takes a mixed action that chooses effort ek with probability proportional to
λk . If this “test statistic” is low at output i, the agent is more likely to be taking e compared to
the alternative hypothesis. Correspondingly, the agent gets a higher wi . If the statistic is high at
output i, the principal sets wi low. Where does the alternative hypothesis come from? Intuitively,

9
under the alternative hypothesis, the agent takes ek with a higher probability when λk is high,
i.e., the agent has a stronger incentive to take ek at the optimal contract. (Recall that λk is the
Lagrange multiplier, or the shadow price, of the constraint that the agent prefers e to ek .) Finally,
note that the above statistical interpretation does not mean that the principal is doing hypothesis
testing: The principal knows what e the agent is choosing, because (e, w1 , . . . , wn ) solves IC. The
optimal (w1 , . . . , wn ) is still a result of the trade-off between risk-sharing and motivating an effort.

6 Multi-Tasking Incentives
We conclude the main part of this lecture with multi-task incentives. So far, we have assumed that
the agent faces a single task—e.g., a firm asks a worker to increase sales, and the worker chooses
how much effort to exert. In practice, the agent may be asked to allocate efforts across multiple
tasks:

1. A teacher may be asked to increase students’ test scores and communication skills.

2. A production worker may be responsible for producing and taking cares of machines.

3. A CEO may be asked to increase short-term profits and take a care of long-term investment
strategy.

4. A worker of a cat cafe may be responsible for the number of visitors they serve per day and
the cats’ health.

How would incentives for different tasks interact? The key idea from Holmstrom and Milgrom
(1991) is that incentivizing an effort for a task that is easy to measure could backfire, because it
distorts the effort level of another task that is hard to measure. For example, paying a teacher
according to the test scores of students may be a bad idea, because the teacher would then lower
the effort to teach students communication skills. This intuition rests on several ideas: First, the
agent’s effort devoted to multiple tasks are substitutes, in that increasing an effort on one task
make it harder to do so on other tasks. Second, the principal cares about outputs from multiple
tasks, not just one. Finally, some types of performance are harder to measure than others.
We consider a simple setup that captures this idea. The agent has two types of effort to choose,
(e1 , e2 ) ∈ R2 . The agent is risk neutral and receives a payoff of w − c(e1 , e2 ), where c is strictly
convex. The principal’s (expected) value of (e1 , e2 ) is π1 e1 + π2 e2 , where π1 , π2 > 0. For simplicity,
suppose that the principal can observe e1 but not e2 . The principal can then offer a fixed wage of w
and ask the agent to choose e1 . However, the principal can neither choose e2 nor base compensation

10
on a noisy output related to e2 . Thus the principal’s problem is

max π1 e1 + π2 e2 − w
w,e1 ,e2

subject to
(IC)e2 ∈ arg min c(e1 , e)
e

(PC) w − c(e1 , e2 ) ≥ 0.

This problem is equivalent to

max π1 e1 + π2 e∗2 (e1 ) − c(e1 , e∗2 (e1 )), (3)


e1

where e∗2 (e1 ) is the agent’s cost-minimizing e2 given e1 . The first-best problem is

max π1 e1 + π2 e2 − c(e1 , e2 )
e1 ,e2
 
⇐⇒ max max π1 e1 + π2 e2 − c(e1 , e2 )
e1 e2

⇐⇒ max π1 e1 + π2 eF2 B (e1 ) − c(e1 , eF2 B (e1 )). (4)


e1

The first-best problem can differ from the principal’s problem (3), because e∗2 (·) is different from
eF B (·). Let us compare the derivatives with respect to e1 : The derivative of (3) with respect to e1
is
de∗2
π1 − c1 + (π2 − c2 ) (e1 )
de1
∂c
where ci = ∂ei . The derivative of (4) with respect to e1 is

π1 − c1 . (5)

de∗
Thus, the gap between the second-best and the first best is (π2 − c2 ) de21 (e1 ). This terms could be
de∗
positive or negative, depending on primitives. Suppose (π2 − c2 ) de21 (e1 ) is negative and large in
magnitude, which could arise if the principal highly values e2 (π2 > c2 ) but promoting a high e1
de∗
could reduce e2 ( de21 (e1 ) < 0). The second-best e1 can then be lower than the first-best. In other
words, the principal chooses to not enforce as much effort as in the first-best, because providing
incentives to increase e1 can detract the agent from the other task e2 that the principal values.

7 Appendix: Monotone Comparative Statics


This appendix introduces concepts and results on monotone comparative statics, focusing on the
case in which all relevant variables are real numbers. Take X ⊂ R, Θ ⊂ R, and f : R × R → R. We

11
consider a maximization problem
max f (x, θ).
x∈X

We assume that the problem has a solution. We want to know when the optimal x is non-decreasing
in θ. Notice that this question is not well-defined if the problem has multiple solutions, which we
will take care of. We begin with a few definitions.

Definition 1. A function f : R × R → R has increasing differences in (x, θ) if, whenever


xH ≥ xL and θH ≥ θL , we have

f (xH , θH ) − f (xL , θH ) ≥ f (xH , θL ) − f (xL , θL ).

Intuitively, increasing differences mean that return to choosing a higher value of x, i.e., f (xH , θ)−
f (xL , θ), is non-decreasing in θ. The above definition does not require the differentiability of f , but
if it is, we may be able to simplify our task of checking whether f has increasing differences:

Theorem 1. If f is twice continuously differentiable, then f has increasing differences if and only
if
∂ 2 f (x, θ)
≥ 0, ∀x ∈ X, ∀θ ∈ Θ.
∂x∂θ
Depending on the differentiability of f , you may find some version of the above result useful.
∂f (x,θ)
For example, we can establish increasing differences by showing that ∂x is increasing in θ. This
result would be useful if f is differentiable in x but may not be twice continuously differentiable.
∂f (x,θ)
Similarly, if ∂θ is increasing in x, f has increasing differences.
The next definition formalizes the idea that one set is “greater” than the other set.

Definition 2. A set A ⊂ R is greater than a set B ⊂ R in the strong set order if, for any a ∈ A
and b ∈ B,

max {a, b} ∈ A and min {a, b} ∈ B.

Theorem 2. For each θ ∈ Θ, define X ∗ (θ) := arg maxx∈X f (x, θ). If f has increasing differences
in (x, θ), then X ∗ (θ) is increasing in the strong set order, i.e., for any θH and θL , X ∗ (θH ) is greater
than X ∗ (θL ) in the strong set order.

We do not cover the proof in this class. Milgrom et al. (1994) offer a much more general
treatment, where X and Θ can be partially ordered sets.
∂ 2 f (x,θ)
We may encounter a situation where ∂x∂θ ≤ 0, i.e., f has decreasing differences in (x, θ).
Function f having decreasing differences in (x, θ) is equivalent to f having increasing differences
in (x, −θ). Thus, we can apply the above result to show that X ∗ (θ) is increasing in the strong set
order in −θ, which roughly means that X ∗ (θ) is “decreasing” in θ.
If X ∗ (θ) is a singleton (i.e., the maximization problem has a unique solution for each θ), the
maximizer of f is increasing in θ in the usual sense.

12
We now show that the previous theorems generalize to X ⊂ Rn and Θ ⊂ Rm . For generalization,
we have two issues. First, if x and y are vectors, how do we define max(x, y) or min(x, y) in the
definition of strong set order? Second, as we will see, it is not enough to have complementarity
between x and θ when they are vectors; we need complementarity within components of x.
The following is a generalization of max and min. Define

x ∧ y = (min{x1 , y1 }, ..., min{xn , yn })


x ∨ y = (max{x1 , y1 }, ..., max{xn , yn }).

x ∧ y is join called meet, and x ∨ y is called join of x and y.

Definition 3. A set A ⊂ Rn is greater than a set B ⊂ Rn in the strong set order if, for any
a ∈ A and b ∈ B,

a ∨ b ∈ A, and
a ∧ b ∈ B.

A lattice is a set X ⊂ Rn such that x ∧ y ∈ X and x ∨ y ∈ X for all x, y ∈ X.

For example, for non-empty sets X1 , . . . , Xn ⊂ R, a product set X = X1 × · · · × Xn is a lattice.


Definition of increasing differences in (x, θ) is the same as before: f has increasing differences
in (x, θ) if xH ≥ xL and θH ≥ θL implies

f (xH , θH ) − f (xL , θH ) ≥ f (xH , θL ) − f (xL , θL ).

Here, xH ≥ xL means that xH L


j ≥ xj for j = 1, . . . , n.
Increasing differences in (x, θ) no longer guarantee that X ∗ (θ) is increasing in the strong set
order. Intuitively, even if higher θ1 leads to higher x1 , without complementarity within x, increase in
x1 might push x2 down, which may further affect components of x. Thus we need complementarity
within components of x, not just between x and θ. Supermodularity takes care of this:

Definition 4. A function f : X × Θ → R is supermodular in x ∈ X if, for all x, y ∈ X and


θ ∈ Θ, we have
f (x ∨ y, θ) − f (x, θ) ≥ f (y, θ) − f (x ∧ y, θ). (6)

Theorem 3. If function f : Rn ×Rm → R is twice continuously differentiable, then f has increasing


differences iff
∂ 2 f (x, θ)
≥ 0, ∀x ∈ X, ∀θ ∈ Θ, i ∈ {1, . . . , n} , j ∈ {1, . . . , m} ,
∂xi ∂θj
and f is supermodular in x iff

∂ 2 f (x, θ)
≥ 0, ∀x ∈ X, ∀θ ∈ Θ, i 6= j ∈ {1, . . . , n} .
∂xi ∂xj

13
We can now extend the previous theorem.

Theorem 4. If X ⊂ Rn is a lattice, Θ ⊂ Rm , and f : X × Θ → R has increasing differences in


(x, θ) and is supermodular in x, then X ∗ (θ) = maxx∈X f (x, θ) is increasing in the strong set order.

References
Holmstrom, Bengt and Paul Milgrom (1991), “Multitask principal-agent analyses: Incentive con-
tracts, asset ownership, and job design.” JL Econ. & Org., 7, 24.

Milgrom, Paul, Chris Shannon, et al. (1994), “Monotone comparative statics.” Econometrica, 62,
157–157.

14

You might also like