Whelan Lecture Notes
Whelan Lecture Notes
Karl Whelan
1
8 Sticky Prices and the Phillips Curve 184
2
Forward
This is a collection of lecture notes that I have used over a number of years teaching Advanced
material has also been used to teach first-term Masters students. The material is not intended
my own interests and preferences and is being made available in this format because some
Over the years, I have made most of these notes available on my own website but they
tend to go up and down on the website each year depending on when I am teaching a course.
The material also changes from year to year, so some topics that had been previously covered
disappear from my site. Because I know that some people are interested in the notes and
use them to assist with their own teaching or learning, I have decided to make the full set
A few observations on the notes. The first two parts of the book deal with issues related to
IS-LM model featuring monetary policy rules. The approach taken is largely borrowed from
Carl Walshs 2002 paper Teaching Inflation Targeting: An Analysis for Intermediate Macro
in the Journal of Economic Education but the analysis here applies the model to a much
wider range of issues. There are also similarities with David Romers manuscript, Short-Run
3
Fluctuations, particularly in the treatment of the zero bound on interest rates but there are
My intention with this material is to teach what are probably the key insights of New
Keynesian economicsthat fiscal and monetary policy can be effective in influencing output
in the short-run but not the long-run, that the process by which the public formulates inflation
expectations is crucial, the advantages to having tough central bankers, that monetary policy
rules can be stabilising if they satisfy the Taylor principle, and that some of these key insights
are reversed if the economy is in a liquidity trapwithout relying on the particular restrictive
Part 2, in contrast, explores rational expectations in some detail. The unifying theme of
this section is the importance of understanding first-order stochastic differences equations; how
to solve them for forward-looking solutions and how to relate these solutions to observable
data. These methods are used to explore a range of standard macroeconomic topics with
some chapters (such as those on consumption and asset prices) more focused on reviewing the
empirical evidence than others. In terms of pedagogy, my approach is to explain the important
role that rational expectations plays in modern macroeconomic modelling but to encourage
students to realise that empirical testing tends to uncover weaknesses in this approach.
The final two parts of the book cover long-run growth theory. The first few topics (growth
acccounting, Solow and Romer models) are fairly standard while the remaining topics (insti-
tutions, technology diffusion, Malthusian dynamics and growth and resources) partly reflect
my own interests.
Beamer slides are available for almost all the chapters in this book but I do not have time
1
Romers manuscript is available at http://eml.berkeley.edu/ dromer/
4
to adapt my current set of slides to precisely cover all the topics in the book. Anybody who
may wish to get teaching slides to accompanying teaching some of the chapters in this book
5
Part I
6
Chapter 1
the first part of this course, we are going to revisit some of the ideas from those models and
Rather than the traditional LM curve, we will describe monetary policy in a way that
is more consistent with how it is now implemented, i.e. we will assume the central bank
follows a rule that dictates how it sets nominal interest rates. We will focus on how the
properties of the monetary policy rule influence the behaviour of the economy.
We will have a more careful treatment of the roles played by real interest rates.
We will model the zero lower bound on interest rates and discuss its implications for
policy.
7
An IS Curve describing how output depends upon interest rates.
A Monetary Policy Rule describing how the central bank sets interest rates depend-
Putting these three elements together, I will call it the IS-MP-PC model (i.e. The Income-
Spending/Monetary Policy/Phillips Curve model). I will describe the model with equations.
I will also merge together the second two elements (the IS curve and the monetary policy
rule) to give a new IS-MP curve that can be combined with the Phillips curve to use graphs
Perhaps the most common theme in economics is that you cant have everything you want. Life
is full of trade-offs, so that if you get more of one thing, you have to have less of another thing.
In macroeconomics, there are important trade-offs facing governments when they implement
policy. One of these relates to a trade-off between desired outcomes for inflation and output.
What form does this relationship take? Back when macroeconomics was a relatively young
discipline, in 1958, a study by the LSEs A.W. Phillips seemed to provide the answer. Phillips
documented a strong negative relationship between wage inflation and unemployment: Low
unemployment was associated with high inflation, presumably because tight labour markets
stimulated wage inflation. Figure 1.1 shows one of the graphs from Phillipss paper illustrating
Teaching Inflation Targeting: An Analysis for Intermediate Macro in the Journal of Economic Education.
8
In 1960, a paper by MIT economists Robert Solow and Paul Samuelson (both of whom
would go on to win the Nobel prize in economics for other work) replicated these findings
for the US and emphasised that the relationship also worked for price inflation. Figure 1.2
reproduces the graph in their paper describing the relationship they found. The Phillips curve
quickly became the basis for the discussion of macroeconomic policy decisions. Economists
advised that governments faced a tradeoff: Lower unemployment could be achieved, but only
However, Milton Friedmans 1968 presidential address to the American Economic Asso-
ciation produced a well-timed and influential critique of the thinking underlying the Phillips
curve. Friedman pointed out that it was expected real wages that affected wage bargaining.
If low unemployment means workers have a strong bargaining position, then high nominal
wage inflation on its own is not good enough: They want nominal wage inflation greater than
price inflation.
Friedman argued that if policy-makers tried to exploit an apparent Phillips curve tradeoff,
then the public would get used to high inflation and come to expect it. Inflation expectations
would move up and the previously-existing tradeoff between inflation and output would disap-
pear. In particular, he put forward the idea that there was a natural rate of unemployment
and that attempts to keep unemployment below this level could not work in the long run.
9
Evidence on the Phillips Curve
Monetary and fiscal policies in the 1960s were very expansionary around the world, partly
because governments following Phillips curve recipes chose booming economies with low
At first, the Phillips curve seemed to work: Inflation rose and unemployment fell. However,
as the public got used to high inflation, the Phillips tradeoff got worse. By the late 1960s
inflation was rising even though unemployment had moved up. Figure 1.3 shows the US time
series for inflation and unemployment. This stagflation combination of high inflation and high
unemployment got even worse in the 1970s. This was exactly what Friedman had predicted.
Today, the data no longer show any sign of a negative relationship between inflation and
unemployment. If fact, if you look at the scatter plot of US inflation and unemployment data
shown in Figure 1.4 , the correlation is positive: The original formulation of the Phillips curve
10
Figure 1.1: One of A. W. Phillipss Graphs
11
Figure 1.2: Solow and Samuelsons Description of the Phillips Curve
12
Figure 1.3: The Evolution of US Inflation and Unemployment
10
0
1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
Inflation Unemployment
13
Figure 1.4: The Failure of the Original Phillips Curve
10
8
Unemployment
3
0 2 4 6 8 10 12
Inflation
14
Equations: Variables, Parameters and All That
We will use both graphs and equations to describe the models in this class. Now I know many
students dont like equations and believe they are best studiously avoided. However, that
wont be a good strategy for doing well in this course, so I strongly encourage you to engage
with the technical material in this class. It isnt as hard is it might look to start with.
Variables and Coefficients: The equations in this class will generally have a certain format.
yt = + xt (1.1)
There are two types of objects in this equation. First, there are the variables, yt and xt .
These will correspond to economic variables that we are interested in (inflation or GDP for
example). We interpret yt as meaning the value that the variable y takes during the time
period t). For most models in this course, we will treat time as marching forward in discrete
intervals, i.e. period 1 is followed by period 2, period t is followed by period t + 1 and so on.
Second, there are the parameters or coefficients. In this example, these are given by and
. These are assumed to stay fixed over time. There are usually two types of coefficients:
Intercept terms like that describe the value that series like yt will take when other variables
all equal zero and coefficients like that describe the impact that one variable has on another.
In this case, if is a big number, then a change in the variable xt has a big impact on yt while
Some of you are probably asking what those squiggly shapes and are. They
are Greek letters. While its not strictly necessary to use these shapes to represent model
parameters, its pretty common in economics. So let me introduce them: is alpha (Al-Fa),
is beta (Bay-ta), is gamma, is delta, is theta (Thay-ta) and naturally enough is pi.
15
Dynamics: One of the things we will be interested in is how the variables we are looking at
will change over time. For example, we will have equations along the lines of
yt = yt1 + xt (1.2)
Reading this equation, it says that the value of y at time t will depend on the value of x at
time t and also on the value that y took in the previous period i.e. t 1. By this, we mean
that this equation holds in every period. In other words, in period 2, y depends on the value
that x takes in period 2 and also on the value that y took in period 1. Similarly, in period 3,
y depends on the value that x takes in period 3 and also on the value that y took in period
2. And so on.
Subscripts and Superscripts: When we write yt , we mean the value that the variable y takes
at time t. Note that the t here is a subscript it goes at the bottom of the y. Some students
dont realise this is a subscript and will just write yt but this is incorrect (it reads as though
We will also sometimes put indicators above certain variables to indicate that they are
special variables. For example, in the model we present now, you will see a variable written
as te which will represent the publics expectation of inflation. In the model, t is inflation at
time t and the e above the in te is there to signify that this is not inflation itself but rather
16
Our Version of the Phillips Curve
We will use both graphs and equations to describe the various elements of our model. Our first
in which inflation depends on inflation expectations, the gap between output and its natural
level and a temporary inflationary shock. Our equation for this is the following:
t = te + (yt yt ) + t (1.3)
In this equation represents inflation and by t we mean inflation at time t. The equation
represents the publics inflation expectations at time t. We have put a time subscript
on this variable because the publics expectations may change over time. Note that
is because we are assuming that people bargain over real wages and higher expected
inflation translates one-for-one into their wage bargaining, which in turn is passed into
price inflation.
2. The Output Gap, (yt yt ): This is the gap between GDP at time t, as represented
by yt , and what we will term the natural level of output, which we term yt . This
is the level of output at time t that would be consistent with unemployment equalling
its natural rate. (Note we are describing inflation as being dependent on the output
gap rather than the gap between unemployment and its natural rate because this would
require adding an extra element to the model, i.e. the link between unemployment and
output). We would expect this natural level of output to gradually increase over time as
17
productivity levels improve. The coefficient (pronounced gamma) describes exactly
how much inflation is generated by a 1 percent increase in the gap between output and
inflation expectations and the output gap may be key drivers of inflation, they wont
capture all the factors that influence inflation at any time. For example, supply shocks
like a temporary increase in the price of imported oil can drive up inflation for a while.
the inflationary shock (this will distinguish it from the output shock that we will also
add to the model) and the t subscript indicates that these shocks change over time.
Figure 1.5 shows how to describe our Phillips curve equation in a graph. The graph shows a
positive relationship between inflation and output. The key points to notice are the markings
on the two axes indicating what happens when output is at its natural rate. This graph
illustrates the case when there are no temporary inflationary shocks so t = 0. In this case,
t = te + (yt yt ) (1.4)
So when yt = yt we get t = te . This is a key aspect of this graph. If you are asked to draw
this graph in the final exam and you just draw an upward-sloping curve without describing
the key points on the inflation and output axes, you wont score many points.
18
The curve can move up or down depending on what happens to the inflationary shocks, t ,
and with inflation expectations. Figure 1.6 illustrates what happens when there is a positive
inflationary shock so that t goes from being zero to being positive. In this case, even when
output equals its natural level (i.e. yt = yt ) we still get inflation being above its expected
level. Figure 1.7 illustrates how the curve changes when expected inflation rises from 1 to
2 . The whole curve shifts upwards because of the increase in expected inflation.
19
Figure 1.5: The Phillips Curve with t = 0
Inflation
Output
20
Figure 1.6: The Phillips Curve as we move from t = 0 to t > 0
Inflation PC ( > 0)
PC ( =0)
Output
21
Figure 1.7: The Phillips Curve as we move from te = 1 to te = 2
Inflation PC ( )
PC ( )
Output
22
Model Element Two: The IS Curve
The second element of the model is one that should be familiar to you: An IS curve relating
output to interest rates. The higher interest rates are, the lower output is. However, I want
to stress something here that is often not emphasised in introductory treatments of the IS
curve. The IS relationship is between output and real interest rates, not nominal rates. Real
interest rates adjust the headline (nominal) interest rate by subtracting off inflation.
Think for a second about why it is that real interest rates are what matters. You know
already that high interest rates discourage aggregate demand by reducing consumption and
investment spending. But what is a high interest rate? Suppose I told you the interest rate
You might be inclined to say, Yes, this is a high interest rate but the answer is that it
really depends on inflation. Consider the decision to save for tomorrow or spend today. The
argument for saving is that it can allow you to consume more tomorrow. If the real interest
rate is positive, then this means that you will be able to purchase more goods and services
tomorrow with the money that you set aside. For instance, if the interest rate if 5% but
inflation is 2%, then youll be able to buy 3% more stuff next year because you saved. In
constrast, if the interest rate if 5% but inflation is 8%, then youll be able to buy 3% less stuff
next year even though you have saved your money and earned interest. For these reasons, it
is the real interest rate that determines whether consumers think saving is attractive relative
to spending.
The same logic applies to firms thinking about borrowing to make investment purchases.
If inflation is 10%, then a firm can expect that its prices (and profits) will be increasing by
that much over the next year and a 10% interest rate wont seem so high. But if prices are
23
falling, then a 10% interest rate on borrowings will seem very high.
With these ideas in mind, our version of the IS curve will be the following:
Expressed in words, this equation states that the gap between output and its natural rate
1. The Real Interest Rate: The nominal interest rate at time t is represented by it ,
the effect of a one point increase in the real interest rate on output. The equation has
been constructed in a particular way so that it explicitly defines the real interest rate
at which output will, on average, equal its natural rate. This rate can be termed the
natural rate of interest a term first used by early 20th century Swedish economist
Knut Wicksell. Specifically, we can see from the equation that if yt = 0 then a real
2. Aggregate Demand Shocks, yt : The IS curve is an even more threadbare model of
output than the Phillips curve model is of inflation. Many other factors beyond the
real interest rate influence aggregate spending decisions. These include fiscal policy,
asset prices and consumer and business sentiment. We will model all of these factors
as temporary deviations from zero of an aggregate demand shock, yt . Note that this
shock has a superscript y to distinguish it from the aggregate supply shock t that
24
Model Element Three: A Monetary Policy Rule
Thus far, our model has described how inflation depends on output and how output depends
on interest rates. We can complete the model by describing how interest rates are determined.
Traditionally, in the IS-LM model, this is where the LM curve is introduced. The LM
curve links the demand for the real money stock with nominal interest rates and output, with
re-arranged to give a positive relationship between output and interest rates of the form
!
1 mt
yt = + it (1.7)
pt
This positive relationship between output and interest rates is combined with the negative
relationship between these variables in the IS curve to determine unique values for output and
interest rates, something that can be illustrated in a graph with an upward-sloping LM curve
and a downward-sloping IS curve. Monetary policy is then described as taking the form of
the central bank adjusting the money supply mt in a way that sets the position of the LM
setting a specified level of the monetary base. Instead, they use their power to supply
target level. The supply of base money ends up being whatever emerges from enforcing
the interest rate target. This approach which has been the practice at all the major
25
central banks for at least 30 years makes the LM curve (and the money supply) of
secondary interest when thinking about core macroeconomic issues. Our approach will
be to describe how the central bank sets interest rates and we wont focus on the money
supply.
output and interest rates. Then a separate AS-AD model is used to describe the deter-
mination of prices (and thus, implicitly, inflation). However, the reality is that rather
than being determined independently of inflation, most modern central banks set inter-
est rates with a very close eye on inflationary developments. A model that integrates
the determination of inflation with the setting of monetary policy is thus more realistic.
down to a single model, this approach is also simpler than one that requires two different
sets of graphs.
We will consider two different types of monetary policy rules that may be followed by the
central bank in our model. The first one is a simple one in which the central bank adjusts its
interest rate in line with inflation with the goal of meeting an inflation target. Specifically,
it = r + + (t ) (1.8)
In English, the rule can be intepreted as follows: The central bank adjusts the nominal interest
rate, it , upwards when inflation, t , goes up and downwards when inflation goes down (we
are assuming that > 0) and it does so in a way that means when inflation equals a target
level, , chosen by the central bank, real interest rates will be equal to their natural level.
26
Thats a bit of a mouthful, so lets see that this is the case and then try to understand
why the monetary policy rule would take this form. First, note what the nominal interest
rate will be if inflation equals its target level (i.e. t = ). The term after the final plus sign
in equation (1.8) will equal zero and the nominal interest rate will be
it = r + (1.9)
i t = r + t (1.10)
i t t = r (1.11)
Now think about why a rule of this form might be a good idea. Suppose the central bank
has a target inflation rate of that it wants to achieve. Ideally, it would like the public to
understand that this is the likely inflation rate that will occur, so that te = . If that can be
achieved, then the Phillips curve (equation 1.3) tells us that, on average, inflation will equal
provided we have yt = yt . And the IS curve tells us that, on average, we will have yt = yt
when it t = r . So this is a policy that can help to enforce an average inflation rate equal
to the central banks desired target, provided the public adjusts its inflationary expectations
accordingly.
27
The IS-MP Curve
Thats the model. It consists of three equations. Lets pull them together in one place. They
t = te + (yt yt ) + t (1.12)
The IS curve:
it = r + + (t ) (1.14)
Now you may recall that I had promised a graphical representation of this model. However,
this is a system of three variables which makes it hard to express on a graph with two axes.
To make the model easier to analyse using graphs, we are going to reduce it down to a system
with two main variables (inflation and output). We can do this because the monetary policy
rule makes interest rates are a function of inflation, so we can substitute this rule into the
IS curve to get a new relationship between output and inflation that we will call the IS-MP
curve.
We do this as follows. Replace the term it in equation (1.13) with the right-hand-side of
yt = yt [r + + (t )] + (t + r ) + yt (1.15)
yt = yt r (t ) + t + r + yt (1.16)
28
We can bring together the two terms being multiplied by on its own, and cancel out the
terms in r to get
yt = yt (t ) + (t ) + yt (1.17)
which simplifies to
yt = yt ( 1) (t ) + yt (1.18)
This is the IS-MP curve. It combines the information in the IS curve and the MP curve into
one relationship.
What would this curve look like on a graph? It turns out it depends especially on the value of
, the parameter that describes how the central bank reacts to inflation. The IS-MP curve
says that an extra unit of inflation implies a change of ( 1) in output. Is this positive
or negative? Well we are assuming that > 0 (we put a negative sign in front of this when
determining the effect of real interest rates on output) so this combined coefficient will be
negative if 1 > 0.
In other words, the IS-MP curve will have a negative slope (slope downwards) provided
the central bank reacts to inflation so that > 1. The explanation for this is as follows.
interest rates change by ( 1) x. This means that if > 1 an increase in inflation leads
to higher real interest rates and, via the IS curve relation, to lower output. So if > 1
then the IS-MP curve can be depicted as a downward-sloping curve. In contrast, if < 1,
then an increase in inflation leads to lower real interest rates and higher output, implying an
29
For now, we will assume that > 1 so that we have a downward-sloping IS-MP curve
but we will revisit this issue after a few more lectures. Figure 1.8 thus shows what the IS-MP
curve looks like when the aggregate demand shock yt = 0. Again, the key thing to notice is
the value of inflation that occurs when output equals its natural rate. When yt = yt we get
t = . As with the Phillips curve, if you are asked to draw this graph in the final exam and
you just draw an downward-sloping curve without describing the key points on the inflation
and output axes, you wont score many points. Figure 1.9 shows how the IS-MP curve shifts
to the right if there is a positive value of yt corresponding to a positive shock to aggregate
demand.
30
Figure 1.8: The IS-MP Curve with yt = 0
Inflation
Output
31
Figure 1.9: The IS-MP curve as we move from yt = 0 to yt > 0
Inflation
IS-MP ( > 0)
Output
IS-MP ( =0)
32
The Full Model
The full IS-MP-PC model can be illustrated in the traditional fashion as a graph with one
curve that slopes upwards (the Phillips curve) and one that slopes downwards (the IS-MP
curve provided we assume that > 1.) Figure 1.10 provides the simplest possible example
of the graph. This is the case where both the temporary shocks, t and yt equal zero and the
publics expectation of inflation is equal to the central banks inflation target. Note that I have
labelled the PC and IS-MP curves to explicitly indicate what the expected and target rates
of inflation are and it will be a good idea for you to do the same when answering questions
In the next set of notes, we will analyse this model in depth, examining what happens
when various types of events occur and focusing carefully on how inflation expectations change
over time.
33
Figure 1.10: The IS-MP-PC Model When Expected Inflation Equals
Inflation
IS-MP (
PC ( )
Output
34
A More Complicated Monetary Policy Rule: The Taylor Rule
Before moving on to analyse the model in more depth, I want to describe the more complex
version of the monetary policy rule that I alluded to earlier. This rule takes a form that is
associated with Stanford economist John Taylor. In a famous paper published in 1993 called
Discretion Versus Policy Rules in Practice (you will find a link on the class webpage) Taylor
noted that Federal Reserve policy in the few years before his paper was published seemed to
be characterised by a rule in which interest rates were adjusted in response to both inflation
Within our model structure, we can amend our monetary policy rule to be more like this
it = r + + (t ) + y (yt yt ) (1.19)
It turns out that the properties of the IS-MP-PC model dont really change if we adopt
this more complicated monetary policy rule. If we substitute the expression for the nominal
Bringing together all the terms involving the output gap yt yt , we get
( 1) 1
yt yt = (t ) + yt (1.23)
1 + y 1 + y
35
This equation shows us that broadening the monetary policy rule to incorporate interest rates
responding to the output gap doesnt change the essential form of the IS-MP curve. As long
as > 1, the curve will slope downwards and will feature t = when yt = yt and there
are no inflationary shocks. So while the addition of an output gap response to the monetary
policy rule changes the coefficients of the IS-MP model a bit, it doesnt change the underlying
economics. In the analysis in the next sets of notes, we will stick with the model that uses
36
Chapter 2
its properties.
Lets start by repeating a graph from the last time. Figure 2.1 shows the simplest possible
example of the model. This is the case where both the temporary shocks, t and yt equal zero
and the publics expectation of inflation equals the central banks inflation target. Specifically,
the graph shows a case where the publics expectation of inflation te = 1 and the central
banks inflation target is = 1 . With no temporary shocks, the value of output consistent
with t = 1 for the IS-MP curve is yt . Similarly, the value of output consistent with t = 1
for the PC curve is also yt . So the model generates an outcome where t = 1 and yt = yt .
Now consider a case in which the publics inflation expectations shift to being higher than
the central banks target rate. Figure 2.2 illustrates this case. It shows the PC curve shifting
upwards to the red line. This position of this red line is determined by the new higher level
37
of expected inflation. Specifically, the publics inflation expectations are now determined by
te = . Note that is the higher level of inflation noted on the y-axis and that this level is
consistent with yt = yt in the new Phillips curve described by the red line.
The outcomes for inflation and output of the increase in inflation expectations are described
by the intersection of the new red PC line and the old unchanged IS-MP curve. The actual
outcome for inflation (denoted as 2 on the graph) ends up being higher than the central
banks inflation target but lower than the publics inflationary expectations. Output ends up
being lower than its natural rate (consistent with a slump or perhaps a full-blown recession)
because the higher level of inflation leads the central bank to raise real interest rates which
reduces output.
When studying this graph, its important to understand the various markings on the curves
and the axes. If I ask you on the final exam to illustrate the impact of an increase in inflation
expectations using this model, my preference would be to see the various assumptions about
inflation targets and inflation expectations explictly marked out, rather than just a graph that
Figure 2.2 is a good example of how we can use graphs to illustrate a models properties. It
gets the basic story across as to what happens when inflation expectations rise above target
when the central bank is pursuing a monetary policy rule that increases real rates in response
to higher inflation.
Still, one could look to dig a bit deeper. The inflation outcome as drawn in Figure 2.2 is
slightly more than halfway towards the publics inflation expectations relative to the central
38
banks inflation target. But what actually determines this outcome? In other words, what de-
termines how far from away target inflation will move when the publics inflation expectations
change? How much does it depend on the monetary policy rule? How much does it depend
on other aspects of the model, like the impact of real interest rates on output and the impact
of output on inflation? It would be tricky to get these answers from a graph. However, using
the equations underlying the model, we can get a full solution that fully answers all these
questions.
39
Figure 2.1: The IS-MP-PC Model When Expected Inflation Equals
Inflation
IS-MP (
PC ( )
Output
40
Figure 2.2: The IS-MP-PC Model When Expected Inflation Rises
PC (
Inflation
IS-MP (
)
PC ( )
Output
41
The IS-MP-PC Model Solution for Inflation
Lets repeat the equations describing our two curves as presented in our last set of notes. The
PC curve is
t = te + (yt yt ) + t (2.1)
yt = yt ( 1) (t ) + yt (2.2)
Taking all the other elements of the model as given, we can view this as two linear equations
in the two variables t and yt . These equations can be solved to give solutions that describe
how these two variables depend on all the other elements of the model.
This can be done as follows. First, we will derive a complete expression for the behaviour
of inflation and then derive an expression for output. We derive the expression for inflation by
starting with the Phillips curve and replacing the output gap term yt yt with the variables
that the IS-MP curve tells us determines this gap. This gives us the following equation:
t = te + [ ( 1) (t ) + yt ] + t (2.3)
yt + t
! !
1 e ( 1)
t = + + (2.5)
1 + ( 1) t 1 + ( 1) 1 + ( 1)
There are a lot of symbols in this equation, which make it a bit hard to read. One way to
simplify it is to take the term multiplying inflation expectations and denote it by a single
42
symbol. In this case, we will denote it by the Greek letter (theta, pronounced thay-ta
1
= (2.6)
1 + ( 1)
t = te + (1 ) + (yt + t ) (2.7)
This equation shows that, apart from the shocks to output and inflation (the (yt + t )
terms) inflation is a weighted average of the publics inflation expectations and the central
banks inflation target i.e. it must lie between these two values as long as 0 < < 1 (which
it should be). What determines whether inflation depends more on the publics expectations
or the central banks target? In other words, what determines the value of ? Three different
1. : This is the parameter that determines how inflation changes when output changes.
The central bank can only influence inflation by influencing output. If the effect of
output on inflation gets bigger, then the central banks inflation target will have more
2. : This is the parameter that determines how output changes when output interest
rates. If the effect of interest rates on output gets bigger, then the central banks
inflation target will have more influence on the outcome for inflation.
3. : Lets continue to assume > 1 (well return to this in the next set of notes). Then
as gets bigger, the central bank is reacting more to inflation being above its target
level. So this parameter getting bigger means less weight on inflation expectations in
43
determining the outcome for inflation and more weight on the central banks inflation
target.
Next we provide an expression for output. Looking at the IS-MP curve, we see that the output
gap depends on how far inflation is from the central banks target as well as the supply shock
term t . We can use the equation determining inflation, equation (2.7), to get an expression
for the gap between inflation and the target level. Subtract from either side of equation
We can now replace the t on the right-hand-side of the IS-MP curve, equation (4.7),
This equation tells us that whether output is above or below target depends upon the gap
between expected inflation and the inflation target as well as on the two temporary shocks
yt and t . Provided we have the usual condition that > 1, the combined coefficient
relative to the inflation target end up having a negative effect on output. Inflationary supply
shocks (positive values for t ) also have a negative effect on output while positive aggregate
44
How far does output fall short of its natural rate when inflation expectations rise above
the central banks target? The coefficient determining this is ( 1). Remembering
that the size of depends positively on , and , we can say that the same three factors
determine the size of this effect. In other words, the larger the impact of output on inflation,
the larger the impact of interest rates on output and the larger the central bank response to
inflation, the larger the shortfall in output will be when inflation expectations rise above the
While the calculations on these pages may seem difficult, they illustrate that a formal
mathematical solution can sometimes give us a much more complete insight into the properties
of a model than graphs. While graphs are often useful at illustrating a particular feature of a
model, they also often fall short of explainnig the full properties of a model.
Lets go back to Figure 2 now. We have seen that after the publics inflation expectations
rise, the result is a fall in output below its natural rate and in increase inflation, though this
increase is smaller than had been expected by the public. What happens next? How does the
Friedmans 1968 paper The Role of Monetary Policy suggested that people gradually adapt
their expectations based on past outcomes for inflation. Consider now a simple model of this
idea of adaptive expectations by assuming that, each period, the expected level of inflation
is simply equal to the level that prevailed last period. Formally, this can be written as
te = t1 (2.11)
45
Under this formulation of expectations, the Phillips curve becomes
t = t1 + (yt yt ) + t (2.12)
t t1 = (yt yt ) + t (2.13)
In other words, there should be a positive relationship between the change in inflation and
the output gap. There are various methods for measuring output gaps but one quick and easy
method is to use the unemployment rate as an indication of what the output might be. If
unemployment is high, then output is likely to be below its natural rate so the output gap is
negative. In contrast, a low unemployment rate is an indicator that the output gap is likely to
be positive. So if the adaptive expectations formulation of the Phillips curve was correct, then
we would expect to see a negative relationship in the data between the change in inflation
Figure 2.3 uses the same US quarterly data that we used for Figure 1.4 in the previous
chapter. That figure showed that there was very little relationship between the level of the
unemployment rate and the level of inflation. Figure 2.3 shows a scatter plot of datapoints
on the change in inflation (measured as the four quarter percentage change in the price level
minus the percentage change in the price level over the preceding four quarters) and the
Phillips curve shows a clear and strong negative relationship between the change in inflation
46
Figure 2.3: Evidence for Adaptive Inflation Expectations
10
8
Unemployment
2
-7.5 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Change in Inflation
47
These results suggest that the adaptive expectations approach appears to provide a reason-
able model of how people formulate inflation expectations. That said, people are unlikely to
simply use mechanical formulae to arrive at their expectations and one can imagine conditions
in which peoples inflation expectations could radically depart from what had happened in the
past e.g. the appointment of a new central bank governor with a different approach to infla-
tion, the adoption of a new currency or other major events. Lets examine for now, however,
how the IS-MP-PC model behaves when people have adaptive inflationary expectations.
After inflation expectations moved up to , the outcome was that inflation moved from 1
(which is the central banks inflation target) to 2 . If people follow adaptive expectations then
the next period, they will set e = 2 . Figure 2.4 shows what happens after this. The PC
curve moves back downwards and inflation moves down to a lower level, denoted on this graph
by 3 . Figure 2.5 indicates how the process plays out. If the public has adaptive expectations,
then inflation and output gradually converge back to the point where output is at its natural
from the central banks inflation target. But if the public has adaptive expectations, how
could inflation expectations just jump upwards? Rather than a random unexplained increase
in inflation expectations, the more likely explanation for the Phillips curve shifting upwards
because is temporary supply shocks, i.e. t is positive for a number of periods. Under adaptive
expectations, the public becomes used to higher inflation and so the Phillips curve will remain
above its long-run position even after the temporary supply shock has been reversed.
48
Figure 2.4: Inflation Expectations Adjusting Back Downwards
PC (
Inflation
IS-MP (
) PC ( )
PC ( )
Output
49
Figure 2.5: Inflation and Output Adjust Back to Starting Position
PC (
Inflation
IS-MP (
)
PC ( )
Output
50
A Temporary Aggregate Demand Shock
Having looked at what happens under adaptive expectations when the Phillips curve shifts,
lets consider what happens when we have a temporary shock to aggregate demand, so yt
takes a different value from zero, which means a shift in the IS-MP curve. In Figures 2.6 to
2.9, we illustrate a case where there is a shift towards a positive value of yt for a couple of
Figure 2.6 shows the immediate impact of a positive aggregate demand shock. Output
and inflation both go up with inflation reaching the point denoted as 2 in the figure. If the
public has adaptive expectations, then this results in an increase in inflation expectations the
following period. Figure 2.7 shows what happens when the aggregate demand shock persists
but inflation expectations move up to match the previous periods inflation rate. The inflation
rate now rises again to 3 . Figure 2.8 shows how this triggers a further increase in inflation
Figure 2.9 shows what happens if the aggregate demand shock then reverses itself in the
next period. The IS-MP curve shifts back to its original position but the Phillips curve
remains elevated. The result is a nasty combination of high inflation and output below its
natural rate. Figure 2.9 contains arrows showing the full set of movements generated by this
a decline in output.
51
A further decline in inflation accompanied by an increase in output as inflationary
This chart shows that when the public has adaptive inflation expectations, temporary
positive aggregate demand shocks lead to counter-clockwise loops on graphs that have output
It turns out that much of the data on inflation and output correspond to these kinds of
movements. Figure 2.10 is borrowed from notes on Stanford economist Charles I. Joness
website. They show the data from US on inflation and an estimated output gap from 1960 to
1983. The figure shows a number of periods of clear counter-cyclical movements. Figure 2.11
shows the same data from 1983-2009. This figure also shows some evidence counter-cylical
loops, thought the movements are smaller than the for the pre-1983 period.
52
Figure 2.6: A Temporary Aggregate Demand Shock (yt > 0)
IS-MP2 (
Inflation
IS-MP1 (
PC2 ( )
Output
53
Figure 2.7: Inflation Expectations Adjust Upwards to 2
IS-MP3 ( PC3 ( )
Inflation
IS-MP1 (
PC1 ( )
Output
54
Figure 2.8: Inflation Expectations Adjust Upwards Further to 3
IS-MP4 (
PC4 ( )
Inflation
IS-MP1 (
PC1 ( )
Output
55
Figure 2.9: Reversal of Aggregate Demand Shock Leads to Recession
PC5 ( )
Inflation
IS-MP5 (
PC1 ( )
Output
56
Figure 2.10: From Chad Joness Notes: US Inflation-Output Loops
1960-1983
57
Figure 2.11: From Chad Joness Notes: US Inflation-Output Loops
1983-2009
58
What if Inflation Expectations Dont Adjust?
The evidence presented in Figure 3 suggests that adaptive expectations seems to be a rea-
sonable model for how people have formulated their expectations of inflation. And it can be
argued that it is a fairly convincing model of how people behave: Most people dont have the
time or knowledge to fully understand exactly whats going in the economy and anticipating
that last years conditions provide a guide to what will happen this year probably works well
enough for most people. Indeed, if the value of is relatively high, then inflation will only
change slowly under adaptive expectations, making the adaptive expectations assumption
more accurate.
All that said, it is also possible to imagine situations in which the publics inflation ex-
pectations are not formed adaptively. For example, if the public believes that the central
bank will always act to return inflation quickly towards its target, then they may assume that
Figure 2.12 shows how the economy reacts to a temporary positive demand shock when in-
flation expectations dont change. The outcome here is much nicer than the counter-clockwise
cycle described in Figure 9. There is no recession at any point, just a short period of output
being above its natural rate and inflation being above its target, followed by a return to their
starting levels.
59
Figure 2.12: Adjustment if Inflation Expectations Dont Change
IS-MP2 (
Inflation
IS-MP1 (
PC ( )
Output
60
Implications for Monetary Policy
The previous examples provide food for thought about what kind of monetary policy we would
like a central bank to implement. Do we want a soft central bank that limits the increase in
real interest rates when inflation rises to protect the economy and which isnt too concerned
about getting inflation back to target quickly? Or do we want a tough central bank that
raises interest rates aggressively and is very concerned about getting inflation back to target?
Suppose we wish to avoid large recessions if possible. You might imagine the soft central
bank would be more likely to deliver this. However, our model says the exact opposite:
Recessions are smaller and the economy less volatile when the central bank acts aggressively
We can see this in two results from the model. First, the economy responds better to shocks
if the central bank reacts strongly to changes in inflation. We can see this from equations (2.7)
and (2.10). Equations (2.6) and (2.7) together tell us that the more aggressively the central
bank responds to inflation (the higher is) the smaller is the increase in inflation in response
to a rise in inflation expectations. Equation (2.10) then tells us this more aggressive approach
means a smaller fall in output, i.e. a small recession. The process of a gradual return to the
natural level of output and the central banks target inflation rate will also be faster if the
original changes were smaller. So recessions are smaller and shorter with an aggressive central
bank.
Second, the more people believe that a central bank is maintaining its low inflation target,
the less likely the economy is to go through boom-bust cycles. We can see this by comparing
the dynamics from Figure 9 where inflation expectations shift over time (perhaps because the
public believes the central bank is willing to be flexible about its target) and Figure 12, which
61
shows what happens when inflation expectations do not change after an expansionary shock.
These results predict that we get better outcomes if we have a tough central bank that
acts aggressively against inflation and which the public believes is committed to keeping the
How can this outcome be achieved? The academic literature on this topic has suggested
1. Political Independence: A central bank that plans for the long-term (and does not
worry about economic performance during election years) is more likely to stick to a
inflationand the public believes this, the economy gets closer to the ideal low in-
flation outcome even without commitment. So the government may choose to appoint
3. Consequence for Bad Inflation Outcomes: Introducing laws so that bad things
happen to the central bankers when inflation is high is one way to make the public
These ideas have had a considerable influence on the legal structure of central banks around
62
1. Political Independence: There has been a substantial move around the world towards
making central banks more independent. Close to home, the Bank of England was made
independent in 1997 (previously the Chancellor of the Exchequer had set interest rates)
2. Conservative Central Bankers: All around the world, central bankers talk much
more now about the evils of inflation and the benefits of price stability. This may be
because they believe this to be the case. But there is also a marketing element. Perhaps
they can face a better macroeconomic tradeoff if the public believes the central banks
3. Consequence for Bad Inflation Outcomes: In tandem with the move towards
increased independence, many central banks now have legally imposed inflation targets
and various types of bad things happen to the chief central banker when the inflation
target is not met. For instance, the Governor of the Bank of England has to write a
letter to the Chancellor explaining why the target was not met. The Bank of Englands
63
Chapter 3
inflation by implementing a bigger change in interest rates. In terms of the equation for our
monetary policy rule, this means we are assuming > 1. With this assumption, real interest
rates go up when inflation rises and go down when inflation falls. For this reason, our IS-MP
curve slopes downwards: Along this curve, higher inflation means lower output. Because John
Taylors original proposed rule had the feature that > 1, the idea that monetary policy
rules should have this feature has become known as the Taylor Principle. In these notes, we
Recall from our last set of notes that inflation in the IS-MP-PC model is given by
t = te + (1 ) + (yt + t ) (3.1)
where
!
1
= (3.2)
1 + ( 1)
64
Under adaptive expectations te = t1 and the model can be re-written as
t = t1 + (1 ) + (yt + t ) (3.3)
The value of turns out to be crucial to the behaviour of inflation and output in this
Case 1: > 1
If the Taylor principle is satisfied, then ( 1) > 0. That value being positive means that
us a value of that is positive but less than one. So > 1 translates into the case 0 < < 1.
1
Case 2: 1
< < 1
a value of that is greater than one. As falls farther below one, gets bigger and bigger
1
and heads towards infinity as approaches 1
(this is the value of that makes the
denominator in the formula equal zero). As long we assume that stays above this level,
1
Case 3: 0 < < 1
negative, meaning an increase in inflation expectations actually reduces inflation. We are not
65
Macro Dynamics and Difference Equations
These calculations tell us that as long as the Taylor principle is satisfied, the value of lies
between zero and one but that if slips below one, then becomes greater than one. It
turns out this is a very important distinction. To understand the difference between these
these sequences can be understood as a pattern over time for a variable of interest. After
supplying some starting values, the difference equation provides a sequence explaining how
the variable changes over time. For example, consider a case in which the first value for a
zt = zt1 + 2 (3.4)
This will give z2 = 3, z3 = 5, z4 = 7 and so on. More relevant to our case is the multiplicative
model
zt = bzt1 (3.5)
For a starting value of z1 = x, this difference equation delivers a sequence of values that looks
Note that if b is between zero and one, then this sequence converges to zero over time no
matter what value x takes but if b > 1, the sequence will explode off towards either plus or
minus infinity depending on whether the initial value was positive or negative. The same logic
prevails if we add a constant term to the difference equation. Consider this equation:
zt = a + bzt1 (3.6)
If b is between zero and one, then no matter what the starting value is, the sequence converges
66
a
over time to 1b
but if b > 1, the sequence explodes towards infinity. Similarly, if we add
zt = a + bzt1 + t (3.7)
where t is a series of independently drawn zero-mean random shocks, then the presence of
the shocks will mean the series wont simply converge to a constant or steadily explode. But
as long as we have 0 < b < 1 then the series will tend to oscillate above and below the average
a
value of 1b
while if b > 1 the series will tend to explode to infinity over time.
These considerations explain why the Taylor principle is so important. If > 1 then inflation
dynamics in the IS-MP-PC model can be described by an AR(1) model with a coefficient on
past inflation that is between zero and one (the in equation 3.3 plays the role of the coefficient
b in the models just considered.) So a policy rule that satisfies the Taylor principle produces a
stable time series for inflation under adaptive expectations. And because output depends on
the gap between inflation expectations and the central banks inflation target, stable inflation
In contrast, once falls below 1, the coefficient on past values of inflation in equation
(3.3) becomes greater than one and the coefficient on the inflation target becomes negative.
In this case, any change in inflation produces a greater change in the same direction next
period and inflation ends up exploding off to either plus or minus infinity. Similarly output
Why does matter so much for macroeconomic stability? Obeying the Taylor principle
67
means that shocks that boost inflation (whether they be supply or demand shocks) raise real
interest rates (because nominal rates go up by more than inflation does) and thus reduce
output, which contains the increase in inflation and keeps the economy stable. In contrast,
when the falls below 1, shocks that raise inflation result in lower real interest rates and
higher output which further fuels the initial increase in inflation (similarly declines in inflation
You might be tempted to think that the arguments in favour of obeying the Taylor principle
as explained here depends crucially on the assumption of adaptive expectations but this isnt
the case. Even before assuming adaptive expectations, from equation (3.1) we can see that
when > 1, the coefficient on the central banks inflation target is negative. So if you
introduced a more sophisticated model of expectations formation, the public would realise
that the central banks inflation target doesnt have its intended influence on inflation and so
there would no reason to expect this value of inflation to come about. But if people know
that changes in expected inflation are translated more than one-for-one into changes in actual
inflation then this could produce self-fulfilling inflationary spirals, even if the public had a
more sophisticated method of forming expectations than the adaptive one employed here.
Graphical Representation
We can use graphs to illustrate the properties of the IS-MP-PC model when the Taylor
principle is not obeyed. Recall that the IS-MP curve is given by this equation
yt = yt ( 1) (t ) + yt (3.8)
The slope of the curve depends on whether or not > 1. In our previous notes, we assumed
> 1 and so the slope ( 1) < 0, meaning the IS-MP curve slopes down. With
68
< 1, the IS-MP curve slopes up. Figure 3.1 illustrates the IS-MP-PC model in this case
under the assumption that te = = 1 , i.e. that the public expects inflation to equal the
One technical point about this graph is worth noting. I have drawn the upward-sloping
IS-MP curve as a steeper line than the upward-sloping Phillips curve. On the graph as weve
1
drawn it in inflation-output space, the slope of this curve is (1 )
while the slope of the
1
Phillips curve is . One can show that the condition that (1 )
> is the same as showing
that > 0, i.e. that we are ruling out values of associated with the strange third case
noted above.
Now consider what happens when there is an increase in inflation expectations when
falls below one. Figure 3.2 shows a shift in the Phillips curve due to inflation expectations
increasing from 1 to h (You can see that the value of inflation on the red Phillips curve
when yt = yt is t = h ). Notice now that, because the IS-MP curve is steeper than the
Phillips curve, inflation increases above h to take the higher value of 2 . Inflation overshoots
Figure 3.3 shows what happens next if the public have adaptive expectations. In this next
period, we have te = 2 and inflation jumps all the way up to the even higher value of 3 . We
wont show any more graphs but the process would continue with inflation increasing every
period. These figures thus show graphically what weve already demonstrated with equations.
The IS-MP-PC model generates explosive dynamics when the monetary policy rule fails to
69
1
Figure 3.1: The IS-MP-PC Model when 1 < < 1
Inflation IS-MP (
PC ( )
Output
70
e 1
Figure 3.2: An Increase in when 1 < < 1
PC ( )
Output
71
1
Figure 3.3: Explosive Dynamics when 1 < < 1
PC3 ( )
Inflation IS-MP (
PC ( )
Output
72
An Increase in the Inflation Target
Figure 3.4 illustrates what happens in the IS-MP-PC model when the central bank changes
its inflation target. The increase in the inflation target shifts the IS-MP curve upwards i.e.
each level of output is associated with a higher level of inflation. However, because the IS-MP
curve is steeper than the Phillips curve, the outcome is a reduction in inflation. Output also
falls.
Even though this is exactly what our earlier equations predicted (the coefficient on the
inflation target is 1 which is negative in this case) this seems like a very strange outcome.
The central bank sets a higher inflation target and then inflation falls. Why is this?
The answer turns out to reflect the particular form of the monetary policy rule that we
it = r + + (t ) (3.9)
You might expect that a higher inflation target would lead to the central bank setting a lower
interest rate, i.e. they ease up to allow the economy to expand and let inflation move higher.
However, if you look closely at this formula, you can see that an increase in the inflation target
This can be explained as follows. The inflation target appears twice in equation (3.9). It
this was the only place that it appeared, then indeed a higher inflation target would lead to
lower interest rates. However, the first part of rule relates to setting the interest rate so that
when inflation equalled its target, real interest rates would equal their natural rate r . The
rule is set on the basis that if inflation is going to be higher on average, then the nominal
interest rate also needs to be higher if real interest rates are to remain unchanged (this is
73
commonly called the Fisher effect of inflation on interest rates).
Putting these two effects together, we see that an increases of x in the inflation target
raises the nominal interest rate by x due to the real interest rate component and reduces it
by x due to inflation now falling below target. If < 1 then the higher inflation target
results in higher interest rates and thus lower output. This is the pattern shown in Figure
3.4.
74
1
Figure 3.4: An Increase in when 1 < < 1
IS-MP (
Inflation
IS-MP (
PC ( )
Output
75
Evidence on Monetary Policy Rules and Macroeconomic Stability
Is there any evidence that obeying the Taylor principle provides greater macroeconomic sta-
An important paper on this topic was Monetary Policy Rules and Macroeconomic Sta-
bility: Evidence and Some Theory by Richard Clarida, Jordi Gali and Mark Gertler. These
economists reported that estimated policy rules for the Federal Reserve appeared to show a
change after Paul Volcker was appointed Chairman in 1979. They estimated that the post-
1979 monetary policy appeared consistent with a rule in which the coefficient on inflation
that was greater than one while the pre-1979 policy seemed consistent with a rule in which
this coefficient was less than one. They also introduce a small model in which the public
adopts rational expectations (more on what this means later) and show that failure to obey
the Taylor principle can lead to the economy generating cycles based on self-fulfilling fluctu-
ations. They argue that failure to obey the Taylor principle could have been responsible for
the poor macroeconomic performance, featuring the stagflation combination of high inflation
There are a number of differences between the approach taken in Clarida, Gali, Gertler
paper and these notes (in particular, their estimated policy rule is a forward-looking one in
which policy reacts to expected future values of inflation and output) and the econometrics
are perhaps more advanced than you have seen but its still a pretty readable paper and a
That said, this being economics, there have been some dissenting voices on Clarida, Gali
76
Orphanides is critical of Taylor rule regressions that use measures of the output gap that
are based on detrending data from the full sample. This includes information that wasnt
available to policy-makers when they were formulating policy in real time and so perhaps it
This point is particularly relevant for assessing monetary policy prior to 1979. During
the 1970s, growth rates for major international economies slowed considerably. Policy-makers
thought their economies were falling far short of its potential level. In retrospect it is clear
that potential output growth rates were falling and true output gaps were small. Replacing
the full-sample outgap estimates with Using real-time estimates that were available to the
Fed at the time, Orphanides reports regressions which suggest that the 1970s Fed obeyed the
Taylor rule with respect to reacting to inflation and that their mistake was over-reacting to
from the Trenches Journal of Money, Credit and Banking, vol. 36(2).
77
Chapter 4
Liquidity Trap
Up to now, we have assumed that the central bank in our model economy sets its interest rate
according to a specific policy rule. Whatever the rule says the interest rate should be, the
central banks sets that interest rate. But what if the rule predicts the central bank should set
Well lets step back a minute and think about what the interest rate it in our model
actually means. This interest rate appears in the IS curve because we think higher real
interest rates discourage consumption and investment spending by the private sector. So
the relevant interest rates here are interest rates that private sector agents borrow at. And
indeed, if you look at the interest rates that central banks target, they are generally short-term
money market interest rates on transactions featuring private sector agents on both sides of
the deal. For example, the Federal Reserve targets the federal funds rate, which is the rate at
which banks borrow and lend short-term funds to and from each other. The ECB also targets
78
With this in mind, consider why these private sector interest rates are unlikely to ever
be negative. If I loan you $100 and only get $101 back next period, I havent earned much
interest but at least I earned some. A negative interest rate would mean me loaning you $100
and getting back less than that next year. Why would I do that? Since money maintains its
nominal face value, Id be better off just the keep the money in my bank account or under a
mattress.
With this in mind, we are going to adapt our model to take into account that there are
times when the central bank would like to set it below zero but is not able to do so.
When will the zero lower bound become a problem for a central bank? In our IS-MP-
PC model, it depends on the form of the monetary policy rule. Up to now, we have been
it = r + + (t ) (4.1)
This rule sees the nominal interest rate adjusted upwards and downwards as inflation changes.
So the lower bound problem occurs when inflation goes below some critical value. This
value might be negative, so it may occur when there is deflation, meaning prices are falling.
Amending our model to remove the possibility that interest rates could become negative, our
it = Maximum [r + + (t ) , 0] (4.2)
Because the intended interest rate of the central bank declines with inflation, this means that
there is a particular inflation rate, ZLB , such that if t < ZLB then the interest rate will
equal zero. So what determines this specific value, ZLB that triggers the zero lower bound?
79
Algebraically, we can characterise ZLB as satisfying
r + + ZLB = 0 (4.3)
ZLB = r (4.4)
Equation (4.5) tells us that three factors determine the value of inflation at which the
1. The inflation target: The higher the inflation target is, then the higher is the level
of inflation at which a central bank will be willing to set interest rates equal to zero.
2. The natural rate of interest: A higher value of r , the natural real interest rate,
lowers the level of inflation at which a central bank will be willing to set interest rates
equal to zero. An increase in this rate makes central bank raise interest rates and so
they will wait until inflation goes lower than previously to set interest rates to zero.
coefficient on in this formula, increasing the first term and it makes the second term
(which has a negative sign) smaller. Both effects mean a higher translates into a
higher value for ZLB . Central banks that react more aggressively against inflation will
wait for inflation to reach lower values before they are willing to set interest rates to
zero.
80
The IS-MP Curve and the Zero Lower Bound
Given this characterisation of when the zero lower bound kicks in, we need to re-formulate the
IS-MP curve. Once inflation falls below ZLB , the central bank cannot keep cutting interest
rates in line with its monetary policy rule. Recalling that the IS curve
We had previously derived the IS-MP curve by substituting in the monetary policy rule
formula (4.1) for it term. This gave us the IS-MP curve as:
yt = yt ( 1) (t ) + yt (4.7)
However, when t ZLB we need to substitute in zero instead of the negative value that the
The effect of inflation on output in this revised IS-MP curve changes when inflation moves
below ZLB . Above ZLB , higher values of inflation are associated with lower values of output.
Below ZLB , higher values of inflation are associated with higher values of output. Graphically,
this means the IS-MP curve shifts from being downward-sloping to being upward-sloping when
inflation falls below ZLB . Figure 4.1 provides an example of how this looks.
Equation (4.8) also explains the conditions under which the zero lower bound is likely
to be relevant. If there are no aggregate demand shocks, so yt = 0, then the zero lower
bound is likely to kick in at a point where output is above its natural rate; this is the case
illustrated in Figure 4.1. But this combination of high output and low inflation is unlikely to
be an equilibrium in the model unless the public expects very low inflation or deflation so the
81
Phillips curve intersects the IS-MP curve along the section that has output above its natural
However, if we have a large negative aggregate demand shock, so that yt < 0, then it is
possible to have output below its natural rate and inflation falling below ZLB . As illustrated
in Figure 4.2, this situation is more likely to be an equilibrium (i.e. this position for the IS-MP
curve is more likely to intersect with the Phillips curve) even if inflation expectations are close
82
Figure 4.1: The IS-MP Curve with the Zero Lower Bound
IS-MP ( =0)
Inflation
Output
83
Figure 4.2: A Negative Aggregate Demand Shock
IS-MP ( =0)
Inflation
PC ( )
IS-MP ( < 0)
Output
84
The Liquidity Trap
yt = yt + r + t + yt (4.9)
t = te + (yt yt ) + t (4.10)
Using the expression for the output gap when the zero lower bound limit has been reached
from equation (4.9) we get an expression for inflation under these conditions as follows
t = te + (r + t + yt ) + t (4.11)
1 1
t = te + r + yt + (4.12)
1 1 1 1 t
1
The coefficient on expected inflation, 1
is greater than one. So, just as with the Taylor
principle example from the last notes, changes in expected inflation translate into even bigger
changes in actual inflation. As we discussed the last time, this leads to unstable dynamics.
Because these dynamics take place only when inflation has fallen below the zero lower bound,
the instability here relates to falling inflation expectations, leading to further declines in
inflation and further declines in inflation expectations. Because output depends positively
on inflation when the zero-bound constraint binds, these dynamics mean falling inflation (or
This position in which nominal interest rates are zero and the economy falls into a defla-
tionary spiral is known as the liquidity trap. Figures 4.3 and 4.4 illustrate how the liquidity
85
trap operates in our model. Figure 4.3 shows how a large negative aggregate demand shock
can lead to interest rates hitting the zero bound even when expected inflation is positive.
Figure 4.4 illustrates how expected inflation has a completely different effect when the
zero lower bound has been hit. It shows a fall in expected inflation after the negative demand
shock (this example isnt adaptive expectations because I havent drawn inflation expectations
falling all the way to the deflationary outcome graphed in Figure 4.3). In our usual model
set-up, a fall in expected inflation raises output. However, once at the zero bound, a fall in
86
Figure 4.3: Equilibrium At the Lower Bound
IS-MP ( =0)
Inflation
PC ( )
IS-MP ( < 0)
Output
87
Figure 4.4: Falling Expected Inflation Worsens Slump
IS-MP ( =0)
Inflation
PC ( )
IS-MP ( < 0)
PC ( )
Output
88
The Liquidity Trap with a Taylor Rule
For the simple monetary policy rule that we have been using, the zero lower bound is hit for
a particular trigger level of inflation. Plugging in reasonable parameter values into equation
(4.5) this trigger value will most likely be negative. In other words, with the monetary policy
rule that we have been using, the zero lower bound will only be hit when there is deflation.
However, if we have a different monetary policy rule this result can be overturned. For
example, remember the Taylor-type rule we considered in the first set of notes
it = r + + (t ) + y (yt yt ) (4.13)
r + + (t ) + y (yt yt ) = 0 (4.15)
(t ) + y (yt yt ) = r (4.16)
In other words, there are a series of different combinations of inflation gaps and output gaps
that can lead to monetary policy hitting the zero lower bound. For example, if yt = yt the
lower bound will be hit at the value of inflation given by equation (4.5), i.e. the level we have
defined as ZLB . In contrast, inflation could equal its target level but policy would hit the
r +
zero bound if output fell as low as yt y
.
Graphically, we can represent all the combinations of output and inflation that produce
zero interest rates under the Taylor rule as the area under a downward-sloping line in Inflation-
Output space. Figure 4.5 gives an illustration of what this area would look like. We showed
89
in the first set of notes that when we are above the zero bound, the IS-MP curve under the
Taylor rule is of the same downward-sloping form as under our simple inflation targeting rule.
At the zero bound, the arguments weve already presented here also apply so that the IS-MP
Figure 4.6 illustrates two different cases of IS-MP curves when monetary policy follows
a Taylor rule. The right-hand curve corresponds to the case yt = 0 (no aggregate demand
shocks) and this curve only interests with the zero bound area when there is a substantial
deflation. In contrast, the left-hand curve corresponds to the case in which yt is highly negative
(a large negative aggregate demand shocks) and this curve interests with the zero bound area
even at levels of inflation that are positive and arent much below the central banks target.
90
Figure 4.5: Zero Bound is Binding in Blue Triangle Area
Inflation
Output
Zero Lower
Bound Area
91
Figure 4.6: Zero Bound Can Be Hit With Positive Inflation
IS-MP ( =0)
Output
92
The Liquidity Trap: Reversing Conventional Wisdom
An important aspect of this model of the liquidity trap is it shows that some of the predictions
that our model made (and which are now part of the conventional wisdom among monetary
Up to now we have seen that as long as the central bank maintains its inflation targets,
then the model with adaptive expectations predicts that deviations of the publics inflation
expectations from this target will be temporary and the economy will tend to converge back
towards its natural level of output. However, once interest rates have hit the zero bound,
this is no longer the case. Instead, the adaptive expectations model predicts the economy can
Similarly, our earlier model predicted that a strong belief from the public that the central
bank would keep inflation at target was helpful in stabilising the economy. However, once
you reach the zero bound, convincing the public to raise its inflation expectations (perhaps
The most obvious way that a liquidity trap can end is if there is a positive aggregate demand
shock that shifts the IS-MP curve back upwards so that the intersection with the Phillips
curve occurs at levels of output and inflation that gets the economy out of the liquidity trap.
However, in reality, liquidity traps have often occurred during periods when there are
ongoing and persistent slumps in aggregate demand. For example, after decades of strong
growth, the Japanese economy went into a slump during the 1990s. Housing prices crashed
and businesses and households were hit with serious negative equity problems. This type
93
of balance sheet recession doesnt necessarily reverse itself quickly. The result in Japan
has been a persistent deflation since the mid-1990s (see Figure 4.7) combined with a weak
economy. The Bank of Japan has set short-term interest rates close to zero throughout this
Given that economies in liquidity traps tend not to self correct with positive aggregate
demand shocks from the private sector, governments can try to boost the economy by us-
ing fiscal policy to stimulate aggregate demand. Japan has used fiscal stimulus on various
What about monetary policy? With its policy interest rates at zero, can a central bank
do any more to boost the economy? Debates on this topic have focused on two areas.
The first area relates to the fact that while the short-term interest rates that are controlled
by central banks may be zero, that doesnt mean the longer-term rates that many people
borrow at will equal zero. By signalling that they intend to keep short-term rates low for a
long period of time and perhaps by directly intervening in the bond market (i.e. quantitiative
The second area relates to inflation expectations. Our model tells us that output can be
boosted when the economy is in a liquidity trap by raising inflation expectations. This acts
to raise inflation (or reduce deflation) and this reduces real interest rates and boost output.
As an academic and during his early years as a member of the Federal Reserve Board of
Governors (prior to becoming Chairman) Ben Bernanke advocated that the Bank of Japan
above their target level of 1%. In a 2003 speech titled Some Thoughts on Monetary Policy
94
What I have in mind is that the Bank of Japan would announce its intention to
restore the price level (as measured by some standard index of prices, such as the
consumer price index excluding fresh food) to the value it would have reached if,
instead of the deflation of the past five years, a moderate inflation of, say, 1 percent
The Bank of Japan did not take Bernankes advice. More recently, however, under pressure
from the new Japanese government, the Bank of Japan have changed their inflation target from
1% per year to 2% per year. Though not as radical as the steps recommended by Bernanke
and others, it is clear that the current Japanese authorities recognise the importance of raising
inflation expectations.
95
Figure 4.7: Consumer Price Index in Japan
96
Figure 4.8: Short-Term Interest Rates in Japan
97
Are the US and Euro Area in Liquidity Traps?
In recent years, the US economy has been in a position that, in some ways, resembles the
position of the Japanese economy during its long liquidity trap period. The economic recov-
ery that started in 2009 has been very weak by historical standards and unemployment has
remained fairly high. In response to this weakness, the Federal Reserve has kept its policy
rate (the federal funds rate) at close to zero since late 2008.
Has the US been in a liquidity trap and should the Fed have been considering more radical
policies such as signalling its intent to allow a temporary rise in inflation? Once Ben Bernanke
became Chairman of the Fed, he was not as keen to implement the ideas he recommended as
an academic. One argument that Bernanke advanced against providing price level guidance
was that the US was not in a liquidity trap because inflation is still positive. However, as
weve seen above, for a central bank that responds to deviations of output from its natural
rate (and clearly the Fed does this) then you can have liquidity-trap conditions even with
positive inflation. The key feature of the liquidity trap is zero short-term rates, not deflation.
And this feature has held in the U.S. for a number of years.
Why did Bernanke not adopt the policy he had recommended to the Japanese? The
explanation seems to be that he was concerned that non-standard policies will undermine the
I guess the question is, does it make sense to actively seek a higher inflation rate
reduction in the unemployment rate? The view of the Committee is that that
building up credibility for low and stable inflation, which has proved extremely
98
valuable in that weve been be able to take strong accommodative actions in the
last four or five years to support the economy without leading to an unanchoring
what I think would be quite tentative and perhaps doubtful gains on the real side
This suggests that Chairman Bernanke was still more focused on the benefits of well-anchored
low inflation expectations during normal times than on the potential benefits of non-standard
Nobel prize winner Paul Krugman, Bernankes former colleague at Princeton, was critical
article titled Chairman Bernanke should listen to Academic Bernanke.1 . From his research
on Japan in the late 1990s, Krugman has discussed the tension that central bankers feel
when in a liquidity trap. When up against the zero bound, they might like to raise inflation
expectations but then they are concerned that this could make inflation go higher than they
would like. The publics awareness that the central bank will clamp down on inflation if the
economy picks up then prevents there being a sufficient increase in inflation rates to get the
economy out of the liquidity trap. Krugman thus stresses the need for central banks facing a
temporary period of inflation being higher than you would normally like. But central bankers
are a conservative crowd and even temporary irresponsibility does not come easy to them.
The Fed has adopted a number of new policies in recent years such as quantitative easing
1
This article is at http://www.nytimes.com/2012/04/29/magazine/chairman-bernanke-should-listen-to-
professor-bernanke.html
99
and enhanced forward guidance in which they signal that rates will stay low for a long
period. More recently, they have discussed conditions under which they will reduce their QE
purchases and outlined the conditions under which they will keep rates at zero. So there
are signs that the Fed is willing to be flexible. Time will tell if the debate about price-
level targeting or raising the target inflation rate produces a change in policy at the Fed.
With many unconventional tools having been introduced over the past few years, it will be
interesting to see if any of the leading central banks will consider moving away from narrow
Short-term interest in the euro area are also now close to zero. While Mario Draghi argues
there is no cause for concern because there is no deflation as of yet, inflation is extremely low
and the ECB has effectively run out of room for influencing the economy through the regular
interest rate channel. So the euro area also appears to resemble the liquidity trap situation
we have described here. Given the ECBs legal requirement to place price stability above all
other goals, they seem very unlikely to adopt the kind of policies that can take an economy
100
Figure 4.9: The Fed Has Been at the Lower Bound Since 2008
101
Figure 4.10: The ECB Has Also Hit the Lower Bound
102
Part II
Rational Expectations
103
Chapter 5
Prices
One of the things weve focused on is how people formulate expectations about inflation. We
put forward one model of how these expectations were formulated, an adaptive expectations
model in which people formulated their expectations by looking at past values for a series.
Over the next few weeks, we will look at an alternative approach that macroeconomists call
rational expectations. This approach is widely used in macroeconomics and we will cover
its application to models of asset prices, consumption and other macroeconomic variables.
Almost all economic transactions rely crucially on the fact that the economy is not a one-
period game. In the language of macroeconomists, most economic decisions have an in-
We accept cash in return for goods and services because we know that, in the future,
104
this cash can be turned into goods and services for ourselves.
You dont empty out your bank account today and go on a big splurge because youre
still going to be around tommorrow and will have consumption needs then.
Conversely, sometimes you spend more than youre earning because you can get a bank
loan in anticipation of earning more in the future, and paying the loan off then.
Similarly, firms will spend money on capital goods like trucks or computers largely in
Another key aspect of economic transactions is that they generally involve some level of
uncertainty, so we dont always know whats going to happen in the future. Take two of the
examples just given. While it is true that one can accept cash in anticipation of turning it
into goods and services in the future, uncertainty about inflation means that we cant be sure
of the exact quantity of these goods and services. Similarly, one can borrow in anticipation
of higher income at a later stage, but few people can be completely certain of their future
incomes.
For these reasons, people have to make most economic decisions based on their subjective
of future values of inflation; in taking out a bank loan, we must have some expectation of our
future income. These expectations will almost certainly turn out to have been incorrect to
some degree, but one still has to formulate them before making these decisions.
So, a key issue in macroeconomic theory is how people formulate expectations of economic
variables in the presence of uncertainty. Prior to the 1970s, this aspect of macro theory
was largely ad hoc. Different researchers took different approaches, but generally it was
105
assumed that agents used some simple extrapolative rule whereby the expected future value
of a variable was close to some weighted average of its recent past values. However, such
models were widely criticised in the 1970s by economists such as Robert Lucas and Thomas
Sargent. Lucas and Sargent instead promoted the use of an alternative approach which they
called rational expectations. This approach had been introduced in an important 1961
The idea that agents expectations are somehow rational has various possible interpreta-
tions. However, when Muths concept of rational expectations meant two very specific things:
They use publicly available information in an efficient manner. Thus, they do not make
They understand the structure of the model economy and base their expectations of
behave in an optimal fashion, so why would we assume that the agents dont understand
the structure of the economy, and formulate expectations in some sub-optimal fashion. That
said, rational expectations models generally produce quite strong predictions, and these can
be tested. Ultimately, any assessment of a rational expectations model must be based upon
We will start with some terminology to explain how we will represent expectations. Suppose
our model economy has a uncertainty so that people do not know what is going to happen in
106
the future. Then we will write Et Zt+2 to mean the expected value the agents in the economy
have at time t for what Z is going to be at time t + 2. In other words, we assume people have
a distribution of potential outcomes for Zt+2 and Et Zt+2 is mean of this distribution.
qualifier explaining that we are dealing with peoples prior expectations of a Zt+2 rather than
Throughout these notes, we will use some basic properties of the expected value of dis-
tributions. Specifically, we will use the fact that expected values of distributions is what is
Some examples of this are the following. The expected value of five times a series equals five
And the expected value of the sum of two series equals the sum of the two expected values.
We will use these properties a lot, so I wont be stopping all the time to explain that they are
being used.
107
Asset Prices
The first class of rational expectations models that we will look relate to the determination of
asset prices. Asset prices are an increasingly important topic in macroeconomics. Movements
in asset prices affect the wealth of consumers and businesses and have an important influence
on spending decisions. In addition, while most of the global recessions that preceded the year
2000 were due to boom and bust cycles involving inflation getting too high and central banks
slowing the economy to constrain it, the most recent two global recessionsthe dot com
recession of 2000/01 and the great recession of 2008/09were triggered by big declines in
asset prices following earlier large increases. A framework for discussing these movements is
In these notes, we will start by considering the case of an asset that can be purchased
today for price Pt and which yields a dividend of Dt . While this terminology obviously fits
with the asset being a share of equity in a firm and Dt being the dividend payment, it could
also be a house and Dt could be the net return from renting this house out, i.e. the rent minus
any costs incurred in maintenance or managment fees. If this asset is sold tomorrow for price
Dt + Pt+1
rt+1 = (5.4)
Pt
This rate of return has two components, the first reflects the dividend received during the
period the asset was held, and the second reflects the capital gain (or loss) due to the price
of the asset changing from period t to period t + 1. This can also be written in terms of the
so-called gross return which is just one plus the rate of return.
Dt + Pt+1
1 + rt+1 = (5.5)
Pt
108
A useful re-arrangement of this equation that we will repeatedly work with is the following:
Dt Pt+1
Pt = + (5.6)
1 + rt+1 1 + rt+1
We will now consider a rational expectations approach to the determination of asset prices.
Rational expectations means investors understand equation (5.6) and that all expectations of
where Et means the expectation of a variable formulated at time t. The stock price at time t
A second assumption that we will make for the moment is that the expected return on
assets equals some constant value for all future periods, unrelated to the dividend process.
One way to think of this is that there is a required return, determined perhaps by the rate
of return available on some other asset, which this asset must deliver. With this assumption
in hand and assuming that everyone knows the value of Dt , equation (5.8) can be re-written
as
Dt Et Pt+1
Pt = + (5.10)
1+r 1+r
109
The Repeated Substitution Method
write down the general approach to solving these equations, rather than just focusing only on
our current asset price example. In general, this type of equation can be written as
Its solution is derived using a technique called repeated substitution. This works as follows.
Equation (9.1) holds in all periods, so under the assumption of rational expectations, the
agents in the economy understand the equation and formulate their expectation in a way that
Note that this last term (Et Et+1 yt+2 ) should simplify to Et yt+2 : It would not be rational if
you expected that next period you would have a higher or lower expectation for yt+2 because
it implies you already have some extra information and are not using it. This is known as the
Repeating this method by substituting in for Et yt+2 , and then Et yt+3 and so on, we get a
yt = axt + abEt xt+1 + ab2 Et xt+2 + .... + abN 1 Et xt+N 1 + bN Et yt+N (5.15)
1
Stochastic means random or incorporating uncertainty. It applies to this equation because agents do not
110
which can be written in more compact form as
N 1
bk Et xt+k + bN Et yt+N
X
yt = a (5.16)
k=0
For those of you unfamiliar with the summation sign terminology, summation signs work like
this
2
X
zk = z0 + z1 + z2 (5.17)
k=0
X3
zk = z0 + z1 + z2 + z3 (5.18)
k=0
X4
zk = z0 + z1 + z2 + z3 + z4 (5.19)
k=0
and so on.
Comparing equations (5.10) and (9.1), we can see that our asset price equation is a specific
y t = Pt (5.20)
xt = Dt (5.21)
1
a = (5.22)
1+r
1
b = (5.23)
1+r
Another assumption usually made is that this final term tends to zero as N gets big:
N
1
lim Et Pt+N = 0 (5.25)
N 1+r
111
What is the logic behind this assumption? One explanation is that if it did not hold then
we could set all future values of Dt equal to zero, and the asset price would still be positive.
But an asset that never pays out should be inherently worthless, so this condition rules this
This equation, which states that asset prices should equal a discounted present-value sum of
The repeated substitution solution is really important to understand so let me try to explain
it without equations. Suppose I told you that the right way to price a stock was as follows.
Todays stock price should equal todays dividend plus half of tomorrows expected
stock price.
Now suppose its Monday. Then that means the right formula should be
Mondays stock price should equal Mondays dividend plus half of Tuesdays ex-
Tuesdays stock price should equal Tuesdays dividend plus half of Wednesdays
If people had rational expectations, then Mondays stock prices would equal
112
Mondays dividend plus half of Tuesdays expected dividend plus one-quarter of
And being consistent about itfactoring in what Wednesdays stock price should beyoud
and so on.
A useful special case that is often used as a benchmark for thinking about stock prices is the
case in which dividend payments are expected to grow at a constant rate such that
In this case, the dividend-discount model predicts that the stock price should be given by
k
Dt X 1+g
Pt = (5.28)
1 + r k=0 1 + r
Now, remember the old multiplier formula, which states that as long as 0 < c < 1, then
1
1 + c + c2 + c3 + .... = ck =
X
(5.29)
k=0 1c
This geometric series formula gets used a lot in modern macroeconomics, not just in examples
1+g
involving the multiplier. Here we can use it as long as 1+r
< 1, i.e. as long as r (the expected
113
return on the stock market) is greater than g (the growth rate of dividends). We will assume
Dt 1
Pt = (5.30)
1 + r 1 1+g
1+r
Dt 1+r
= (5.31)
1 + r 1 + r (1 + g)
Dt
= (5.32)
rg
When dividend growth is expected to be constant, prices are a multiple of current dividend
payments, where that multiple depends positively on the expected future growth rate of
dividends and negatively on the expected future rate of return on stocks. This formula is
often called the Gordon growth model, after the economist that popularized it.2 It is often
used as a benchmark for assessing whether an asset is above or below the fair value implied
by rational expectations. Valuations are often expressed in terms of dividend-price ratios, and
Dt
=rg (5.33)
Pt
A more flexible way to formulate expectations about future dividends is to assume that divi-
ut = ut1 + t (5.35)
2
The formula appeared in Myron Gordons 1962 book The Investment, Financing and Valuation of the
Corporation.
114
These equations state that dividends are the sum of two processes: The first grows at rate
g each period. The second, ut , measures a cyclical component of dividends, and this follows
what is known as a first-order autoregressive process (AR(1) for short). Here t is a zero-mean
random shock term. Over large samples, we would expect ut to have an average value of
zero, but deviations from zero will be more persistent the higher is the value of the parameter
We will now derive the dividend-discount models predictions for stock prices when divi-
Lets split this sum into two. First the trend component,
k+1
1 c(1 + g)t X 1+g k
Et c(1 + g)t+k
X
= (5.37)
k=0 1+r 1 + r k=0 1 + r
c(1 + g)t 1
= (5.38)
1 + r 1 1+g
1+r
c(1 + g)t 1+r
= (5.39)
1 + r 1 + r (1 + g)
c(1 + g)t
= (5.40)
rg
115
ut 1
= (5.45)
1 + r 1 1+r
ut 1+r
= (5.46)
1+r1+r
ut
= (5.47)
1+r
c(1 + g)t ut
Pt = + (5.48)
rg 1+r
In this case, stock prices dont just grow at a constant rate. Instead they depend positively on
the cyclical component of dividends, ut , and the more persistent are these cyclical deviations
(the higher is), the larger is their effect on stock prices. To give a concrete example, suppose
1 1
= =5 (5.49)
1+r 1.1 0.9
1 1
= =2 (5.50)
1+r 1.1 0.6
Note also that when taking averages over long periods of time, the u components of
dividends and prices will average to zero. Thus, over longer averages the Gordon growth
model would be approximately correct, even though the dividend-price ratio isnt always
constant. Instead, prices would tend to be temporarily high relative to dividends during
periods when dividends are expected to grow at above-average rates for a while, and would
be temporarily low when dividend growth is expected to be below average for a while. This
is why the Gordon formula is normally seen as a guide to long-run average valuations rather
116
Unpredictability of Stock Returns
The dividend-discount model has some very specific predictions for how stock prices should
change over time. It implies that the change in prices from period t to period t + 1 should be
k+1 k+1
X 1 X 1
Pt+1 Pt = Et Dt+k+1 Et Dt+k (5.51)
k=0 1+r k=0 1+r
Taking away the summation signs and writing this out in long form, it looks like this
2 3
" #
1 1 1
Pt+1 Pt = Dt+1 + Et+1 Dt+2 + Et+1 Dt+3 + ....
1+r 1+r 1+r
" 2 3 #
1 1 1
Dt + Et Dt+1 + Et Dt+2 + ... (5.52)
1+r 1+r 1+r
We can re-arrange this equation in a useful way by grouping together each of the two terms
that involve Dt+1 , Dt+2 , Dt+3 and so on. (There is only one term involving Dt .) This can be
written as follows
2
" #
1 1 1
Pt+1 Pt = Dt + Dt+1 Et Dt+1
1+r 1+r 1+r
" 2 3 #
1 1
+ Et+1 Dt+2 Et Dt+2 +
1+r 1+r
" 3 4 #
1 1
+ Et+1 Dt+3 Et Dt+3 + .... (5.53)
1+r 1+r
This equation explains three reasons why prices change from period Pt to period Pt+1 .
Pt+1 differs from Pt because it does not take into account Dt this dividend has been
paid now and has no influence any longer on the price at time t + 1. This is the first
Pt+1 applies a smaller discount rate to future dividends because have moved forward one
2
1 1
period in time, e.g. it discounts Dt+1 by 1+r
instead of 1+r
.
117
People formulate new expectations for the future path of dividends e.g. Et Dt+2 is gone
In general, the first few items above should not be too important. A single dividend payment
being made shouldnt have too much impact on a stocks price and the discount rate shouldnt
2
1 1
change too much over a single period (e.g. if r is relatively small, then 1+r
and 1+r
shouldnt be too different.) This means that changing expectations about future dividends
In fact, it turns out there is a very specific result linking the behaviour of stock prices with
changing expectations. Ultimately, it is not stock prices, per se, that investors are interested
in. Rather, they are interested in the combined return incorporating both price changes and
dividend payments, as described by equation (5.4). It turns out that movements in stock
re-expressed as
k
" #
X 1
Pt+1 Pt = Dt + rPt + (Et+1 Dt+k Et Dt+k ) (5.54)
k=1 1+r
Recalling the definition of the one-period return on a stock from equation (1), this return can
be written as
P 1 k
Dt + Pt+1 k=1 (E
1+r t+1 Dt+k Et Dt+k )
rt+1 = =r+ (5.55)
Pt Pt
This is a very important result. It tells us that, if the dividend-discount model is correct,
then the rate of return on stocks depends on how people change their minds about what they
expect to happen to dividends in the future: The Et+1 Dt+k Et Dt+k terms on the right-hand
118
side of equation (5.55) describe the difference between what people expected at time t + 1 for
the dividend at time t + k and what they expected for this same dividend payment at time t.
Importantly, if we assume that people formulate rational expectations, then the return on
stocks should be unpredictable. This is because, if we could tell in advance how people were
going to change their expectations of future events, then that would mean people have not
been using information in an efficient manner. So, with rational expectations, the term in the
summation sign in equation (5.55) must be zero on average and must reflect news that could
not have been forecasted at time t. So the only thing determining changes in stock returns in
One small warning about this result. It is often mis-understood as a prediction that stock
prices (rather than stock returns) should be unpredictable. This is not the case. The series
that should be unpredictable is the total stock return including the dividend payment. Indeed,
the model predicts that a high dividend payments at time t lowers stock prices at time t + 1.
Consider for example a firm that promises to make a huge dividend payment next month
but says they wont make any payments after this for a long time. In that case, we would
expect the price of the stock to fall after the dividend is payment. This shows that, even
with rational expectations, stock prices movements can sometimes be predictable. Because
dividend payments are only made on an occasional basis, this prediction can be tested and
various studies have indeed found so-called ex-dividend effects whereby a stock price falls
119
Evidence on Predictability and Efficient Markets
The theoretical result that stock returns should be unpredictable was tested in a series of
empirical papers in the 1960s and 1970s, most notably by University of Chicago professor
Eugene Fama and his co-authors. Famas famous 1970 paper Efficient Capital Market: A
Review of Theory and Empirical Work reviewd much of this work. This literature came to
a clear conclusion that stock returns did seem to be essentially unpredictable. The idea that
you could not make easy money by timing the market entered public discussion with Burton
Malkiels famous 1973 book, A Random Walk Down Wall Street being particularly influential.
A random walk is a series whose changes cannot be forecasted and rational expectations
The work of Fama and his co-authors was very important in establishing key facts about
how financial markets work. One downside to this research, though, was the introduction
of a terminology that proved confusing. Famas 1970 paper describes financial market as
being efficient if they fully reflect all available information. In general, the researchers
contributing to this literature concluded financial markets were efficient because stock returns
were difficult to forecast. However, this turned out to be a bit of a leap. It is certainly true
that if stock prices incorporate all available information in the rational manner described
above, then returns should be hard to forecast. But the converse doesnt necessarily apply:
Showing that it was difficult to forecasting stock returns turned out to not be the same thing
120
Robert Shiller on Excess Volatility
The idea that financial markets were basically efficient was widely accepted in the economics
profession by the late 1970s. Then, a Yale economist in his mid-thirties, Robert Shiller,
dropped something of a bombshell on the finance profession. Shiller showed that the dividend-
discount model beloved of finance academics completely failed to match the observed volatility
of stock prices.3 Specifically, stock prices were much more volatile than could be justified by
the model.
To understand Shillers basic point, we need to take a step back and think about some
basic concepts relating to the formulation of expectations. First note that the ex post outcome
for any variable can be expressed as the sum of its ex ante value expected by somebody and
the unexpected component (i.e. the amount by which that persons expectation was wrong).
Xt = Et1 Xt + t (5.56)
From statistics, we know that the variance of the sum of two variables equals the sum of their
two variances plus twice their covariance. This means that the variance of Xt can be described
by
Now note that this last covariance termbetween the surprise element t and the ex-
ante expectation Et1 Xt should equal zero if expectations are fully rational. If there was
a correlationfor instance, so that a low value of the expectation tended to imply a high
value for the errorthen this would means that you could systematically construct a better
3
Do Stock Prices Move Too Much to be Justified by Subsequent Changes in Dividends? American
121
forecast once you had seen the forecast that was provided. For example, if a low forecasted
value tended to imply a positive error then you could construct a better forecast by going for
a higher figure. But this contradicts the idea that investors have rational expectations and
The variance of the observed series must equal the variance of the ex ante expectation plus
the variance of the unexpected component. Provided there is uncertainty, so there is some
In other words, the variance of the ex post outcome should be higher than the variance of ex
This reasoning has implications for the predicted volatility of stock prices. Equation (5.26)
says that stock prices are an ex ante expectation of a discount sum of future dividends. Shillers
observation was that rational expectations should imply that the variance of stock prices be
less than the variance of the present value of subsequent dividend movements:
" k+1 #
X 1
Var(Pt ) < Var Dt+k (5.60)
k=0 1+r
A check on this calculation, using a wide range of possible values for r, reveals that this
inequality does not hold: Stocks are actually much more volatile than suggested by realized
movements in dividends.4
4
While technically, the infinite sum of dividends cant be calculated because we dont have data going past
the present, Shiller filled in all terms after the end of his sample based on plausible assumptions, and the
122
Figure 5.1 on the next page reproduces the famous graph from Shillers 1981 paper showing
actual stock prices (the sold line) moving around much more over time than his discounted
123
Longer-Run Predictability
We saw earlier that the dividend-discount model predicts that when the ratio of dividends to
prices is low, this suggests that investors are confident about future dividend growth. Thus,
a low dividend-price ratio should help to predict higher future dividend growth. Shillers
volatility research pointed out, however, that there appears to be a lot of movements in
stock prices that never turn out to be fully justified by later changes dividends. In fact,
later research went a good bit further. For example, Campbell and Shiller (2001) show that
over longer periods, dividend-price ratios are of essentially no use at all in forecasting future
dividend growth.5 In fact, a high ratio of prices to dividends, instead of forecasting high
growth in dividends, tends to forecast lower future returns on the stock market alebit with a
This last finding seems to contradict Famas earlier conclusions that it was difficult to
forecast stock returns but these results turn out to be compatible with both those findings
and the volatility results. Famas classic results on predictability focused on explaining short-
run stock returns e.g. can we use data from this year to forecast next months stock returns?
However, the form of predictability found by Campbell and Shiller (and indeed a number of
earlier studies) related to predicting average returns over multiple years. It turns out the
an inability to do find short-run predictability is not the same thing as an inability to find
longer-run predictability.
To understand this, we need to develop some ideas about forecasting time series. Consider
yt = yt1 + t (5.61)
5
NBER Working Paper No. 8221.
124
where t is a random and unpredictable noise process with a zero mean. If = 1 then the
yt yt1 = t (5.62)
so the series is what is what we described earlier as a random walk process whose changes
cannot be predicted. Suppose, however, that was close to but a bit less than one, say
Now suppose you wanted to assess whether you could forecast the change in the series based
on last periods value of the series. You could run a regression of the change in yt on last
periods value of the series. The true coefficient in this relationship is -0.01 with the t being
the random error. This coefficient of -0.01 is so close to zero that you will probably be unable
to reject that the true coefficient is zero unless you have far more data than economists usually
But what if you were looking at forecasting changes in the series over a longer time-horizon?
To understand why this might be different, we can do another repeated substitution trick.
The series yt depends on its lagged value, yt1 and a random shock. But yt1 in turn depended
on yt2 and another random shock. And yt2 in turn depended on yt3 and another random
shock. And so on. Plugging in all of these substitutions you get the following.
yt = yt1 + t
= 2 yt2 + t + t1
125
Now suppose you wanted to forecast the change in yt over N periods with the value of the
N 1
N
k tk
X
yt ytN = 1 ytN + (5.65)
k=0
Again the change in yt over this period can be written as a function of a past value of the
series and some random noise. The difference in this case is that the coefficient on the lagged
value doesnt have to be small anymore even if had a near-random walk series. For example,
suppose = 0.99 and N = 50 so were looking at the change in the series over 50 periods.
In this case, the coefficient is (0.9950 1) = 0.4. For this reason, regressions that seek to
predict combined returns over longer periods have found statistically significant evidence of
predictability even though this evidence cannot be found for predicting returns over shorter
periods.
It is very easy to demonstrate this result using any software that can generate random
numbers. For example, in an appendix at the back of the notes I provide a short programme
written for the econometric package RATS. The code is pretty intuitive and could be repeated
for lots of other packages. The programme generates random AR(1) series with = 0.99 by
starting them off with a value of zero and then drawing random errors to generate full time
series. Then regressions for sample sizes of 200 are run to see if changes in the series over one
period, twenty periods and fifty periods can be forecasted by the relevant lagged values. This
is done 10,000 times and the average t-statistics from these regressions are calculated.
Table 1 shows the results. The average t-statistic for the one-period forecasting regression
is -1.24, not high enough to reject the null hypothesis that there is no forecasting power. In
contrast, the average t-statistic for the 20-period forecasting regression is -5.69 and the average
t-statistic for the 50-period forecasting regression is -8.95, so you can be very confident that
126
there is statistically significant forecasting power over these horizons.
The intuition behind these results is fairly simple. Provided is less than one in absolute
value, AR(1) series are what is known as mean-stationary. In other words, they tend to revert
back to their average value. In the case of the yt series here, this average value is zero. The
speed at which you can expect them to return to this average value will be slow if is high but
they will eventually return. So if you see a high value of yt , you cant really be that confident
that it will fall next period but you can be very confident that it will eventually tend to fall
Pulling these ideas together to explain the various stock price results, suppose prices were
given by
k+1
" #
X 1
Pt = Et Dt+k + ut (5.66)
k=0 1+r
where
ut = ut1 + t (5.67)
with being close to one and t being an unpredictable noise series. This model says that
stock prices are determined by two elements. The first is the rational dividend-discount price
and the second is a non-fundamental AR(1) element reflecting non-rational market sentiment.
The latter could swing up and down over time as various fads and manias affect the market.
1. Short-term stock returns would be very hard to forecast. This is partly because of the
127
though with a relatively low R-squared. This is because the fundamental element that
accounts for much of the variation cannot be forecasted while you can detect a statisti-
3. Stock prices would be more volatile than predicted by the dividend-discount model,
perhaps significantly. This is because non-fundamental series of the type described here
can go through pretty long swings which adds a lot more volatility than the dividend-
This suggests a possible explanation for the behaviour of stock prices. On average, they
appear to be determined by something like the dividend-discount model but they also have
a non-fundamental component that sees the market go through temporary (but potentially
long) swings in which it moves away from the values predicted by this model.
128
Figure 5.2: Campbell and Shillers 2001 Chart
129
Table 1: Illustrating Long-Run Predictability
130
Example: The Prognosis for U.S. Stock Prices
These results would suggest that it may be possible to detect whether a stock market is over-
valued or under-valued and thus to forecast its future path. Lets consider an example and
look at the current state of the U.S. stock market as measured by the S&P 500 which is a
Figure 5.3 shows that the U.S. market has been on a tear over the past few years and has
moved well past previous historical highs. The last two times the market expanded rapidly
Figure 5.4 shows the ratio of dividends to prices for the S&P 500 index of US stocks over
the period since the second world war. The measure of dividends used in the numerator is
based on the average value of dividends over a twelve month period to smooth out volatile
month-to-month movements. The chart shows that the dividend-price ratio for this index, at
about 2 percent over the past few years, is very low by historical standards. In other words,
prices are very high relative to dividends, which is normally a bad sign.
Still, comparisons with the long-run historical average might be a bad idea. The average
value of this ratio over the period since 1945 is 3.3 percent. The only point since the mid-1990s
that the ratio has exceeded that historical average was a brief period in early 2009 due to the
plunge in stock prices after the Lehman Brothers bankruptcy and the emergence of the worst
One reason for this change is that many firms have moved away from paying dividends. In
more recent years, it is particularly clear that dividends have been low relative to how much
companies can afford to pay. The ratio of dividends to earnings for S&P 500 firms is about
half its historical level. There are two reasons for this. There has been a long-run trend of
131
moving away from paying dividends and towards using earnings to fund share repurchases.
This increases the average value of outstanding shares (each of them can get a higher share
of future dividend payments) without the shareholders explicitly receiving dividend income
at present (which would be taxable). In recent years, the reduction in dividend payout rates
may be related to corporate deleveraging, as US firms pay down the large amounts of debt
built up during the period prior to the financial crisis and build up cash buffers as a protection
against the kind of problems that many first ran into during the great recession of 2008/09.
Because of these factors, many analysts instead look at the ratio of total corporate earnings
to prices. Figure 5.4 compares this to the dividend-price ratio. This ratio of average earnings
over the previous twelve months to prices was about 5.1 percent in June 2014. This series had
been is right in line with its historical average of about 6.9 percent in late 2011 but over the
past couple of years it has fallen behind it. So this series suggests that stocks are perhaps a
bit over-valued relative to historical norms but not as much as the dividend-price ratio.
A final consideration, however, is the discount rate being used to value stocks. We know
from the Gordon growth model that one reason stock prices might be high relative to dividends
(or earnings) is that the expected rate of return r may be low. Assuming the required rate
of return on stocks reflects some premium over safe investments such government bonds, this
could provide another explanation for high stock price valuations. Figure 6 shows that real
interest rates on US Treasury bonds are at historically low levels (this series is the yield on
ten-year Treasury bonds minus inflation over the previous year). If these rates are being used
as a benchmark for calculating the required rate of return on stocks, then one might expect
This shows that figuring out whether stocks are under- or over-valued is rarely easy. My
132
assessment is they are probably about in line with reasonable valuations but I may well prove
to be wrong.
2250
2000
1750
1500
1250
1000
750
500
250
0
1950 1960 1970 1980 1990 2000 2010
133
Figure 5.4: Dividend-Price Ratio for S&P 500
0.07
0.06
0.05
0.04
0.03
0.02
0.01
1950 1960 1970 1980 1990 2000 2010
134
Figure 5.5: Dividend-Price and Earnings-Price Ratios for S&P 500
0.175
0.150
0.125
0.100
0.075
0.050
0.025
0.000
1950 1960 1970 1980 1990 2000 2010
Earnings/Price Dividends/Price
135
Figure 5.6: Real 10-Year Treasury Bond Rate
16
14
12
10
0
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
136
Time-Varying Expected Returns
one in which stock prices are also determined by a temporary but volatile non-fundamental
component. However, the last observation in the previous section about discount rates
suggests another way to mend the dividend-discount model and perhaps explain the
extra volatility that affects stock prices: Change the model to allow for variations in expected
returns. Consider the finding that a high value of the dividend-price ratio predicts poor future
stock returns. Shiller suggests that this is due to temporary irrational factors gradually
disappearing. But another possibility is that that the high value of this ratio is rationally
We can reformulate the dividend-discount model with time-varying returns as follows. Let
Rt = 1 + rt (5.68)
Start again from the first-order difference equation for stock prices
Dt Pt+1
Pt = + (5.69)
Rt+1 Rt+1
where Rt+1 is the return on stocks in period t + 1. Moving the time-subscripts forward one
137
The general formula is
N 1
Dt+k Pt+N
X
Pt = k+1 + N
(5.73)
Q Q
k=0 Rt+m Rt+m
m=1 m=1
h
Q
where xi means the product of x1 , x2 .... xh . Again setting the limit of the t + N term to
n=1
zero and taking expectations, we get a version of the dividend-discount model augmented to
This equation gives a potential explanation for the failure of news about dividends to explain
stock price fluctuations. Stock prices depend positively on expected future dividends. But
they also depend negatively on the Rt+k values which measure the expected future return on
stocks. So perhaps news about future stock returns explains movements in stock prices: When
investors learn that future returns are going to be lower, this raises current stock prices.
In 1991, Eugene Fama provided his updated overview of the literature on the predictability
of stock returns. By this point, Fama accepted the evidence on long-horizon predictability and
had contributed to this literature. However, Fama and French (1988) put forward predictable
time-variation in expected returns as the likely explanation for this result.6 This explanation
has also been promoted by leading modern finance economists such as Harvards John Camp-
bell and Chicagos John Cochrane and is currently the leading hypothesis for reconciling the
138
What About Interest Rates?
Changing interest rates on bonds are the most obvious source of changes in expected returns
on stocks. Up to now, we only briefly discussed what determines the rate of return that
investors require to invest in the stock market, but it is usually assumed that there is an
In other words, next periods expected return on the market needs to equal next periods
expected interest rate on bonds, it+1 , plus a risk premium, , which we will assume is constant.
Are interest rates the culprit accounting for the volatility of stock prices? They are cer-
tainly a plausible candidate. Stock market participants spend a lot of time monitoring the Fed
and the ECB and news interpreted as implying higher interest rates in the future certainly
tends to provoke declines in stock prices. Perhaps surprisingly, then, Campbell and Shiller
(1988) showed that this type of equation still doesnt help that much in explaining stock mar-
ket fluctuations.8 Their methodology involved plugging in forecasts for future interest rates
and dividend growth into the right-hand-side of (5.74) and checking how close the resulting
series is to the actual dividend-price ratio. They concluded that expected fluctuations in in-
terest rates contribute little to explaining the volatility in stock prices. A study co-authored
by Federal Reserve Chairman Ben Bernanke examining the link between monetary policy and
139
Time-Varying Risk Premia or Behavioural Finance?
So, changes in interest rates do not appear to explain the volatility of stock market fluctuations.
The final possible explanation for how the dividend-discount model may be consistent with the
data is that changes in expected returns do account for the bulk of stock market movements,
but that the principal source of these changes comes, not from interest rates, but from changes
in the risk premium that determines the excess return that stocks must generate relative to
bonds: The in equation (5.75) must be changing over time. According to this explanation,
asset price booms are often driven by investors being willing to take risks and receive a
relatively low compensation for them (when investors are risk-on in the commonly-used
market terminology) while busts often happen when investors start to demand higher risk
One problem with this conclusion is that it implies that, most of the time, when stocks are
increasing it is because investors are anticipating lower stock returns at a later date. However,
the evidence that we have on this seems to point in the other direction. For example, surveys
have shown that even at the peak of the most recent bull market, average investors still
If one rejects the idea that, together, news about dividends and news about future returns
explain all of the changes in stock prices, then one is forced to reject the rational expecta-
tions dividend-discount model as a complete model of the stock market. What is missing
from this model? Many believe that the model fails to take into account of various human
behavioural traits that lead people to act in a manner inconsistent with pure rational expec-
tations. Economists like Shiller point to the various asset price bubbles of the past twenty
years such as the dot-com boom and bust and the rise and fall in house prices in countries
140
like the U.S. and Ireland, as clear evidence that investors go through periods of irrational ex-
huberance which sees asset prices become completely detached from the fundamental values
Indeed, the inability to reconcile aggregate stock price movements with rational expecta-
tions is not the only well-known failure of modern financial economics. For instance, there
are many studies documenting the failure of rational optimisation-based models to explain
various cross-sectional patterns in asset returns, e.g. why the average return on stocks ex-
ceeds that on bonds by so much, or discrepancies in the long-run performance of small- and
large-capitalisation stocks. Eugene Fama is the author of a number of famous papers with
Kenneth French that have demonstrated these discrepancies though he interprets these results
as most likely due to a rational pricing of the risk associated with certain kinds of assets.
For many, the answers to these questions lie in abandoning the pure rational expecta-
tions, optimising approach. Indeed, the field of behavioural finance is booming, with various
researchers proposing all sorts of different non-optimising models of what determines asset
As a final note on this topic, in an interesting development, the 2013 Nobel Prize for
Economics was shared by Shiller and Fama along with Lars-Peter Hansen who has also worked
these economists and the current state of play of the finance profession.
10
The papers presented at the bi-annual NBER workshop on behavioural finance give a good flavour of this
141
Appendix 1: Proof of Equation (5.54)
2
" #
1 1 1
Pt+1 Pt = = Dt + Dt+1 Et Dt+1
1+r 1+r 1+r
" 2 3 #
1 1
+ Et+1 Dt+2 Et Dt+2 +
1+r 1+r
" 3 4 #
1 1
+ Et+1 Dt+3 Et Dt+3 + ....
1+r 1+r
This shows that the change in stock prices is determined by a term relating to this periods
dividends dropping out and then a whole bunch of terms that involve period t + 1 and
period t expectations of future dividends. To be able to pull all the terms for each Dt+k
k
1
together, we both add and subtract a set of terms of the form 1+r
Et Dt+k . The equation
1
Pt+1 Pt = = [Dt+1 Et Dt+1 ]
1+r
2
1
+ [Et+1 Dt+2 Et Dt+2 ]
1+r
3
1
+ [Et+1 Dt+3 Et Dt+3 ] + ....
1 + r
1
Dt
1 + r
1 1
+ 1 Et Dt+1
1+r 1+r
2
1 1
+ 1 Et Dt+2
1+r 1+r
3
1 1
+ 1 Et Dt+3 + ... (5.76)
1+r 1+r
The sequence summarised on the first three lines of equation (5.76) can be described using a
summation sign as
k
" #
X 1
(Et+1 Dt+k Et Dt+k ) (5.77)
k=1 1+r
142
This is an infinite discounted sum of changes to peoples expectations about future dividends.
The sequence summarised on the last three lines of equation (5.76) can be simplified to be
r 1 r
(1 + r) Pt Dt = rPt Dt (5.78)
1+r 1+r 1+r
k
" #
r 1 1
X
Pt+1 Pt = Dt Dt + rPt + (Et+1 Dt+k Et Dt+k )
1+r 1+r k=1 1+r
k
" #
X 1
= Dt + rPt + (Et+1 Dt+k Et Dt+k ) (5.79)
k=1 1+r
which is the equation we were looking for: So the return on stocks can be written as
P 1 k
Dt + Pt+1 k=1 (E 1+r t+1 Dt+k Et Dt+k )
rt+1 = =r+ (5.80)
Pt Pt
143
Appendix 2: Programme For Return Predictability Results
Below is the text of a programme to generate the return predictability results reported in
Table 1. The programme is written for the econometric package RATS but a programme of
this sort could be written for any package that has a random number generator.
allocate 10000
set y = 0
set tstats_1lag = 0
set tstats_20lag = 0
set tstats_50lag = 0
do k = 1,10000
set y 2 300 = 0.99*y{1} + %ran(1)
set dy = y - y{1}
set dy20 = y - y{20}
set dy50 = y - y{50}
linreg(noprint) dy 101 300
# y{1}
comp tstats_1lag(k) = %tstats(1)
linreg(noprint) dy20 101 300
# y{20}
comp tstats_20lag(k) = %tstats(1)
linreg(noprint) dy50 101 300
# y{50}
comp tstats_50lag(k) = %tstats(1)
end do k
stats tstats_1lag
stats tstats_20lag
stats tstats_50lag
144
Chapter 6
Consumption and Asset Pricing
Elementary Keynesian macro theory assumes that households make consumption decisions
based only on their current disposable income. In reality, of course, people have to base their
spending decisions not just on todays income but also on the money they expect to earn in
the future. During the 1950s, important research by Ando and Modigliani (the Life-Cycle
Hypothesis) and Milton Friedman (the Permanent Income Hypothesis) presented significant
evidence that people plan their expenditures in system pattern, smoothing consumption over
In these notes, we will use the techniques developed in the last topic to derive a rational
expectations version of the Permanent Income Hypothesis. We will use this model to illustrate
some pitfalls in using econometrics to assess the effects of policy changes. We will discuss
empirical tests of this model and present some more advanced topics. In particular, we will
discuss the link between consumption spending and the return on various financial assets.
145
The Household Budget Constraint
We start with an identity describing the evolution of the stock of assets owned by households.
Letting At be household assets, Yt be labour income, and Ct stand for consumption spending,
this identity is
where rt+1 is the return on household assets at time t + 1. Note that Yt is labour income
(income earned from working) not total income because total income also includes the capital
income earned on assets (i.e. total income is Yt + rt+1 At .) Note, we are assuming that Yt is
As with the equation for the return on stocks, this can be written as a first-order difference
At+1
At = Ct Yt + (6.2)
1 + rt+1
We will assume that agents have rational expectations. Also, in this case, we will assume that
1
A t = C t Yt + Et At+1 (6.3)
1+r
Using the same repeated substitution methods as before this can be solved to give
X Et (Ct+k Yt+k )
At = (6.4)
k=0 (1 + r)k
Note that we have again imposed the condition that the final term in our repeated substitution
Et At+k
(1+r)k
goes to zero as k gets large. Effectively, this means that we are assuming that people
consume some of their capital income (i.e. that assets are used to finance a level of consumption
Ct that is generally larger than labour income Yt ). If this is the case, then this term tends to
zero.
146
One way to understand this equation comes from re-writing it as
X Et Ct+k X Et Yt+k
k = A t + k (6.5)
k=0 (1 + r) k=0 (1 + r)
This is usually called the intertemporal budget constraint. It states that the present value sum
of current and future household consumption must equal the current stock of financial assets
plus the present value sum of current and future labour income.
A consumption function relationship can be derived from this equation by positing some
theoretical relationship between the expected future consumption values, Et Ct+k , and the
current value of consumption. This is done by appealing to the optimising behaviour of the
consumer.
Some of you may be aware of Thomas Pikettys now infamous book Capital in the Twenty
First Century. If youre not, scoll down a few pages to check him out. Perhaps Pikettys
most famous conjecture is there is a natural tendency in capitalist economies for wealth to
accumulate faster than income. This conjecture can be understood on the basis of the simple
Consider the simple version of our budget constraint with a constant return on assets
If assets grew at rate r or faster, then this would likely mean they were growing faster than
GDP, because r is generally higher than GDP growth. So what is the growth rate of the stock
147
So the growth rate of assets is given by
At+1 At (1 + r) (Yt Ct )
=r+ (6.8)
At At
This means the growth rate of assets equals r plus an additional term that will be positive
as long as Yt > Ct i.e. as long as labour income is greater than consumption. So this tells us
that the growth rate of assets equals r plus a term that depends upon whether consumption
is greater than or less than labour income. If consumption is less than labour income, assets
grow at a rate that is greater than r while they will grow at a rate slower than r if consumption
Piketty bases his ideas about the tendency for wealth to rise faster than income on the
fact that the rate of return on assets r has tended historically to be higher than the growth
rate of GDP. If we observed Yt > Ct , then assets would grow at a rate greater than r and
so this would generally also be higher than the growth rate of GDP. In general, however, we
probably dont expect consumption to be greater than labour income. If the income people
earn from their assets doesnt ever boost their consumption spending, then what is the point
of it? And indeed, the data generally show that consumption is greater than labour income,
so that people consume some of their capital income (i.e. their income from assets) and total
assets should generally grow at a rate that is less than r. Still, Piketty points out that it is
possible for people to consume some of their capital income and still have assets growing at a
Under what conditions will assets grow at a faster rate than the growth rate of GDP, which
1
For example, page 564: If r > g, it suffices to reinvest a fraction of the return on capital equal to the
growth rate g and consume the rest (r g).
148
Piketty terms g? Our previous equation tells us this happens when
(1 + r) (Yt Ct )
g<r+ (6.9)
At
consume (i.e. the amount they consume above their labour income) as a share of total assets
Is there any result in economics that leads us to believe that this last inequality should
generally hold? Not to my knowledge. In this sense, Piketty perhaps overstates the extent
to which, on its own, the fact that r > g is a fundamental force for divergence. What is
required for assets to steadily grow relative to income is not only this condition but also an
additional, relatively arbitrary, restriction on how much people can consume and this latter
condition may or may not hold at various times. However, what can be said is that during
periods of high returns on capital, when the gap between r and g is particularly high, then
the bigger the right-hand-side of equation (6.10) will be and it is perhaps more likely that the
Most likely, however, the key empirical developments that Pikettys book focuses onrising
assets relative to income and growing inequality of wealthare being driven by other forces
that are making the income distribution more unequal and reducing the share of income going
to workers rather than being related to some innate law of capitalism that drives wealth up
149
Figure 6.1: Thomas Piketty
150
Optimising Behaviour by the Consumer
We will assume that consumers wish to maximize a welfare function of the form
!k
X 1
W = U (Ct+k ) (6.11)
k=0 1+
where U (Ct ) is the instantaneous utility obtained at time t, and is a positive number that
describes the fact that households prefer a unit of consumption today to a unit tomorrow.
If the future path of labour income is known, consumers who want to maximize this welfare
function subject to the constraints imposed by the intertemporal budget constraint must solve
For every current and future value of consumption, Ct+k , this yields a first-order condition of
the form
!k
1
U 0 (Ct+k ) =0 (6.13)
1+ (1 + r)k
For k = 0, this implies
U 0 (Ct ) = (6.14)
For k = 1, it implies
!
0 1+
U (Ct+1 ) = (6.15)
1+r
Putting these two equations together, we get the following relationship between consumption
When there is uncertainty about future labour income, this optimality condition can just be
re-written as
!
0 1+r
U (Ct ) = Et [U 0 (Ct+1 )] (6.17)
1+
151
This implication of the first-order conditions for consumption is sometimes known as an Euler
equation.
In an important 1978 paper, Robert Hall proposed a specific case of this equation.2 Halls
r = (6.19)
In other words, Hall assumed that the utility function was quadratic and that the real interest
rate equalled the household discount rate. In this case, the Euler equation becomes
which simplifies to
Ct = Et Ct+1 (6.21)
This states that the optimal solution involves next periods expected value of consumption
equalling the current value. Because, the Euler equation holds for all time periods, we have
In other words, all future expected values of consumption equal the current value. Because it
implies that changes in consumption are unpredictable, this is sometimes called the random
152
The Rational Expectations Permanent Income Hypothesis
Halls random walk hypothesis has attracted a lot of attention in its own right, but rather
To do this, insert Et Ct+k = Ct into the intertemporal budget constraint, (6.5), to get
X Ct X Et Yt+k
k = At + k (6.24)
k=0 (1 + r) k=0 (1 + r)
Now we can use the geometric sum formula to turn this into a more intuitive formulation:
X 1 1 1+r
k = 1 = (6.25)
k=0 (1 + r) 1 1+r r
So, Halls assumptions imply the following equation, which we will term the Rational Expec-
r r X Et Yt+k
Ct = At + (6.26)
1+r 1 + r k=0 (1 + r)k
This equation is a rational expectations version of the well-known permanent income hypoth-
esis (I will use the term RE-PIH below) which states that consumption today depends on a
Lets look at this equation closely. It states that the current value of consumption is driven
by three factors:
The expected present discounted sum of current and future labour income.
The current value of household assets. This wealth effect is likely to be an important
153
r
The expected return on assets: This determines the coefficient, 1+r
, that multiplies both
assets and the expected present value of labour income. In this model, an increase in
this expected return raises this coefficient, and thus boosts consumption.
This RE-PIH model can be made more concrete by making specific assumptions about ex-
pectations concerning future growth in labour income. Suppose, for instance, that households
This implies
k
r rYt X 1+g
Ct = At + (6.28)
1+r 1 + r k=0 1 + r
As long as g < r (and we will assume it is) then we can use the geometric sum formula to
The fact that the coefficients of so-called reduced-form relationships, such as the consumption
154
function equation (6.31), depend on expectations about the future is an important theme in
Robert Lucas pointed out that the assumption of rational expectations implied that these
coefficients would change if expectations about the future changed.3 In our example, the
MPC from current income will change if expectations about future growth in labour income
change.
form regressions to assess the impact of policy changes. He pointed out that changes in
policy may change expectations about future values of important variables, and that these
changes in expectations may change the coefficients of reduced-form relationships. This type
of problem can limit the usefulness for policy analysis of reduced-form econometric models
based on historical data. This problem is now known as the Lucas critique of econometric
models.
tax cut on labour income. As noted above, we can consider Yt to be after-tax labour income,
so it would be temporarily boosted by the tax cut. Now suppose the policy-maker wants an
estimate of the likely effect on consumption of the tax cut. They may get their economic
advisers to run a regression of consumption on assets and after-tax labour income. If, in the
past, consumers had generally expected income growth of g, then the econometric regressions
r
will report a coefficient of approximately rg
on labour income. So, the economic adviser
might conclude that for each extra dollar of labour income produced by the tax cut, there will
r
be an increase in consumption of rg
dollars.
3
Robert Lucas, Econometric Policy Evaluation: A Critique, Carnegie-Rochester Series on Public Policy,
Vol. 1, pages 19-46, 1976.
155
However, if households have rational expectations and operate according to equation (6.26)
then the true effect of the tax cut could be a lot smaller. For instance, if the tax cut is only
expected to boost this periods income, and to disappear tomorrow, then each dollar of tax cut
r
will produce only 1+r
dollars of extra consumption. The difference between the true effect and
For instance, plugging in some numbers, suppose r = 0.06 and g = 0.02. In this case, the
.06
economic advisor concludes that the effect of a dollar of tax cuts is an extra 1.5 (= .06.02 )
.06
dollars of consumption. In reality, the tax cut will produce only an extra 0.057 (= 1.06 ) dollars
The Lucas critique has played an important role in the increased popularity of rational
expectations economics. Examples like this one show the benefit in using a formulation such
as equation (6.26) that explicitly takes expectations into account, instead of relying only on
Like households, governments also have budget constraints. Here we consider the implications
of these constraints for consumption spending in the Rational Expectations Permanent Income
Hypothesis. First, let us re-formulate the household budget constraint to explicitly incorporate
where Tt is the total amount of taxes paid by households. Taking the same steps as before,
156
we can re-write the intertemporal budget constraint as
X Et Ct+k X Et (Yt+k Tt+k )
k = A t + (6.33)
k=0 (1 + r) k=0 (1 + r)k
Now lets think about the governments budget constraint. The stock of public debt, Dt
This states that the present discounted value of tax revenue must equal the current level of
debt plus the present discounted value of government spending. In other words, in the long-
run, the government must raise enough tax revenue to pay off its current debts as well as its
Consider the implications of this result for household decisions. If households have ra-
tional expectations, then they will understand that the governments intertemporal budget
constraint, equation (6.35), pins down the present value of tax revenue. In this case, we can
substitute the right-hand-side of (6.35) into the household budget constraint to replace the
present value of tax revenue. Doing this, the household budget constraint becomes
X Et Ct+k X Et (Yt+k Gt+k )
k = A t Dt + (6.36)
k=0 (1 + r) k=0 (1 + r)k
Consider now the implications of this result for the impact of a temporary cut in taxes. Before,
we had discussed how a temporary cut in taxes should have a small effect. This equation gives
us an even more extreme result unless governments plan to change the profile of government
157
spending, then a cut to taxes today has no impact at all on consumption spending. This is
because households anticipate that lower taxes today will just trigger higher taxes tomorrow.
This result that rational expectations implied that a deficit-financed cut in taxes should
have no impact on consumption was first presented by Robert Barro in a famous 1974 paper.4
It was later pointed out that some form of this result was alluded to in David Ricardos writings
in the nineteenth century. Economists love fancy names for things, so the result is now often
There have been lots of macroeconomic studies on how well the RE-PIH fits the data. One
problem worth noting is that there are some important measurement issues when attempting
to test the theory. In particular, the models assumption that consumption expenditures only
yield a positive utility flow in the period in which the money is spent clearly does not apply
to durable goods, such as cars or computers, which yield a steady flow of utility. For this
reason, most empirical research has focused only on spending on nondurables (e.g. food) and
There are various reasons why the RE-PIH may not hold. Firstly, it assumes that it
is always feasible for households to smooth consumption in the manner predicted by the
theory. For example, even if you anticipate earning lots of money in the future and would
like to have a high level of consumption now, you may not be able to find a bank to fund
a lavish lifestyle now based on your promises of future millions. These kinds of liquidity
constraints may make consumption spending more sensitive to their current incomes than
4
Robert Barro (1974). Are Government Bonds Net Wealth? Journal of Political Economy, Volume
82(6).
158
the RE-PIH predicts. Secondly, people may not have rational expectations and may not plan
their spending decisions in the calculating optimising fashion assumed by the theory.
Following Halls 1978 paper, the 1980s saw a large amount of research on whether the RE-
PIH fitted the data. The most common conclusion was that consumption was excessively
forecastable than they should be if Halls random walk idea was correct. Campbell and Mankiw
(1990) is a well-known paper that provides a pretty good summary of these conclusions.5
They present a model in which a fraction of the households behave according to the RE-
PIH while the rest simply consume all of their current income. They estimate the fraction
this conclusion would be that financial sector reforms that boost access to credit could have
There is also a large literature devoted to testing the Ricardian equivalence hypothesis. In
addition to the various reasons the RE-PIH itself may fail, there are various other reasons
why Ricardian equivalence may not hold. Some are technical points. People dont actually
live forever (as we had assumed in the model) and so they may not worry about future tax
increases that could occur after they have passed away; taxes take a more complicated form
than the simple lump-sum payments presented above; the interest rate in the governments
budget constraint may not be the same as the interest rate in the households constraint.
5
John Campbell and Gregory Mankiw (1990). Permanent Income, Current Income, and Consumption,
Journal of Business and Economic Statistics
159
(You can probably think of a few more.) More substantively, people may often be unable
to tell whether tax changes are temporary or permanent. Most of the macro studies on this
topic (in particular those that use Vector Autoregressions) tend to find the effects of fiscal
policy are quite different from the Ricardian equivalence predictions. Tax cuts and increases
Perhaps the most interesting research on this area has been the use of micro data to ex-
amine the effect of changes in taxes that are explicitly predictable and temporary. One recent
example is the paper by Parker, Souleles, Johnson and Robert McClelland which examines
the effect of tax rebates provided to U.S. taxpayers in 2008.6 This programme saw the U.S.
Since these payments were being financed by expanding the government deficit, Ricardian
equivalence predicts that consumers should not have responded. Parker et al, however, found
the opposite using data from the Consumer Expenditure Survey. A quick summary:
We find that, on average, households spent about 12-30% (depending on the speci-
month period in which the payments were received. Further, there was also a
vehicles, bringing the average total spending response to about 50-90% of the
payments.
You might suspect that these results are driven largely by liquidity constraints but the
various microeconomic studies that have examined temporary fiscal policy changes have not
6
Consumer Spending and the Economic Stimulus Payments of 2008. American Economic Review, 103(6),
October 2013.
160
always been consistent with this idea. For example, research by Parker (1999) showed the even
their social security taxes (which stop at a certain point in the year when workers reach a max-
imum threshold point) while Souleles (1999) found excess sensitivity results for consumer
spending after people received tax rebate cheques.7 These results show excess sensitivity even
At the same time, this doesnt mean that households go on a splurge every time they
get a large payment. For example, Hsieh (2003) examines how people in Alaska responded
to large anticipated annual payments that they received from a state fund that depends
largely on oil revenues.8 Unlike the evidence on temporary tax cuts, Hsieh finds that Alaskan
households respond to these payments in line with the predictions of the Permanent Income
Hypothesis, smoothing out their consumption over the year. One possible explanation is that
these large and predictable payments are easier for people to understand and plan around and
the consequences of spending them too quickly more serious than smaller once-off federal tax
changes. There is clearly room for more research in this important area.
Precautionary Savings
I want to return to a subtle point that was skipped over earlier. If we keep the assumption
You might think that this equation is enough to deliver the property of constant expected
7
Jonathan Parker. The Reaction of Household Consumption to Predictable Changes in Social Security
Taxes, American Economic Review, Vol 89 No 4, September 1999. Nicholes Souleles. The Response of
Household Consumption to Income Tax Refunds, American Economic Review, Vol 89 No 4, September 1999
8
Chang-Tai Hsieh. Do Consumers React to Anticipated Income Changes? Evidence from the Alaska
Permanent Fund American Economic Review, March 2003.
161
consumption. We generally assume declining marginal utility, so function U 0 is monotonically
decreasing. In this case, surely the expectation of next periods marginal utility being the
same as this periods is the same as next periods expected consumption level being the same
as this periods.
The problem with this thinking is the Et here is a mathematical expectation, i.e. a
weighted average over a set of possible outcomes. And for most functions F generally
E(F (X)) 6= F (E(X)). In particular, for concave functionsfunctions like utility functions
which have negative second derivativesa famous result known as Jensens inequality states
that E(F (X)) < F (E(X)). This underlies the mathematical formulation of why people are
averse to risk: The average utility expected from an uncertain level of consumption is less
than from the sure thing associated with obtaining the average level of consumption. The
sign of the Jensens inequality result is reversed for concave functions, i.e. those with positive
second derivatives.
In this example, we are looking at the properties of Et [U 0 (Ct+1 )]. Whether or not marginal
utility is concave or convex depends on its second derivative, so it depends upon the third
derivative of the utility function U 000 . Most standard utility functions have positive third
derivatives implying convex marginal utility and thus Et [U 0 (Ct+1 )] > U 0 (Et Ct+1 ). What we
can see now is why the quadratic utility function was such a special case. Because this function
has U 000 = 0, its marginal utility is neither concave or convex and the Jensen relationship is an
equality. So, in this very particular case, the utility function displays certainty equivalence:
The uncertain outcome is treated the same way is if people were certain of achieving the
162
Heres a specific example of when certainty equivalence doesnt hold.9 Suppose consumers
1
U (Ct ) = exp (Ct ) (6.38)
where exp is the exponential function. This implies marginal utility of the form
Now suppose the uncertainty about Ct+1 is such that it is perceived to have a normal distri-
bution with mean Et (Ct+1 ) and variance 2 . A useful result from statistics is that if a variable
2 2
!
exp (Ct ) = exp Et (Ct+1 ) + (6.45)
2
9
This particular example was first presented by Ricardo Caballero (1990), Consumption Puzzles and
Precautionary Savings Journal of Monetary Economics, Volume 25, pages 113-136.
163
Taking logs of both sides this becomes
2 2
Ct = Et (Ct+1 ) + (6.46)
2
which simplifies to
2
Et (Ct+1 ) = Ct + (6.47)
2
Even though expected marginal utility is flat, consumption tomorrow is expected to be higher
than consumption today. Thus, uncertainty induces an upward tilt to the consumption
profile. And this upward tilt has an affect on todays consumption: We cannot sustain higher
Indeed, it turns out that this result allows us to calculate exactly what the effect of
k 2
Et (Ct+k ) = Ct + (6.48)
2
It can be shown (mainly by repeatedly using the well-known geometric sum formula) that
X k 1+r
k = (6.50)
k=1 (1 + r) r2
164
2
This is exactly as before apart from an additional precautionary savings term
2r
. The
more uncertainty there is, the more lower the current level of consumption will be.
This particular result obviously relies on very specific assumptions about the form of the
utility function and the distribution of uncertain outcomes. However, since almost all utility
function feature positive third derivatives, the key property underlying the precautionary
savings resultmarginal utility averaged over the uncertain outcomes being higher than at
the average level of consumption-will generally hold. It is an important result because some
of the more important changes in the savings rate observed over time appear consistent with
this type of precautionary savings behaviour. So, for example, during the global financial
crisis, when there was so much uncertainty about how long the recession would last and what
impact it would have, it is very likely that this greater uncertainty depressed consumption.
One simplification that we have made up to now is that consumers expect a constant return
on assets. Here, we allow expected asset returns to vary. The first thing to note here is that
one can still obtain an intertemporal budget constraint via the repeated substitution method.
h
Q
where xi means the product of x1 , x2 .... xh . The steps to derive this are identical to the
n=1
steps used to derive equation (71) in the previous set of notes (Rational Expectations and
Asset Prices).
The optimisation problem of the consumer does not change much. This problem now has
165
the Lagrangian
!k
1 Et Yt+k Et Ct+k
X X X
L (Ct, Ct+1 , ....) = U At
(Ct+k )+ + ! !
k=0 1+ k=0
k+1
Q k=0
k+1
Q
(1 + rt+m ) (1 + rt+m )
m=1 m=1
or, letting
Rt = 1 + rt (6.55)
Previously, we had used an equation like this to derive the behaviour of consumption, given an
assumption about the determination of asset returns. However, Euler equations have taken on
a double role in modern economics because they are also used to consider the determination
of asset returns, taking the path of consumption as given. The Euler equation also takes on
greater importance than it might seem based on our relatively simple calculations because,
once one extends the model to allow the consumer to allocate their wealth across multiple
asset types, it turns out that equation (6.56) must hold for all of these assets. This means
166
So, for example, consider a pure risk-free asset that pays a guaranteed rate of return next
period. The nearest example in the real-world is a short-term US treasury bill. Because there
is no uncertainty about this rate of return, call it Rf,t , these terms can be taken outside the
Rf,t+1
U 0 (Ct ) = Et [U 0 (Ct+1 )] (6.58)
1+
(1 + ) U 0 (Ct )
Rf,t+1 = (6.59)
Et [U 0 (Ct+1 )]
To think about the relationship between risk-free rates and returns on other assets, it is
The expectation of a product of two variables equals the product of the expectations plus the
covariance between the two variables. This allows one to re-write (6.57) as
1
U 0 (Ct ) = [Et (Ri,t+1 ) Et (U 0 (Ct+1 )) + Cov (Ri,t+1 , U 0 (Ct+1 ))] (6.61)
1+
Note now that, by equation (6.67), the left-hand-side of this equation equals the risk-free rate.
So, we have
Cov (Ri,t+1 , U 0 (Ct+1 ))
Et (Ri,t+1 ) = Rf,t+1 (6.63)
Et [U 0 (Ct+1 )]
This equation tells us that expected rate of return on risky assets equals the risk-free rate
minus a term that depends on the covariance of the risky return with the marginal utility of
167
consumption. This equation is known as the Consumption Capital Asset Pricing Model or
Consumption CAPM, and it plays an important role in modern finance. Most asset returns
depend on payments generated by the real economy and so they are procyclicalthey are
better in expansions than during recessions. However, the usual assumption of diminishing
marginal utility implies that U 0 depends negatively on consumption. This means that the
covariance term is negative for assets whose returns are positively correlated with consumption
and these assets will have a higher rate of return than the risk free rate. Indeed, the higher
the correlation of the asset return with consumption, the higher will be the expected return.
Underlying this behaviour is the fact that consumers would like to use assets to hedge
against consumption variations. Given two assets that have the same rate of return, a risk-
averse consumer would prefer to have one that was negatively correlated with consumption
than one that is positively correlated with consumption. For investors to be induced into hold-
ing both assets, the rate of return on the asset with a positive correlation with consumption
needs to be higher.
In theory, the consumption CAPM should be able to explain to us why some assets, such
as stocks, tend to have such high returns while others, such as government bonds, have such
low returns. However, it turns out that it has great difficulty in doing so. In the US, the
average real return on stocks over the long run has been about six percent per year while the
average return on Treasury bonds has been about one percent per year. In theory, this could
be explained by the positive correlation between stock returns and consumption. In practice,
this is not so easy. Most studies use simple utility functions such as the Constant Relative
168
Risk Aversion (CRRA) preferences
1
U (Ct ) = Ct1 (6.64)
1
so marginal utility is
U 0 (Ct ) = Ct (6.65)
on the right-hand side is not nearly big enough to justify the observed equity premium. It
requires values such as = 25, which turns out to imply people are incredibly risk averse: For
instance, it implies they are indifferent between a certain 17 percent decline in consumption
and 50-50 risk of either no decline or a 20 percent decline. One way to explain this finding
is as follows. In practice, consumption tends to be quite smooth over the business cycle (our
earlier model helps to explain why) so for standard values of , marginal utility doesnt change
that much over the cycle and one doesnt need to worry too much equities being procyclical.
However, if is very very high, then the gap between marginal utility in booms and recessions
is much bigger: Marginal utility is really high in recessions and consumers really want an asset
One route that doesnt seem to work is arguing that people really are that risk averse, i.e.
that = 25 somehow is a good value. The reason for this is that this value of would imply
a much higher risk-free rate than we actually see. Plugging the CRRA utility function into
(1 + ) Ct
Rf,t+1 = h
i (6.67)
Et Ct+1
169
Neglecting uncertainty about consumption growth, this formula implies that on average, the
Rf = (1 + ) (1 + gC ) (6.68)
where gC is the growth rate of consumption. Plugging in the average growth rate of con-
sumption, a value of = 25 would imply a far higher risk-free rate than we actually see on
government bonds.
There is now a very large literature dedicated to solving the equity premium and risk-free
10
The paper that started this whole literature is Rajnish Mehra and Edward Prescott, The Equity Premium:
A Puzzle Journal of Monetary Economics, 15, 145-161. For a review, see Narayana Kocherlakota, The
Equity Premium: Its Still a Puzzle Journal of Economic Literature, 34, 42-71.
170
Chapter 7
Exchange Rates, Interest Rates and
Expectations
Our next example the role of expectations in macroeconomics is an important one: The link
between interest rates and exchange rates and the behaviour of flexible exchange rates.
Why do exchange rates matter? Consider the Euro-Pound exchange rate, so that 1 = X.
Now suppose X goes up, so the Euro is worth more relative to the pound. What will happen
to exports from Ireland to the UK and imports to Ireland from the UK?
1. Exports: For each pound in sterling revenues that an Irish firm earns, they now get less
revenue in euros unless they increase their UK price. Because most of their costs (in
particular wages) will be denominated in euros, this means that exporting will become
less profitable at prevailing prices. Irish firms may react to this by increasing the price
they charge in the UK: This will reduce demand for their product, so exports will
still decline. Alternatively, some firms that feel they cannot raises prices to restore
profitability may simply exit from exporting. Between these two mechanisms, an increase
in the value of the euro relative to the pound will reduce Irish exports to the UK.
171
2. Imports: Because the value of the euro has increased, UK firms will get more sterling
revenues from exporting to Ireland at the same prices, so UK firms that hadnt previously
been exporting to Ireland may start to do so. Alternatively, UK firms already exporting
to Ireland may decide to lower their euro-denominated prices in Ireland and increase
their market share while still getting the same sterling revenue per unit. Either way,
So while an increase in the value of a countrys currency may sound like a good thing,
it tends to reduce exports, increase imports, and thus reduce the countrys real GDP. In
contrast, a depreciation of the currency boosts exports and has a positive effect on economic
growth. For these reasons, a depreciation of the currency is often welcome in a recession and
the absence of this tool when the exchange rate is fixed is often pointed to as a downside of
such regimes.
1. Inflation: Depreciation tends to make imports more expensive and so add to inflation.
This is one reason why central bankers tend to say they favour a strong currencythey
are indicating their preference for low inflation. For small open economies that import
2. Temporary Boost: The boost to growth from a devaluation is often temporary. Over
time, the increase in import prices may feed through to higher wages and this gradually
erodes the competitive benefits from devaluation. The more open an economy is, the
172
Free Movement of Capital: Uncovered Interest Parity
Consider the case where there is free mobility of capital: In other words, people can move
money from one country to another immediately and without incurring any fees or taxes.
Specifically, consider the case where money can flow easily between the US and the Euro area.
Suppose now that investors can buy either US or European risk-free one-period bonds.
represent the amount of dollars that can be obtained in exchange for one Euro: Currently et
is about 1.30.
Now lets think about about the return to a US investor who wants to invest $1 in a Euro-
denominated bond at time t and then convert the money back into dollars at time t + 1. They
1
do this as follows. First, they exchange their $1 for for et
and use this money to buy a European
1+iE
Et et+1
exchanges their et
t
back into dollars, so they expect to end up with $ 1 + iE
t et
.
If we abstract from risk aversion (the exchange rate movement is presumably uncertain)
E
t et+1
1+ iE
t = 1 + iUt S (7.1)
et
Et et+1 et
1 + iE
t 1+ = 1 + iUt S (7.2)
et
Et et+1 et Et et+1 et
1+ iE
t + + iE
t = 1 + iUt S (7.3)
et et
173
Subtracting the 1 from each side, we get
Et et+1 et Et et+1 et
iE
t + + iE
t = iUt S (7.4)
et et
Et et+1 et
Since both iE
t and et
are going to be relatively small, the product of them will usually
be close to zero, so the condition for the investor to be indifferent between the two investment
strategies is
Et et+1 et
iE
t + iUt S (7.5)
et
This conditionwhich says that the foreign interest rate plus the expected percentage change
in the value of the foreign currency should equal the domestic interest rateis known as the
Why should we expect this condition to hold? Why would be expect investors to be
indifferent between US and European bonds? Well, suppose it turned out that the European
bonds offered a better deal than the US bonds: The combination of interest rate and expected
exchange rate appreciation makes the rate of return on European bonds better than that on
US bonds. Well, if there is perfect capital mobility, then this would mean that there would be
a rush for investors to purchase European bonds rather than US bonds. European institutions
who borrow via selling these bonds (governments, highly rated corporations) would figure out
that they could borrow at a lower interest rate and still find investors willing to buy their
bonds as well as US bonds. By this logic, deviations from Uncovered Interest Parity (UIP)
should be temporary with borrowers adjusting the interest rates on their bonds to ensure that
Note that it states that if European interest rates are lower than US rates, then the Euro
must be expected to appreciate. This might seem counter-intuitive: Before reading this, you
174
might expect the country that has higher interest rates to be the one with an appreciating
If the UIP relationship approximately holds, then this has important implications for the
links between a countrys choice of exchange rate regime and its choice of monetary policy.
Specifically, if UIP holds, then it is not possible to have all three of the following:
1. Free capital mobility (money moving freely in and out of the country).
You can have any two of these three things, but not the third:
1. You can have free capital mobility and a fixed exchange rate (so that Et et+1 = et ) but
then your interest rates must equal those of the area you have fixed exchange rates
against (iUt S = iE
t ). For example, Ireland had a fixed exchange rate with the UK for
many years and interest rates here were the same as in the UK.
2. You can have free capital mobility and set your own monetary policy (iUt S 6= iE
t ) but
then your exchange rate cannot simply be fixed (so that Et et+1 6= et ). For example, in
the UK, the Bank of England sets short-term interest rates and the sterling exchange
3. You can set your own monetary policy and fix your exchange rate against another coun-
try, but then you must intervene in capital markets to prevent people talking advantage
175
of investment arbitrage opportunities. For example, China has a fixed exchange rate
with the US dollar and also sets its own monetary policy but it does not allow free
movement of capital.
This idea that you can only have two from three of free capital mobility, a fixed exchange
rate and independent monetary policy is commonly known as the trilemma of international
finance.
Lets think about how exchange rates should behave free under capital mobility. Recall our
example involving US and European bonds. The condition for the expected return on the two
E
t et+1
1 + iE
t = 1 + iUt S (7.6)
et
You may have thought at this point that you had escaped from first-order stochastic difference
equations. Unfortunately not. Equation (7.6) isnt a linear first-order stochastic difference
equation of the type that we have studies up to know. However, if we take logs, it becomes
log 1 + iE
t + Et log et+1 log et = log 1 + iUt S (7.7)
This is a linear stochastic difference equation describing the properties of the log of the ex-
log et = log 1 + iE
t log 1 + iUt S + Et log et+1 (7.8)
Going back to our description of first-order stochastic difference equations, this is an another
example of one these equations of the form yt = axt + bEt yt+1 , this time with yt = log et ,
176
xt = log 1 + iE
t log 1 + iUt S , a = b = 1. If we apply the repeated substitution technique
h i
Et log 1 + iE US
X
log et = t+k log 1 + it+k (7.9)
k=0
It turns out, however, that this is not the only possible solution. To see this, note that for
log et log e = log 1 + iE
t log 1 + iUt S + Et log et+1 log e (7.10)
In other words, because the coefficient on the expected future exchange rate equals one (be-
cause the b = 1) then the repeated substitution method works not just for et but for any et e
h i
Et log 1 + iE US
X
log et = log e + t+k log 1 + it+k (7.11)
k=0
where the theory does not predict what the value of e is. Because the natural log function
Et iE US
X
log et = log e + t+k it+k (7.12)
k=0
UIP tells us something about the dynamics of the exchange rate but it does not make
definitive predictions about the level an exchange rate should be at, i.e. it does not pin
down a unique value of e. Other theories, such as Purchasing Power Parity (the idea
that exchange rates should adjust so each type of currency has equivalent purchasing
power) do make such predictions, though they dont work very well in practice.
177
This unexplained e can be seen as a sort of long-run equilibrium exchange rate because
this is the rate that holds when the average interest rate on European bonds in the
The model predicts that deviations from the long-run exchange rate e are determined
by expectations that interest rates will differ across areas. In this example, the euro will
be higher than e if people expect European interest rates to be higher in the future than
US rates.
The model explains the slightly puzzling result we discussed earlier: That higher interest rates
in Europe imply the euro is expected to depreciate. Suppose in period t 1, Euro and US
interest rates were equal to each other and expected to stay that way. Equation (7.12) implies
that under these circumstances we would have log et1 = log e. Now suppose that, in period
t, Euro interest rates unexpectedly went above US interest rates just for one period. What
would happen? The Euro must end up back at e (because interest rates in the two areas are
going to equal each other after period t) and the Euro must also be expected to depreciate
So, in response to the surprise temporary increase in European interest rates, the Euro
immediately jumps upwards and then depreciates back to e. This conforms with our intuition
that higher European interest rates should make the Euro more attractive.
During the period after the second world war up to the 1970s, most of the worlds economies
operated the so-called Bretton Woods system of quasi-fixed exchange rates. The 1970s saw
the widespread introduction of market-determined flexible exchange rates. Prior to the intro-
178
duction of this system, advocates of market-based flexible exchange rates had predicted that
The truth turned out to be the opposite: Exchange rates change by very large amounts
on a daily, weekly, monthly basis. See Figure 8.1 which shows the Euro-dollar exchange rate.
It also gone through big swings: Reaching lows of 0.8 in 2000 and highs of 1.6 in 2008. In
addition, there are often large day to day movements where the exchange rate will go up or
The model just developedcombining the UIP with rational expectationshelps to ex-
plain why exchange rates are so volatile. Using equation (7.12) for the level of exchange rates,
Et iE US
Et1 iE US
X X
log et = t+k it+k t+k it+k (7.13)
k=0 k=1
We will simplify this a bit via a slightly dodgy bit of terminology, meaning that we will write
(Et Et1 ) xt+k to mean Et xt+k Et1 xt+k , i.e. this means the change between time t 1 and
time t in what people expect xtk to be. Given this, we can re-write the previous equation as
log et = iUt1
S
iE (Et Et1 ) iE US
X
t1 + t+k it+k (7.14)
k=0
This equation tells us a lot about how exchange rates should behave if investors have
rational expectations. Exchange rate changes reflect not only the expected change due to past
in the projected path of future interest rate differentials. This means that all information
that affects expectations of future Euro-area and US interest rates feed directly into todays
exchange rate. Because interest rates are set by central banks in response to developments in
179
the macroeconomy, this means that exchange rates should react to all types of macroeconomic
news.
180
Problems for the UIP-Rational Expectations Theory
The UIP theory helps to explain a number of important aspects of the behaviour of exchange
rates. However, there have been many examples of where the theory just outlined does not
seem to work well. Indeed, quite commonly, there have been examples where the theory
predicts for an extended period of time that a currency depreciation or appreciation should
One potential explanation for this apparent failure that could still be consistent with
the model is that Et et+1 et is not the same as et+1 et : The mathematical expectation of
something and its actual outcome can sometimes differ from each other for quite a while. This
is sometimes called the Peso problem. Sometimes interest rates in developing economies (such
as Mexico, after which the term is named) are high because markets think there is a probability
(perhaps a small probability) that a large depreciation may be coming. Just because the
depreciation doesnt happen during a particular sample doesnt mean the expectation was
But evidence also seems to exist of more systematic errors for the UIP theory. Take one
example. For most of the last decade, Japanese interest rates were well below European levels
for most of this decade. The UIP-Rational Expectations approach would have predicted that
the Yen should have been appreciating against the Euro: In fact, the opposite happened
systematically from 2001 to 2008. See Figure ??. Many traders systematically exploited this,
borrowing at low interest rates in Yen, using the funds to buy Euro bonds that yielded higher
interest rates and then repaying their debts in depreciated Yenthe so-called Yen carry trade.
That said, as Figure 2 also shows, the carry trade unwound itself fairly spectacularly in 2008.
The leading explanations for the apparent failures of the UIP-RE theory involve introduc-
181
ing risk aversion (we have assumed investors are risk-neutral) and home-bias (the preference
for assets denominated in your home currency). For instance, in relation to the theorys fail-
ure to explain the Yen carry trade period, its worth noting that many Japanese investors
have a strong preference for Yen-denominated assets and dont want to take on the extra
These kinds of preferences may lead to short-term violations of the stronger predictions
of the UIP-RE theory. However, they will not allow countries to escape from the restrictions
of the Trilemma: A country that attempts to adopt a systematically different interest rate
policy than another country simply will not be able to have a fixed exchange rate with that
182
Figure 7.2: Daily Data on the Euro-Yen Exchange Rate
183
Chapter 8
Sticky Prices and the Phillips Curve
One of the important themes of macroeconomics is that the behaviour of prices was crucial in
determining how the macro-economy responded to shocks. In the IS-LM model, we needed to
assume that prices were sticky in the short-run to obtain real effects for fiscal and monetary
policy but we assumed that prices were flexible in the long-run so that the economy returned
to its full employment level over time. In the IS-MP-PC theory, we formalised this idea a bit
more: This model featured prices that adjusted over time in response to the real economy
In these notes, we will return to the topic of price setting and the relationship over time
between inflation and the business cycle. We will emphasise the role of price flexibility and
expectations.
When we discussed IS-LM, we assumed that the price level did not keep moving to constantly
equate GDP with the level of output consistent with a natural rate of unemployment. Instead,
we assumed that prices only changed gradually over time in response to the real economy.
184
The idea that prices may be sticky has a long history in Keynesian macroeconomics but,
until recent decades, there was comparatively little evidence on the extent to which prices
This has changed since the statistical agencies have made available the micro-data that
underlie Consumer Price Indices. To construct CPIs, these agencies collect large numbers of
quotes of prices on individual items (e.g. they can tell you the price in April of a bottle of
Heinz ketchup at a particular store). These individual price quote data can be used to assess
Studies of this type now exist for a large number of countries. For example, Bils and
Klenow 2004 paper provided evidence for consumer prices in the United States.1 An important
finding from this research is that the data show a very wide range of the frequency with which
different prices change. Figure 8.1 shows a histogram from Bils and Klenows paper showing
the distribution of the percentage probability that any price changes in a month. These vary
from prices that only have a one percent probability of changing each month (Coin-operated
apparel laundry and dry cleaning) to those that have an 80 percent probability of changing
The table on the following page shows the median price duration is about four months. In
other words, half of the prices quoted in the CPI index change more than every four months,
while the other half change less than every four months. Research for the euro area has shown
that price durations are even longer in Europe. For example, Alvarez et al (2006) report a
2
median price duration for the euro area of 10.6 months.
1
Mark Bils and Peter Klenow (2004). Some Evidence on the Importance of Sticky Prices Journal of
Political Economy, Volume 112, Number 5.
2
Luis Alvarez, Emmanuel Dhyne, Marco Hoeberichts, Claudia Kwapil, Herve Le Bihan. Patrick Lunne-
mann, Fernando Martins, Roberto Sabbatini, Harald Stahl, Philip Vermeulen and Jouko Vilmunen (2006).
Sticky Prices in the Euro Area: A Summary of New Micro-Evidence Journal of the European Economic
185
Figure 8.1: The Distribution of Monthly Percent Probability of
Price Changes
186
Bils and Klenow Evidence on Price Durations
187
New Classical and New Keynesian Macroeconomics
After Milton Friedmans critique of the Phillips curve, macroeconomists began to pay more
attention to the question of how expectations were formed. In particular, a number of papers
by Robert Lucas and Thomas Sargent introduced rational expectations into macroeconomic
modelling. These early papers tended to assume that prices were perfectly flexible, which
limited the ability of fiscal and monetary policy to influence output. This school of thought
In a number of famous New Classical papers, Robert Lucas argued that monetary policy
could still have short-run effects even if prices were flexible and people had rational expec-
tations. Lucass model relied on the idea that firms had a difficulty in the short-run distin-
guishing between movements in their prices and movements in the overall price levels. For
this reason, an increase in the money supply that provoked an increase in prices could, in the
short-run, provoke higher output because firms may believe this is increasing their relative
price and making production more profitable. Lucas emphasised, however, that once people
had rational expectations, the impact of policy on output could only be short-lived. In par-
ticular, he stressed that only unpredictable fiscal and monetary policies would have an impact
because people with rational expectations would anticipate the impact of predictable policy
Once we allow prices to be sticky, however, these points no longer hold. Because some prices
will not change even after the government changes fiscal or monetary policy, these policies
will have the traditional short-run impacts described in the IS-LM model even if people have
rational expectations. There are lots of different ways of formulating the idea that prices
may be sticky. Some of the best known formulations were those introduced in papers in
188
the late seventies by John Taylor and Stanley Fischer.3 These papers assumed that only a
certain fraction of firms set prices each period but those who did change their prices would set
them in an optimal manner using rational expectations. This work, which combined rational
expectations with sticky prices, invented what is now known as New Keynesian economics.
Pricing a la Calvo
The New Keynesian literature contains a number of different formulations of sticky prices.
For the rest of these notes, we will use a formulation of sticky prices known as Calvo pricing,
after the economist who first introduced it.4 Though not the most realistic formulation of
sticky prices, it turns out to provide analytically convenient expressions, and has implications
that are very similar to those of more realistic (but more complicated) formulations.
The form of price rigidity faced by the Calvo firm is as follows. Each period, only a random
fraction (1) of firms are able to reset their price; all other firms keep their prices unchanged.
When firms do get to reset their price, they must take into account that the price may be
fixed for many periods. We assume they do this by choosing a log-price, zt , that minimizes
where is between zero and one, and pt+k is the log of the optimal price that the firm would
This expression probably looks a bit intimidating, so its worth discussing it a bit to explain
189
2
The term Et zt pt+k describes the expected loss in profits for the firm at time t + k
due to the fact that it will not be able to set a frictionless optimal price that period.
This quadratic function is intended just as an approximation to some more general profit
function. What is important here is to note that because the firm may be stuck with
the price zt for some time, it will lose profits relative to what it would have been able
The summation
P
shows that the firm considers the implications of the price set today
k=0
However, the fact that < 1 implies that the firm places less weight on future losses
than on todays losses. A dollar today is worth more than a dollar tomorrow because it
can be re-invested. By the same argument, a dollar lost today is more important than
Future losses are actually discounted at rate ()k , not just k . This is because the firm
only considers the expected future losses from the price being fixed at zt . The chance
that the price will be fixed until t + k is k . So the period t + k loss is weighted by this
probability. There is no point in the firm worrying too much about losses that might
occur from having the wrong price far off in the future, when it is unlikely that the price
190
The Optimal Reset Price
After all that, the actual solution for the optimal value of zt , (i.e. the price chosen by the firms
who get to reset) is quite simple. Each of the terms featuring the choice variable zt that is,
2
each of the zt pt+k termsneed to be differentiated with respect to zt and then the sum
Separating out the zt terms from the pt+k terms, this implies
" #
k
()k Et pt+k
X X
() zt = (8.3)
k=0 k=0
Now, we can use our old pal the geometric sum formula to simplify the left side of this
Stated in English, all this equation says is that the optimal solution is for the firm to set its
price equal to a weighted average of the prices that it would have expected to set in the future
if there werent any price rigidities. Unable to change price each period, the firm chooses to
And what is this frictionless optimal price, pt ? We will assume that the firms optimal
pricing strategy without frictions would involve setting prices as a fixed markup over marginal
191
cost:
pt = + mct (8.7)
Now, we can show how to derive the behaviour of aggregate inflation in the Calvo economy.
The aggregate price level in this economy is just a weighted average of last periods aggregate
price level and the new reset price, where the weight is determined by :
pt = pt1 + (1 ) zt , (8.9)
This can be re-arranged to express the reset price as a function of the current and past
Examining equation (8.8), we can see that zt must obey a first-order stochastic difference
equation with
yt = zt (8.13)
192
xt = + mct (8.14)
a = 1 (8.15)
b = (8.16)
1
(pt pt1 ) = (Et pt+1 pt ) + (1 ) ( + mct ) (8.18)
1 1
(1 ) (1 )
t = Et t+1 + ( + mct pt ) (8.19)
This equation is known as the New-Keynesian Phillips Curve. It states that inflation is a
The gap between the frictionless optimal price level + mct and the current price level
pt . Another way to state this is that inflation depends positively on real marginal cost,
mct pt .
Why is real marginal cost a driving variable for inflation? Firms in the Calvo model would
like to keep their price as a fixed markup over marginal cost. If the ratio of marginal cost to
193
price is getting high (i.e. if mct pt is high) then this will spark inflationary pressures because
those firms that are re-setting prices will, on average, be raising them.
For simplicity, we will denote the deviation of real marginal cost from its frictionless level of
as
observe data on real marginal cost. National accounts data contain information on the factors
that affect average costs such as wages, but do not tell us about the cost of producing an
additional unit of output. That said, it seems very likely that marginal costs are procyclical,
and more so than prices. When production levels are high relative to potential output, there
is more competition for the available factors of production, and this leads to increases in
real costs, i.e. increases in the costs of the factors over and above increases in prices. Some
examples of the procyclicality of real marginal costs are fairly obvious. For example, the
existence of overtime wage premia generally means a substantial jump in the marginal cost of
labour once output levels are high enough to require more than the standard workweek.
For these reasons, many researchers implement the NKPC using a measure of the output
gap (the deviation of output from its potential level) as a proxy for real marginal cost. In
mcrt = yt (8.22)
194
where yt is the output gap. This implies a New-Keynesian Phillips curve of the form
t = Et t+1 + yt (8.23)
where
(1 ) (1 )
= (8.24)
And this approach can be implemented empirically using various measures for estimating
potential output.5
The New-Keynesian approach assumes that firms have rational expectations. Thus, we can
k Et yt+k
X
t = (8.25)
k=0
Inflation today depends on the whole sequence of expected future output gaps. Thus, the
NKPC sees inflation as behaving according to the classic asset-price logic that we saw with
The vast majority of macroeconomists now accept Friedmans critique of the original Phillips
curve. Thus, it is widely accepted that inflation expectations will move upwards over time if
output remains above its potential level, and that there is little or scope for policy-makers to
choose a tradeoff between inflation and output. However, as we discussed in earlier lecture
5
Roberts (1995) shows that a number of other models of sticky prices also imply a formulation for inflation
similar to the New Keynesian Phillips curve.
195
notes, there is empirical evidence for a relationship of the form
t = t1 + ut (8.26)
So there is a relationship between the change in inflation and the level of unemployment. In
this formulation, the lagged inflation term reflects how last periods level of inflation changes
peoples expectations and so feeds into todays inflation. This so-called accelerationist Phillips
curve fits the data quite well (or, more precisely, empirical approaches based on a weighted
average of past inflation rates, not just last periods, fit the data well) and comes with its own
non-accelerating inflation rate of unemployment. This is the inflation rate consistent with
policy recommendations made on the basis of whether unemployment is above or below this
NAIRU level.6
The NKPC model provides a different view of this empirical relationship. While advocates
of the NKPC will concede that the accelerationist model, equation (8.26), fits the data reason-
ably well, they view this as a so-called reduced-form relationship, not a structural relationship.
t = Et t+1 + yt (8.28)
then equation (8.26) might have a good statistical fit because t1 is likely to be correlated
with Et t+1 . However, they would warn policy-makers not to rely on this relationship, because
6
Note though the NAIRU terminology is actually a misnomer. If unemployment is below u , then inflation
will be increasing, but not accelerating. The price level is what will be accelerating. Perhaps the NAIRU
should be changed to the NAPLRU, but this isnt so catchy so the slipped derivative is probably here to
stay.
196
changes in policy may produce a break the correlation between Et t+1 and t1 and at this
The NKPC also has important implications for how a government can approach reducing
inflation. Consider again the accelerationist Phillips curve, equation (8.26). The fact that
inflation depends on its own lagged values in this formulation means then it would be very
difficult to reduce inflation quickly without a significant increase in unemployment. So, this
Phillips curve suggests that gradualist policies are the best way to reduce inflation.
But the implications of the NKPC are completely different. There may be a statistical
relationship between current and lagged inflation but the NKPC says that there is no structural
relationship at all. Thus, there is no need for gradualist policies to reduce inflation. According
to the NKPC, low inflation can be achieved immediately by the central bank announcing (and
the public believing) that it is committing itself to eliminating positive output gaps in the
Whether the empirical evidence fits with the NKPCs predictions is open for debate. For
example, there has been plenty of evidence that reductions in inflation do tend to be costly
in terms of lost output and high unemployment. Some, however, have put this down to the
failure of governments and central banks to credibly convince the public of their commitment
197
Chapter 9
Investment With Adjustment Costs
In the previous chapters, we have seen a number examples of forward-looking first-order
bk Et xt+k
X
yt = a (9.2)
k=0
so that yt is a completely forward-looking variable. Note that this means that yt does not
depend at all on its own past values. We will now turn to an example which does not
Specifically, we will look at a theory of the determination of the capital stock (and thus
investment). Empirical studies show that the capital stock does not change very much from
period to period. Economists usually rationalise this by assuming that there are some form of
adjustment costs that prevent firms from changing their capital stock too quickly. In this
chapter, we will consider a model of investment with adjustment costs, show that it implies
a second-order stochastic difference equation, and examine the methods used to solve these
types of equations.
198
The Firms Problem
Consider now the following model of firm investment. We will assume that, each period,
there is a level of the log of the capital stock, kt , that the firm would choose if there were
no adjustment costs. We will call this the frictionless optimal capital stock. With adjustment
costs the firm has to choose a planned sequence of capital stocks Et {kt, kt+1, kt+2 , .....} minimise
This might look a bit intimidating but its not too complicated:
2
Firstly, for each period, t + m, there is a term kt+m kt+m that describes the loss in
profits suffered by the firm from not having its capital stock equated with the frictionless
optimal level.
Secondly, there is a term (kt+m kt+m1 ) which describes the concept of adjustment
costs formally: Ceteris paribus changes in the capital stock have a negative effect on
firm profits.
The reason we are assuming that kt is actually the log of the stock, as opposed to
the stock itself, is that this way these losses can be viewed in percentage terms: It is
the percentage gap between capital and its frictionless optimal that matters and also
the percentage change in the stock. This makes more sense than levels of these gaps
mattering because economic growth will make levels of these variables grow over time.
Finally, the parameter is a discount rate less than one, which tells us that firms care
199
This loss function can be re-written as
2
L (kt , kt+1 , tt+2 , ...) = (kt kt )2 + (kt kt1 )2 + Et
kt+1 kt+1 + (kt+1 kt )2
2
+2 Et
kt+2 kt+2 + (kt+2 kt+1 )2 + .... (9.4)
An optimal plan is arrived at by differentiating this with respect to each of the capital stock
terms kt+m and setting these derivatives equal to zero. Consider first differentiating with
h i
Et 2 kt+1 kt+1 + 2 (kt+1 kt ) 22 (kt+2 kt+1 ) = 0 (9.6)
This is the exact same as the previous first-order condition, only shifted forward one period.
In fact one can show that all of the FOCs describing the optimal dynamics of the capital are
1 1 1 1
Et kt+1 1 + + kt + kt1 = kt (9.9)
Because the maximum difference between time subscripts is two, this is a second-order stochas-
tic difference equation. There are two different methods that are commonly used to solve
200
equations of this form. I will discuss the so-called factorization method. For completeness, I
have also attached the derivation of the solution using the other method known as the method
Lag Operators
The factorization method makes use what are known as lag and forward operators. These
are commonly used in calculations relating to time series, and they work as follows. The lag
Lag operators can be multiplied and added just like normal variables. So, for instance, one
can write
Lk yt = ytk (9.11)
The forward operator has the reverse effect of the lag operator
F k yt = yt+k (9.12)
Lag and forward operators also obey a form of the geometric sum formula. Recall that for
yt = Et yt+1 + xt (9.14)
201
Equation (9.14) can be re-written as
" #
1
yt = Et xt (9.16)
1 F
yt = yt1 + xt (9.19)
Armed with this knowledge of lag and forward operators we can solve the second-order stochas-
This method first re-writes equation (9.9) in terms of lag and forward operators. Written this
way it is
1 1 1 1
Et F 1+ + + L kt = kt (9.21)
Next, the method re-expresses the left-hand-side in terms of a quadratic equation in F mul-
tiplied by L:
1 1 1 1
Et F2 1 + + F+ Lkt = kt (9.22)
202
Now, you may recall that polynominals of the form
g(x) = x2 + bx + c (9.23)
g(x) = (x 1 ) (x 2 ) (9.24)
where
1 + 2 = b (9.25)
1 2 = c (9.26)
1 1 1
x2 1 + + x+ (9.27)
1
has two roots such that one root () is between zero and one while the other equals
. This
means that the optimality condition for the capital stock can be re-expressed as
1 1
Et (F ) F Lkt = kt (9.28)
1
Dividing across by F
, this becomes
" #
1 1
Et [(F ) Lkt ] = Et 1 kt (9.29)
F
Now we can use the properties of lag operators just derived to show that
1
()k F k
X
1 = = (9.30)
F
1 F k=0
203
Note now how adding adjustment costs changes the solution for a rational expectations model.
This produces a second-order difference equation, and the solution is no longer completely
geometric discounted sum, but it also has a backward-looking component, whereby it depends
The model can be fleshed out by stating what are the determinants of the frictionless optimal
capital stock. For instance, if the production function was of the Cobb-Douglas form, then
Yt
Kt = (9.32)
Ct
where Yt is output and Ct is the cost of capital. Using lower-case letters to denote logs, this
can be written as
kt = yt ct (9.33)
Now assume that output and the cost of capital both follow AR(1) processes
204
1
= yt (9.37)
1 y
while
1
()n ct+n =
X
Et ct (9.38)
n=0 1 c
So, the capital stock process is
1 1
kt = kt1 + yt ct (9.39)
1 y 1 c
This gives us a reduced-form relationship between the capital stock, the lagged capital
Note that the magnitudes of the coefficients on output and the cost of capital depend
positively on the persistence of these variables. If y is close to one, then the coefficient on
output will be high, with the same applying for c and the cost of capital. One example of
an application of this type of reasoning is Tevlin and Whelan (2003).1 This paper presents
regressions of equipment investment on output and the cost of capital. It reports much
larger coefficients on the cost of capital for investment in computers than for non-computing
equipment, and uses a model of this sort to provide an explanation. The cost of capital for
computing equipment is largely determined by very persistent shocks that tend to produce
ever-decreasing computer prices. In contrast, for non-computing equipment, the cost of capital
depends on a set of less persistent variables such as interest rates and tax incentives. This
suggests that the cost of capital should have a smaller coefficient in a regression for the non-
1
Stacey Tevlin and Karl Whelan. Explaining the Investment Boom of the 1990s, Journal of Money,
Credit, and Banking, Volume 35, pages 1-22, 2003.
205
Appendix: The Undetermined Coefficients Method
The other method used to solve these models starts by assuming that one knows the form of
the solution. So, one guesses that the solution is of the form
" #
n2 kt+n
X
kt = 1 kt1 + Et
n=0
From there, one goes on to figure out a unique set of values for 1 , 2 and that are consistent
with this equation, and with the optimality conditions for the capital stock. In this case
" #
n2 kt+n+1
X
Et kt+1 = 1 kt + Et
n=0
So, we have
" " ##
n2 kt+n+1 + (1 + + ) kt kt1 = kt
X
1 kt + Et
n=0
" #
kt
n2 kt+n+1
X
(1 + + 1 ) kt = kt1 + + Et
n=0
1 =
(1 + + 1 )
1 1
= =
(1 + + 1 )
2 = = 1
The solution is
" #
()n kt+n
X
kt = kt1 + Et (9.40)
n=0
where solves
(1 + + ) = (9.41)
206
This can be re-written as
1 1 1
2
1+ + + =0 (9.42)
so the solution is the same as that derived from the factorization method above.
Personally, I am less fond of this method because it involves guessing the form of the
solution, which is a bit of a cheat, because it is still quite algebra-intensive, and because it
becomes impractical to apply once one moves to higher-order difference equations. In contrast,
the factorization method can be used to characterize the solutions of difference equations of
any order.
207
Part III
Long-Run Growth
208
Chapter 10
Growth Accounting
The chapters in this section will focus on what is known as growth theory. Unlike most
of macroeconomics, which concerns itself with what happens over the course of the business
cycle (why unemployment or inflation go up or down during expansions and recessions), this
branch of macroeconomics concerns itself with what happens over longer periods of time. In
particular, it looks at the question What determines the growth rate of the economy over
the long run and what can policy measures do to affect it? As we will also discuss, this is
related to the even more fundamental question of what makes some countries rich and others
poor.
In this set of notes, we will cover what is known as growth accounting a technique for
Production Functions
The usual starting point for growth accounting is the assumption that total real output in an
economy is produced using an aggregate production function technology that depends on the
total amount of labour and capital used in the economy. For illustration, assume that this
Yt = At Kt Lt (10.1)
209
where Kt is capital input and Lt is labour input. Note that an increase in At results in
higher output without having to raise inputs. Macroeconomists usually call increases in At
technological progress and often refer to this as the technology term. As such, it is easy
to imagine increases in At to be associated with people inventing new technologies that allow
efficiency and it may go up or down for other reasons, e.g. with the imposition or elimination
factors, it is also sometimes known as Total Factor Productivity (TFP), and this is the term
most commonly used in empirical papers that attempt to calculate this series.
Usually, we will be more interested in the determination of output per person in the econ-
omy, rather than total output. Output per person is often labelled productivity by economists
with increases in output per worker called productivity growth. Productivity is obtained by
Technological progress: Improving the efficiency with which an economy uses its inputs,
i.e. increases in At .
Increases in the number of workers: Note that this only adds to growth if + > 1, i.e.
if there are increasing returns to scale. Most growth theories assumes constant returns
210
to scale: A doubling of inputs produces a doubling of outputs. If a doubling of inputs
manages to more than double outputs, you could argue that the efficiency of production
has improved and so perhaps this should be considered an increase in A rather than
something that stems from higher inputs. If, there are constant returns to scale, then
Lets consider what determines growth with a constant returns to scale Cobb-Douglas pro-
Yt = At Kt L1
t (10.5)
and lets assume that time is continuous. In other words, the time element t evolves smoothly
How do we characterise how this economy grows over time? Lets denote the growth rate
1 dYt
GYt = (10.6)
Yt dt
In other words, the growth rate at any point in time is the change in output (the derivative
dYt
of output with respect to time, dt
) divided by the level of output. We can characterise
the growth rate of Yt as a function of the growth rates of labour, capital and technology by
differentiating the right-hand-side of equation (11.35) with respect to time. Before we do this,
you should recall the product rule for differentiation, i.e. that
dAB dA dB
=B +A (10.7)
dx dx dx
211
For products of three variables (like we have in this case) this implies
dABC dA dB dC
= BC + AC + AB (10.8)
dx dx dx dx
dYt dAt Kt L1
t dAt dK dL1
= = Kt Lt1 + At Lt1 t + At Kt t (10.9)
dt dt dt dt dt
We can use the chain rule to calculate the terms involving the impact of changes in capital
The growth rate of output is calculated by dividing both sides of this by Yt which is the same
as dividing by At Kt L1
t .
Kt L1 At Kt1 Lt1 At Kt L
! ! !
1 dYt t dAt dKt t dLt
= 1
+ + (1 ) (10.13)
Yt dt At Kt Lt dt At Kt Lt1 dt
At Kt Lt1
dt
Cancelling the various terms that appear multiple times in the terms inside the brackets and
we get
1 dYt 1 dAt 1 dKt 1 dLt
= + + (1 ) (10.14)
Yt dt At dt Kt dt Lt dt
This can written in more intuitive form as
GYt = GA K L
t + Gt + (1 ) Gt (10.15)
The growth rate of output equals the growth rate of the technology term plus a weighted
average of capital growth and labour growth, where the weight is determined by the parameter
212
. This is the key equation in growth accounting studies. These studies provide estimates of
how much GDP growth over a certain period comes from growth in the number of workers,
how much comes from growth in the stock of capital and how much comes from improvements
One can also show that the growth rate of output per worker is the growth rate of output
This is a re-statement in growth rate terms of our earlier decomposition of output growth
into technological progress and capital deepening when the production function has constant
returns to scale.
It is good to understand how equation (11.36) was derived but, more generally, it is useful
is assumed to be less than one, this is a smaller increase than comes from increasing
At by a factor of (1 + x).
213
How to Calculate the Sources of Growth: Solow (1957)
For most economies, we can calculate GDP, as well as the number of workers and also get some
estimate of the stock of capital (this last is a bit trickier and usually relies on assumptions
about how investment cumulates over time to add to the stock of capital.) We dont directly
observe the value of the Total Factor Productivity term, At . However, if we knew the value
of the parameter , we could figure out the growth rate of TFP from the following equation
GA Y K L
t = Gt Gt (1 ) Gt (10.17)
But where would we get a value of from? In a famous 1957 paper, MIT economist Robert
Solow pointed out that we could arrive at an estimate of by looking at the shares of GDP
To see how this method works, consider the case of a perfectly competitive firm that is
seeking to maximise profits. Suppose the firm sells its product for a price Pt (which it has no
control over), pays wages to its workers of Wt and rents its capital for a rental rate of Rt (this
last assumptionthat the firm rents its capitalisnt important for the points that follow
but it makes the calculations simpler.) This firms profits are given by
t = Pt Yt Rt Kt Wt Lt (10.18)
= Pt At Kt Lt1 Rt Kt Wt Lt (10.19)
Now consider how the firm chooses how much capital and labour to use. It will maximise
profits by differentiating the profit function with respect to capital and labour and setting the
214
t
= (1 ) Pt At Kt L
t Wt = 0 (10.21)
Lt
t Pt Y t
= Rt = 0 (10.22)
Kt Kt
t Pt Y t
= (1 ) Wt = 0 (10.23)
Lt Lt
Rt Kt
= (10.24)
Pt Y t
Wt Lt
1 = (10.25)
P t Yt
Wt Lt is the total amount of income paid out as wages (the wage rate times number of
workers).
Rt Kt is the total amount of income paid to capital (the rental rate times the amount of
capital).
These equations tell us that we can calculate 1 as the fraction of income paid to workers
rather than to compensate capital. (In real-world economies, non-labour income mainly takes
the form of interest, dividends, and retained corporate earnings). National income accounts
come with various decompositions. One of them describes how different types of incomes
add up to GDP. In most countries, these statistics show that wage income accounts for most
of GDP, meaning < 0.5. A standard value that gets used in many studies, based on US
215
estimates, is = 13 . I would note, however, that some studies do this calculation assuming
firms are imperfectly competitive if this is the case (as it is in the real world) then the
shares of income earned by labour and capital depend on the degree of monopoly power. So
one needs to be cautious about growth accounting calculations as they rely on theoretical
Solows 1957 paper concluded that capital deepening had not been that important for U.S.
growth for the period that he examined (1909-1949). In fact, he calculated that TFP growth
accounted for 87.5% of growth in output per worker over that period. The calculation became
very famous it was one his papers that was cited by the Nobel committee when awarding
Solow the prize for economics in 1987. TFP is sometimes called the Solow residual because
it is a backed out calculation that makes things add up: You calculate it as the part of
output growth not due to input growth in the same way as regression residuals in econometrics
are the part of the dependent variable not explained by the explanatory variables included in
the regression.
Most growth accounting calculations are done as part of academic studies. However, in some
countries the official statistical agencies produce growth accounting calculations. In the U.S.
the Bureau of Labor Statistics (BLS) produces them under the name multifactor productiv-
ity calculations, (i.e. they use the term MFP instead of the term TFP but conceptually they
are the same thing.) Many of the studies add some bells and whistles to the basic calcula-
tions just described. For example, the BLS try to account for improvements in the quality
of the labour force by accounting for improvements in the level of educational qualifications
216
and work experience of employees. In other words, they view the production function as being
of the form
Yt = At Kt (qt Lt )1 (10.26)
Figure 10.1 shows a summary of the BLSs calculations of the sources of growth in the
US from 1987 to 2010. They conclude that average growth of 2.2 percent in the U.S. private
nonfarm economy can be explained as follows: 0.9 percent comes from capital deepening, 0.3
percent comes from changes in labour composition and 0.9 percent comes from changes in
what they call multifactor productivity. Looking at different samples, however, we can see
From 1987-1995, productivity growth averaged only 1.5 percent and MFP growth was
weak, contributing only 0.5 percent per year to growth. During this period, there was
a lot of discussion about the slowdown in growth relative to previous eras, with much
of the focus on the poor performance of TFP growth. Paul Krugmans first popular
economics book was called The Age of Diminished Expectations because people seemed
to have accepted that the US economy was doomed to low productivity growth.
From 1995-2007, productivity growth averaged a very respectable 2.65 percent, with
MFP growth contributing 1.35 percent. During this period, there was a lot ofdiscussion
of the impact of new Internet-related technologies that improved efficiency. While the
peak of this enthusiasm was around the dot-com bubble of the 2000s when there was
pretty good.
217
From 2007-2013, productivity growth has been weaker than in the previous decade,
averaging only 1.3 percent. MFP growth has been particularly weak, averaging only 0.6
percent over this period. New Economy optimism has receded, a topic that we will
return to later.
In addition to the poor performance of U.S. productivity growth, another factor that is
weighing on the potential for output growth is a slow growth rate of the labour force. After
years of increasing numbers of people available for work due to normal population growth,
immigration and increased female labour participation, the US labour force has flattened out
(see Figure 10.2). This is being driven by long-run demographic trends as the large baby
boom generation starts to retire. This trend is set to continue over the next few decades.
Figure 10.3 shows that the dependency ratio (the ratio of non-working to working people) is
projected to increase significantly as the populations grows older on average. See my blog
218
Figure 10.1: Growth Accounting Calculations for the U.S.
219
Figure 10.2: The U.S. Labour Force
220
Figure 10.3: The Ratio of Non-Working to Working People in U.S.
221
Example: The Euro Area
Longer-term growth prospects in Europe appear to be worse than in the United States. I am
currently working on a paper with Kieran McQuinn that does a growth accounting analysis
for the euro area and constructs longer-term growth projections. The following discussion is
Table 1 shows that growth in output per worker in the countries that make up the euro
area has gradually declined over time. In particular, TFP growth has collapsed. From 2.7
percent per year over 1970-76, TFP growth has fallen to an average of 0.2 percent per year
over the period 2000-2012. Table 2 shows that weak performances for TFP growth can be
Europe is also on the cusp of a significant demographic change that will reduce the potential
for GDP growth: See Figure 10.4. Population growth is slowing and total population is set to
peak in before the middle of this century. The population is also ageing significantly. Indeed
the total amount of people aged between 15 and 64 (i.e. the usual definition of work-age
population) has peaked and is set to decline substantially over the next half century.
While Europe has many short-term macroeconomics problems due to weak aggregate de-
mand and high levels of public and private debt, it is also the case that it faces severe challenges
in relation to long-term growth. Maintaining growth rates at close to those experienced histor-
ically will likely involve reforms to raise the size of the labour force (such as raising retirement
222
Table 1: The Euro Areas Growth Performance
223
Table 2: Country-by-Country Growth Performance 2000-2013
224
Figure 10.4: Demographic Projections for the Euro Area from Eu-
rostat
225
Example: A Tale of Two Cities
Alwyn Youngs 1992 paper A Tale of Two Cities: Factor Accumulation and Technical Change
compares the growth experiences of these two small Asian economies from the early 1970s
to 1990. Young explained his motivation for picking these two economies in terms of their
In the prewar era, both economies were British colonies that served as entre-
pot trading ports, with little domestic manufacturing activity ... In the postwar
facturing sectors. Both economies have passed through a similar set of industries,
moving from textiles, to clothing, to plastics, to electronics, and then, in the 1980s,
gradually moving from manufacturing into banking and financial services ... The
Southern China ... While the Hong Kong government has emphasized a policy of
laissez faire, the Singaporean government has, since the early 1960s, pursued the
Both economies were successful: Hong Kong had total growth of 147% between the early
1970s and 1990 and Singapore had growth of 154%. But Young was interested in exploring the
extent to which TFP contributed to growth in these two economies. The results of his growth
accounting calculations are shown on the next page. He found that Singapores approach did
not produce any TFP growth while Hong Kongs more free market approach lead to strong
TFP growth with this element accounting for almost half of the growth in output per worker.
2
Available at www.nber.org/chapters/c10990.pdf
226
One can argue this was a better outcome because Hong Kong achieved the growth without
having to divert a huge part of national income towards investment rather than consumption.
As we will see in the next lecture, however, TFP-based growth has an advantage over growth
227
Chapter 11
The Solow Model
We have discussed how economic growth can come from either capital deepening (increased
amounts of capital per worker) or from improvements in total factor productivity (sometimes
termed technological progress). This suggests that economic growth can come about from
saving and investment (so that the economy accumulates more capital) or from improvements
in productive efficiency. In these notes, we consider a model that explains the role these two el-
ements play in generating sustained economic growth. The model is also due to Robert Solow,
whose work on growth accounting we discussed in the last lecture, and was first presented in
The Solow model assumes that output is produced using a production function in which output
depends upon capital and labour inputs as well as a technological efficiency parameter, A.
Yt = AF (Kt , Lt ) (11.1)
Yt
> 0 (11.2)
Kt
Yt
> 0 (11.3)
Lt
228
However, the model also assumes there are diminishing marginal returns to capital accumula-
tion. In other words, adding extra amounts of capital gives progressively smaller and smaller
increases in output. This means the second derivative of output with respect to capital is
negative.
2 Yt
<0 (11.4)
Kt
See Figure 11.1 for an example of how output can depend on capital with diminishing returns.
Think about why diminishing marginal returns is probably sensible: If a firm acquires an extra
unit of capital, it will probably be able to increase its output. But if the firm keeps piling on
extra capital without raising the number of workers available to use this capital, the increases
in output will probably taper off. A firm with ten workers would probably like to have at
least ten computers. It might even be helpful to have a few more; perhaps a few laptops for
work from home or some spare computers in case others break down. But at some point, just
We will use a very stylized description of the other parts of this economy: This helps us to
focus in on the important role played by diminishing marginal returns to capital. We assume
a closed economy with no government sector or international trade. This means all output
Yt = Ct + It (11.5)
St = Yt Ct = It (11.6)
dKt
= It Kt (11.7)
dt
229
In other words, the addition to the capital stock each period depends positively on investment
The Solow model does not attempt to model the consumption-savings decision. Instead it
St = sYt (11.8)
230
Figure 11.1: Diminishing Marginal Returns to Capital
Output
Output
Capital
231
Capital Dynamics in the Solow Model
Because savings equals investment in the Solow model, equation (11.8) means that investment
It = sYt (11.9)
which means we can re-state the equation for changes in the stock of capital
dKt
= sYt Kt (11.10)
dt
Whether the capital stock expands, contracts or stays the same depends on whether investment
dKt
> 0 if Kt < sYt (11.11)
dt
dKt
= 0 if Kt = sYt (11.12)
dt
dKt
< 0 if Kt > sYt (11.13)
dt
Kt s
= (11.14)
Yt
the the stock of capital will stay constant. If the capital-ouput ratio is lower than this level,
then the capital stock will be increasing and if it is higher than this level, it will be decreasing.
straight-line function of the stock of capital while output is a curved function of capital,
featuring diminishing marginal returns. When the level of capital is low sYt is greater than
K. As the capital stock increases, the additional investment due to the extra output tails
off but the additional depreciation does not, so at some point sYt equals K and the stock of
capital stops increasing. Figure labels the particular point at which the capital stock remains
232
In the same way, if we start out with a high stock of capital, then depreciation, K, will
tend to be greater than investment, sYt . This means the stock of capital will decline. When
it reaches K it will stop declining. This an example of what economists call convergent
dynamics. For any fixed set of the model parameters (s and ) and other inputs into the
production (At and Lt ) there will be a defined level of capital such that, no matter where the
capital stock starts, it will converge over time towards this level.
Figure provides an illustration of how the convergent dynamics determine the level of
output in the Solow model. It shows output, investment and depreciation as a function of the
capital stock. The gap between the green line (investment) and the orange line (output) shows
the level of consumption. The economy converges towards the level of output associated with
Now consider what happens when the economy has settled down at an equilibrium unchanging
level of capital K1 and then there is an increase in the savings rate from s1 to s2 .
Figure 11.4 shows what happens to the dynamics of the capital stock. The line for in-
vestment shifts upwards: For each level of capital, the level of output associated with it
translates into more investment. So the investment curve shifts up from the green line to the
red line. Starting at the initial level of capital, K1 , investment now exceeds depreciation. This
means the capital stock starts to increase. This process continues until capital reaches its new
equilibrium level of K2 (where the red line for investment intersects with the black line for
depreciation.) Figure 11.4 illustrates how output increases after this increase in the savings
rate.
233
Figure 11.2: Capital Dynamics in The Solow Model
Investment,
Depreciation Depreciation K
Investment sY
K* Capital, K
234
Figure 11.3: Capital and Output in the Solow Model
Investment,
Depreciation, Depreciation K
Output
Output Y
Consumption
Investment sY
K*
Capital, K
235
Figure 11.4: An Increase in the Saving Rate
Investment,
Depreciation Depreciation K
K1 K2 Capital, K
236
Figure 11.5: Effect on Output of Increased Saving
Investment, Depreciation K
Depreciation
Output
Output Y
K1 K2 Capital, K
237
An Increase in the Depreciation Rate
Now consider what happens when the economy has settled down at an equilibrium unchanging
level of capital K1 and then there is an increase in the depreciation rate from 1 to 2 .
Figure 11.6 shows what happens in this case. The depreciation schedule shifts up from
the black line associated with the original depreciation rate, 1 , to the new red line associated
with the new depreciation rate, 2 . Starting at the initial level of capital, K1 , depreciation now
exceeds investment. This means the capital stock starts to decline. This process continues
until capital falls to its new equilibrium level of K2 (where the red line for depreciation
intersects with the green line for investment.) So the increase in the depreciation rate leads
Now consider what happens when technological efficiency At increases. Because investment
is given by
a one-off increase in A thus has the same effect as a one-off increase in s. Capital and output
gradually rise to a new higher level. Figure 11.7 shows the increase in capital due to an
238
Figure 11.6: An Increase in Depreciation
Investment sY
K2 K1
Capital, K
239
Figure 11.7: An Increase in Technological Efficiency
Investment, Depreciation K
Depreciation
K1 K2 Capital, K
240
Solow and the Sources of Growth
In the last lecture, we described how capital deepening and technological progress were the
two sources of growth in output per worker. Specifically, we derived an equation in which
output growth was a function of growth in the capital stock, growth in the number of workers
Our previous discussion had pointed out that a one-off increase in technological efficiency,
At , had the same effects as a one-off increase in the savings rate, s. However, there are
important differences between these two types of improvements. The Solow model predicts
that economies can only achieve a temporary boost to economic growth due to a once-off
increase in the savings rate. If they want to sustain economic growth through this approach,
then they will need to keep raising the savings rate. However, there are likely to be limits in
any economy to the fraction of output that can be allocated towards saving and investment,
particularly if it is a capitalist economy in which savings decisions are made by private citizens.
Unlike the savings rate, which will tend to have an upward limit, there is no particular
reason to believe that technological efficiency At has to have an upper limit. Indeed, growth
accounting studies tend to show steady improvements over time in At in most countries. Going
back to Youngs paper on Hong Kong and Singapore discussed in the last lecture, you can
see now why it matters whether an economy has grown due to capital deepening or TFP
growth. The Solow model predicts that a policy of encouraging growth through more capital
accumulation will tend to tail off over time producing a once-off increase in output per worker.
In contrast, a policy that promotes the growth rate of TFP can lead to a sustained higher
241
The Capital-Output Ratio with Steady Growth
Up to now, we have only considered once-off changes in output. Here, however, we consider
how the capital stock behaves when the economy grows at steady constant rate GY . Specif-
ically, we can show in this case that the ratio of capital to output will tend to converge to a
specific value. Recall from the last lecture that if we have something of the form
Zt = Ut Wt (11.16)
then we have the following relationship between the various growth rates
GZt = GUt + GW
t (11.17)
Adjusting equation 11.10, the growth rate of the capital stock can be written as
1 dKt Yt
GK
t = =s (11.19)
Kt dt Kt
K Yt
GtY = s GY (11.20)
Kt
This gives a slightly different form of convergence dynamics from those we saw earlier. This
equation shows that the growth rate of the capital-output ratio depends negatively on the
level of this ratio. This means the capital-output ratio displays convergent dynamics. When
it is above a specific equilibrium value it tends to fall and when it is below this equilibrium
value it tends to increase. Thus, the ratio is constantly moving towards this equilibrium value.
242
We can express this formally as follows:
K Kt s
GtY > 0 if < (11.21)
Yt + GY
K Kt s
GtY = 0 if = (11.22)
Yt + GY
K Kt s
GtY < 0 if > (11.23)
Yt + GY
We can illustrate these dynamics using a slightly altered version of our earlier graph. Figure
11.8 amends the depreciation line to the amount of capital necessary not just to replace
depreciation but also to have a percentage increase in the capital stock that matches the
increase in output. The diagram shows that the economy will tend to move towards a capital
Kt s
stock such that sYt = + GY Kt meaning the capital-output ratio is Yt
= +GY
.
243
Figure 11.8: The Equilibrium Capital Stock in a Growing Economy
Investment,
Depreciation
Depreciation and Growth (+GY)K
Investment sY
K* Capital, K
244
Briefly, Back to Piketty
In Chapter 6, we discussed one of Thomas Pikettys explanations for why capital may tend
to growth faster than income (the r > g argument). Piketty has a different argument for why
capital may grow faster than income that relates to the result we have just derived.
In his book, Piketty describes a different assumption about savings in the economy from
the one we have just derived. Specifically, he works with a net savings rate, s, which is defined
as follows
It Kt = sYt (11.24)
In other words, defined like this, s is a savings rate that subtracts off the share of GDP taken
up by capital depreciation. In the same way, net national product is defined as GDP minus
depreciation. Given this definition, we can write the change in the capital stock as
Kt = sYt (11.25)
Repeating the calculations from above with this model, the growth rate of capital
1 dKt It Kt
GK
t = = (11.26)
Kt dt Kt
becomes
1 dKt Yt
GK
t = = s (11.27)
Kt dt Kt
So the growth rate of the capital-output ratio is
K Yt
GtY = s GY (11.28)
Kt
K Kt s
GtY > 0 if < Y (11.29)
Yt G
245
K Kt s
GtY = 0 if = Y (11.30)
Yt G
K Kt s
GtY < 0 if > Y (11.31)
Yt G
Kt s
So the capital output ratio converges to Yt
= GY
. Again showing his gift for grand terminol-
ogy, Piketty calls this result the second fundamental law of capitalism. His research has argued
that growth appears to be slowing around the world and thus, with GY in the denominator
This prediction will, of course, also hold for the standard Solow model formulation in
s
which the capital-output ratio converges to +GY
. The most obvious difference, however, is
that Pikettys formulation suggests that when GY tends towards zero that we could see the
capital-output ratio head towards infinity because his steady-state ratio does not have the
the model, you can show that the net savings rate along a steady growth path will be
It Kt
s = (11.32)
Yt Yt
s
= s (11.33)
GY +
sGY
= (11.34)
GY +
So when output growth goes to zero, the net savings rate also goes to zero. This means we
s
shouldnt just look at Pikettys formula of GY
for the steady-state capital-output ratio and
imagine the denominator (GY ) heading to zero while the numerator s is fixed. From this
discussion, you can take that slower output growth is likely to raise the ratio of capital to
246
Why Growth Accounting Can Be Misleading
Of the cases just considered in which output and capital both increasean increase in the
savings rate and an increase in the level of TFPthe evidence points to increases in TFP
being more important as a generator of long-term growth. Rates of savings and investment
tend for most countries tend to stay within certain ranges while large increases in TFP over
time have been recorded for many countries. Its worth noting then that growth accounting
studies can perhaps be a bit misleading when considering the ultimate sources of growth.
Consider a country that has a constant share of GDP allocated to investment but is
experiencing steady growth in TFP. The Solow model predicts that this economy should
experience steady increases in output per worker and increases in the capital stock. A growth
accounting exercise may conclude that a certain percentage of growth stems from capital
accumulation but ultimately, in this case, all growth (including the growth in the capital
stock) actually stems from growth in TFP. The moral here is that pure accounting exercises
I encourage you to read Paul Krugmans 1994 article The Myth of Asias Miracle.1 It
discusses a number of examples of cases where economies where growth was based on largely
on capital accumulation. In addition to the various Asian countries covered in Alwyn Youngs
research, Krugman (correctly) predicted a slowdown in growth in Japan, even though at the
time many US commentators were focused on the idea that Japan was going to overtake US
247
Perhaps most interesting is his discussion of growth in the Soviet Union. Krugman notes
that the Soviet economy grew strongly after World War 2 and many in the West believed they
would become more prosperous than capitalist economies. The Soviet Unions achievement
in placing the first man in space provoked Kennedys acceleration in the space programme,
mainly to show the U.S. was not falling behind communist systems. However, some economists
that had examined the Soviet economy were less impressed. Heres an extended quote from
Krugmans article:
When economists began to study the growth of the Soviet economy, they did so
using the tools of growth accounting. Of course, Soviet data posed some prob-
lems. Not only was it hard to piece together usable estimates of output and input
(Raymond Powell, a Yale professor, wrote that the job in many ways resembled
cialist economy one could hardly measure capital input using market returns, so
similar levels of development. Still, when the efforts began, researchers were pretty
sure about what: they would find. Just as capitalist growth had been based on
growth in both inputs and efficiency, with efficiency the main source of rising per
capita income, they expected to find that rapid Soviet growth reflected both rapid
But what they actually found was that Soviet growth was based on rapidgrowth
in inputsend of story. The rate of efficiency growth was not only unspectacular,
it was well below the rates achieved in Western economies. Indeed, by some
248
The immense Soviet efforts to mobilize economic resources were hardly news. Stal-
inist planners had moved millions of workers from farms to cities, pushed millions
of women into the labor force and millions of men into longer hours, pursued mas-
the countrys industrial output back into the construction of new factories.
Still, the big surprise was that once one had taken the effects of these more or
less measurable inputs into account, there was nothing left to explain. The most
This comprehensibility implied two crucial conclusions. First, claims about the
apprehension. If the Soviet economy had a special strength, it was its ability to
mobilize resources, not its ability to use them efficiently. It was obvious to every-
one that the Soviet Union in 1960 was much less efficient than the United States.
was virtually certain to slow down. Long before the slowing of Soviet growth be-
The Soviet leadership did a good job for a long time of hiding from the world that their
economy had stopped growing but ultimately the economic failures of the centrally planning
model (combined with its many political and ethnic tensions) ended in a dramatic implosion
249
A Formula for Steady Growth
All of the results so far apply for any production function with diminishing marginal returns
to capital. However, we can also derive some useful results by making specific assumptions
about the form of the production function. Specifically, we will consider the constant returns
Yt = At Kt L1
t (11.35)
GYt = GA K L
t + Gt + (1 ) Gt (11.36)
Now consider the case in which the growth rate of labour input is fixed at n
GLt = n (11.37)
GA
t = g (11.38)
GYt = g + GK
t + (1 ) n (11.39)
This means all variations in the growth rate of output are due to variations in the growth
rate for capital. If output is growing at a constant rate, then capital must also be growing at
a constant rate. And we know that the capital-output ratio tends to move towards a specific
equilibrium value. So along a steady growth path, the growth rate of output equals the growth
250
which can be simplified to
g
GYt = +n (11.41)
1
The growth rate of output per worker is
g
GYt n = (11.42)
1
So the economy tends to converge towards a steady growth path and the growth rate of output
g
per worker along this path is 1
. Without growth in technological efficiency, there can be
In this case of the Cobb-Douglas production function, output per worker can be written as
Yt Kt
= At (11.43)
Lt Lt
In other words, output per worker is a function of technology and of capital per worker. A
drawback of this representation is that we know that increases in At also increase capital
per worker, so this has the misleading implications about the role of capital accumulation
worker, one that we will use again. First, well define the capital-output ratio as
Kt
xt = (11.44)
Yt
Yt = At (xt Yt ) L1
t (11.45)
K t = xt Y t (11.46)
251
Dividing both sides of this expression by Yt , we get
1
Taking both sides of the equation to the power of 1
we arrive at
1
Yt = At1 xt1 Lt (11.48)
technological progress or changes in the capital-output ratio. When considering the relative
role of technological progress or policies to encourage accumulation, we will see that this
decomposition is more useful than equation (11.43) because the level of technology does not
Kt
affect xt in the long run while it does affect Lt
. So, this decomposition offers a cleaner picture
of the part of growth due to technology and the part that is not.
Because At is assumed to grow at a constant rate each period, this means that all of the
interesting dynamics for output per worker in this model stem from the behaviour of the
capital-output ratio. We will now describe in more detail how this ratio behaves. Before
doing so, I want to introduce a new piece of terminology that we will use in the next few
lectures.
A useful mathematical shorthand that saves us from having to write down derivatives with
dYt
Yt = (11.50)
dt
252
What we are really interested in, though, is growth rates of series, so we need to scale this by
Yt
the level of output itself. Thus, Yt
, and this is our mathematical expression for the growth
rate of a series. For our Cobb-Douglas production function, we can use the result we derived
Yt At Kt Lt
= + + (1 ) (11.51)
Yt At Kt Lt
xt = Kt Yt 1 (11.55)
To get an expression for the growth rate of the capital stock, we re-write the capital
accumulation equation as
Kt = sYt Kt (11.57)
Kt Yt
=s (11.58)
Kt Kt
Kt s
= (11.59)
Kt xt
253
Now using equation (11.54) for output growth and equation (11.59) for capital growth, we
can derive a useful equation for the dynamics of the capital-output ratio:
xt Kt
= (1 ) g (1 )n (11.60)
xt Kt
s g
= (1 )( n ) (11.61)
xt 1
This dynamic equation has a very important property: The growth rate of xt depends nega-
tively on the value of xt . In particular, when xt is over a certain value, it will tend to decline,
and when it is under that value it will tend to increase. This provides a specific illustration
What is the long-run steady-state value of xt , which we will label x ? It is the value
x
consistent with x
= 0. This implies that
s g
n =0 (11.62)
x 1
xt s g
= (1 )( n ) (11.64)
xt xt 1
g
Multiplying and dividing the right-hand-side of this equation by ( 1 + n + ):
g
s/xt 1 n
!
xt g
= (1 )( + n + ) g (11.65)
xt 1 1
+n+
254
g x
= (1 )( + n + ) 1 (11.67)
1 xt
g x xt
= (1 )( + n + ) (11.68)
1 xt
This equation states that each period the capital-output ratio closes a fraction equal to =
g
(1 )( 1 + n + ) of the gap between the current value of the ratio and its steady-state
value.
Often, the best way to understand dynamic models is to load them onto the computer and
see them run. This is easily done using spreadsheet software such as Excel or econometrics-
oriented packages such as RATS. Figures 11.9 to 11.11 provide examples of the behaviour over
time of two economies, one that starts with a capital-output ratio that is half the steady-state
level, and other that starts with a capital output ratio that is 1.5 times the steady-state level.
The parameters chosen were s = 0.2, = 13 , g = 0.02, n = 0.01, = 0.06. Together these
parameters are consistent with a steady-state capital-output ratio of 2. To see, this plug these
Figure 11.9 shows how the two capital-output ratios converge, somewhat slowly, over time
to their steady-state level. This slow convergence is dictated by our choice of parameters: Our
g 2
= (1 )( + n + ) = (1.5 0.02 + 0.01 + 0.06) = 0.067 (11.70)
1 3
So, the capital-output ratio converges to its steady-state level at a rate of about 7 percent
255
per period. These are fairly standard parameter values for annual data, so this should be
Figure 11.10 shows how output per worker evolves over time in these two economies. Both
economies exhibit growth, but the capital-poor economy grows faster during the convergence
period than the capital-rich economy. These output per worker differentials may seem a little
small on this chart, but the Figure 11.11 shows the behaviour of the growth rates, and this
chart makes it clear that the convergence dynamics can produce substantially different growth
rates depending on whether an economy is above or below its steady-state capital-output ratio.
During the initial transition periods, the capital-poor economy grows at rates over 6 percent,
while the capital-rich economy grows at under 2 percent. Over time, both economies converge
256
Figure 11.9: Convergence Dynamics for the Capital-Output Ratio
257
Figure 11.10: Convergence Dynamics for Output Per Worker
258
Figure 11.11: Convergence Dynamics for the Growth Rate of Output
Per Worker
259
Illustrating Changes in Key Parameters
Figures 11.12 to 11.14 examine what happens when the economy is moving along the steady-
state path consistent with the parameters just given, and then one of the parameters is
Consider first an increase in the savings rate to s = 0.25. This has no effect on the
steady-state growth rate. But it does change the steady-state capital-output ratio from 2 to
2.5. So the economy now finds itself with too little capital relative to its new steady-state
capital-output ratio. The growth rate jumps immediately and only slowly returns to the long-
run 3 percent value. The faster pace of investment during this period gradually brings the
The increase in the savings rate permanently raises the level of output per worker relative
to the path that would have occurred without the change. However, for our parameter values,
this effect is not that big. This is because the long-run effect of the savings rate on output per
worker is determined by s 1 , which in this case is s0.5 . So in our case, 25 percent increase in
the savings rate produces an 11.8 percent increase in output per worker (1.250.5 = 1.118). More
generally, a doubling of the savings rate raises output per worker by 41 percent (20.5 = 1.41).
The charts also show the effect of an increase in the depreciation rate to = 0.11. This
reduces the steady-state capital-output ratio to 4/3 and the effects of this change are basically
Finally, there is the increase in the rate of technological progress. Ive shown the effects of
a change from g = 0.02 to g = 0.03. This increases the steady-state growth rate of output per
worker to 0.045. However, as the charts show there is another effect: A faster steady-state
growth rate for output reduces the steady-state capital-output ratio. Why? The increase in
260
g raises the long-run growth rate of output; this means that each period the economy needs
to accumulate more capital than before just to keep the capital-output ratio constant. Again,
without a change in the savings rate that causes this to happen, the capital-output ratio will
decline. So, the increase in g means thatas in the depreciation rate examplethe economy
starts out in period 25 with too much capital relative to its new steady-state capital-output
ratio. For this reason, the economy doesnt jump straight to its new 4.5 percent growth rate
of output per worker. Instead, after an initial jump in the growth rate, there is a very gradual
transition the rest of the way to the 4.5 percent growth rate.
261
Figure 11.12: Capital-Output Ratios: Effect of Increases In ...
262
Figure 11.13: Growth Rates of Output Per Hour: Effect of Increases
In ...
263
Figure 11.14: Output Per Hour: Effect of Increases In ...
264
Convergence Dynamics in Practice
The Solow model predicts that no matter what the original level of capital an economy starts
out with, it will tend to revert to the equilibrium levels of output and capital indicated by
the economys underlying features. Does the evidence support this idea?
far less capital than is consistent with their fundamental features. Wars have provided the
natural experiments in which various countries have had huge amounts of their capital
destroyed. The evidence has generally supported Solows prediction that economies that
experience negative shocks should tend to recover from these setbacks and return to their
pre-shock levels of capital and output. For example, both Germany and Japan grew very
strongly after the war, recovering prosperity despite the massive damage done to their stocks
A more extreme example, perhaps, is study by Edward Miguel and Gerard Roland of the
long-run impact of U.S. bombing of Vietnam in the 1960s and 1970s.2 Miguel and Roland
found large variations in the extent of bombing across the various regions of Vietnam. Despite
large differences in the extent of damage inflicted on different regions, Miguel and Roland
found little evidence for lasting relative damage on the most-bombed regions by 2002. (Note
this is not the same as saying there was no damage to the economy as a whole the study
is focusing on whether those areas that lost more capital than average ended up being poorer
than average).
2
http://eml.berkeley.edu/ groland/pubs/vietnamoct09.pdf
265
Chapter 12
Endogenous Technological Change
The Solow model identified technological progress or improvements in total factor productivity
(TFP) as the key determinant of growth in the long run, but did not provide any explanation
of what determines it. In the technical language used by macroeconomists, long-run growth
In these notes, we consider a particular model that makes technological progress endogeous,
meaning determined by the actions of the economic agents described in the model. The model,
due to Paul Romer (Endogenous Technological Change, Journal of Political Economy, 1990)
starts by accepting the Solow models result that technological progress is what determines
long-run growth in output per worker. But, unlike the Solow model, Romer attempts to
So what is this technology term A anyway? The Romer model takes a specific concrete view
A
L1 (x1 x2 xA ) L1 xi
X
Y = Y + + .... + = Y (12.1)
i=1
where LY is the number of workers producing output and the xi s are different types of capital
266
goods. The crucial feature of this production function is that diminishing marginal returns
applies, not to capital as a whole, but separately to each of the individual capital goods
If A was fixed, the pattern of diminishing returns to each of the separate capital goods
would mean that growth would eventually taper off to zero. However, in the Romer model,
A is not fixed. Instead, there are LA workers engaged in R&D and this leads to the invention
of new capital goods. This is described using a production function for the change in the
A = LA A (12.2)
The change in the number of capital goods depends positively on the number of researchers
( is an index of how slowly diminishing marginal productivity sets in for researchers) and
also on the prevailing value of A itself. This latter effect stems from the giants shoulders
effect.1 For instance, the invention of a new piece of software will have relied on the previous
invention of the relevant computer hardware, which itself relied on the previous invention of
Romers model contains a full description of the factors that determines the fraction of
workers that work in the research section. The research sector gets rewarded with patents that
allow it to maintain a monopoly in the product it invents; wages are equated across sectors,
so the research sector hire workers up to point where their value to it is as high as it is to
producers of final output. In keeping with the spirit of the Solow model, Im going to just
treat the share of workers in the research sector as an exogenous parameter (but will discuss
1
Stemming from Isaac Newtons observation If I have seen farther than others, it is because I was standing
on the shoulders of giants.
267
later some of the factors that should determine this share). So, we have
L = LA + LY (12.3)
L A = sA L (12.4)
And again we assume that the total number of workers grows at an exogenous rate n:
L
=n (12.5)
L
A
X
K= xi (12.6)
i=1
K = sK Y K (12.7)
One observation that simplifies the analysis of the model is the fact that all of the capital
goods play an identical role in the production process. For this reason, we can assume that
the demand from producers for each of these capital goods is the same, implying that
xi = x i = 1, 2, ....A (12.8)
Y = ALY1 x (12.9)
This looks just like the Solow models production function. The TFP term is written as
A1 as opposed to just A as it was in our first handout, but this makes no difference to the
You can use the same arguments as before to show that this economy converges to a steady-
state growth path in which capital and output grow at the same rate. So, we can derive the
where
sY = 1 sA (12.13)
Now use the fact that the steady-state growth rates of capital and output are the same to
Finally, because the share of labour allocated to the non-research sector cannot be changing
along the steady-state path (otherwise the fraction of researchers would eventually go to zero
269
or become greater than one, which would not be feasible) we have
!
Y L A
= (12.16)
Y L A
The steady-state growth rate of output per worker equals the steady-state growth rate of A.
The only difference from the Solow model is that writing the TFP term as A1 makes this
A 1 A
growth rate A
as opposed to 1 A
.
The big difference relative to the Solow model is that the A term is determined within the
model as opposed to evolving at some fixed rate unrelated to the actions of the agents in the
model economy. To derive the steady-state growth rate in this model, note that the growth
A
= (sA L) A1 (12.17)
A
The steady-state of this economy features A growing at a constant rate. This can only be the
case if the growth rate of the right-hand-side of (12.17) is zero. Using our usual procedure for
Again, in steady-state, the growth rate of the fraction of researchers ( ssAA ) must be zero. So,
along the models steady-state growth path, the growth rate of the number of capital goods
The long-run growth rate of output per worker in this model depends on positively on
three factors:
270
The parameter , which describes the extent to which diminishing marginal productivity
The strength of the standing on shoulders effect, . The more past inventions help to
boost the rate of current inventions, the faster the growth rate will be.
The growth rate of the number of workers n. The higher this, the faster the economy
adds researchers. This may seem like a somewhat unusual prediction, but it holds well if
one takes a very long view of world economic history. Prior to the industrial revolution,
growth rates of population and GDP per capita were very low. The past 200 years have
seen both population growth and economic growth rates increases. See the figures on
the next two pages (the first comes from Greg Clarks book A Farewell to Alms which
271
Figure 12.1: World Economic History
272
Figure 12.2: Global Population
273
The Steady-State Level of Output Per Worker
Just as with our discussion of the Solow model, we can decompose output per worker into a
capital-output ratio component and a TFP component. In other words, one can re-arrange
Note that the sA term reflects the reduction in the production of goods and services due
to a fraction of the labour force being employed as researchers. One can also use the same
arguments to show that, along the steady-state growth path the capital-output ratio is
K sK
= n (12.22)
Y n+ 1
+
n g
(The 1
here takes the place of the 1
in the first handouts expression for the steady-state
capital-output ratio because this is the new formula for the growth rate of output per worker).
Finally, we can also figure out the level of A along the steady-state growth path as follows.
A n
= (sA L) A1 = (12.23)
A 1
(1 ) 1
A = (sA L) 1 (12.24)
n
274
Convergence Dynamics for A
We noted already that the arguments showing that the capital-output ratio tends to converge
towards its steady-state are the same here as in the Solow model. What about the A term?
How do we know, for instance, that A always reverts back eventually to the path given by
A
gA = = (sA L) A1 (12.26)
A
gA sA
= + n (1 ) gA (12.27)
gA sA
One can use this equation to show that gA will be falling whenever
n sA
gA > + (12.28)
1 1 sA
So, apart from periods when the share of researchers is changing, the growth rate of A will be
n
declining whenever it is greater than its steady-state value of 1
. The same argument works in
reverse when gA is below its steady-state value. Thus, the growth rate of A displays convergent
dynamics, always tending back towards its steady-state value. And equation (12.24) tells us
exactly what the level of A has to be if the growth rate of A is at its steady-state value.
275
Optimal R&D?
We havent discussed the various factors that may determine the share of the labour force
allocated to the research sectors, sA . However, in equation (12.25) we have diagnosed two
separate offsetting effects that sA has on output: A negative one caused by the fact the
researchers dont actually produce output, and a positive one due to the positive effect of the
Equation (12.25) looks very complicated but it looks simpler if we just take all the terms
that dont involve sA and bundle them together calling them X and also write Z = 1
. In
Written like this, it is a relatively simple calculus problem to figure out the level of sA that
maximises the level of output per worker along the steady-state growth path. In other words,
one can can differentiate equation (12.25) with respect to sA , set equal to zero, and solve to
When one fills in the model to determine sA endogenously, does the economy generally
arrive at this optimal level? No. The reason for this is that research activity generates
externalities that affect the level of output per worker, but which are not taken into account by
private individuals or firms when they make the choice of whether or not to conduct research.
Looking at the ideas production function, equation (12.2), one can see both positive and
negative externalities:
A positive externality due to the giants shoulders effect. Researchers dont take into
276
account the effect their inventions have in boosting the future productivity of other
researchers. The higher is , the more likely it is that the R&D share will be too low.
A negative externality due to the fact that < 1, so diminishing marginal productivity
Whether there is too little or too much research in the economy relative to the optimal level
depends on the strength of these various externalities. However, using empirical estimates of
the parameters of equation (12.2), Charles Jones and John Williams have calculated that
it is far more likely that the private sector will do too little research relative to the social
optimum.2
To give some insight into this result, note that the steady-state growth rate in this model is
n
1
, so 1
is the ratio of the growth rate of output per worker to the growth rate of population.
Suppose this equals one, so growth in output per worker equals growth in populationperhaps
a reasonable ballpark assumption. In this case 1
= 1 and the optimal share of researchers is
one-half. Indeed, for any reasonable steady-state growth rate, the optimal share of researchers
is very high, so it is hardly surprising that the economy does not automatically generate this
share.
This result points to the potential for policy interventions to boost the rate of economic
growth by raising the number of researchers. For instance, laws to strengthen patent pro-
tection may raise the incentives to conduct R&D. This points to a potential conflict between
macroeconomic policies aimed at raising growth and microeconomic policies aimed at reducing
the inefficiencies due to monopoly power: Some amount of monopoly power for patent-holders
2
Charles I. Jones and John C. Williams, Too Much of a Good Thing? The Economics of Investment in
R&D, Journal of Economic Growth, March 2000, Vol. 5, No. 1, pp. 65-85.
277
may be necessary if we want to induce a high level of R&D and thus a high level of output.
Many of the facts about economic history back up Romers vision of economic growth. Robert
Gordons paper Is US economic growth over? Faltering innovation confronts the six head-
winds provides an excellent description of the various phases of technological invention and
also provides an interesting perspective on the potential for future technological progress.3
Gordon highlights how economic history can be broken into different periods based on how
The First Industrial Revolution: centered in 1750-1830 from the inventions of the steam
engine and cotton gin through the early railroads and steamships, but much of the impact
of railroads on the American economy came later between 1850 and 1900. At a minimum it
took 150 years for IR1 to have its full range of effects.
The Second Industrial Revolution: within the years 1870-1900 created within just a few years
the inventions that made the biggest difference to date in the standard of living. Electric light
and a workable internal combustion engine were invented in a three-month period in late
1879. The number of municipal waterworks providing fresh running water to urban homes
multiplied tenfold between 1870 and 1900. The telephone, phonograph, and motion pictures
were all invented in the 1880s. The benefits of IR2 included subsidiary and complementary
inventions, from elevators, electric machinery and consumer appliances; to the motorcar, truck,
and airplane; to highways, suburbs, and supermarkets; to sewers to carry the wastewater away.
All this had been accomplished by 1929, at least in urban America, although it took longer
3
CEPR Policy Insight, Number 63
278
to bring the modern household conveniences to small towns and farms. Additional follow-up
inventions continued and had their main effects by 1970, including television, air conditioning,
and the interstate highway system. The inventions of IR2 were so important and far-reaching
that they took a full 100 years to have their main effect.
The Third Industrial Revolution: is often associated with the invention of the web and
internet around 1995. But in fact electronic mainframe computers began to replace routine
Gordons paper is very worth reading for understanding how the innovations associated
with the second industrial revolution completely altered peoples lives. He describes life in
1870 as follows
most aspects of life in 1870 (except for the rich) were dark, dangerous, and involved
backbreaking work. There was no electricity in 1870. The insides of dwelling units
were not only dark but also smoky, due to residue and air pollution from candles
and oil lamps. The enclosed iron stove had only recently been invented and much
cooking was still done on the open hearth. Only the proximity of the hearth or
stove was warm; bedrooms were unheated and family members carried warm bricks
But the biggest inconvenience was the lack of running water. Every drop of water
for laundry, cooking, and indoor chamber pots had to be hauled in by the house-
wife, and wastewater hauled out. The average North Carolina housewife in 1885
had to walk 148 miles per year while carrying 35 tonnes of water.
Gordon believes that the technological innovations associated with computer technologies
279
are far less important than those associated with the second industrial revolution and that
growth may sputter out over time. Figure 1 repeats a chart from Gordons paper showing
the growth rate of per capita GDP for the worlds leading economies (first the UK, then
the US). It shows growth accelerating until 1950 and declining thereafter. Figure 2 shows a
To illustrate why he believes modern inventions dont match up with past improvements,
You are required to make a choice between option A and option B. With option
A you are allowed to keep 2002 electronic technology, including your Windows 98
laptop accessing Amazon, and you can keep running water and indoor toilets; but
Option B is that you get everything invented in the past decade right up to Face-
book, Twitter, and the iPad, but you have to give up running water and indoor
toilets. You have to haul the water into your dwelling and carry out the waste.
Even at 3am on a rainy night, your only toilet option is a wet and perhaps muddy
You probably wont be surprised to find out that most people pick option B.
Gordon also discusses other factors likely to hold back growth in leading countries such
and energy-related constraints. Its worth noting, though, that while Gordons paper is very
well researched and well argued, economists are not very good at forecasting the invention
of new technologies or their impact on the economy. For all we know, the next industrial
280
revolution could be around the corner to spark a new era of rapid growth. Joel Mokyrs
scepticism.4
4
Available at www.voxeu.org/article/technological-progress-thing-past
281
Figure 12.4: Gordons Hypothetical Path for Growth
282
Chapter 13
Cross-Country Technology Diffusion
So far, weve been discussing how the invention of new technologies promotes economic growth
by pushing out the technological frontier and allowing capital to be allocated across new
and old technologies with diminishing returns setting in. This is clearly an important aspect
of economic growth. However, we should remember that only a very few countries in the
world are on the technological frontiermost places are not relying on Apple to invent a
new gadget to promote efficiency. One way to illustrate this point is to estimate the level of
An important paper that did these calculations and used them to shed light on cross-
country income differences is Hall and Jones (1999).1 The basis of the study is a levels
Yi = Ki (hi Ai Li )1 (13.1)
Like the BLS multifactor productivity calculations that we discussed a few lectures ago,
Hall and Jones account for the effect of education on the productivity of the labour force.
Specifically, they construct measures of human capital based on estimates of the return to
283
Hall and Jones show that their production function can be re-formulated as
Yi Ki
1
= hi Ai (13.2)
Li Yi
Hall and Jones then constructed a measure hi using evidence on levels of educational attain-
ment and they also set = 1/3. This allowed them to use (13.2) to express all cross-country
differences in output per worker in terms of three multiplicative terms: capital intensity, hu-
man capital per worker, and technology or total factor productivity. They found that output
per worker in the richest five countries was 31.7 times that in the poorest five countries. This
The results from this paper show that differences in total factor productivity, rather than dif-
ferences in factor accumulation, are the key explanation of cross-country variations in income
levels. A more detailed table of Hall and Joness calculations is reproduced on the next page.
These calculations show that most countries are very far from the technological frontier, so
284
Table from Hall-Jones Paper
285
Leaders and Followers
The Romer model probably should not be thought of as a model of growth in any one partic-
ular country. No country uses only technologies that were invented in that country; rather,
products invented in one country end up being used all around the world. Thus, the model is
best thought of as a model of the leading countries in the world economy. How then should
long-run growth rates be determined for individual countries? By itself, the Romer model
has no clear answer, but it suggests a model in which ability to learn about the usage of new
We will now describe such a model. The mathematics of the model are also formally
equivalent to a well-known model of Nelson and Phelps (AER, 1966), though the application
there is different, their subject being the diffusion of technological knowledge over time within
an individual country.
The Model
We will assume that there is a lead country in the world economy that has technology level,
At
=g (13.3)
At
All other countries in the world, indexed by j, have technology levels given by Ajt < At . The
where j < g and j > 0. This tells us that technology growth in all countries apart from the
286
Learning: The second term says that their technology level will grow faster the bigger
is the percentage gap between its level of technology, Ajt and the level of the leader,
At . The larger is the parameter j , the better the country is at learning about the
The first term, j indicates the countrys capacity for increasing its level of technology
without learning from the leader. We impose the condition j < g. This means that
country j cant grow faster than the lead country without the learning that comes from
Exponential Growth
Youve probably heard about exponential functions before but, even if you have, its worth a
quick reminder. The number e 2.71828 is a very special number such that the function
dex
= ex (13.5)
dx
One way to see why the number is 2.718 is to use something called the Taylor series approxi-
mation for a function, which states that you can approximate a function f (x) as
1 1 1
f (x) = f (a)+f 0 (x)(xa)+ f 00 (x))(xa)2 + f 000 (x))(xa)3 +... f n (x))(xa)n +... (13.6)
2 3! n!
where n! = (1)(2)(3)...(n 1)(n). If there is a number, e that has the property that ex =
f (x) = f 0 (x), then that means that all derivatives also equal ex . In this case, we have
1 1
ex = ea + ea (x a) + ea (x a)2 + ea (x a)3 + ... (13.7)
2 3!
1 1 1
e=1+ + + + ..... (13.8)
2 3! 4!
287
This converges to 2.71828. Ok, thats not on the test but worth knowing. Now note that
degt degt dg
= = gegt (13.9)
dt d(gt) dt
Now lets relate this back to our model. The fact that the lead country has growth such that
dAt
= At = gAt (13.10)
dt
means that this country is characterised by what is known as exponential growth, i.e.
At = A0 egt (13.11)
We write the first term as A0 because e(g)(0) = 1 so whatever term multiplies egt that is the
Dynamics of Technology
Now we are going to try to figure out how the technology variable behaves in the follower
country. First, lets take equation (13.4) and multiply across by Ajt to get
This is what is known as a first-order linear differential equation (differential equation because
it involves a derivative; first-order because it only involves a first derivative; linear because it
doesnt involve any terms taken to powers than are not one.) These equations can be solved
to illustrate how Aj changes over time. To do this, we first draw some terms together to
re-write it as
Recalling equation (13.11) for the technology level of the leader country, this differential
288
Now well move on to illustrating how people figure out how an Ajt that satisfies this equation
needs to behave.
Lets think about what we learned about exponential functions to help us see what form a
potential solution might take. The derivative of Ajt with respect to time plus (j j ) times
Looked at this way, we might guess that one possible solution for an Ajt process that will
satisfy this equation is something of the form Bj egt where Bj is some unknown coefficient.
Indeed, it turns out that this is the case. Lets figure out what Bj must be. It must satisfy
j A0
Bj = (13.16)
j + g j
A General Solution
Is that it or could we add on an additional term and still get a solution? Suppose we look for
289
Then the solution would have to obey
gBegt + Djt + (j j ) Begt + Djt = j A0 egt (13.19)
All the terms in egt cancel out because, by construction of Bj , they satisfy equation (13.15).
Again using the properties of the exponential function, this equation is satisfied by anything
of the form
where Dj0 is a parameter that can take on any value. So, given the differential equation
(13.12), all possible solutions for technology in country j must take the form
!
j
Ajt = At + Dj0 e(j j )t (13.22)
j + g j
Now we like to examine the properties of this solution. Does technology in the follower country
catch up and, if not, where does it end up and why? To answer these questions, it is useful
to express Ajt as a ratio of the frontier level of technology. This can be written as
Ajt j Dj0 (j j )t
= + e (13.23)
At j + g j At
290
To understand the properties of this solution, recall that we assumed j < g, which means
that on its own (without catch-up growth) the follower countrys level of technology grows
slower than the leader country and also that j > 0 (some learning takes place). Putting
j + g j > 0 (13.25)
This means that the second term in (13.24) tends towards zero. So, over time, as this term
j
disappears, the country converges towards a level of technology that is a constant ratio, j +gj
j
0< <1 (13.27)
j + g j
so each country never actually catches up to the leader but instead converges to some fraction
of the lead countrys technology level. This makes sense if you think about it. Because of their
inferiority at developing their own technologies (j < g) the follower countries will always be
falling further behind the leader unless there is a gap between their level of technology and
the leader. So, to have a steady-state in which everyones technology is growing at the same
rate, the followers must all have technology levels below that of the leader.
The equilibrium ratio of the countrys technology to the leaders depends positively on the
learning parameter j . The higher this parameterthe more fo the gap to the leader that
291
it closes each periodthen the close the ratio gets to one and the higher up the pecking
In other words, the more growth the country can generate each period independent of learning
from the leader, the higher will be its equilibrium ratio of technology relative to the leader.
Going back to the equation for the ratio of technology in country j to the leader, equation
(13.24), we noted already that the second term tends to disappear to zero over time. That
doesnt mean its unimportant. How a country behaves along its transition path depends
If Dj0 < 0, then the term that is disappearing over time is a negative term that is
a drag on the level of technology. This means that the country starts out below its
equilbrium technology ratio, grows faster than the leader for some period of time with
If Dj0 > 0, then the term that is disappearing over time is a positive term that is
boosting the level of technology. This means that the country starts out above its
equilbrium technology ratio, grows slower than the leader for some period of time with
We have illustrated how these dynamics would work with the Figures 13.1 to 13.3. These
charts show model simulations for a leader economy with g = 0.02 and a follower economy
292
with j = 0.01 and j = 0.04. These values mean
j 0.04
= = 0.8 (13.30)
j + g j 0.04 + 0.02 0.01
so the follower economy converges to a level of technology that is 20 percent below that of
the leader. The first collection of charts show what happens when this economy has a value
of Dj0 = 0.5, so that it starts out with a technology level only 30 percent that of the leader.
They grow faster than the leader country for a number of years before they approach the
0.8 equilibrium ratio and then their growth rate settles down to the same rate as that of the
leader.
The second collection of charts show what happens when this economy has a value of
Dj0 = 0.5, so that it starts out with a technology level 30 percent above that of the leader,
even though the equilibrium value is 20 percent below. Technology levels in this follower
country never actually decline but they do go through a long-period of slow growth rates
before eventually heading towards the same growth rate as the leader as they approach the
Finally, we show how the model may also be able to account for the sort of growth
miracles that are occasionally observed when countries suddenly start experiencing rapid
growth: If a country can increase its value of j via education or science-related policies, its
position in the steady-state distribution of income may move upwards substantially, with the
economy then going through a phase of rapid growth. The third collection of charts show
what happens when, in period 21, an economy changes from having j = 0.005 to j = 0.04.
The equilbrium technology ratio changes from one-third to 0.8 and the economy experiences
An important message from this model is that for most countries, it is not their ability to
293
invent new capital goods that is key to high living standards, but rather their ability to learn
Technology Levels Over Time Ratio of Follower to Leader Technology Growth Rates of Technology
9 0.8 0.10
8 0.09
7 0.7 0.08
6 0.07
0.6
5 0.06
4 0.05
0.5
3 0.04
2 0.03
0.4
1 0.02
0 0.01
20 40 60 80 100 25 50 75 100
0.3
Leader Follower Leader Follower
20 40 60 80 100
294
Figure 13.2: Falling Back
Technology Levels Over Time Ratio of Follower to Leader Technology Growth Rates of Technology
9 1.3 0.0225
0.0200
8
1.2 0.0175
7
0.0150
6
1.1
0.0125
0.0100
4 1.0
0.0075
3
0.0050
0.9
2
0.0025
1 0.0000
20 40 60 80 100 25 50 75 100
0.8
Leader Follower Leader Follower
20 40 60 80 100
295
Figure 13.3: A Growth Miracle
Technology Levels Over Time Ratio of Follower to Leader Technology Growth Rates of Technology
9 0.8 0.09
8
0.08
7 0.7
0.07
6
0.06
0.6
5
0.05
0.5 0.04
3
0.03
2
0.4
0.02
1
0 0.01
20 40 60 80 100 25 50 75 100
0.3
Leader Follower Leader Follower
20 40 60 80 100
296
Chapter 14
Institutions and Efficiency
We have documented huge differences in total factor productivity across countries. What
determines these differences? One answer is provided by the combination of the Romer model
and the leader-follower model. According to these models, large differences in TFP reflect
variations in the extent to which countries have adopted the latest technologies.
However, this is perhaps too mechanistic a view of what generates cross-country differences
in efficiency. TFP doesnt just reflect the technologies a countrys people use. It is a measure
of the efficiency with which an economy makes use of its resources and there are a whole range
Crime: Time spent on crime does not produce output. Neither do resources devoted to
this still leaves open the question of what drives the pace of technology adoption in poorer
297
countries. Ultimately, the models so far dont answer the question of the deeper determinants
of economic success. We will now discuss on the idea that the ultimate explanation for patterns
There is now a large literature that focuses on the idea that differences in institutions provides
the key to understanding TFP differences across countries. Economic activity does not take
place in a vacuum. Firms need to take account of the legal and regulatory environment,
the tax system, and the services provided by government as well as the political setting that
The work of economic historian Douglass North, winner of the 1993 Nobel prize for eco-
nomics, was particularly influential in stressing the key importance of good institutions for
A theory of institutional change is essential for further progress in the social sci-
(and other theories in the social scientists toolbag) at present cannot satisfacto-
rily account for the very diverse performance of societies and economies both at a
moment of time and over time. The explanations derived from neo-classical theory
are not satisfactory because, while the models may account for most of the differ-
education, savings rates, etc., they do not account for why economies would fail to
undertake the appropriate activities if they had a high payoff. Institutions deter-
298
mine the payoffs. While the fundamental neo-classical assumption of scarcity and
hence competition has been robust (and is basic to this analysis), the assumption
of a frictionless exchange process has led economic theory astray. Institutions are
the structure that humans impose on human interaction and therefore define the
incentives that (together with the other constraints (budget, technology, etc.) de-
termine the choices that individuals make that shape the performance of societies
He goes to discuss the link between institutions and the profit-maximising decisions that
ventions, and self imposed codes of conduct) and the enforcement characteristics
of both ... If institutions are the rules of the game, organizations are the players.
They are groups of individuals engaged in purposive activity. The constraints im-
posed by the institutional framework (together with the other constraints) define
the opportunity set and therefore the kind of organizations that will come into
existence ... If the highest rates of return in a society are to be made from piracy,
then organizations will invest in knowledge and skills that will make them better
This paper contains a discussion of some aspects of the USs institutional history that have
been positive for economic growth. Much of Norths other work focuses on the development
of institutions that made some countries such as the UK successful early developers through
299
An Example of the Importance of Institutions
success of an economy. After World War II, Korea was split into a northern zone that became
the Democratic Peoples Republic of Korea, a Soviet-style socialist republic, while South Korea
North Korea received external support from the USSR for many years but no longer
receives external aid. It remains a centrally planned economy with only one political party.
The economy has failed to prosper and there are reliable reports of large amounts of death
from famine in the 1990s. In contrast, South Korea has been a huge economic success and is
The figure on the next page illustrates the gap between North and South Korea. While
the two areas began with few substantive differences, sharing a common culture and identity,
their different economic institutions mean that they are now completely different. Viewed
from the sky, you can see development all over South Korea while North Korea is almost fully
300
Figure 14.1: The Korean Peninsula at Night
301
An Econometric Approach
The historical approach adopted by North and isolated examples of extreme events (such as the
Korean split) been very valuable in highlighting cases where good institutions have facilitated
economic growth and where bad institutions have prevented it. More recently, there has
been an attempt to assess the role of institutions in economic development using more formal
econometric techniques. An early paper in this literature was the 1999 Quarterly Journal of
Economics paper by Robert Hall and Charles I. Jones (Recall that we previously discussed
this papers calculations of the sources of differences in output per worker). They used the
term social infrastructure to describe the institutions that affect incentives to produce and
invest. Their approach was to collect data on a large number of countries and then estimate
to which institutions in country i facilitate economic activity. Hall and Jones constructed their
variables relating to (i) law and order (ii) bureaucratic quality (iii) corruption (iv) risk
2. An index that focuses on the openness of a country to trade with other countries
There are two potentially serious econometric problems when assessing the linkage between
productivity and institutions. The first is endogeneity. Do countries get rich because they
have good institutions or do countries have good institutions because they are rich? The
302
latter linkage certainly exists. Citizens in richer countries have substantial incentives to keep
good institutions that promote productive efficiency because they would have lot to lose if
their markets ceased to work well; these incentives may be substantial weaker in the worlds
poorer countries. Hall and Jones thus describe their social infrastructure variable as being
determined by
Yi
Si = + + Xi + i (14.2)
Li
Yi
In this case, a simple OLS regression of Li
on Si will produce a positive estimate of the
effect of institutions on output per workereven if the true value of was zero.
The second econometric problem is measurement error. The variables used as measures of
institutional quality can only ever be proxies, and possibly poor proxies, for the true measure
of institutional quality that actually affects economic output. The use of proxies like this is the
same as using variables that are affected by measurement error. One of the standard results
from econometrics is that measurement error can result in downward bias in coefficients. In
other words, the OLS coefficient might be less than the true coefficient.
So the presence of these econometric problems means OLS estimation will produce biased
estimates, though whether the bias is upwards or downwards depends on the source of the bias.
The usual solution to these econometric problems is estimation via instrumental variables.
that that may be correlated with the institutions variable but that are not affected by the
exogenous factors that are not determined by output per worker, the researcher can try to
303
identify the true causal effect of institutions.
Finding good instruments for this problem can be tricky. Many of the papers in this literature
have focused on either geography or history as their inspiration for truly exogenous sources of
variations in institutions.
prosperity. But certain types of geographical features may be correlated with whether
a country has good institutions or not. Hall and Jones used the countrys distance
from the equator as an instrument. Other papers have also used coastal access, average
In relation to history, many countries around the world were colonised by various Eu-
ropean countries and their current institutions (e.g. whether a country uses a French or
English legal systems) are often determined, in a somewhat random fashion, by which
countries colonised them. Hall and Jones used instruments measuring the fraction of
people speaking English as a native language and a variable measuring the fraction of
Using their selected instrument set, Hall and Jones found a positive and significant effect of
their social infrastructure variable when estimating using IV methods, with the coefficient
being higher than the OLS estimate. They concluded from this that there is a large causal
effect from institutions to productivity and that the measurement error is a more important
304
Some Other Papers
Acemoglu, Johnson, and Robinson (AER, 2001) assess the effect on GDP per capita of
new instrument measuring settler mortality in different European colonies. They argue
that countries where mortality for initial settlers was low were places where Europeans
were more likely to settle and set up good institutions, with the reverse working when
settler mortality was high. With this variable as an instrument, they find a very strong
Rodrik, Subramanian and Trebbi (Journal of Economic Growth, 2004). These authors
assess the role of institutions (as proxied by a variable measuring the strength of the rule
of law), openness to trade and geography (as measured by distance from the equator).
To be able to assess whether geography has a direct effect on income per capita, they use
other variables such as the AJR settler mortality variable and language-related variables
as instruments. They conclude that institutions, in the form of their rule of law variable,
are the key determinant of economic success and do not find a significant role for trade
or geography.
Gillanders and Whelan (2014) compare the effect of the Rule of Law variable preferred
by Rodrik, Subramanian and Trebbi with a new variable that measures the ease of
doing business. Both are institutional variables but they measure different types of
institutions. This paper also applies IV methods using geographical variables as instru-
ments and concludes that it is the ease of doing business that is the key determinant of
305
Part IV
Growth and Resources
306
Chapter 15
The Malthusian Model
We have devoted the last few weeks to studying economies that grow steadily over time. For
many countries around the world, that has been a reasonable description of their behaviour
since the start of the Industrial Revolution. However, prior to around the year 1800, there is
very little evidence of steady growth in income levels. The chart on the next page repeats the
chart shown earlier from the book A Farewell to Alms by economic historian Greg Clark. It
summarises world economic history as a long period in which living standards fluctuated over
time showing no growth trend before the Industrial Revolution lead to steady growth over
time (though Clark notes that this take-off did not occur in all countries and some remain
exceptionally poor).
well-resourced statistical agencies. So its hardly surprising that there is a lot of controversy
over Clarks particular interpretation of the evidence as implying no trend growth at all
in living standards prior to 1800. Other studies show slow but gradual increases in living
standards prior to the Industrial Revolution but all agree that the average rate of economic
growth was very low before 1800. In addition, Figure 15.2 shows that global population growth
was extremely slow until 1800 and then increased to much higher rates.
What explains these patterns? Our previous models would suggest the pace of technolog-
307
ical progress must have been slower before the Industrial Revolution and this is true. But
cumulatively, there was a lot of technological progress in the years prior to 1800 with many ad-
vances made in science and in the organisation of economic life. One might have expected this
to translate into growth in average living standards over time but the evidence suggests such
progress was limited. In these notes, we will present the Malthusian model, which explains
how the world works very differently when rates of technological progress are slow.
308
Figure 15.1: World Economic History (from Greg Clarks book)
309
Figure 15.2: Global Population
310
Life Expectancy and Income Levels
The Malthusian model has two key elements: A negative relationship between income levels
and the size of population and a positive relationship between income levels and population
By definition, population growth increases with birth rates and falls with death rates.
Death rates, in turn, are the key determinant of life expectancy. Throughout history, there
has been a strong relationship between a countrys average level of income per capita and its
average life expectancy. This relationship still holds strongly today. Figure 3 shows a chart
taken from a wonderful website called Gapminder which allows you to make animated charts
showing developments over time and around the world in income levels, health outcomes and
Figure 15.3 shows the relationship between average life expectancy and real income per
person. Each dot corresponds to a country, with the size of the dot corresponding to its
population (the big red dot is China, the big blue dot is India, the yellow dot at the top right
is the US). The chart shows that in the poorest countries in the world in 2012, average life
expectancy was as low as under 50 years of age while the richest countries tend to have average
life expectancy of over 80 years. Figure 15.4 shows a relationship of this kind holding inside
a large country: U.S. counties with higher income per capita have longer life expectancy.
countries that allow people to live much longer. But it is more influenced by very high
rates of child mortality. Figure 15.5 shows another Gapminder chart. This one shows that
mortality among children under 5 is still very common in the worlds poorest countries due
311
This relationship between income levels and the rate of death among the population will
be a key element of the version of the Malthusian model that we will cover.
Figure 15.3: Life Expectancy and Real GDP Per Capita Around the
World in 2012
312
Figure 15.4: Life Expectancy and Income Levels: U.S. Counties
313
Figure 15.5: Child Mortality and Real GDP Per Capita Around the
World in 2012
314
Population and Income Levels
The second element of the Malthusian model is a negative relationship between income levels
and the level of population. Before discussing Malthuss thoughts on this issue, its worth
duction function
Yt = AK L1
t (15.1)
Here, Ive assumed that both capital and technology are fixed (and so have no time subscript),
so that labour input is the only factor that produces changes in output. We can figure out
the demand for labour by assuming that the firms in the economy maximise profits in a
= pAK L1
t wL rK (15.2)
where p is the price of output, w is the wage rate and r is an implicit rental rate for capital.
pAK L w = 0 (15.3)
L = N (15.5)
we get
w K
=A (15.6)
p N
315
The higher the population, the lower will be the real wage. This is because of diminishing
marginal returns to labour and the fact that workers are being paid their marginal wage
product.
Now note that the direct link between higher population and lower wage rates (and thus
lower living standards) works here because technology and capital are held constant. In the
Solow growth model, there is both rising population and increasing wages because technology
improvements and capital accumulation offset the negative effects on wages of rising pop-
ulation. In this example, we have assumed something quite different, i.e. no technological
progress. We will return, however, to the question of what happens when there is a slow but
Malthus (1798)
Thomas Malthuss 1798 book An Essay on the Principle of Population put together the two
ideas that we have just discussed. He noted that rising living standards can lead to higher
population growth but the famously-gloomy Malthus believed that this increase in population
Malthus placed a somewhat different emphasis on the various links than in our discussion.
In relation to the link between demographics and living standards, Malthus focused on two
mechanisms (checks on living standards) that would cause population growth to increase
as living standards rose and thus ultimately see the increase in living standards reversed.
The first mechanism, which Malthus labelled the preventative check was the tendency to
see more births when real wages are high. In pre-Industrial Revolution Britain, the tradition
was for people to marry relatively late as they waited to accumulate the wealth to be able to
316
support a family. This tended to keep fertility rates relatively low. In practice, as discussed in
Greg Clarks book on the Malthusian model, the evidence for a link between living standards
and birth rates prior to the Industrial Revolution is fairly weak and I will assume a constant
birth rate in the model treatment below (though the logic of the model is unchanged if you
The second mechanism, which Malthus labelled the positive check, was the negative
effect of living standards on death rates. Evidence for this mechanism is stronger and still
the actual distresses of some of the lower classes, by which they are disabled from
giving the proper food and attention to their children, act as a positive check to the
This is the mechanism that we will focus on in our description of the model.
In relation to the negative effect of population on living standards, Ive used a production
function approach and emphasised the role played by the assumption of technology increases
failing to offset the effect of increased population. Malthus focused more the idea of increased
have the same effect in lowering the value of each man patent. The food must
would arise either from an increase of population faster than the means of subsis-
317
The Model and its Convergent Dynamics
We will now describe a Malthusian model in somewhat more formal terms than Malthus did.
Basically, Im following Greg Clarks version of the model as described in Chapter 2 of his
book, though Im using a constant birth rate rather than one that depends on income levels.
The model has four equations. First, there is the definition of the change in the population,
which just states that population equals last periods population plus last periods level of
births minus deaths. (There are lots of different possible timing conventions here. I have
in mind that the population level is measured at the start of each period, while births and
deaths occur over the course of the period, but the particular timing convention adopted isnt
important):
Bt
=b (15.8)
Nt
Dt
= d0 d1 Yt (15.9)
Nt
Finally, real income per person is a negative function of the population size:
Yt = a0 a1 Nt (15.10)
Figure 15.6 shows how the death and birth rate equations combine together to make population
dynamics a function of income per person. The death rate depends negatively on income per
person, so at sufficiently high income levelsin this case, levels above Y births are greater
than deaths and population is growing, while population is falling at income levels below Y .
318
Figure 15.7 then shows that the economy tends to return to this equilibrium level of
income. When income is above Y , population is growing. But Figure 7 shows that growing
population means income levels are falling. So income levels tend to fall when income is
above Y and increase when it is below Y . Similarly population tends to fall when it is
above the level of population associated with Y , call this N , and rise when it is below this
level. This means that both income and population display what we have called convergent
dynamics in our discussion of the Solow model: Wherever the economy starts out, it tends to
converge towards these specific levels of income and population. Because the economy tends
to revert back to the same levels of income and population, this phenomenon is often called
Figure 15.8 shows how the birth and death schedules, on the one hand, and the income-
population schedule on the other, combine to determine the models properties. Perhaps
surprisingly, it is the birth and death schedules and not the income-population schedule that
determines the long-run level of real income per person in the model. The income-population
schedule then determines how many people are alive, given that level of income.
319
Figure 15.6: Birth and Death Rate Schedules
BIRTH
AND
DEATH
RATE
BIRTH RATE
DEATH RATE
320
Figure 15.7: The Income-Population Schedule
POPULATION
N*
321
Figure 15.8: The Full Model
BIRTH
AND
DEATH
RATE
BIRTH RATE
DEATH RATE
POPULATION
N*
322
Calculating the Long-Run Equilibrium
We can figure N and Y out algebraically as follows. Combining the birth and death schedules
Nt Nt1
= b d0 + d1 Yt1 (15.11)
Nt1
Nt Nt1
= b d0 + d1 a0 d1 a1 Nt1 (15.12)
Nt1
This shows that the growth rate of population depends negatively on last periods level of
population: This is what determines the convergent dynamics. The level of N such that
b d0 + d1 a0 d1 a1 N = 0 (15.13)
which effectively measures the level of technology in the model (if this increases it can offset
the negative effect of higher population on income levels). The equilibrium level of population
depends negatively on the exogenous element of the death rate (d0 ), on the sensitivity of the
death rate to income levels (d1 ), and on the sensitivity of income levels to population (a1 ).
The long-run equilibrium level of real income per person can be derived as the income
Nt Nt1 d0 b
= b d0 + d1 Y = 0 Y = (15.15)
Nt1 d1
323
This level of income, as we noted above from the graphical illustration of the model, depends
only on the parameters of the birth and death schedule and not at all on the parameters of
the income-population schedule. So, for example, even if there was an increase in a0 so that
people could be paid more wages for each level of population, this would result, over time,
only in higher population rather than higher income levels. Income levels depend negatively
on birth rates, positively on death rates and negatively on the sensitivity of death rates to
income levels.
A final way of illustrating the convergent dynamics of the model is to note that equation
Nt Nt1
= (d1 a1 ) (N Nt1 ) (15.17)
Nt1
In other words, the growth rate of population is determined by how far population is from
its equilibrium level, with the speed of adjustment to this equilibrium, d1 a1 , determined by
the sensitivity of income levels to population and the sensitivity of the death rate to income
levels.
Finally, we consider three kinds of shocks to the Malthusian economy. In each case, we
assume the economy starts at an equilbrium with population of N0 and income levels Y0 .
First, consider an increase in d0 which shifts the death rate schedule up. Figure illustrates
what happens: At the starting level of income, Y0 , death rates now start to exceed birth rates.
324
Population falls and income rises until we reach the new higher equilibrium level of income
Figure illustrates the consequences of an increase in the birth rate, b. This shock works in
increase in a0 so that people are able to earn more money at each level of population. The
initial response to this shock is higher income levels. However, these higher income levels
reduce the death rate and, over time, income levels return to their original equilibrium level.
While income levels return to their original level, population is permanently higher because
the new level of productivity permits a higher level of population than the old level.
There is an interesting contrast here between what happens when there is technological
progress in the Solow model and when technology improves in the Malthusian model. The
difference relates to the assumption in the Solow model that there is a consistent and non-
trivial pace of technology increase. In the Malthusian model, the instantaneous effect of an
increase in efficiency is an improvement of living standards. But this is offset over time by
In the Solow model, technology keeps increasing and keeps pushing up incomes every
period, so the population can steadily increase without pushing income levels down. Greg
Clark argues that while, cumulatively, there was a large increase in technology from ancient
times to 1800, the pace of this increase was never fast enough to prevent population growth
eroding its effects on living standards, so that prior to the Industrial Revolution, improvements
325
Figure 15.9: A Shift in the Death Rate Schedule
BIRTH
AND
DEATH
RATE
BIRTH RATE
POPULATION
N0
N1
326
Figure 15.10: A Shift in the Birth Rate Schedule
BIRTH
AND
DEATH BIRTH RATE NEW
RATE
BIRTH RATE OLD
DEATH RATE
POPULATION
N1
N0
327
Figure 15.11: An Increase in Technological Efficiency
BIRTH
AND
DEATH
RATE
BIRTH RATE
DEATH RATE
POPULATION
N1
N0
328
Malthus on the Poor Laws
The Malthusian model is one in which our usual understanding of what is good and what is
bad is turned on its head. Things that we think are good, such as people living londer, turn
out to be bad for average living standards, and things that we think are bad, like plagues and
diseases, have a positive effect on those who survive. This non-intuitive worldview translated
into Malthuss own policy recommendations. For example, he argued strongly against poor
The poor laws of England tend to depress the general condition of the poor in these
two ways. Their first obvious tendency is to increase population without increasing
the food for its support. A poor man may marry with little or no prospect of
some measure to create the poor which they maintain, and as the provisions of the
man in smaller proportions, it is evident that the labour of those who are not
before and consequently more of them must be driven to ask for support.
society that cannot in general be considered as the most valuable part diminishes the
shares that would otherwise belong to more industrious and more worthy members,
and thus in the same manner forces more to become dependent. If the poor in the
workhouses were to live better than they now do, this new distribution of the money
of the society would tend more conspicuously to depress the condition of those out
329
Over the years, Malthus has often been criticised for being overly-pessimistic about the fate of
mankind and for opposing socially-progressive policies. However, it is worth noting the date
that he wrote his famous essay1798. Up until the time that he wrote his essay, his version
of how the world worked actually described the economy remarkably well. It was only after
his book was written that technological progress became fast enough to render his analysis
less relevant.
330
Chapter 16
Malthus and the Environment
The Malthusian model may seem of interest today only for the light that it sheds on how
the world worked before the Industrial Revolution ushered in an era of growth and increasing
prosperity. Recall, however, that Malthuss views on how rising population reduced living
standards focused on how increasing numbers of people placed pressures on the allocation
of scarce resources, particularly food. In a world in which global population has just passed
7 billion, up from 4 billion in 1960 and 2 billion in 1927, it is reasonable to ask whether
important global resources, such as energy sources, agricultural land and the global resource
In these notes, we will study a model that combines a Malthusian approach to population
dynamics with an approach to modelling changes in a renewable resource base, which can
expand or contract. The model was first presented by James A. Brander and M. Scott Taylor
in their 1998 American Economic Review paper The Simple Economics of Easter Island: A
Easter Island
On Easter Sunday 1722, a Dutch explorer called Jacob Roggeveen came across a Pacific
island that is believed to be the most remote inhabitable place in the world. Situated over
331
two thousand miles west of Chile (see Figure 1) it is about 1300 miles east of its nearest
inhabited neighbour, Pitcairn Island. Known as Easter Island since Roggeveens brief visit,
the island its inhabitants called Rapa Nui has had a long and fascinating history.
There is no written history of events at Easter Island prior to Roggeveens visit so we are
time ago. The interpretation Im passing on in these brief notes comes from my reading of
a chapter in Jared Diamonds book, Collapse, but there are archeologists and scientists who
Easter Island was probably first populated sometime around 900 AD. That it was ever
populated, given its remoteness, is somewhat extraordinary. It seems likely that, once pop-
ulated, it had little (and possibly no) contact with the outside world. The most remarkable
feature of the island is its collection of hundreds of carved ceremonial statues featuring torsos
and heads (see Figures 2 and 3) which were mainly built between 1100 and 1500. The natives
most likely erected the statues as a form of religious worship. Evidence suggests that the
island was divided into twelve tribes and they competed with each other (perhaps for local
pride, perhaps for favour with the gods) by building larger and larger statues over time.
The statues were enormous. On average, they were 12 feet high and weighed 14 tons,
while the largest weighs 82 tons. There is plenty of evidence to show that the statues required
huge resources and that at least some of these resources were organised on a shared basis by a
centralised leadership. Large teams of carvers were needed to create the statues and as many
as 250 people were required to spend days transporting the statues around the island. When
first populated, the island had large amounts of palm trees which supplied the resources for
canoes, for tools for hunting and for materials used to transport the statues (sleds, rope, levers
332
etc.) Estimates of peak population vary but it appears that the population peaked at about
By the time Europeans began to visit the island one hundred years later, however, the
island was largely deforested and population seemed to be as low as 3,000. Without palm
trees, the islanders no longer had materials with which to build good canoes and this limited
their abilities to catch fish. Without forests, the island lost most of its land birds, which had
been an important source of meat. By the 1700s, the population survived mainly on farming,
with chickens the main source of protein, but deforestation had also reduced water retention in
the soil and lead to soil erosion (the island is quite windy) so agricultural yields also declined.
Statue building had ceased by the early 1600s: Many of the statues remain today in various
states of completion at the quarry at Rano Raraku where they were carved. Archeological
evidence shows increasing numbers of spears and daggers appearing around this time, as well
as evidence of people starting to live in caves and fortified dwellings. By the time Europeans
arrived in the following century, tensions over food shortages had spilled over into intra-tribal
rivalries with tribes knocking over the statues of their rivals. By the mid-1800s, all the statues
had been toppled, so todays standing statues have been put in place in modern times.
There are many gaps in our understanding of what happened at Easter Island but the
basic story appears to be that the population expanded to the point where the islands re-
sources began to diminish and once population started to decline, the island went into a
downward spiral. By the time Europeans visited in the seventeenth century, both population
and resources had been greatly diminished from their peak levels.
The model laid out over the next few pages provides a description of how this can happen.
We conclude with some thoughts about why it was allowed to happen and the potential
333
implications for current global environmental problems.
334
Figure 16.2: Easter Island Statues
335
Figure 16.3: Some Standing, Some Toppled
336
The Model
The model economy consists of population of Nt people at time t, who sustain themselves
The model consists of three elements. The first element describes the change in population:
This depends positively on the size of the harvest (a bigger harvest means less deaths and
perhaps more births) and on an exogenous factor d > 0 such that without a harvest, there is
dNt
= dNt + Ht (16.1)
dt
The next element describes the harvest. We assume that the harvest reaped per person is a
Ht
= St (16.2)
Nt
The final element, describing the change in the resource stock, is perhaps the most important.
We are describing a resource stock that is renewable. It doesnt simply decline when harvested
until it is all gone. Instead, it has its own capacity to increase. For example, stocks of fish
can be depleted but will increase naturally again if fishing is cut back. So, our equation for
amount that is harvested. The first element is more interesting. It describes the ability of
the resource to grow. Brander and Taylor use a logistic function to describe how the resource
337
stock renews itself
This equation can be interpreted as follows. The maximum level of resources is St = 1: At this
level, there can no further increase in St . Also, if St = 0 so the resource base has disappeared,
then it cannot be regenerated. For all levels in between zero and one, we can note that
G (St )
= r (1 St ) (16.5)
St
So the amount of natural renewal as a fraction of the stock decreases steadily as the stock
reaches its maximum value of one. This means that if the stock gets very low, it can grow at
a fast rate if there is limited harvesting. However, if the stock is starting from a low base, the
Dynamics of Population
We are going to describe the dynamics of this model using what is known as a phase diagram,
which is a diagram that shows the direction in which variables are moving depending upon
the values that they take. In our case, we are going to describe the joint dynamics of Nt and
St .
Inserting the equation for the harvest, equation (16.2), into equation (16.1) for population
growth, we get
dNt
= dNt + St Nt (16.6)
dt
This equations shows us that the change in population is a positive function of the resource
stock. This means there is a particular value of the resource stock, S , for which population
growth is zero. When resources are higher than S population increases and when it is lower
338
than S population declines. The value of S can be calculated as the value for which the
d
dNt + S Nt = 0 S = (16.7)
The resource stock consistent with an unchanged population depends positively on the exoge-
nous death rate of the population, d, and negatively on the sensitivity of the population to the
size of the harvest, , and on which describes the productivity of the harvesting technology.
Figure 16.4 shows how we illustrate the dynamics with a phase diagram. We put population
on the x-axis and the stock of resources on the y-axis. Unchanged population corresponds to
a straight line at S . For all values of resources above S population is increasing: Thus in
the area above the line, we show an arrow pointing right, meaning population is increasing.
In the area below this line, there is an arrow pointing left, meaning population is falling.
Dynamics of Resources
The dynamics of resources are derived by substituting in the logistic resource renewal function,
equation (16.4), and the equation for the harvest, equation (16.2), into equation (16.3) to get
dSt
= rSt (1 St ) Nt St (16.8)
dt
The stock of resources will be unchanged for all combinations of St and Nt that satisfy
r (1 St )
rSt (1 St ) Nt St = 0 Nt = (16.9)
This means that there is downward sloping line in N S space along which each point is a
point such that the change in resources is zero. This line is shown on Figure 16.5. The upper
point crossing the S axis corresponds to no change because S = 1 and there are no people; as
339
we move down the line we get points that correspond to no change in the stock of resources
because while there are progressively larger numbers of people, the stock gets smaller and so
Remembering that equation (16.8) tells us that the change in the stock resources depends
negatively on the size of the population, note now that every point that lies to the right of
dS
the downward-sloping St = dt
= 0 line has a higher level of population than the points on
line. That means that the stock of resources is declining for every point to the right the line
and increasing for every point to the left of it. Thus, in the area above the downward-sloping
line on Figure 5, we show an arrow pointing down, meaning the stock of resources is falling.
In the area below this line, there is an arrow pointing up, meaning the stock of resources is
increasing.
In Figure 16.6, we put together the four arrows drawn in Figures 16.4 and 16.5. This phase
diagram shows that the joint dynamics of population and resources can be divided up into
We can also see that there is one point at which both population and resources are un-
changed, and thus the model stays at this point if it is reached. We know already from
equation (16.7) that the level of the resource stock at this point is S = d
. We can calcu-
late the level of population associated with this point by inserting this formula into equation
(16.9):
d
r 1 r ( d)
N = = (16.10)
2
This level of population depends positively on r (so faster resource renewal raises population)
340
and on (the sensitivity of population growth to the harvest) and negatively on d (the
This point is clearly some kind of equilibrium in the sense that once the economy reaches
this point, it tends to stay there. But is the economy actually likely to end up at this point?
The answer is yes: From any interior point (i.e. a point in which there is a non-zero population
and resource stock) the economy eventually ends up at (N , S ). Its beyond the scope of this
class to prove formally that this is the case (the Brander-Taylor paper goes through all the
gory details) but I can note that, after messing around with the equations, one can show that
1 dNt
= (St S ) (16.11)
Nt dt
1 dSt
= (Nt N ) + r (St S ) (16.12)
St dt
so the dynamics of both population and the resource stock are both driven by how far the
What does changing the parameter (which determines the fraction of the resources that
is harvested) do to the equilibrium level of population? There are two different effects. On
the one hand, a higher means a smaller amount of people consume the natural growth
in resources that occurs in steady-state this would tend to reduce the sustainable level
of population. On the other hand, the smaller stock of resouces associated with the higher
value of implies a higher harvest which could sustain more people. We can calculated the
dN r 2rd
= 2+ 3 (16.13)
d
341
!
1 2d
= 2 1 (16.14)
r
= 2 (2S 1) (16.15)
This shows that whether an increase in raises or reduces the equilibrium population depends
on the size of the equilibrium level of resoures. If the equilibrium level of resources is over half
the original maximum amount (which we have set equal to one) then we have 2S 1 > 0
and a more intensive rate of harvesting raises the population even though it reduces the total
amount of resources. On the other hand, if the equilibrium level of resources is less than half
the original maximum amount (which we have set equal to one) then we have 2S 1 > 0
An economy like Easter Island, where the economy ended up with a hugely diminished
amount of resources, likely corresponds to the latter case, so it was an example of an economy
that would have had a higher long-run population if they had harvested less.
Lets go back to Easter Island and imagine the island in its early days with a full stock of
resources and very few residents. What happens next? Figure 16.7 provides an illustration.
The economy starts out in what we can call the happy quadrant with resources above
the long-run equilibrium and an expanding population. How do we know the dynamics take
the curved form displayed in Figure 16.7? Well, when the economy crosses into the bottom
right quadrant, in which population is now falling, the economy doesnt suddenly jump off
in a different direction; the models equations dont allow for any sudden jumps. Thus, the
turnaround from increasing population to falling population must occur gradually over time.
342
So what happens to our theoretical Easter Island?
Then, when it moves into the bottom right quadrant, population falls and resources
keep declining.
Then the economy moves into the bottom left quadrant where population keeps falling
Then the economy moves into the quadrant in the triangle under the two curves and
Finally, the economy moves back into the quadrant where it started but with less pop-
ulation and lower resources. The process is repeated with smaller fluctuations until it
Our theoretical Easter Island sees its population far overshoot its long-run equilibrium
level before collapsing below this level and then oscillating around the long-run level and then
343
Figure 16.4: Population Dynamics
344
Figure 16.5: Resource Stock Dynamics
345
Figure 16.6: Dynamics Differ In Four Quadrants
346
Figure 16.7: Illustrative Dynamics Starting from Low Population
and High Resources
347
Numerical Example: A Lower Harvesting Rate
One of the ways to explore the properties of models like this one is to use software like Excel
versions of the model. Figures 16.8 and 16.9 show time series for resources and population
generated from a RATS programme that simulates a discrete-time adaptation of the model.
The programme is shown at the back in an appendix. It implements a version of the model with
The parameter values are set so that the equilibrium level of resources is S = d
= 0.075
0.2
=
r(1S ) (0.075)(10.375)
0.375 while the equilibrium level of population is N =
= 2
= 0.023475.
Figure 16.8 shows that, for these parameter values, the stock of resources falls to about
half of its long-run equilibrium value, then rises and overshoots this value and then oscillates
before settling down at this equilibrium level. Figure 16.9 shows the associated movements
in population. We see population surge to levels that are over twice the long-run sustainable
level, then dramatically drop to undershoot this level before eventually settling down.
Because we have chosen a base case in which S < 0.5, this is a case where there would be
higher resources and population in the long-run if we had a somewhat lower rate of harvesting.
Indeed, you can pick a rate of harvesting that avoids a collapse scenario altogether. Figures
16.10 and 16.11 compare the base case we have just looked at with a case in which the rate
of harvesting was 40 percent lower, so = 1.2. In this case, the resource stock only slightly
undershoots its long-run level and the population only slightly-overshoots. The economy ends
348
Figure 16.8: Resources in a Simulated Easter Island Economy
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
50 100 150 200 250 300 350 400 450 500
349
Figure 16.9: Population in a Simulated Easter Island Economy
0.05
0.04
0.03
0.02
0.01
0.00
50 100 150 200 250 300 350 400 450 500
350
Figure 16.10: Resource Stock with Less Harvesting
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
50 100 150 200 250 300 350 400 450 500
Base Case Less Harvesting
351
Figure 16.11: Population with Less Harvesting
0.05
0.04
0.03
0.02
0.01
0.00
50 100 150 200 250 300 350 400 450 500
Base Case Less Harvesting
352
Why Doesnt Someone Shout Stop?
The pattern demonstrated in the modelin which the economy far overshoots its long-run
level before collapsing to an equilibrium with lower population and depleted resourcesmay
seem to fit what happened at Easter Island. But it raises plenty of questions: Why did the
residents of the island allow this to happen? Why didnt they establish better governance
rules to prevent the deforestation that proved so devestating? And could this model possibly
be a warning that todays global economy could represent an overshooting with a significant
In his book, Collapse, Jared Diamond discusses Easter Island and a number of other cases
in which societies saw dramatic collapses, many triggered by long-term environmental damage.
Diamond points to a number of potential explanations for why societies can let environmental
The Tragedy of the Commons: It may simply never be in anyones interests at any
excess fishing will eventually put him out of business but there may be little he can do
to prevent others fishing and today he needs to earn an income. Some societies can put
cannot. At present, the society called The Earth is not known for its efficient centralised
Failure to Anticipate: Societies may not realise exactly how much damage they are
doing to their environment or what its long-term consequences will be. Up until the
there was probably a limited realisation among the population of the damage being
353
done. Once the population began to shrink and the tribes turned against each other
(theres some evidence of cannibalism during this period) the likelihood of a common
negotiated solution to cut down less trees to preserve the environment was unlikely.
Similarly today, the future effects of climate change are unpredictable and the costs
Failure to Perceive, Until Too Late: Diamond notes that environmental change
often occurs at such a slow pace that people fail to notice it and plan to deal with it.
The Easter Islanders of 1500 probably couldnt remember (and certainly had no written
record of) their island being covered in palm trees. The islander who eventually cut
down the last tree probably had little idea that these trees had once been the mainstay
of the local economy. Similarly, global climate change has occurred at such a slow pace
that, despite the mountain of scientific evidence that it is real, many simply choose to
deny it.
354
Appendix: Programme For Easter Island Simulation
Figures 8 and 9 were produced using the programme below. The programme is written for the
econometric package RATS but a programme of this sort could be written for lots of different
allocate 10000
set d = 0.075
set r = 0.075
set gamma = 2
set theta = 0.1
set s = 1
set n = 0.0001
set h = 0
do k = 2,10000
comp s(k) = s(k-1) + r(k)*s(k-1)*(1-s(k-1) ) - h(k-1)
comp h(k) = gamma(k)*s(k-1)*n(k-1)
comp n(k) = (1-d(k))*n(k-1) + theta(k)*h(k)
end do k
graph 1
# s 1 500
graph 1
# n 1 500
355