5ssmn932 Lecture7 2021 Collated Online
5ssmn932 Lecture7 2021 Collated Online
Dragos Radu
[email protected]
recommended reading for this week: Stock and Watson chapter 12.1
corruption and GDP growth (recap from last tutorial)
Dragos Radu
[email protected]
X Y
X Y
Z X Y
Y = b0 + b1 · X + u
For an instrumental variable (an “instrument”) Z to be valid, it must
satisfy two conditions:
1 Instrument relevance: corr (Z , X ) 6= 0
2 Instrument exogeneity: corr (Z , u ) = 0
Suppose for now that you have such a Z (we’ll discuss how to find
instrumental variables later). How can you use Z to estimate b 1 ?
two stage least squares estimates (2SLS)
Y = b0 + b1 · X + u
TSLS - so named because the results can be obtained directly by two
consecutive OLS regressions:
1 OLS regression of X on Z to get X b
2 OLS regression of Y on X b to get bbIV
Dragos Radu
[email protected]
we introduce additional
control variables to eliminate
potential sources of bias
the red area of overlap
between the three variables
represents the variation in
GDP that is explained by
institutions and the control
variables
the remaining blue area
represents the variation in
GDP uniquely explained by
institutional variation, i.e. the
true e↵ect of institutions on
economic performance
institutions and economic performance
problem:
what if relevant control
variables are not available of
unobservable?
what if higher GDP also leads
to better institutions?
(reversed causality)
then there is no way to
distinguish between the red
and the blue ares, i.e. we
cannot identify the true e↵ect
of institutions on economic
performance
institutions and economic performance
we need an instrumental
variable for institutions which
should have two properties:
• it must be independent
of the error term (no
overlap with yellow or
red areas)
• it must be a determinant
of institutions (large
overlap with orange and
blue areas)
do good institutions cause economic prosperity?
Y = b0 + b1 · X + u
where do good institutions come from?
Acemoglu, Johnson, Robinson (AJR)
Y = b0 + b1 · X + u
Y : economic performance
X : institutions
Z : settler mortality rates
(instrument)
first stage:
we regress X on Z and
predict X̂ - i.e. brown +
green
second stage:
we regress Y on X̂ to
produce an estimate of b 1
institutions and economic performance
. use iv_tutorial8.dta
. reg prot logmort
. predict prothat
. reg lgdp prothat
first stage: institutions and settler mortality
Acemoglu, Johnson, Robinson (AJR)
------------------------------------------------------------------------------
prot | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
logmort | -.6213181 .1273148 -4.88 0.000 -.8758166 -.3668195
_cons | 9.400169 .6116454 15.37 0.000 8.177507 10.62283
------------------------------------------------------------------------------
. predict prothat
(option xb assumed; fitted values)
second stage results, TSLS
------------------------------------------------------------------------------
lgdp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
prothat | .9170798 .1255658 7.30 0.000 .6660774 1.168082
_cons | 2.086889 .8237977 2.53 0.014 .4401407 3.733637
------------------------------------------------------------------------------
second stage results, TSLS
------------------------------------------------------------------------------
lgdp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
prothat | .9170798 .1255658 7.30 0.000 .6660774 1.168082
_cons | 2.086889 .8237977 2.53 0.014 .4401407 3.733637
------------------------------------------------------------------------------
------------------------------------------------------------------------------
lgdp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
prot | .522107 .061185 8.53 0.000 .3997999 .6444142
_cons | 4.660383 .4085062 11.41 0.000 3.843791 5.476976
------------------------------------------------------------------------------
IV in Stata
------------------------------------------------------------------------------
lgdp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
prot | .9170798 .1501859 6.11 0.000 .6168625 1.217297
_cons | 2.086889 .9853227 2.12 0.038 .1172566 4.056521
------------------------------------------------------------------------------
Instrumented: prot
Instruments: logmort
------------------------------------------------------------------------------
IV in Stata
. ivreg lgdp (prot=logmort), first
First-stage regressions
------------------------------------------------------------------------------
prot | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
logmort | -.6213181 .1273148 -4.88 0.000 -.8758166 -.3668195
_cons | 9.400169 .6116454 15.37 0.000 8.177507 10.62283
------------------------------------------------------------------------------
Instrumental variables (2SLS) regression
Source | SS df MS Number of obs = 64
-------------+------------------------------ F( 1, 62) = 37.29
Model | 15.8432814 1 15.8432814 Prob > F = 0.0000
Residual | 52.7384371 62 .850619954 R-squared = 0.2310
-------------+------------------------------ Adj R-squared = 0.2186
Total | 68.5817185 63 1.08859871 Root MSE = .92229
------------------------------------------------------------------------------
lgdp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
prot | .9170798 .1501859 6.11 0.000 .6168625 1.217297
_cons | 2.086889 .9853227 2.12 0.038 .1172566 4.056521
------------------------------------------------------------------------------
Instrumented: prot
Instruments: logmort
------------------------------------------------------------------------------
conclusions from AJR, 2001
Dragos Radu
[email protected]
• derivation of b IV
• explain the math behind the intuition
• further IV examples
two stage least squares estimates(2SLS)
Y = b0 + b1 · X + u
TSLS - so named because the results can be obtained directly by two
consecutive OLS regressions:
1 OLS regression of X on Z to get X b
2 OLS regression of Y on X b to get bbIV
Y = b0 + b1 · X + u
if Z is an instrument for X, we can write:
cov (Z , Y ) = cov (Z , b 0 + b 1 · X + u )
= cov (Z , b 0 ) + cov (Z , b 1 · X ) + cov (Z , u )
= 0 + cov (Z , b 1 · X ) + 0
= b 1 · cov (Z , X )
cov (Z , Y )
b1 =
cov (Z , X )
the variance of b IV
X = p0 + p1 · Z + e
Y = a0 + a1 · Z + n
X = p0 + p1 · Z + e
Y = a0 + a1 · Z + µ
yields :
Y = b0 + b1 · X + u
where :
a1 causal e↵ect of Z on Y
b1 = = b IV =
p1 causal e↵ect of Z on X
Interpretation: An exogenous change in X of p1 units is associated with a
change in Y of a1 units – so the e↵ect on Y of an exogenous unit change
in X is b 1 = pa11 .
additional intuition
causal e↵ect of Z on Y
b IV =
causal e↵ect of Z on X
further IV examples
ln (Qibutter ) = b 0 + b 1 · ln (Pibutter ) + u
STR TestSCR
STR TestSCR