MIT Microeconomics 14.32 Final Review
MIT Microeconomics 14.32 Final Review
1
+x
2i
2
+x
3i
3
+
i
(1)
Then,
1
=
x1iyi
x
2
1i
, where x
1i
are the residuals from regressing x
1i
on x
2i
and x
3i
.
Interpretation:
1
is eect of x
1i
holding x
2i
and x
3i
xed
Comparison of short and long regression: short regression of
y
i
=
s
0
+x
1i
s
1
+u
i
Then
s
1
=
1
+
2
COV (x
1
, x
2
)
V (x
1
)
+
3
COV (x
1
, x
3
)
V (x
1
)
Short=long when (1) omitted variables have coecients of zero or (2) omitted variables are uncorre-
lated with included variables!
Interpretating dummy variables and interactions
Example: X
1
=years of schooling, X
2
= 1(sex=female)
Y =
0
+
1
X
1
+
2
X
2
+
12
X
1
X
2
+
This regression approximates the CEF (the model isnt saturated when one of the interaction variables
is continuous).
The (female-male) dierence in intercepts is
2
The (female-male) dierence in schooling slopes is
12
This parameterization produces results identical to those from separate regressions on X
1
by sex
F-tests
F =
(RSS
r
RSS
ur
)/q
RSS
ur
/(n k 1)
F(q, n k 1)
where RSS is the residual sum of squares. ur and r subscripts denote the unrestricted and restricted
models. q is the number of restrictions, n is the number of observations, k is the number of parameters in
the unrestricted model.
Note: Requires homoskedasticity and normal resids (in small sample) or robust formula required!
1
e.g. to test H
0
:
2
=
3
= 0 in (1), RSS
r
is the residual sum of squares from the regression of y on
just x
1
. RSS
ur
is the residual sum of squaes from (1), q = 2, and k = 3.
An F-statistic testing only one restriction is just the square of the corresponding t-statistic!
Dierences-in-dierences: have some observations where a policy change took place, others where it did
not. Have data on an outcome of interest before and after change in both types of places. Estimate eect
of policy as the dierence in the dierence in mean outcome before and after the change across places
with and without the policy change. Key assumption is that only thing that caused systematic dierence
in changes across places is the policy. Regression version:
y
it
=
0
+
1
Post
t
+
2
Treatment
i
+
3
(Post
t
Treatment
i
) +
it
Di-in-di estimate is
3
.
Extended version (state-level example):
y
st
=
0
+
s
+
t
+X
st
+
DD
(Policy
st
) +
st
s
are state xed eects,
t
are year xed eects, X
st
allows for state-specic trends (i.e. demographic
changes), and the coecient on Policy
st
provides our dierences-in-dierences-style estimate.
2 Asymptopia
S
n
converges in probability to c (plim S
n
= c) if Lim
n
P(|S
n
c| ) = 0 for all > 0
Under random sampling alone, we can show plim
= (i.e.
is a consistent estimator of )
is asymptotically normally distributed by the CLT: AV (
) =
E[(Xi
X
)
2
2
i
]
n(
2
x
)
2
3 Problems with Residuals and GLS
Heteroskedasticity:
V (|x) = E[
2
|x] varies as a function of x.
Usual OLS standard errors incorrect and OLS not ecient, but estimates still consistent.
Use heteroskedasticity robust standard errors
If know functional form of heteroskedasticity, can do FGLS.
e.g. suppose know that V (|x) =
0
+
1
x
2
. Then: (1) estimate original model by OLS, (2)
regress
2
=
0
+
1
x
2
i
to estimate , (3) form
V (|x) =
0
+
1
x
2
i
, (4) reweight observations by
V (|x)
1/2
, (5) Run OLS on re-weighted data (this is WLS, a special case of GLS).
FGLS is asymptotically ecient (has lowest asymptotic variance)
Serial correlation:
E[
t
s
] = 0 for some s = t
Usual OLS standard errors incorrect, but estimates still consistent (again, OLS is not BLUE)
Intuition: Correlated observations provide less information than independent observations be-
cause they do not contain entirely fresh evidence on the relationship in question.
Use HAC (aka Newey-West) standard errors
2
Do FGLS if willing to take stance of form of serial correlation:
e.g. for AR(1) where we assume the form of serial correlation is
t
=
t1
+ u
t
and assume
u
t
is homoskedastic and serially uncorrelated. Then: (1) estimate model using OLS, (2) regress
t
=
t1
+ u
t
, (3) quasi-dierence data y
t
= y
t
y
t1
, x
t
= x
t
x
t1
, (4) Run OLS on
quasi-dierenced data.
FGLS is asymptotically ecient (has lowest asymptotic variance)
4 Instrumental Variables
Motivating example: omitted variable bias in OLS
Model: y
i
= x
i
+
i
, (and for simpler formulas, let x = y = 0) then
x
i
y
i
x
2
i
=
x
i
(x
i
+
i
)
x
2
i
plim
= +
plim
1
n
x
i
i
plim
1
n
x
2
i
= +
Cov(x, )
V (x)
Usual omitted variable bias formula; If
i
=
a
i
+ u
i
with E[x
i
u
i
] = 0, then plim
= +
Cov(x, )
V (x)
= +
IV
=
s
ZY/s
2
Z
s
ZX
/s
2
Z
(reduced form estimates divided by rst stage estimates)
Given binary instrument z, we can construct an IV estimate as the ratio of dierences in means (this is
the Wald estimator!)
2SLS: example model
y = +x +w +
where E[x] = 0 but E[w] = 0 and we have z as an instrument.
First stage: Regress x on z and w, form tted values, x
Regress y on x and w. Estimated coecient for x is
2SLS
.
In bivariate model, y = +x +, with one instrument,
2SLS
=
(zi z)yi
(zi z)xi
Examples: QOB instrument to estimate economic return to schooling, draft lottery, charter
school lotteries
3
Overidentication:
More instruments than endogenous variables. e.g. x
i
scalar, and have z
1i
, z
2i
.
Can test overidentifying restrictions
Ex./Run 2SLS and then regress resids on instruments to test whether they estimate the same
causal parameter (i.e. coecients are not signicantly dierent from 0).
Can combine estimates that would get from just using one or the other instrument to get more
ecient estimate (this is what 2SLS (and 3SLS) does automatically)
5 Simultaneous Equations
When prices and quantities are determined by solving two (or more) equations simultaneously, OLS
estimates are inconsistent for supply and demand elasticities; this is called simultaneous equations bias
Suppose we have jointly determined variables, q
t
and p
t
q
d
t
(p) =
0
+
1
p
t
+
2
z
t
+
d
t
q
s
t
(p) =
0
+
1
p
t
+
2
x
t
+
s
t
Identication: when can we estimate and ?
Lets start by writing down reduced form equations for q
t
and p
t
(i.e. equations in which the regressors
are uncorrelated with the errors):
p =
0
0
1
1
1
1
z +
2
1
1
x +
s
1
1
=
10
+
11
z +
12
x +
1
q =
1
0
1
1
1
+
2
1
1
z
1
1
1
x +
1
1
1
=
20
+
21
z +
22
x +
2
As a sidenote, suppose we had run OLS regression of q
t
on p
t
:
1
=
p
t
q
t
p
t
2
=
1
+
p
t
d
t
p
t
2
Since
pt
d
t
pt
2
= 0 based on our reduced form equation for p
t
, our estimate of
1
is biased.
However, we can run OLS on these reduced form equations and solve for the structural parameters (the