0% found this document useful (0 votes)
170 views5 pages

MIT Microeconomics 14.32 Final Review

This document provides a summary of key topics in multivariate regression analysis: 1. It explains the mechanics of multivariate regression, including how to interpret regression coefficients when additional variables are included or excluded from the model. 2. It discusses how to interpret dummy variables and interactions, and how differences-in-differences models work. 3. It covers issues like heteroskedasticity and serial correlation in residuals, and how to address them using robust standard errors or generalized least squares.

Uploaded by

ddkiller
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
170 views5 pages

MIT Microeconomics 14.32 Final Review

This document provides a summary of key topics in multivariate regression analysis: 1. It explains the mechanics of multivariate regression, including how to interpret regression coefficients when additional variables are included or excluded from the model. 2. It discusses how to interpret dummy variables and interactions, and how differences-in-differences models work. 3. It covers issues like heteroskedasticity and serial correlation in residuals, and how to address them using robust standard errors or generalized least squares.

Uploaded by

ddkiller
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

MIT (14.

32) Spring 2014


Ben Feigenberg
14.32 Final Review
1 Multivariate Regression
Multivariate regression mechanics:
Formula for multivariate regression coecients: consider
y
i
=
0
+x
1i

1
+x
2i

2
+x
3i

3
+
i
(1)
Then,

1
=

x1iyi

x
2
1i
, where x
1i
are the residuals from regressing x
1i
on x
2i
and x
3i
.
Interpretation:

1
is eect of x
1i
holding x
2i
and x
3i
xed
Comparison of short and long regression: short regression of
y
i
=
s
0
+x
1i

s
1
+u
i
Then

s
1
=
1
+
2
COV (x
1
, x
2
)
V (x
1
)
+
3
COV (x
1
, x
3
)
V (x
1
)
Short=long when (1) omitted variables have coecients of zero or (2) omitted variables are uncorre-
lated with included variables!
Interpretating dummy variables and interactions
Example: X
1
=years of schooling, X
2
= 1(sex=female)
Y =
0
+
1
X
1
+
2
X
2
+
12
X
1
X
2
+
This regression approximates the CEF (the model isnt saturated when one of the interaction variables
is continuous).
The (female-male) dierence in intercepts is
2
The (female-male) dierence in schooling slopes is
12
This parameterization produces results identical to those from separate regressions on X
1
by sex
F-tests
F =
(RSS
r
RSS
ur
)/q
RSS
ur
/(n k 1)
F(q, n k 1)
where RSS is the residual sum of squares. ur and r subscripts denote the unrestricted and restricted
models. q is the number of restrictions, n is the number of observations, k is the number of parameters in
the unrestricted model.
Note: Requires homoskedasticity and normal resids (in small sample) or robust formula required!
1
e.g. to test H
0
:
2
=
3
= 0 in (1), RSS
r
is the residual sum of squares from the regression of y on
just x
1
. RSS
ur
is the residual sum of squaes from (1), q = 2, and k = 3.
An F-statistic testing only one restriction is just the square of the corresponding t-statistic!
Dierences-in-dierences: have some observations where a policy change took place, others where it did
not. Have data on an outcome of interest before and after change in both types of places. Estimate eect
of policy as the dierence in the dierence in mean outcome before and after the change across places
with and without the policy change. Key assumption is that only thing that caused systematic dierence
in changes across places is the policy. Regression version:
y
it
=
0
+
1
Post
t
+
2
Treatment
i
+
3
(Post
t
Treatment
i
) +
it
Di-in-di estimate is
3
.
Extended version (state-level example):
y
st
=
0
+
s
+
t
+X
st
+
DD
(Policy
st
) +
st

s
are state xed eects,
t
are year xed eects, X
st
allows for state-specic trends (i.e. demographic
changes), and the coecient on Policy
st
provides our dierences-in-dierences-style estimate.
2 Asymptopia
S
n
converges in probability to c (plim S
n
= c) if Lim
n
P(|S
n
c| ) = 0 for all > 0
Under random sampling alone, we can show plim

= (i.e.

is a consistent estimator of )


is asymptotically normally distributed by the CLT: AV (

) =
E[(Xi
X
)
2

2
i
]
n(
2
x
)
2
3 Problems with Residuals and GLS
Heteroskedasticity:
V (|x) = E[
2
|x] varies as a function of x.
Usual OLS standard errors incorrect and OLS not ecient, but estimates still consistent.
Use heteroskedasticity robust standard errors
If know functional form of heteroskedasticity, can do FGLS.
e.g. suppose know that V (|x) =
0
+
1
x
2
. Then: (1) estimate original model by OLS, (2)
regress
2
=
0
+
1
x
2
i
to estimate , (3) form

V (|x) =
0
+
1
x
2
i
, (4) reweight observations by

V (|x)
1/2
, (5) Run OLS on re-weighted data (this is WLS, a special case of GLS).
FGLS is asymptotically ecient (has lowest asymptotic variance)
Serial correlation:
E[
t

s
] = 0 for some s = t
Usual OLS standard errors incorrect, but estimates still consistent (again, OLS is not BLUE)
Intuition: Correlated observations provide less information than independent observations be-
cause they do not contain entirely fresh evidence on the relationship in question.
Use HAC (aka Newey-West) standard errors
2
Do FGLS if willing to take stance of form of serial correlation:
e.g. for AR(1) where we assume the form of serial correlation is
t
=
t1
+ u
t
and assume
u
t
is homoskedastic and serially uncorrelated. Then: (1) estimate model using OLS, (2) regress

t
=
t1
+ u
t
, (3) quasi-dierence data y
t
= y
t
y
t1
, x
t
= x
t
x
t1
, (4) Run OLS on
quasi-dierenced data.
FGLS is asymptotically ecient (has lowest asymptotic variance)
4 Instrumental Variables
Motivating example: omitted variable bias in OLS
Model: y
i
= x
i
+
i
, (and for simpler formulas, let x = y = 0) then

x
i
y
i

x
2
i
=

x
i
(x
i
+
i
)

x
2
i
plim

= +
plim
1
n

x
i

i
plim
1
n

x
2
i
= +
Cov(x, )
V (x)
Usual omitted variable bias formula; If
i
=

a
i
+ u
i
with E[x
i
u
i
] = 0, then plim

= +

Cov(x, )
V (x)
= +

where is the coecient from regressing on x.


Need to use IV when we want to estimate a causal eect. A good way to think about causal eects is to
imagine the ideal randomized experiment that you would perform to answer your question. Then think
about the data you have. If you can tell stories about why OLS might not estimate the experimental
eect, then you need an instrument.
Consistency of IV: need z such that E[z] = 0 (exogeneity) and E[zx] = 0 (relevance)

IV
=
s
ZY/s
2
Z
s
ZX
/s
2
Z
(reduced form estimates divided by rst stage estimates)
Given binary instrument z, we can construct an IV estimate as the ratio of dierences in means (this is
the Wald estimator!)
2SLS: example model
y = +x +w +
where E[x] = 0 but E[w] = 0 and we have z as an instrument.
First stage: Regress x on z and w, form tted values, x
Regress y on x and w. Estimated coecient for x is

2SLS
.
In bivariate model, y = +x +, with one instrument,

2SLS
=

(zi z)yi

(zi z)xi
Examples: QOB instrument to estimate economic return to schooling, draft lottery, charter
school lotteries
3
Overidentication:
More instruments than endogenous variables. e.g. x
i
scalar, and have z
1i
, z
2i
.
Can test overidentifying restrictions
Ex./Run 2SLS and then regress resids on instruments to test whether they estimate the same
causal parameter (i.e. coecients are not signicantly dierent from 0).
Can combine estimates that would get from just using one or the other instrument to get more
ecient estimate (this is what 2SLS (and 3SLS) does automatically)
5 Simultaneous Equations
When prices and quantities are determined by solving two (or more) equations simultaneously, OLS
estimates are inconsistent for supply and demand elasticities; this is called simultaneous equations bias
Suppose we have jointly determined variables, q
t
and p
t
q
d
t
(p) =
0
+
1
p
t
+
2
z
t
+
d
t
q
s
t
(p) =
0
+
1
p
t
+
2
x
t
+
s
t
Identication: when can we estimate and ?
Lets start by writing down reduced form equations for q
t
and p
t
(i.e. equations in which the regressors
are uncorrelated with the errors):
p =

0

0

1

1

1

1
z +

2

1

1
x +

s

1

1
=
10
+
11
z +
12
x +
1
q =

1

0

1

1

1
+

2

1

1
z

1

1

1
x +

1

1

1
=
20
+
21
z +
22
x +
2
As a sidenote, suppose we had run OLS regression of q
t
on p
t
:

1
=

p
t
q
t

p
t
2
=
1
+

p
t

d
t

p
t
2
Since

pt
d
t

pt
2
= 0 based on our reduced form equation for p
t
, our estimate of
1
is biased.
However, we can run OLS on these reduced form equations and solve for the structural parameters (the

s and s). This is Indirect Least Squares! For example,


21
11
=
1
,
12
(
1

1
) =
2
, etc.
In the simultaneous equations framework, an equation is identied by the exclusion of exogenous variables
that are included in other equations.
An equation is overidentied if there are more excluded exogenous variables than endogenous vari-
ables, just-identied if there are as many excluded exogenous variables as endogenous variables, and
under-identied if there are fewer excluded exogenous variables than endogenous variables.
4
Note the link between simultaneous equations and IV. Using ILS, note that
1
=
22
12
, but we can re-write
this ratio as the sample analog of
C(q, x)
C(p, x)
, which is just the ratio of reduced form to rst stage!
If the system is overidentied, 3SLS, which is a version of GLS that takes into account the correlation
between u
1
and u
2
is asymptotically more ecient than 2SLS
For just identied equations, ILS=2SLS=3SLS
For over identied equations, 2SLS is a weighted combination of the ILS estimates constructed by
regressing p
t
on both potential instruments and then using the tted values as the single instrument
6 Regression Discontinuity
Basic model of sharp RD: Treatment status D
i
is a deterministic and discontinuous function of covariate
x
i
(so D
i
= 1 if x
i
x
0
and D
i
= 0 if x
i
< x
0
)
E[Y
0i
|x
i
] = +x
i
Y
1i
= Y
0i
+
and so we can estimate the causal eect of crossing the cuto with the following regression:
Y
i
= +x
i
+D
i
+
i
Fuzzy RD: Jump in the probability D
i
= 1 at the cuto is less than one. To estimate the causal impact of
D
i
on Y
i
, we can make use of the following ratio in a neighborhood around the cuto:
E[Y
i
|x
0
< x
i
< x
o
+] E[Y
i
|x
0
< x
i
< x
o
]
E[D
i
|x
0
< x
i
< x
o
+] E[D
i
|x
0
< x
i
< x
o
]
5

You might also like