Lec05 Regressionasymptotics
Lec05 Regressionasymptotics
Asymptotics
Dr. Henry
Kankwamba
Motivation
Consistency
Asymptotic
normality
Large sample
OLS Asymptotics
inference
Dr. Henry
Kankwamba
Motivation
Consistency
1 Motivation
Asymptotic
normality
Large sample
inference
2 Consistency
3 Asymptotic normality
Large sample inference
OLS
Asymptotics
Motivation
Consistency
Asymptotic
normality
Large sample
inference
• Wooldridge chapter 5
• Stock and Watson chapter 18
• Review of asymptotics
OLS
Asymptotics
Dr. Henry
Kankwamba
Motivation
Consistency
Asymptotic
Section 1
normality
Large sample
inference
Motivation
OLS
Asymptotics
Motivation
• Our six regression assumptions,
Consistency MLR.1 (linear model)
Asymptotic
MLR.2 (independence) {(x1,i , x2,i , yi )}ni=1 is an independent
normality random sample
Large sample
inference MLR.3 (rank condition) no multicollinearity: no xj,i is constant
and there is no exact linear relationship among the xj,i
MLR.4 (exogeneity) E[εi |x1,i , ..., xk,i ] = 0
MLR.5 (homoskedasticity) Var(εi |X ) = σε2
MLR.6 εi |X ∼ N(0, σε2 )
especially MLR.6 (and to a lesser extent MLR.1 and
MLR.4) are often implausible
• Requiring OLS to only be consistent instead of
unbiased will let us relax MLR.1 and MLR.4
• We will use the Central limit theorem to relax
assumption MLR.6 and still perform inference (t-tests
and F -tests)
OLS
Asymptotics
Motivation
Consistency
Asymptotic
normality
Large sample
inference
• Idea: use limit of distribution of estimator as N→∞ to
approximate finite sample distribution of estimator
• Notation:
• Sequence of samples of increasing size n,
Sn = {(y1 , x1 ), ..., (yn , xn )}
• Estimator for each sample θ̂ (implicitly depends on n)
OLS
Asymptotics
Dr. Henry
Kankwamba
Motivation
Consistency
Asymptotic
normality
Large sample
inference
Section 2
Consistency
OLS
Asymptotics
Dr. Henry
Kankwamba
Motivation
Consistency
A law of large numbers gives conditions such that
Asymptotic p
normality ȳ → E[Y ]. For this course, you do not need to worry about
the conditions needed to make a law of large numbers hold.
Large sample
inference
p
You can just always assume that ȳ → E[Y ]. However, in
case you’re curious, the remainder of this paragraph will go
into more detail. The simplest law of large numbers (called
Khinchine’s law of large numbers) says that if yi are iid
p
with E[Y ] finite, then ȳ → E[Y ].
OLS
Asymptotics
Dr. Henry
Kankwamba
Motivation
Consistency
The assumption that yi are iid can be relaxed if more
Asymptotic
assumptions are made about the moments of yi . The “or”
normality
Large sample
part of the bullet above is called Chebyshev’s law of large
inference
numbers. It says that if yi are independent (but not
necessarily identically distributed), E[yi ] = µi < ∞ for all i,
1 Pn
and limn→∞ n2 i=1 Var(yi ) = 0, then
P
plim(ȳn − n1 ni=1 µi ) = 0. In the next lecture, when we deal
with heteroskedasticity, we will be using this law of large
of numbers. There are also versions of the law of large
numbers for when yi are not independent.
OLS
Asymptotics
Motivation • Similarly
1X
Consistency n
p
Asymptotic (xi − x̄)2 → Var(x)
normality n
Large sample
i=1
inference
Cov(x,y )
i.e. β1 = Var(x) and β0 = E[y ] − β1 E[x]
• Thus, OLS consistently estimates the population
regression under very weak assumptions
OLS
Asymptotics
Motivation
P p
Consistency • We only need to assume n1 ni=1 xi → E[x],
P n p P n p
Asymptotic
normality
1
yi → E[y ], n i=1 xi → E[x ], and
1 2 2
n
Pi=1
n p
i=1 xi yi → E[xy ]. There are multiple versions of the
Large sample 1
inference n
law of large numbers that would make this true. The
details of LLNs are not important for this course, so we
will be slightly imprecise and say that this is true
assuming xi and yi have finite second moments and are
not too dependent
Theorem
Assume yi , xi1 , ..., xik have finite second moments and
observations are not too dependent then OLS consistently
estimates the population regression of y on x1 , ..., xk
OLS
Asymptotics
yi = β0 + β1 x1,i + · · · + βk xk,i + εi
Consistency
of population regression
Asymptotic
normality
Large sample
inference
Code
OLS
Asymptotics
Consistency
of population regression
Asymptotic
normality
Large sample
inference
Code
OLS
Asymptotics
yi = β0 + β1 x1,i + β2 x2,i + εi
instead
• Causal effect: want slope to be the causal effect of x on
y
• Economic model: e.g. production function
OLS
Asymptotics
Dr. Henry
Kankwamba
Motivation
Consistency
Asymptotic
normality
Large sample
inference
Section 3
Asymptotic normality
OLS
Asymptotics
Consistency
with CDF F
d
Asymptotic • θ̂ converges in distribution to W , written θ̂ → W , if
normality
Large sample limn→∞ Fn (x) = F (x) for all x where F is continuous
inference
• Central limit theorem: Let {y1 , ..., yn√
} be i.i.d. with
mean µ and variance σ 2 then Zn = n (ȳn − µ)
converges in distribution to a N(0, σ 2 ) random variable
• As with the LLN, the i.i.d. condition can be relaxed if
additional moment conditions are added; we will not
worry too much about the exact assumptions needed
• For non-i.i.d. data, if E[yi ] = µ for all i and
h P 2 i
v = limn→∞ E n1 ni=1 yi − µ exists (and some
technical conditions are met) then
√ d
n (ȳn − µ) → N(0, v )
• Properties:
OLS
Asymptotics
Motivation
Consistency
Asymptotic
normality
Large sample
inference
d d
• If θ̂ → W , then g (θ̂) → g (W ) for continuous g
(continuous mapping theorem (CMT))
d p
• Slutsky’s theorem: If θ̂ → W and ζ̂ → c, then (i)
d d d
θ̂ + ζ̂ → W + c, (ii) θ̂ ζ̂ → cW , and (iii) θ̂/ ζ̂ → W /c
OLS
Asymptotics
7 nrow=s i m u l a t i o n s , n c o l=n )
8 means [ , i ] <= ( apply ( dat , 1 , mean ) = 0 . 5 ) ∗ s q r t ( n )
9 }
10
11 # Plotting
12 d f <= data . frame ( means )
13 d f $n <= N
14 d f <= melt ( d f )
15 c l t P l o t <= g g p l o t ( data=df , aes ( x=value , f i l l =v a r i a b l e ) ) +
16 geom h i s t o g r a m ( alpha = 0 . 2 , p o s i t i o n=” i d e n t i t y ” ) +
17 s c a l e x c o n t i n u o u s ( name=e x p r e s s i o n ( s q r t ( n ) ( bar ( x ) =mu
18 s c a l e f i l l brewer ( type=” d i v ” , p a l e t t e=” RdYlGn ” ,
19 name=”N” , l a b e l=N)
20 cltPlot
OLS
Asymptotics
Motivation
Motivation
• Then,
Consistency
!
Asymptotic 1 Pn
normality √ √ i=1 (xi − x̄)yi
Large sample n(β̂1 − β1 ) = n 1 Pn
n
− β1
i=1 (xi − x̄)
inference 2
n
Pn !
i=1 (xi − x̄)(β0 + β1 xi + εi )
√ 1
= n n
1 Pn
− β1
i=1 (xi − x̄)
2
n
√ P
n n1 ni=1 (xi − x̄)εi
= 1 Pn
i=1 (xi − x̄)
2
n
Pn 2 p
• Already showed that 1
n i=1 (xi − x̄) → Var(x)
√ 1 Pn
• Need to apply CLT to n n i=1 (xi − x̄)εi
• E[(xi − x̄)εi ] = 0
OLS
Asymptotics
Motivation
Consistency
Asymptotic
• With homoskedasiticity,
normality
Large sample
inference
Var ((xi − x̄)εi ) =E [Var ((xi − x̄)εi |x)] + Var E[(xi − x̄)εi |x]
| {z }
=0
=E (xi − x̄)2 σε2
≈Var(x)σε2
1 X
n
d
√ (xi − x̄)εi → N(0, Var(x)σε2 )
n i=1
OLS
Asymptotics
Motivation
Consistency
• By Slutsky’s theorem,
Asymptotic
normality
√ 1 Pn
Large sample
inference √ n (xi − x̄)εi
n(β̂1 − β1 ) = 1 nPn i=1
i=1 (xi − x̄)
2
n
d σε2
→ N 0,
Var(x)
or equivalently,
β̂ − β d
q1 2 1 → N(0, 1)
σε
nVar(x)
OLS
Asymptotics
Motivation
Consistency
Asymptotic
normality • Again by slutsky’s lemma can replace σε2 and Var(x) by
Large sample
inference
consistent estimators, and
β̂1 − β1 d
q 2
→ N(0, 1)
Pn σ̂ε
i=1 (xi −x̄)
2
Dr. Henry
Kankwamba
Theorem
Motivation Assume MLR.1-3, MLR.5, and MLR.4’: E[εi xi,j ] = 0∀j, then
Consistency
OLS is asymptotically normal with
Asymptotic
normality
Large sample β̂0 − β0
inference
√ .. d
n . → N (0, Σ)
β̂k − βk
and in particular
β̂j − βj d
r → N(0, 1)
σ̂ 2
Pn ε 2
i=1 x̃ji
OLS
Asymptotics
yi = β0 + β1 x1,i + · · · + βk xk,i + εi
• p-value:
p = P(|t| ≥ |t̂|) = 2Φ(−|t̂|)
OLS
Asymptotics
Motivation
Consistency
P(reject H0 if it is true) ̸= α
so
β̂j − βj
P Φ−1 (0.025) ≤ q 2 ≤ Φ−1 (0.975)→0.95
σ̂
Pn ε 2
i=1 x̃ji
q 2
β̂j + Pnσ̂ε x̃ 2 Φ−1 (0.025) ≤ βj ≤
P qji 2
i=1 →0.95
≤ β̂j + Pnσ̂ε x̃ 2 Φ−1 (0.975)
i=1 ji
where
• SSRr = sum of squared residuals from restricted model,
i.e. regressing yi on just x1,i
• SSRur = sum of squared residuals from unrestricted
model, i.e. regressing yi on x1,i , x2,i , and x3,i
OLS
Asymptotics
p = P(F ≥ F̂ ) = 1 − Fχ 2 (q) (q F̂ )
Motivation 1 rm ( l i s t =l s ( ) )
Consistency
2 l i b r a r y ( l m t e s t ) ## f o r l r t e s t ( ) and w a l d t e s t ( )
3
k <=
Asymptotic
normality 4 3
n <= 1000
Large sample
inference 5
6 beta <= m a t r i x ( c ( 1 , 1 , 0 , 0 ) , n c o l =1)
7 x <= m a t r i x ( rnorm ( n∗k ) , nrow=n , n c o l=k )
8 e <= r u n i f ( n ) ∗2=1 ## U( = 1 , 1 )
9 y <= c b i n d ( 1 , x ) %∗% beta + e
10
11 ## LR form o f F= t e s t
12 d f <= data . frame ( y , x )
13 u n r e s t r i c t e d <= lm ( y ˜ X1 + X2 + X3 , data=d f )
14 r e s t r i c t e d <= lm ( y ˜ X1 , data=d f )
15 F <= ( sum ( r e s t r i c t e d $ r e s i d u a l s ˆ 2 ) =
16 sum ( u n r e s t r i c t e d $ r e s i d u a l s ˆ 2 ) ) / 2 /
17 ( sum ( u n r e s t r i c t e d $ r e s i d u a l s ˆ 2 ) / ( n=k = 1))
18 p <= 1= p f ( F , 2 , n=k = 1)
OLS
Asymptotics
24 ## Wald form
25 Fw <= 0 . 5 ∗ c o e f ( u n r e s t r i c t e d ) [ c ( ” X2 ” , ” X3 ” ) ] %∗%
26 s o l v e ( v c o v ( u n r e s t r i c t e d ) [ c ( ” X2 ” , ” X3 ” ) ,
27 c ( ” X2 ” , ” X3 ” ) ] ) %∗%
28 c o e f ( u n r e s t r i c t e d ) [ c ( ” X2 ” , ” X3 ” ) ]
29 pw <= 1= p f ( Fw, 2 , n=k = 1)
30 ## Should have F == Fw and p==pw
31
32 ## automated Wald t e s t
33 w a l d t e s t ( u n r e s t r i c t e d , r e s t r i c t e d , t e s t=”F ” )
Code