0% found this document useful (0 votes)
3 views

ps4

The document is a problem set for a statistics course, focusing on various topics such as measurement error, linear probability models, generalized least squares, ridge regression, and constrained least squares. It includes multiple problems that require theoretical explanations, derivations, and practical applications related to statistical estimation and hypothesis testing. The problems are designed to deepen understanding of regression analysis and its implications in econometrics.

Uploaded by

Preston Lee
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

ps4

The document is a problem set for a statistics course, focusing on various topics such as measurement error, linear probability models, generalized least squares, ridge regression, and constrained least squares. It includes multiple problems that require theoretical explanations, derivations, and practical applications related to statistical estimation and hypothesis testing. The problems are designed to deepen understanding of regression analysis and its implications in econometrics.

Uploaded by

Preston Lee
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Problem Set #4. (Handed out 25th October. Due 1st November.

1. Let X̂ = X + V . Suppose X and X̂ both take values in {0, 1}. Is it possible for E[V ] = 0 and
Cov[X, V ] = 0? Conclude that one cannot have classical measurement error in a binary variable.

2. (Linear Probability Model) Let (Y, X, U ) be a random vector such that

Y = X ′β + U .

Suppose Y takes values in {0, 1} and that E[Y |X] = X ′ β. Is it reasonable to assume that Var[U |X]
does not depend on X? Explain briefly.

3. Let X ∼ N (0, 1) and define Y = γX + X 2 . Let

Y = β0 + β1 X + U ,

where BLP(Y |(1, X)) = β0 + β1 X. Show that


Var[U ] γ2
ρ2 = 1 − = ,
Var[Y ] 2 + γ2
which approaches one as γ 2 → ∞. (Remember that R2 and R̄2 can be thought of as estimates of ρ2 .)

4. Let (Y, X) ∼ P , where Y takes values in R and X takes values in Rk+1 . Suppose E[XX ′ ] and E[XY ]
both exist and that there is no perfect colinearity in X. Let (Y1 , X1 ), . . . , (Yn , Xn ) be an i.i.d. sequence
of random vectors with distribution P . Let A be an k + 1 × k + 1 invertible matrix and define W = AX.

(a) Show that there is no perfect colinearity in W .


(b) Let BLP(Y |X) = X ′ β and BLP(Y |W ) = W ′ γ. How are β and γ related?
(c) For 1 ≤ i ≤ n, define Wi = AXi . Suppose you estimate β using OLS regression of Yi on Xi and
γ using OLS regression of Yi on Wi . How are your estimators of β and γ related? How are the
variances of these estimators related?

5. (Generalized Least Squares) Let (Y1 , X1 ), . . . , (Yn , Xn ) be an i.i.d. sequence of random vectors where Yi
take values in R and Xi takes values in Rk+1 . Suppose that E[Yi |Xi ] = Xi′ β, and Var[Yi |Xi ] = σ 2 (Xi )
and σ 2 (·) is known and σ 2 (Xi ) > 0 for all 1 ≤ i ≤ n. Define Y = (Y1 , . . . , Yn )′ , X = (X1 , . . . , Xn )′ , and
D = diag(σ 2 (X1 ), . . . , σ 2 (Xn )). Consider an estimator of β of the form

β̃n = A′ Y ,

where A = A(X1 , . . . , Xn ).

(a) Show that E[β̃n |X1 , . . . , Xn ] = β if and only if A′ X = I.


(b) Show that Var[β̃n |X1 , . . . , Xn ] = A′ DA.
(c) Suppose that the columns of X are linearly independent. Show that X′ D−1 X is invertible.
(d) Consider the generalized least squares (GLS) estimator obtained by setting A′ = (X′ D−1 X)−1 X′ D−1 .
Evaluate Var[β̃n |X1 , . . . , Xn ] for this choice of A. Show that this choice of A satisfies A′ X = I.
(e) Show that (among all A for which E[β̃n |X1 , . . . , Xn ] = β) the “best” choice of A is given by the
A corresponding to the GLS estimator. Here, “best” is meant to interpreted as in the statement
of the Gauss-Markov Theorem in class.

1
6. (Ridge Regression) Let (Y1 , X1 ), . . . , (Yn , Xn ) be an i.i.d. sequence of random vectors. Suppose E[Xi Xi′ ]
and E[Xi Yi ] exists. Suppose further that there is no perfect collinearity in Xi . Hence, E[Xi Xi′ ] is
invertible.

(a) Does it also follow that


1 X
Xi Xi′
n
1≤i≤n

is invertible? Explain briefly.


(b) For any λn > 0, show that
1 X
(Xi Xi′ + λn I)
n
1≤i≤n

is invertible.
(c) Suppose λn → 0 as n → ∞. Find the limit in probability of
 −1  
1 X 1 X
β̃n =  (Xi Xi′ + λn I)  Xi Yi  .
n n
1≤i≤n 1≤i≤n

√ d
7. Suppose n(β̂n − β) → N (0, Ω) as n → ∞. Let f : Rk+1 → R be continuously differentiable at β with
nonzero derivative f ′ (β). Let Ω̂n be a consistent estimate of Ω. Suppose that Ω is non-singular.

(a) Derive the limiting distribution of n(f (β̂n ) − f (β)).
(b) Construct a test of
H0 : f (β) ≤ 0 versus H1 : f (β) > 0

at level α. Show that the test is consistent in level.


(c) Construct a confidence region, Cn , of level 1 − α for f (β). Show that

P {f (β) ∈ Cn } → 1 − α

as n → ∞.

8. Let (Y1 , X1 , Z1 ), . . . , (Yn , Xn , Zn ) be an i.i.d. sequence of random variables. Define

Ui = Yi − BLP(Yi |Wi ) ,

where
Wi = (1, Xi , Zi )′ .

Suppose Yi takes values in {0, 1} and that E[Yi |Wi ] = Wi′ β, where β = (β0 , β1 , β2 )′ .

(a) Is Ui correlated with Wi ? Is Ui mean independent of Wi ? Explain briefly.


(b) What is Var[Ui |Wi ]? Is it be reasonable to assume that Ui is homoskedastic? Explain briefly.
(c) Write down an expression for β̂n , the OLS estimator of β. Is β̂n unbiased for β? Explain briefly.
(d) Describe in detail how you would test whether β2 (the coefficient on Zi ) is equal to zero or not
at level α ∈ (0, 1).
(e) Suppose you omitted Zi from the regression. Under what assumptions would your estimate of
the coefficient on Xi obtained in this way be consistent for β1 ?

2
(f) Can you propose a “more efficient” estimator of β? (Hint: Remember Gauss-Markov!)

9. You are interested in the effect of a binary treatment on an outcome of interest. To this end, you
collect an i.i.d. sample of n individuals and assign them to the treatment or control group with equal
probability. Let Yi denote the outcome of the ith individual. It is assumed that E[Yi2 ] < ∞. Let Di
denote the treatment status of the ith individual, where Di = 1 if the ith individual is treated and
Di = 0 if the ith individual is not treated. You assume the following model for Yi :

Yi = αi + βi Di ,

where (αi , βi ) are independent of Di . Note that αi and βi are random variables, and βi is the effect of
the treatment on the outcome. Furthermore, the effect is “heterogeneous” in the sense that βi differs
across individuals.

(a) Explain why it might be reasonable to assume that (αi , βi ) is independent of Di .


(b) Consider the linear regression
Yi = α + βDi + ϵi ,

which you interpret as the best linear predictor of Yi given Di . Express α and β in terms of the
distribution of (αi , βi ). In particular, show that β = E[βi ].
(c) For α ∈ (0, 1), construct Cn such that

P {β ∈ Cn } → 1 − α .

10. Download the dataset ps4.csv from Canvas. Please code up your solutions in Matlab, R or Python
and include a pdf of your code with your submission. The point of this problem is to get you to work
through linear regression by example, so please do not use any regression packages to implement your
solutions. Restrict yourself to just matrix operations. (Of course, you may want to use packages to
check your answers!) Please include your answers in the write-up that you are handing in.
Consider the regression:
Y = β0 + β1 X1 + β2 X2 + ϵ

Interpret this regression as the best linear predictor of Y given X1 and X2 .

(a) Compute the OLS estimate β̂n of β.


(b) Compute the (heteroskedasticity-robust) estimator of the covariance matrix of β̂n .
(c) Suppose we wanted to test whether β1 = β2 = 1. Compute a relevant test statistic and a p-value.
(d) Suppose we wanted to test whether (β1 − β2 )2 = 0. Can you use the nonlinear test suggested in
class? If so, do it. If not, provide another test.

11. (Constrained Least Squares and a Lagrange Multiplier Test) Let (Y, X, U ) satisfy

Y = X ′β + U ,

where Y and U take values in R, X = (1, X1 , . . . , Xk )′ takes values in Rk+1 , β = (β0 , . . . , βk )′ . Suppose
E[XU ] = 0, E[XX ′ ] < ∞, Var[XU ] is non-singular, and there is no perfect colinearity in X. Suppose
further that Rβ = c, where R is a p × (k + 1) matrix such that the rows of R are linearly independent.

3
Let (Y1 , X1 ), . . . , (Yn , Xn ) be an i.i.d. sample from the distribution of (Y, X). Define the constrained
least squares (CLS) estimator of β, β̃n , as the solution to
1 X
min (Yi − Xi′ b)2 .
b∈Rk+1 :Rb=c n
1≤i≤n

(a) Consider the Lagrangian


11 X
L(β, λ) = (Yi − Xi′ b)2 + λ′ (Rb − c) .
2n
1≤i≤n

1 ∂
(The 2 out front just makes the algebra work out a bit more nicely.) Compute ∂β L(β, λ) and

∂λ L(β, λ). Let β̃n and λ̃n be such that these two derivatives are equal to zero.
(b) Show that
  −1 −1
1 X  
λ̃n = R  Xi Xi′  R′  Rβ̂n − c ,
 
n
1≤i≤n

where β̂n is the OLS estimator of β.


P
(c) Show that λ̃n → 0 when Rβ = c.

(d) Derive the asymptotic distribution of nλ̃n when Rβ = c.
(e) Use the preceding exercises to suggest a test based on the distance of λ̃n to zero for H0 : Rβ = c
versus H1 : Rβ ̸= c at level α. Show that your test is consistent in level.
(f) How does your test compare with the Wald test studied in class?

12. (Frisch Bounds) Let (Y, X, U ) satisfy

Y = β0 + β1 X + U ,

where Y , X and U all take values in R. Suppose E[XU ] = E[U ] = 0. Suppose X is unobserved, but
X̂ = X + V is observed, where E[V ] = E[XV ] = E[U V ] = 0.

(a) Show that

Var[X̂] = Var[X] + Var[V ]


Var[Y ] = β12 Var[X] + Var[U ]
Cov[X̂, Y ] = β1 Var[X] .

(b) If β1 ≥ 0, show that


Cov[X̂, Y ] Var[Y ]
≤ β1 ≤ .
Var[X̂] Cov[X̂, Y ]
Interpret the upper and lower bounds in terms of coefficients from a regression.
(c) Derive an analogous result for the case when β1 ≤ 0.

(Generalizations of this result are developed in Klepper and Leamer (1984).)

You might also like