0% found this document useful (0 votes)
45 views

HW7

This document contains 3 homework problems related to linear regression modeling. Problem 1 involves fitting linear regression models to predict lean body mass from other variables in a dataset containing athletic measurements. It asks the student to comment on patterns in the data, fit models, examine residuals, summarize models, and perform hypothesis tests comparing models. Problem 2 asks the student to prove properties related to the design matrix and least squares estimates for orthogonal columns. Problem 3 asks the student to show properties of the ridge regression estimator as the regularization parameter lambda approaches infinity.

Uploaded by

S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views

HW7

This document contains 3 homework problems related to linear regression modeling. Problem 1 involves fitting linear regression models to predict lean body mass from other variables in a dataset containing athletic measurements. It asks the student to comment on patterns in the data, fit models, examine residuals, summarize models, and perform hypothesis tests comparing models. Problem 2 asks the student to prove properties related to the design matrix and least squares estimates for orthogonal columns. Problem 3 asks the student to show properties of the ridge regression estimator as the regularization parameter lambda approaches infinity.

Uploaded by

S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Homework 7

1. Download these data:

http://stat.cmu.edu/~larry/=stat401/sports.txt

These data are from the R package alr4 and you can find a complete description of
the data in that package. The dataset is called ais in that package. These are data on
202 athletes. The goal is to predict lean body mass (LBM) from the other variables.
For this question, you should ignore the variables Label and Sport.
(a) Use the pairs command to plot the data. Comment on any patterns that you see.
(b) Use a linear regression model to predict LBM from Sex, Ht, Wt, RCC, WCC, Hc,
Hg, Ferr, BMI, Bfat. Comment on your residual plots.
(c) Summarize your fitted model.
(d) Find the eigenvalues of the design matrix.
(e) Construct a 90 percent confidence rectangle for all the coefficients in the model
(except the intercept).
(f) Let’s try a smaller model. Fit a linear regression to predict LBM from Sex, Ht, Wt
and RCC. Summarize the fitted model.
(g) Construct and plot a 95 percent confidence ellipsoid for Ht and Wt.
(h) Construct an F test to compare the two models that you fit. Summarize and
interpret the result of the test.
2. Let X denote the design matrix for some regression problem. Let vj denote the j th col-
umn of X. In other words X = [v1 v2 · · · vq ]. Suppose that the coluns are orthogonal.
In other words, vjT vk = 0 when j 6= k.
(a) Show that XT X is non-singular (invertible) as long as ||vj || > 0 for all j.
(b) Find an explicit expression for (XT X)−1 .
(c) Now suppose we replace one of the columns of X with a vector of all zeroes. For
example, we set: v1 = (0, 0, . . . , 0). Prove or disprove the following statement: the
least squares solution is still well-defined (in other words, it exists and is unique).

3. Consider the ridge regression estimator βbλ .


(a) Show that βbλ → 0 as λ → ∞.
(b) Consider the following modified ridge estimator:

βbλ = λ(XT X + λI)−1 XT Y.


What does this estimator converge to as λ → ∞?

You might also like