0% found this document useful (0 votes)
131 views

Assignment 2

This document provides instructions and questions for Assignment 2, which is due on February 14, 2024. It includes three questions related to analyzing relationships between variables using regression models. Question 1 involves estimating a regression model relating academic costs per student and enrollment for universities. Question 2 examines relationships between wages and education using a dataset on individuals. Question 3 considers a regression model relating wages and education, estimating various regression diagnostics.

Uploaded by

businesshtr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
131 views

Assignment 2

This document provides instructions and questions for Assignment 2, which is due on February 14, 2024. It includes three questions related to analyzing relationships between variables using regression models. Question 1 involves estimating a regression model relating academic costs per student and enrollment for universities. Question 2 examines relationships between wages and education using a dataset on individuals. Question 3 considers a regression model relating wages and education, estimating various regression diagnostics.

Uploaded by

businesshtr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Assignment 2 Template

Kimberly Boswell

2024-02-01

This is based on Weeks 3, 4 and 5 information and is due Wednesday, February 14, 2024. If you’ve opted
to work in a group, submit ONE document per group. (Consider using a different color for your answer)

QUESTION 1
3.8 (from the text)
Using 2011 data on 141 U.S. public research universities, we examine the relationship between cost per
student and full-time university enrollment. Let ACA = real academic cost per student (thousands of
dollars), and let FTESTU = full-time student enrollment (thousands of students). The least squares fitted
relation is ACA
[ = 14.656 + 0.266F T EST U

(a) For the regression, the 95% interval estimate for the intercept is [10.602, 18.710]. Calculate the standard
error of the estimated intercept.
(b) From the regression output, the standard error for the slope coefficient is 0.081. Test the null hypothesis
that the true slope β2 , is 0.25 (or less) against the alternative that the true slope is greater than 0.25
using the 10% level of significance. Show all steps of this hypothesis test, including the null and
alternative hypotheses, and state your conclusion
(c) On the regression output, the automatically provided p-value for the estimated slope is 0.001. What
is the meaning of this value? Use a sketch to illustrate your answer
(d) A member of the board of supervisors states that ACA should fall if we admit more students. Using
the estimated equation and the information in parts (a)–(c), test the null hypothesis that the slope
parameter β2 is zero, or positive, against the alternative hypothesis that it is negative. Use the 5% level
of significance. Show all steps of this hypothesis test, including the null and alternative hypotheses,
and state your conclusion. Is there any statistical support for the board member’s conjecture?
(e) In 2011, Louisiana State University (LSU) had a full-time student enrollment of 27,950. Based on the
estimated equation, the least squares estimate of E(ACA|F T EST U = 27, 950) is 22.079, with standard
error 0.964. The actual value of ACA for LSU that year was 21.403. Would you say that this value is
surprising or not surprising? Explain.

QUESTION 2
Using the cps5_small dataset,

(a) If we estimate the regression W AGE = β1 + β2 EDU C + e for individuals living in a metropolitan area,
where M ET RO = 1, is there a statistically significant positive relationship between expected wages
and education at the 1% level for these individuals? Clearly state the test statistic used, the rejection
region, and the test p-value. What do you conclude? How much of an effect is there and what does it
mean?

1
(b) Estimate the elasticity of expected WAGE with respect to EDUC, evaluated at the sample means.
Construct a 95% interval estimate for the elasticity, treating the sample means as if they are given
(not random) numbers. What is the interpretation of the interval?
(c) Test the null hypothesis that the elasticity, calculated in part (b), is one against the alternative that
the elasticity is not one. Use the 1% level of significance. Clearly state the test statistic used, the
rejection region, and the test p-value. What do you conclude?
(d) Estimate the quadratic regression W AGE = α1 + α2 EDU C 2 + e and discuss the results. Estimate the
marginal effect of another year of education on wage for a person with 12 years of education and for a
person with 16 years of education. Compare these values to the estimated marginal effect of education
from the linear model.
(e) Plot the fitted linear model from (a) and the fitted values from the quadratic model from part (e) in
the same graph with the scatterplot of WAGE and EDUC. Which model appears to fit the data better?
Does your R2 align with this deduction?
(f) Estimate the expected wage, E(W AGE|EDU C) = b1 + b2 EDU C, for an individual with 16 years of
education. Construct a 95% interval estimate of the expected wage. Describe your interval estimate
to a general audience.
(g) Locate individuals in the sample with 16 years of education. Calculate the sample mean (average) of
their wages. Is the sample average of the wages for individuals with EDUC = 16 compatible with the
result in part (f)? Explain.
(h) Construct a residual histogram of the regression in (a) and carry out the Jarque–Bera test for normality.
Is it more important for the variables EDUC and WAGE to be normally distributed, or that the random
error e be normally distributed? Explain your reasoning.

QUESTION 3
Consider the regression model W AGE = β1 + β2 EDU C + e. WAGE is hourly wage rate in U.S. 2013 dollars.
EDUC is years of education attainment, or schooling. The model is estimated using individuals from an
urban area.

AGE = −10.76 + 2.46965EDU C,


W\ N = 986
(se) (2.27) (0.16)

(a) The sample standard deviation of WAGE is 15.96 and the sum of squared residuals from the regression
above is 199,705.37. Compute R2
(b) Using the answer to (a), what is the correlation between WAGE and EDUC? [Hint: What is the
correlation between WAGE and the fitted value W\ AGE]
(c) The sample mean and variance of EDUC are 14.315 and 8.555, respectively. Calculate the leverage of
observations with EDUC = 5, 16, and 21. Should any of the values be considered large?
(d) Omitting the ninth observation, a person with 21 years of education and wage rate $30.76, and re-
estimating the model we find σ̂ = 14.25 and an estimated slope of 2.470095. Calculate DFBETAS for
this observation. Should it be considered large?
(e) For the ninth observation, used in part (d), DFFITS = -0.0571607. Is this value large? The leverage
value for this observation was found in part (c). How much does the fitted value for this observation
change when this observation is deleted from the sample?
(f) For the ninth observation, used in parts (d) and (e), the least squares residual is -10.18368. Calculate
the studentized residual. Should it be considered large?

You might also like