0% found this document useful (0 votes)
13 views

Exam1Fall2022

Uploaded by

wsdh7jzqrf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Exam1Fall2022

Uploaded by

wsdh7jzqrf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

STAT 51200-FALL 2022

Midterm Exam

Name:

This examination is open-book and notes. Please, read the problems carefully and organize
your work in a reasonably neat and coherent manner, in the space provided. Be sure that
all your answers are complete. You are responsible for upholding IUPUI’s standard for
academic integrity. This includes working independently on this examination.
• Recall the notation:
X X
n X
2
SSxx = (xi − x̄n ) , SSxy = (xi − x̄n )(yi − ȳn ), SSyy = (yi − ȳn )2 .
i=1

1. A regression analysis was performed for some data on the Height (Y , in inches),
against the AGE (X , in days) of tomato plants. The researcher was told to convert
the Height measurements from centimeters to inches . This means dividing all the
Y values by the constant c = 2.54 . How will the following calculated quantities be
changed? Be specific.

b1

b0

SSE

SSR

R2

t∗ (for testing H0 : β1 = 0 )

1
2. Use the definition of SSR and show that for SLR model (with one explanatory vari-
able),
a) SSR = b21 SSxx

b) E(MSR) = σ 2 + β12 SSxx

3. In a SLR model, based on an appropriate data from n = 23 observations, you are


given the calculated value of the coefficient of determination R2 = 0.35 . Conduct
a formal test, using α = 0.01 , of the hypothesis H0 : β1 = 0 against the alternative
Ha : β1 6= 0 . Interpret your findings.

2
4. Indicate for each of the following statements, concerning the SLR model, whether it
is True or False. Give a full justification for your answer.

a) [T, F] The least square regression line always passes through the point (x̄, ȳ) .

P
b) [T, F] ei = 0 .

c) [T, F] High value of the coefficient of determination R2 = SSR/SST O , always


indicates that the estimated linear regression line is a good fit to the data.

d) [T, F] The least squares estimator b1 for the regression parameter β1 is biased,
unless the error terms, i are normally distributed.

e) [T, F] The estimator Ŷh (for E(Yh ) at a given X = xh ) is more precise (on
the average), if the value of xh is close to the average value x̄ of the independent
variable X .

3
5. The data below show the sale price Y (× $ 1,000) and the square footage X (× 100)
for a sample of n = 52 single-family homes in a small town in Indiana. The sale price
is to be treated here as the dependent variable and the SLR model is assumed to be
appropriate.

i 1 2 3 4 ... ... ... ... 52


Yi 65.9 67.4 71.6 93.1 ... ... ... 185.0
Xi 9.5 12.0 11.8 10.0 ... ... 40.0

In addition, preliminary calculations yield the following quantities:


x̄ = 20, ȳ = 100, SSxx = 16, 000, SSxy = 32, 000, SSyy = 384, 000
a) Calculate the LS estimates b0 and b1 of the regression coefficients. What is the
fitted regression line?

b) What is the fitted selling price for house #4 on the list. Is it undervalued or
over-valued by this regression model?

c) Complete the ANOVA table below and conduct a formal F test of H0 : β1 = 0


against Ha : β1 6= 0 . Use α = 0.01 . State your conclusions.

Source SS d.f. MS F

Total:

4
d) Conduct a formal test the hypothesis H0 : β1 = 2.25 against Ha : β1 > 2.25 ,
using α = 0.05 . Discuss your findings.

e) Obtain a 95% prediction interval for the predicted selling-price of a house with
1,000 square feet.

5
6. The sample data below shows for a random sample of n = 15 land-transactions, in
the same tax district, the Assessed Value of the land obtained for property for tax
purposes, (AV in $1000) and the recent Selling Price of the land in a public auction
(SP in $1000)– see below.

35
30
SP

25
20

10 11 12 13 14 15 16

AV

The statistician is interested in predicting (at a 95% ’probability’ level) the selling
price of a land-parcel with an assessed value of $13,500 that she plans to place for a
public auction. To that end, she is considering three possible models to fit the data:
I) Model I: Yi = β0 + β1 xi + i ,
II) Model II: Yi = β1 xi + i ,
III) Model III: log(Yi ) = β0 + β1 xi + i .
a) What will be her predicted selling price under each model?

b) Which of the models you think she should use and why? Give full justification
to your recommendation.

6
7. Problem 6 Revisited. Recall that the data consist of a random sample of of n = 15
land-transactions, in the same tax district, the Assessed Value of the land obtained
for property for tax purposes, (AV in $1000) and the recent Selling Price of the land
in a public auction (SP in $1000)– see below.
The statistician is interested in predicting (at a 95% ’probability’ level) the selling
price of a land-parcel with an assessed value of $13,500 that she plans to place for a
public auction. To that end, she is considering three possible models to fit the data:
I) Model I: Yi = β0 + β1 xi + i ,
II) Model II: Yi = β1 xi + i ,
III) Model III: log(Yi ) = β0 + β1 xi + i .
• Redo the problem at home, giving answers to the questions below. When you are done,
please upload your solution on Canvas by 12:00am (midnight), of Sunday, October
16, 2022 as a single PDF file with the name: FirstName LastName 512.pdf
a) What will be her prediction interval for the selling price under each model?

b) Which of the three models you think she should use and why? Give full justifi-
cation to your recommendation.

c) Is there a different model that you would like to recommend for her to use? If
so, which one and why?

7
Data and output for Problem 6 and 7

> taxes0<-read.table(’taxes.txt’, header=T, sep=","); taxes0


AS SP
1 13.9 28.6
2 16.0 34.7
3 10.3 21.0
4 11.8 25.5
5 16.7 36.8
6 12.5 24.0
7 10.0 19.1
8 11.4 22.5
9 13.9 28.3
10 12.2 25.0
11 15.4 31.1
12 14.8 29.6
13 14.9 35.1
14 12.9 30.0
15 15.8 36.2
> apply(taxes0, 2, mean)
AV SP
13.5 28.5
> apply(taxes0, 2, var)
AV SP
4.442857 31.611429

8
I) Model I: Yi = β0 + β1 xi + i
> taxes1.lm<-lm(SP ~ AV); summary(taxes1.lm)
Call:
lm(formula = SP ~ AV)
Residuals:
Min 1Q Median 3Q Max
-2.2291 -1.0667 -0.1959 0.9770 3.0417
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -5.8121 3.0650 -1.896 0.0804 .
AV 2.5416 0.2245 11.322 4.19e-08 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 1.771 on 13 degrees of freedom
Multiple R-squared: 0.9079, Adjusted R-squared: 0.9008
F-statistic: 128.2 on 1 and 13 DF, p-value: 4.187e-08

II) Model II: Yi = β1 xi + i


> taxes2.lm<-lm(SP ~ -1+AV); summary(taxes2.lm)
Call:
lm(formula = SP ~ -1 + AV)
Residuals:
Min 1Q Median 3Q Max
-2.5086 -1.6172 -0.8724 1.0767 3.5017
Coefficients:
Estimate Std. Error t value Pr(>|t|)
AV 2.12069 0.03646 58.17 <2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 1.928 on 14 degrees of freedom
Multiple R-squared: 0.9959, Adjusted R-squared: 0.9956
F-statistic: 3384 on 1 and 14 DF, p-value: < 2.2e-16

9
III) Model III: log(Yi ) = β0 + β1 xi + i .
> taxes3.lm<-lm(log(SP) ~ AV); summary(taxes3.lm)
Call:
lm(formula = log(SP) ~ AV)
Residuals:
Min 1Q Median 3Q Max
-0.06916 -0.04170 -0.01500 0.02730 0.12555
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.085146 0.107023 19.48 5.28e-11 ***
AV 0.092287 0.007839 11.77 2.63e-08 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.06182 on 13 degrees of freedom
Multiple R-squared: 0.9142, Adjusted R-squared: 0.9077
F-statistic: 138.6 on 1 and 13 DF, p-value: 2.627e-08

10

You might also like