0% found this document useful (0 votes)

22 views

Midterm 2022 Sol

Uploaded by

vanessalaucode

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

Midterm 2022 Sol

Uploaded by

vanessalaucode

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

DO NOT OPEN THIS EXAM UNTIL INSTRUCTED TO

Name: Student #:

STAT 404 Midterm Exam

September–December, 2022
Instructor: Jiahua Chen
Total marks: 68.

• Put your name and student ID on the up-right corner of every sheet.

• Correct answers are usually short. Answer questions in brief but complete sentences.
For example, if we ask: Calculate SStrt , a satisfactory answer is:
The sum of square of the treatment is given by
k
X
SStrt = ni (ȳi· − ȳ·· )2 = 4 × (5 − 3.2)2 + 6 × (2 − 3.2)2 = 21.6.
i=1

An unsatisfactory answer is:

21.6.

• Use R for simple calculations such as the sample mean and sample variance (as in the
assignments). Answers obtained using one-line R functions will not be accepted.

• Save the R code you used in a .doc, .docx, .rtf, or .txt file. Include comments
describing which question the code block is used for. Leave sufficient space between code
for different questions. Submit your code to Canvas when instructed to.

• Unless otherwise specified,

1. assume common notations and model assumptions;

2. use the conventional 5% level for tests, hypothesis for two-sided alternatives, and
95% confidence level.

1
1. [6] List the three principles of design of experiments we discussed in STAT 404.
Explain each principle in 1–2 complete sentences.

Answer.

(a) Randomization: prevents the effect of lurking factors or assigning treatments

to random experiment units (either is fine).

(b) Replication: improves the precision of estimating the treatment effects or re-
peating the same treatment on several experiment units.

(c) Blocking: removes the effect of a factor that is not of interest or grouping simi-
lar experiment units to compare different treatments under similar conditions.

2. [8] The standard two-sample t-test is formulated under strict model assumptions.

(a) [4] Name two of the model assumptions. Describe each assumption in one
sentence.

Answer. Any two of the following (or other relevant assumptions) is acceptable:

• Independence: all observations are independent of each other.

• Normality: all observations have normal distributions.

• Identical means: the two populations have the same mean.

• Identical variances: the two populations have the same variance.

• Identically distributed: all observations in the same sample have the same
distribution.

(b) [4] We recommend the Welch test when two populations have different variances.
Yet, we commented that this test is (1) mathematically invalid but (2) statistically
acceptable. Explain these two points.

Answer.

• The test is mathematically invalid because the test statistic for Welch’s test
does not have a t-distribution.

2
• The test is statistically acceptable (and widely recommended) because the dis-
tribution of Welch’s test is well-approximated by the recommended t-distribution,
which leads to null rejection probabilities close to the nominal level.

3. [20] A linear regression model assumes that the response values in a study can be
expressed as

yi = x ⊤
i β + ϵi for i = 1, 2, . . . , n ,

where x⊤ 2
i β is the expected value of yi and the ϵi ’s are iid N(0, σ ) random variables.

Use the R commands provided in the file “Midterm2022.txt” on the Canvas main
page to load the data. This file also provides a few lines of code to save time.

The dataset contains

• x1 : the assignment mark,
• x2 : the midterm mark, and
• y: the final exam mark
of n = 39 students in some course. Regard x1 and x2 as predictors and y as the
response variable.

Note: in this case, x = (x0 = 1, x1 , x2 )⊤ and β = (β0 , β1 , β2 )⊤ .

(a) [4] Obtain the least squares estimator β̂ of β.

Answer. The LSE of β is

β̂ = (X⊤ X)−1 X⊤ y = (11.9091, 0.4910, 0.4028) .

(b) [4] Estimate the error variance σ 2 (use the method given in class).

Answer. The error variance is estimated as

X
σ̂ 2 = (y − ŷ)2 /(39 − 3) = 54.59 .

3
(c) [4] Estimate the variance matrix of β̂.

Answer. The variance matrix of β̂ is estimated as

 
86.8395 −0.7029 −0.4398
 
2 ⊤ −1
Var(β̂) = σ̂ (X X) = −0.7029 0.0131 −0.0041 .
 
 
−0.4398 −0.0041 0.0104

(d) [4] Estimate the variance of β̂2 − β̂1 (both LS estimators).

Answer. The variance of β̂2 − β̂1 is estimated to be

Var(β̂2 − β̂1 ) = σ̂ 2 (X⊤ X)−1 ⊤ −1 ⊤ −1

22 + (X X)33 − 2(X X)23

= 0.0131 + 0.0104 − 2(−0.0041)

= 0.0317 .

(e) [4] Construct a two-sided, non-simultaneous 95% CI for β1 − β2 (the difference

regression coefficients for the assignment and midterm marks).
Hint: remember the general recipe for constructing CIs.

Answer. The 95% CI for β1 − β2 is given by

q
(β̂1 − β̂2 ) ± qt(0.975, 39 − 3) Var(β̂1 − β̂2 ) = 0.0882 ± 0.361 = [−0.273, 0.449] .

4. [26] Consider a hypothetical one-way layout comparing k = 6 treatments under

standard model assumptions and notations.

Use the R commands provided in the file “Midterm2022.txt” on the Canvas main
page to load the data. This file also provides a few lines of code to save time.

(a) [4] Compute the treatment sum of squares SStrt .

Answer. The mean responses of these treatments are

1.5675, 1.7450, 2.0150, 1.3800, 1.5475, 1.6375

4
and the grand mean is 1.64875. We find
6
X
SStrt = [4(ȳi − ȳ)2 ] = 0.93 .
i=1

(b) [4] Compute the error (or residual) sum of squares SSerr .

Answer. Let s2i be the sample variance for treatment i. We compute it as

6
X
SSerr = 3 s2i = 0.0822 .
i=1

Answer.

Source DF SS MSS F
Treatment 5 0.9304 0.1861 40.7367
Error 18 0.0822 0.00457
Total 23 1.0127

(d) [4] Test the hypothesis that all treatment means are equal at the 10% level.
State the null and alternative hypotheses, the test statistic and its reference distri-
bution, and your conclusions.

Answer. The null hypothesis is that all treatment means are equal, i.e.,

H0 : τ1 = · · · = τ6 .

The alternative is that at least two of them are not equal, i.e.,

H1 : τi ̸= τj , i ̸= j .

The test statistic (and its value) is

MSStrt 0.1861
Fobs = = = 40.73 .
MSSerr 0.0046
The reference distribution is F with degrees of freedom (5, 18). The p-value of the
test is
p = 1 − pf(40.73, 5, 18) = 3.38 × 10−9

5
which is below nominal level 0.1. We reject the null hypothesis and conclude that
at least two means are not equal.

(e) [4] Estimate the 6 treatment effects and the error variance (i.e., τ̂j and σ̂ 2 ). Use
complete sentences.

Answer. The estimated effects were calculated in a previous question and are
(τ̂i : i = 1, 2, . . . , 6)

−0.081, 0.096, 0.366, − 0.269, − 0.101, − 0.011 .

The estimated error variance is

s2 = MSSerr = 0.00457 .

(f ) [6] Construct simultaneous 90% CIs for the mean differences using Tukey’s
method. Pretend that you are computing all simultaneous CIs but show only the
first 3 (1 vs 2 ; 1 vs 3 ; 2 vs 3 ) in writing.

Answer. The mean differences are estimated as

(ȳ1 − ȳ2 , ȳ1 − ȳ3 , ȳ2 − ȳ3 ) = (−0.1775, −0.4475, −0.2700) .

The estimated error standard deviation is

p
s = MSSerr = 0.0676 .

Tukey’s 90% quantile is given by

q = qtukey(0.9, 6, 18) = 3.9836 .

We have s
qs 1 1
√ + = 0.1347 .
2 4 4
Hence, the 90% simultaneous CIs are

1 vs 2 : (−0.312, −0.043) ,

1 vs 3 : (−0.582, −0.313) ,

2 vs 3 : (−0.405, −0.135) .

6
5. [8] The two-sample problem is a special case of the one-way layout. You may find
the following formulas helpful for this question:
n1
X n2
X
2
SStot = (y1j − ȳ·· ) + (y2j − ȳ·· )2 ,
j=1 j=1

SStrt = n1 (ȳ1· − ȳ·· )2 + n2 (ȳ2· − ȳ·· )2 ,

n1
X n2
X
2
SSerr = (y1j − ȳ1· ) + (y2j − ȳ2· )2 .
j=1 j=1

(a) [4] Suppose µ1 ̸= µ2 . Compute E [(ȳ1· − ȳ2· )2 ].

Answer. By independence and the relationship E[X 2 ] = Var(X) + E2 [X], we have

σ2 σ2
E (ȳ1· − ȳ2· )2 = + (µ1 − µ2 )2 .

+
n1 n2

(b) [4] Prove the formula for the decomposition of the sum of squares:

SStot = SStrt + SSerr .

Remark: while this is not a bonus question, do not start this problem unless you
have extra time.

x2i = − x̄)2 + nx̄2 , we get

P P
Answer. Using the well-known fact i i (xi

n1
X n1
X
2
(y1j − ȳ·· ) = {(y1j − ȳ1· ) + (ȳ1· − ȳ·· )}2
j=1 j=1
Xn1
= (y1j − ȳ1· )2 + n1 (ȳ1· − ȳ·· )2 .
j=1

For the same reason, we have

n2
X n2
X
2
(y2j − ȳ·· ) = (y2j − ȳ2· )2 + n2 (ȳ2· − ȳ·· )2 .
j=1 j=1

We get the desired identity by summing up the two sides. This completes the proof.

George G. Judge, William E. Griffiths, R. Carter Hill, Helmut Lütkepohl, Tsoung-Chao Lee-The Theory and Practice of Econometrics (Wiley Series in Probability and Statistics) - Wiley (1985)
60% (5)
George G. Judge, William E. Griffiths, R. Carter Hill, Helmut Lütkepohl, Tsoung-Chao Lee-The Theory and Practice of Econometrics (Wiley Series in Probability and Statistics) - Wiley (1985)
1,033 pages
Logistic Regression Quiz: Pandas Version: 1.0.5 Seaborn Version: 0.10.1 Matplotlib Version: 3.2.1 Sklearn Version: 0.23.1
50% (2)
Logistic Regression Quiz: Pandas Version: 1.0.5 Seaborn Version: 0.10.1 Matplotlib Version: 3.2.1 Sklearn Version: 0.23.1
1 page
Stat 151 - Final Review
No ratings yet
Stat 151 - Final Review
15 pages
Midterm 2023 Sol
No ratings yet
Midterm 2023 Sol
10 pages
Midterm2021R1 Sol PDF
No ratings yet
Midterm2021R1 Sol PDF
13 pages
Formula Sheet
No ratings yet
Formula Sheet
6 pages
Lecture 2: Completely Randomised Designs: Example 1
No ratings yet
Lecture 2: Completely Randomised Designs: Example 1
25 pages
Introduction To Multiple Regression
No ratings yet
Introduction To Multiple Regression
36 pages
Statistics 2nd Sem Numerical Solutions
No ratings yet
Statistics 2nd Sem Numerical Solutions
11 pages
Answers For Homework #2: 1 Theoretical Exercises
No ratings yet
Answers For Homework #2: 1 Theoretical Exercises
7 pages
Formula Sheet For Statistics
No ratings yet
Formula Sheet For Statistics
43 pages
Assign3 Sol PDF
No ratings yet
Assign3 Sol PDF
7 pages
Fin 04
No ratings yet
Fin 04
15 pages
Lecture set 4
No ratings yet
Lecture set 4
39 pages
BRM questions for practice
No ratings yet
BRM questions for practice
5 pages
Assignment Econometrics
No ratings yet
Assignment Econometrics
7 pages
De Vera Juanito Jr. P. Practice Exercise T Test
No ratings yet
De Vera Juanito Jr. P. Practice Exercise T Test
5 pages
Model Solution_econ f241 Mid (1)
No ratings yet
Model Solution_econ f241 Mid (1)
3 pages
Chapter 7: BIOSTATISTICS
No ratings yet
Chapter 7: BIOSTATISTICS
19 pages
Qam Ii - Ps 3 Ans
No ratings yet
Qam Ii - Ps 3 Ans
8 pages
Prob-stat - 241 FE CC Copy (1)
No ratings yet
Prob-stat - 241 FE CC Copy (1)
5 pages
Statistics GIDP Ph.D. Qualifying Exam Methodology: January 10, 9:00am-1:00pm
No ratings yet
Statistics GIDP Ph.D. Qualifying Exam Methodology: January 10, 9:00am-1:00pm
20 pages
2101 F 17 Assignment 1
No ratings yet
2101 F 17 Assignment 1
8 pages
FormulaSheet FinalExam
No ratings yet
FormulaSheet FinalExam
8 pages
Midterm GR5412 2019 PDF
No ratings yet
Midterm GR5412 2019 PDF
2 pages
Assignment 07 Summer 2021
No ratings yet
Assignment 07 Summer 2021
3 pages
Topic 6
No ratings yet
Topic 6
27 pages
Old Exam-Dec PDF
No ratings yet
Old Exam-Dec PDF
6 pages
STA 302 / 1001 - Summer 2010 Term Test
No ratings yet
STA 302 / 1001 - Summer 2010 Term Test
9 pages
2017dec_02402_solution_en
No ratings yet
2017dec_02402_solution_en
45 pages
TCH442E Quantitative Methods For Finance
No ratings yet
TCH442E Quantitative Methods For Finance
21 pages
QT2_23 EndTermSolution
No ratings yet
QT2_23 EndTermSolution
6 pages
(EMPTY) - Practice Test 2.5
No ratings yet
(EMPTY) - Practice Test 2.5
16 pages
Hypothetical Test: Test of Mean With Known Variance Test of Mean With Unknown Variance Test of Equality of Two Variances
No ratings yet
Hypothetical Test: Test of Mean With Known Variance Test of Mean With Unknown Variance Test of Equality of Two Variances
7 pages
Final - Answers Stud
100% (1)
Final - Answers Stud
11 pages
AMS 315 Final Examination Solution F2019B PDF
No ratings yet
AMS 315 Final Examination Solution F2019B PDF
16 pages
Statistics TA CHP 13 Experimental Design and ANOVA-2
No ratings yet
Statistics TA CHP 13 Experimental Design and ANOVA-2
62 pages
I3 Sta2 MX 25122019
No ratings yet
I3 Sta2 MX 25122019
7 pages
STAT501 Online FinalExam Fall2024
No ratings yet
STAT501 Online FinalExam Fall2024
14 pages
WS4 T-Test of Two Independent Samples
No ratings yet
WS4 T-Test of Two Independent Samples
4 pages
232 Final CC - Thi
No ratings yet
232 Final CC - Thi
12 pages
Psych Stat Problem and Answers
No ratings yet
Psych Stat Problem and Answers
3 pages
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
No ratings yet
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
4 pages
QUESTION 1 (3 + 12 + 5 = 20 marks) :, … ,Y Y μ and V Y σ
No ratings yet
QUESTION 1 (3 + 12 + 5 = 20 marks) :, … ,Y Y μ and V Y σ
4 pages
Stats Formulas &tables
No ratings yet
Stats Formulas &tables
21 pages
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
No ratings yet
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
3 pages
568 Asst 1
No ratings yet
568 Asst 1
4 pages
Solution For Assignment # 2 Sta 5206, 5126 & 4202
No ratings yet
Solution For Assignment # 2 Sta 5206, 5126 & 4202
27 pages
Lab Exercises Answer
No ratings yet
Lab Exercises Answer
13 pages
Scott and Watson CHPT 4 Solutions
No ratings yet
Scott and Watson CHPT 4 Solutions
4 pages
Sheet10 Solution
No ratings yet
Sheet10 Solution
14 pages
WS4 T-Test of Two Independent Samples
No ratings yet
WS4 T-Test of Two Independent Samples
3 pages
Exercises - (Activity #6)
No ratings yet
Exercises - (Activity #6)
15 pages
October 25, 2011
No ratings yet
October 25, 2011
27 pages
Solutions Stat CH 5
No ratings yet
Solutions Stat CH 5
4 pages
Paper II
No ratings yet
Paper II
129 pages
Chapter 7
No ratings yet
Chapter 7
26 pages
Statistics Problem Set
No ratings yet
Statistics Problem Set
6 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Introduction to Calculus
From Everand
Introduction to Calculus
Joan Van Glabek
4.5/5 (8)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Peasaran Et Al 2001 Bound Test and ARDL Cointegrat
No ratings yet
Peasaran Et Al 2001 Bound Test and ARDL Cointegrat
33 pages
Linear Regression in Excel
No ratings yet
Linear Regression in Excel
7 pages
ARDL
No ratings yet
ARDL
17 pages
Machine Learning Algorithm With Python Implementation
No ratings yet
Machine Learning Algorithm With Python Implementation
34 pages
Segmented Regression
No ratings yet
Segmented Regression
5 pages
Introduction To Econometrics, 5 Edition: Chapter 1: Simple Regression Analysis
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter 1: Simple Regression Analysis
26 pages
Chapter 6-Linear Regression With Multiple Regressors
No ratings yet
Chapter 6-Linear Regression With Multiple Regressors
68 pages
1999 - A Statistical Method For Practical Assessment of Sawability With Diamond Wire Cutting Machine of Ankara-Cubuk Andesites
No ratings yet
1999 - A Statistical Method For Practical Assessment of Sawability With Diamond Wire Cutting Machine of Ankara-Cubuk Andesites
4 pages
k2 - Attachments - CT Lecture 18a. Multiple Logistic Regression Model 3
No ratings yet
k2 - Attachments - CT Lecture 18a. Multiple Logistic Regression Model 3
27 pages
Association Between Variables Measured at The Interval-Ratio Level
No ratings yet
Association Between Variables Measured at The Interval-Ratio Level
23 pages
Risk Management Varsity
No ratings yet
Risk Management Varsity
26 pages
Pivot Table
No ratings yet
Pivot Table
52 pages
Feature Selection
No ratings yet
Feature Selection
6 pages
Canonical Correlation Analysis_ Uses and Interpretation -- Bruce Thompson -- Quantitative Applications in the Social Sciences, 1984 -- SAGE -- 9780585216775 -- 37c48284803515657c5d87278126b705 -
No ratings yet
Canonical Correlation Analysis_ Uses and Interpretation -- Bruce Thompson -- Quantitative Applications in the Social Sciences, 1984 -- SAGE -- 9780585216775 -- 37c48284803515657c5d87278126b705 -
137 pages
Forecasting Model
No ratings yet
Forecasting Model
15 pages
100 Employee Data Set
No ratings yet
100 Employee Data Set
7 pages
Camm 4e Ch09 PPT
No ratings yet
Camm 4e Ch09 PPT
71 pages
Chapter Three Statistical Inference in Simple Linear Regression Model
No ratings yet
Chapter Three Statistical Inference in Simple Linear Regression Model
33 pages
Linear Discriminant Analysis - Credit Card Default Analysis
No ratings yet
Linear Discriminant Analysis - Credit Card Default Analysis
7 pages
Lecture07 Dimensionality Pca
No ratings yet
Lecture07 Dimensionality Pca
9 pages
ANOVA
No ratings yet
ANOVA
4 pages
Linear Regression Analysis
100% (3)
Linear Regression Analysis
53 pages
UE21CS342AA2 - Unit-1 Part - 3
No ratings yet
UE21CS342AA2 - Unit-1 Part - 3
90 pages
Cross Section: Combination Pooled Data Heteroscedasticity
No ratings yet
Cross Section: Combination Pooled Data Heteroscedasticity
17 pages
Chapter 10 Regression Analysis
No ratings yet
Chapter 10 Regression Analysis
3 pages
Steps in Factor Analysis
No ratings yet
Steps in Factor Analysis
3 pages
Tugas Spss Korelasi Print
No ratings yet
Tugas Spss Korelasi Print
3 pages
7 Single Index Models
No ratings yet
7 Single Index Models
7 pages