0% found this document useful (0 votes)

9 views

2024spring 340 Final

The document outlines the structure and rules for the SP24 STAT340 Final exam, including sections for multiple choice and short answer questions. It covers various statistical concepts such as regression analysis, probability, and model selection. Students are required to show work for computations and adhere to specific instructions for answering questions.

Uploaded by

Margita Kon-Popovska

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

2024spring 340 Final

Uploaded by

Margita Kon-Popovska

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

SP24 STAT340 Final

MC1-3 (/6) MC4-7 (/8) MC8-10 (/6) SA1 (/4) SA2 (/4) SA3 (/4) SA4 (/4) SA5 (/4) Total (/40)

First (given) name:

Write here: ______________________________________________________________________

Last (family) name:

Write here: ______________________________________________________________________

Lecture section:

Circle one: Bi’s section Brian’s section Yongyi’s section

Rules:
You must show work for all computations (unless otherwise specified) to receive full credit.
You do NOT need to simplify any expressions you write down.
Note some of the multiple choice are choose ONE and some are choose ALL that apply, please pay attention to
the instruction and select the appropriate number of responses!
Multiple Choice 2pts each
NOTE: For each multiple choice question below, choose ONE means there is EXACTLY one right answer, choose ALL
means there is AT LEAST one right answer.

MC1
Let X̄ and S 2 be the mean and variance of a sample Xi drawn from some distribution X . The expression θ̂ = X̄ /S 2 would
make a good estimator for the parameter of the distribution if X followed which of the following distributions? Choose ALL
that apply!

a. Geometric
b. Poisson
c. Exponential
d. Binomial (with known n)
e. None of the above

MC2
X1 , … , Xn are an independent random sample from a population. As sample size increases, which of the following
statistics will tend to decrease? Choose ALL that apply!

a. S 2
b. X(1) , the sample minimum
c. SE(X̄ )
d. Width of a 95% confidence interval for μ
e. The width of the sample range

MC3
Let A, B be events for some random variable X . Suppose that A ⊂ B, in other words A is a subset of B (every outcome in
A is also in B) but A ≠ B . The Venn diagram would look something like this (see diagram to right)

B
Which of the following is DEFINITELY true? Choose ONE!

a. P(A | B) > P(B | A) A

b. P(B | A) > P(A | B)
c. P(A | B) = P(A)
d. P(A | B) = P(B)
e. None of the above

MC4
Which of the following is MOST problematic for a multiple regression model with response Y and predictors X1 , X2 ?
Choose ONE!

a. Non-normality in Y
b. Non-normality in X1 or X2
c. Correlation between Y and X1 or between Y and X2
d. Correlation between X1 and X2
e. High variance in X1 or X2
MC5
A regression model is fit to a data set predicting Y from three predictors X1 , X2 and X3 . The following residual diagnostic
plots are produced afterwards:

According to the plots above, which of the following assumptions of linear regression show evidence of NOT being
satisfied? Choose ALL that apply!

a. Normality of errors
b. Zero-mean of errors
c. Homoscedasticity (constant variance) of errors
d. Independence of errors
e. ALL the assumptions above show evidence of NOT being satisfied

MC6
The following is the output of a multiple linear regression fit. Let α = 0.05 as usual. Which of the following statements are
true? Choose ALL that apply!

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 6.532 3.273 1.996 0.0499 *
## x1 2.030 1.090 1.862 0.0668 .
## x2 1.722 1.124 1.532 0.1300
##
## Residual standard error: 10.24 on 69 degrees of freedom
## Multiple R-squared: 0.08399, Adjusted R-squared: 0.05744
## F-statistic: 3.163 on 2 and 69 DF, p-value: 0.04848

a. The slope of predictor x1 , β 1 , is significantly different from 0

b. The slope of predictor x2 , β 2 , is significant different from 0
c. The intercept β 0 is significant different from 0
d. The overall model is significantly better than a null model
e. None of the above

MC7
In a logistic regression model, if the fitted probability Y ̂ i > T we will predict the classification as 1. If we increase T , what
will DEFINITELY be true? Choose ALL that apply!

a. Type I error rate will increase

b. Type II error rate will increase
c. Power of the test will increase
d. AIC of the model will increase
e. R2 will increase
MC8
The following figure show 4 different binary-response datasets (X1 , Y1 ) , (X2 , Y2 ) , (X3 , Y3 ) , (X4 , Y4 ) with the same sample
size fitted with 4 different logistic regressions.

Answer each of the following with either Y1, Y2, Y3, Y4, or N/S for EITHER Not enough information to determine OR too
Similar to tell based on the plot alone.

a. Which fit shows the most significant relationship? _______

b. Which fit shows the least significant relationship? _______
c. Which fit gives the maximum value for β̂ 1 ? _______
d. Which fit gives the maximum value for β̂ 0 ? _______

MC9
Which of the following are valid ways of performing model selection if you wish to compare multiple different possible
models? Choose ALL that apply!

a. Try to minimize the RSS (i.e. the loss function)

b. Try to maximize the R2
c. Try to minimize the RSE (i.e. the residual standard error)
d. Try to maximize the correlation coefficient
e. Try to minimize the AIC (i.e. the Akaike information criterion)

MC10
Which of the following techniques AUTOMATICALLY does variable selection for you? Choose ALL that apply!

a. Stepwise model fitting

b. K-fold cross validation
c. Ridge regression
d. LASSO regression
e. None of the above
Short Answer 4pts each
SA1
A simple soil test can reveal the presence of lead. Contamination is considered > 100 ppm. We assume that the land is not
contaminated unless the soil test shows evidence of lead. The soil test has a power of 90% and a type 1 error rate of 5%.
Suppose that in this region of Wisconsin there is a 1% chance that land is contaminated with lead.

a. Write out an expression (plug in the necessary numbers, but do NOT simplify) for the probability that the land is
contaminated given a positive test result.
b. Write out an expression for the probability that the land is contaminated given a negative test result.
SA2
On a recent data collection trip you were sampling soil pH in a plot of land that you are considering purchasing. You would
like to estimate the mean soil pH, and model soil pH as a normally distributed random variable. You collect the following
data (summarized in a histogram)

After returning home, you realize the measurement tool was calibrated incorrectly and only recorded a maximum
measurement of 7.3! Any soil pH above 7.3 was simply recorded as 7.3. You would like to still be able to use the data but of
the 205 data measurements there are 15 “7.3”s and you know those are not reliable.

You are still comfortable assuming that (ignoring the calibration error) the rest of the soil pH measurements can be modeled
as being normally distributed, but you have to consider the following questions:

a. What would be the problem with using X̄ to estimate μ?

b. What other point estimator could you for μ? Justify your answer.
c. What would be the problem with using s , the sample standard deviation, to estimate σ ?
d. What other estimator could be used to estimate σ ? Hint: You can use the Empirical Rule to find at least one, but there
are other acceptable responses
SA3
Consider a simple linear regression of two variables Yi and Xi . Suppose you decide to standardize your data before fitting
the linear model:

Yi − Ȳ Xi − X̄
Z Z
Y = and X =
i i
SY SX

This is basically analogous to applying the z -score formula, where we simply subtract out the sample mean and divide by
the sample standard deviation. Note this this a linear transformation of each variable, and results in XiZ and YiZ both having
mean 0 and standard deviation 1. Then, the standardized regression model is

Z Z
Y = β0 + β1 X + ϵi
i i

For each of the following, decide if the statement is TRUE or FALSE and explain why. Note: you MUST give sufficient
justification for full credit!

a. β̂ 0 = 0
b. The standard error of the residuals would stay the same compared to the unstandardized model
c. β̂ 1 = r xy
d. R2 may decrease due to standardizing data
SA4
The following output shows the result of a logistic regression fit for the Titanic passengers dataset, where we fit the
probability of survival for a given passenger on the predictors Fare, Age, Sex, and an interaction term (Fare and Age are
numeric variables, while Sex is a categorical variable that can be “Female” or “Male”)

## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.315636 0.319388 0.988 0.32303
## Fare 0.012480 0.002708 4.608 4.06e-06 ***
## Age 0.012747 0.010832 1.177 0.23925
## Sexmale -1.306316 0.413719 -3.157 0.00159 **
## Age:Sexmale -0.037645 0.013740 -2.740 0.00615 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

a. What’s the most statistically significant predictor (not counting the intercept)? Interpret it in context.
b. Write down an expression for a 95% confidence interval for the Sex coefficient and interpret it in context (you can
keep it as a log odds).
c. Holding all else constant, for a female passenger, by what factor would the ODDS of survival change if the
passenger’s age was increased by 10 years?
d. What would you predict to be the PROBABILITY of survival for a 35-year old male passenger who paid $8 for fare?
Note: You do NOT need to simplify any expressions! Also, you may use intermediate variables if you wish!
SA5
Suppose k ≤ n (where n is the sample size). For each of the following, decide if the statement is TRUE or FALSE and
explain why. Note: you MUST give sufficient justification for full credit!

a. The goal of cross-validation is to estimate the training error.

b. Repeated application of leave-one-out cross-validation will always produce the same estimation of error.
c. For any k , repeated application of k -fold cross-validation will always produce the same estimation of error.
d. Leave-one-out cross-validation is equivalent to k -fold cross validation when k is equal to the total number of
observations.

Stat 101 Sample Final Exam
50% (2)
Stat 101 Sample Final Exam
6 pages
AP Statistics Practice Exam
50% (2)
AP Statistics Practice Exam
13 pages
Multiple Choice Test Bank Questions No Feedback - Chapter 3
100% (1)
Multiple Choice Test Bank Questions No Feedback - Chapter 3
5 pages
Ig 136 en 11
No ratings yet
Ig 136 en 11
84 pages
Crimp Table
No ratings yet
Crimp Table
2 pages
Internship Report (Waseem)
No ratings yet
Internship Report (Waseem)
66 pages
340-s23-final
No ratings yet
340-s23-final
7 pages
Econ205 Final Ans
No ratings yet
Econ205 Final Ans
7 pages
23fall 340 Final
No ratings yet
23fall 340 Final
9 pages
PSY Final Exam PDF
No ratings yet
PSY Final Exam PDF
4 pages
Stat Practice
No ratings yet
Stat Practice
7 pages
STAT - FINAL EXAM-MACARAMBON,RASMIA RUFFAIDA
No ratings yet
STAT - FINAL EXAM-MACARAMBON,RASMIA RUFFAIDA
4 pages
ECON3334 Midterm Fall2023 Question
No ratings yet
ECON3334 Midterm Fall2023 Question
7 pages
MS2301
No ratings yet
MS2301
7 pages
4th Quarter Exam
No ratings yet
4th Quarter Exam
6 pages
Practice MC Questions Solutions-Statistics PDF
No ratings yet
Practice MC Questions Solutions-Statistics PDF
7 pages
SBBB
No ratings yet
SBBB
5 pages
Final Exam 102 w10 Solutions
No ratings yet
Final Exam 102 w10 Solutions
14 pages
Practice Exams 3
No ratings yet
Practice Exams 3
15 pages
ML U3 MCQ
No ratings yet
ML U3 MCQ
20 pages
ADA MCQs
No ratings yet
ADA MCQs
11 pages
MOCK-EXAM
No ratings yet
MOCK-EXAM
12 pages
Demo0 Sol1
No ratings yet
Demo0 Sol1
5 pages
Intermediate Statistics Sample Test 1
0% (3)
Intermediate Statistics Sample Test 1
17 pages
Review 2 K49
No ratings yet
Review 2 K49
4 pages
Multiple Choice Test Bank Questions No Feedback - Chapter 4: y + X + X + X + U
No ratings yet
Multiple Choice Test Bank Questions No Feedback - Chapter 4: y + X + X + X + U
6 pages
1 1
No ratings yet
1 1
6 pages
Assignment 5
No ratings yet
Assignment 5
6 pages
Stab22h3 A16
No ratings yet
Stab22h3 A16
46 pages
Final Exam of Statistics June 2021
No ratings yet
Final Exam of Statistics June 2021
5 pages
Appilication of Statistics in Psychology
No ratings yet
Appilication of Statistics in Psychology
16 pages
NSE BA Sample Paper With Solution
100% (1)
NSE BA Sample Paper With Solution
18 pages
Statistics Test PDF
100% (1)
Statistics Test PDF
8 pages
2101 F 17 Assignment 1
No ratings yet
2101 F 17 Assignment 1
8 pages
Mock Paper SI
No ratings yet
Mock Paper SI
5 pages
MULTIPLE CHOICE. Choose The One Alternative That Best Completes The Statement or Answers The
No ratings yet
MULTIPLE CHOICE. Choose The One Alternative That Best Completes The Statement or Answers The
13 pages
Multiple Choice Questions
100% (2)
Multiple Choice Questions
5 pages
Chapter 1 and 2 Mcqs Econometrics
No ratings yet
Chapter 1 and 2 Mcqs Econometrics
10 pages
MAS 132 - Statistics II
No ratings yet
MAS 132 - Statistics II
6 pages
Stat 1st Quarter Exam 46
No ratings yet
Stat 1st Quarter Exam 46
34 pages
PDF Test Bank Questions Chapters 1 and 2 - Compress
No ratings yet
PDF Test Bank Questions Chapters 1 and 2 - Compress
3 pages
RM Questions
No ratings yet
RM Questions
4 pages
Eco220y Au18
No ratings yet
Eco220y Au18
25 pages
Intermediate Statistics Test Sample 2
0% (1)
Intermediate Statistics Test Sample 2
19 pages
Intermediate Statistics Test Sample 2
100% (1)
Intermediate Statistics Test Sample 2
19 pages
Equation Cda
No ratings yet
Equation Cda
14 pages
Model of Questions on Regression Analysis - Linear - Binary - Multinomial (1) (1)
No ratings yet
Model of Questions on Regression Analysis - Linear - Binary - Multinomial (1) (1)
6 pages
ECON 6001 Assignment1 2023
No ratings yet
ECON 6001 Assignment1 2023
9 pages
Statistics 500: Midterm 1 Name
No ratings yet
Statistics 500: Midterm 1 Name
6 pages
BA 182 Regression MC Samplex With Answer
No ratings yet
BA 182 Regression MC Samplex With Answer
4 pages
Multiple Choice Sample Questions
No ratings yet
Multiple Choice Sample Questions
3 pages
1. AP Stat~ AP Exam 2002
No ratings yet
1. AP Stat~ AP Exam 2002
50 pages
STA302 Mid 2010F
No ratings yet
STA302 Mid 2010F
9 pages
Statistic Manag. Quiz
No ratings yet
Statistic Manag. Quiz
9 pages
Stab22h3 m17
No ratings yet
Stab22h3 m17
26 pages
AP Statistics Final Exam Review
100% (2)
AP Statistics Final Exam Review
11 pages
BMTH202 Sample Final PDF
100% (1)
BMTH202 Sample Final PDF
5 pages
Anser Key 100 Marks Test
No ratings yet
Anser Key 100 Marks Test
20 pages
Sample of Final Exam PDF
No ratings yet
Sample of Final Exam PDF
5 pages
Metrics Aug 2023
No ratings yet
Metrics Aug 2023
10 pages
ECON3334 - Mock Midterm - Fall2024
No ratings yet
ECON3334 - Mock Midterm - Fall2024
4 pages
Chapter 4
No ratings yet
Chapter 4
3 pages
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Economics Term I
No ratings yet
Economics Term I
9 pages
Nexans MV 11kV 33kV Cable Terminations - Slip On and Cold Shrink PDF
No ratings yet
Nexans MV 11kV 33kV Cable Terminations - Slip On and Cold Shrink PDF
25 pages
(Computing Supplement 11) Dr. K. Daniilidis (Auth.), Prof. Dr. W. Kropatsch, Prof. Dr. R. Klette, Prof. Dr. F. Solina, Prof. Dr. R. Albrecht (Eds.) - Theoretical Foundations of Computer Vision-Springe
No ratings yet
(Computing Supplement 11) Dr. K. Daniilidis (Auth.), Prof. Dr. W. Kropatsch, Prof. Dr. R. Klette, Prof. Dr. F. Solina, Prof. Dr. R. Albrecht (Eds.) - Theoretical Foundations of Computer Vision-Springe
259 pages
Seismic Analysis and Structural Design of Multi-Storied RCC Framed Commercial Building
No ratings yet
Seismic Analysis and Structural Design of Multi-Storied RCC Framed Commercial Building
55 pages
Acoustic Parameters For Speaker Verification
No ratings yet
Acoustic Parameters For Speaker Verification
16 pages
Cube House Plan GF and FF
No ratings yet
Cube House Plan GF and FF
2 pages
Computer Aided Simulation and Analysis Lab Manual - 7
100% (2)
Computer Aided Simulation and Analysis Lab Manual - 7
98 pages
ArtCam 19-2dmachining
No ratings yet
ArtCam 19-2dmachining
30 pages
How Efficient Is Naive Portfolio Diversification? An Educational Note
No ratings yet
How Efficient Is Naive Portfolio Diversification? An Educational Note
18 pages
S2024_Test 1 Assignment Improved
No ratings yet
S2024_Test 1 Assignment Improved
16 pages
Battery-Operated Controller
No ratings yet
Battery-Operated Controller
16 pages
VXC Evaporative Condenser
100% (2)
VXC Evaporative Condenser
22 pages
Earth Bar
No ratings yet
Earth Bar
2 pages
CS2233 Data Structure Assignment (2) - 1
No ratings yet
CS2233 Data Structure Assignment (2) - 1
6 pages
Airborne PUmp Motor
No ratings yet
Airborne PUmp Motor
4 pages
CP Final Spring 2018 - Solution
No ratings yet
CP Final Spring 2018 - Solution
11 pages
The Effects of Composition and Thermal Path On Hot Ductility of Forging Steels-Connolly
No ratings yet
The Effects of Composition and Thermal Path On Hot Ductility of Forging Steels-Connolly
11 pages
PDF Sound Intensity Second Edition Frank J. Fahy download
100% (1)
PDF Sound Intensity Second Edition Frank J. Fahy download
81 pages
Explanation Solubility Curve Worksheet
100% (2)
Explanation Solubility Curve Worksheet
3 pages
Math 10 Q3 LAS 3
No ratings yet
Math 10 Q3 LAS 3
4 pages
Data Analsis Using Software Tools (MS-Excel)
No ratings yet
Data Analsis Using Software Tools (MS-Excel)
22 pages
Practice Papers XII-Chemistry
No ratings yet
Practice Papers XII-Chemistry
125 pages
Binomial Theorem _ DPP 01 || Abhiyaan (Marathi)
No ratings yet
Binomial Theorem _ DPP 01 || Abhiyaan (Marathi)
3 pages
BM SI 4G Project Introduction of Solution and Delivery
No ratings yet
BM SI 4G Project Introduction of Solution and Delivery
22 pages
p05 Exercises
No ratings yet
p05 Exercises
5 pages
Alp Manual
No ratings yet
Alp Manual
57 pages

2024spring 340 Final

Uploaded by

2024spring 340 Final

Uploaded by

SP24 STAT340 Final

First (given) name:

Write here: ______________________________________________________________________

Last (family) name:

Write here: ______________________________________________________________________

Circle one: Bi’s section Brian’s section Yongyi’s section

a. P(A | B) > P(B | A) A

## Estimate Std. Error t value Pr(>|t|)

a. The slope of predictor x1 , β 1 , is significantly different from 0

a. Type I error rate will increase

a. Which fit shows the most significant relationship? _______

a. Try to minimize the RSS (i.e. the loss function)

a. Stepwise model fitting

a. What would be the problem with using X̄ to estimate μ?

a. The goal of cross-validation is to estimate the training error.

You might also like