0% found this document useful (0 votes)
14 views

Chapter 9 Multiple Linear Regression

Uploaded by

ngoclannguyenduy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Chapter 9 Multiple Linear Regression

Uploaded by

ngoclannguyenduy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Multiple Linear Regression

Pham Thi Viet Huong


Contents
• An example of multiple linear regression
• Matrix Approach to Regression
• Sampling Distribution
• Significance of Regression
• Nested Models
• Simulation

Pham Thi Viet Huong - VNU-IS 2


Opening
• It is rarely the case that a dataset will have a single predictor variable
• It is also rarely the case that a response variable will only depend on a
single variable
• In this chapter, we will extend our current linear model to allow a
response to depend on multiple predictors

Pham Thi Viet Huong - VNU-IS 3


After this chapter, you will be able to:
• Construct and interpret linear regression models with more than one
predictor
• Understand how regression models are derived using matrices
• Create internal estimates and perform hypothesis tests for multiple
regression parameters
• Formulate and interpret interval estimates for the mean response under
various conditions
• Compare nested models using an ANOVA F-Test

Pham Thi Viet Huong - VNU-IS 4


An example of multiple linear regression
• Consider the dataset auto-mpg from the UCI Machine Learning Repository

• We will focus on using 2 variables, wt and year, as predictor variables


Pham Thi Viet Huong - VNU-IS 5
An example of multiple linear regression
• We would like to model the fuel
efficiency (mpg) of a car as a
function of its weight (wt) and
model year (year)
• We define the linear model 𝑌" =
𝛽% + 𝛽' 𝑥"' + 𝛽) 𝑥") + 𝜀" where 𝜀"
~𝑁 0, 𝜎 )

Find a plane!!!
Pham Thi Viet Huong - VNU-IS 6
An example of multiple linear regression

Interpretation of the coefficients


- 𝛽% is simply the mean when all the predictors are 0
- 𝛽' is the average change in miles per gallon for 1-pound increase in weight for a car of a
certain model year
- 𝛽) is the average change in miles per gallon for a 1-year increase in model year for a car of a
certain weight
Pham Thi Viet Huong - VNU-IS 7
Matrix Approach to Regression
• Consider the model
• Matrix appraoch

Pham Thi Viet Huong - VNU-IS 8


Matrix Approach to Regression
• We can estimate 𝛽 by minimizing

• After taking p derivatives, we obtain

• Or we can write as 𝑋 2 𝑋𝛽 = 𝑋 2 𝑦, hence 𝛽4 = 𝑋 2 𝑋 5'


𝑋2 𝑦

Pham Thi Viet Huong - VNU-IS 9


Matrix Approach
to Regression

Pham Thi Viet Huong - VNU-IS 10


Sampling Distribution
• Single Parameter Tests
• Confidence Intervals
• Confidence Intervals for Mean Response
• Prediction Intervals

Pham Thi Viet Huong - VNU-IS 11


Single Parameter Tests
• Test for a single 𝛽6 . 𝐻% : 𝛽6 = 0 𝐻' : 𝛽6 ≠ 0
<=25>?@ A C 5%
B
• The test statistic takes the form 𝑇𝑆 = =
=< DE FCC

• Example: recall our model for mpg. 𝑌" = 𝛽% + 𝛽' 𝑥"' + 𝛽) 𝑥") + 𝜀"
• 𝑥"': the weight of the ith car
• 𝑥"): the model year of the ith car
• Perform the test 𝐻% : 𝛽' = 0 𝐻' : 𝛽' ≠ 0

Pham Thi Viet Huong - VNU-IS 12


Single Parameter Tests

Second row!!!

By hypothesizing that 𝛽' = 0 , the null and alternative specify 2 models


𝐻% : 𝑌 = 𝛽% + 𝛽) 𝑥) + 𝜀
𝐻' : 𝑌 = 𝛽% + 𝛽' 𝑥' + 𝛽) 𝑥) + 𝜀

WE ARE TESTING IF THERE IS A RELATIONSHIP BETWEEN WEIGHT AND FUEL EFFICIENCY, GIVEN
THAT A TERM FOR YEAR IS IN THE MODEL

Pham Thi Viet Huong - VNU-IS 13


Confidence Intervals

Pham Thi Viet Huong - VNU-IS 14


Confidence Intervals for Mean Response
• The mean response 𝐸 𝑌|𝑋 = 𝑥
• In SLR, the mean of Y is only dependent on a single value x
• In multiple regression, the mean of Y depends on the value of each
predictors
• We define the vector 𝑥% should be

• Then

Pham Thi Viet Huong - VNU-IS 15


Confidence Intervals for Mean Response
• Standard error

• Hence, the confidence interval for the mean response

Pham Thi Viet Huong - VNU-IS 16


Confidence Intervals for Mean Response
EXAMPLE
• We add 2 cars, weights 3500 and 5000, in the year 76 and 81

Pham Thi Viet Huong - VNU-IS 17


Caution: hidden extrapolation
• 2 additional cars are in the range of observed value. How to check?
(check the range of 2 variables: wt and year)

One of the new cars is within the


observed values, while the other is
noticeably outside of the observed value
=> hidden extrapolation

Pham Thi Viet Huong - VNU-IS 18


Prediction Intervals
• When creating prediction intervals, we need to account for the
additional variability of an observation about its mean

Pham Thi Viet Huong - VNU-IS 19


Significance of Regression
• The decomposition of variation that we had seen in SLR still holds for
MLR

• In R Interpretation: 80.82% of the


observed variation in miles per
gallon is explained by the linear
relationship with the two predictor
variables, weight and year

Pham Thi Viet Huong - VNU-IS 20


Significance of Regression Test
• The hypothesis 𝐻% : 𝛽' = 𝛽) = ⋯ = 𝛽J5' = 0
𝐻' : at least one of 𝛽K ≠ 0

• Use F-test
• ANOVA table

Pham Thi Viet Huong - VNU-IS 21


Significance of Regression Test
• The F statistic is

• The p-value is calculated as 𝑃 𝐹N5',O5N > 𝐹


• Reject the null hypothesis for large value of F. A large F corresponds
to a large portion of the variance being explained by the regression.

Pham Thi Viet Huong - VNU-IS 22


Significance of Regression Test in R
• 2 steps:
• Specify the 2 models in R and save the results in different variables
• Use anova() to compare the 2 models
• Back to our example

F = 815.55 and p-value is


extremely low
ÞReject the null hypothesis.
ÞThe regression is significant

Pham Thi Viet Huong - VNU-IS 23


Significance of Regression Test in R
• We can also verify the sums of squares and degrees of freedom

Pham Thi Viet Huong - VNU-IS 24


Significance of Regression Test in R

Pham Thi Viet Huong - VNU-IS 25


Nested Models
• Nested models: 2 models, where one model is “nested” inside the other,
meaning one model contains a subset of the predictors from only the larger
model.
• Consider the full model
• The fitted value of this model 𝑦Q'R
• Let the null model be
where q < p
• The fitted value of this model 𝑦Q%R
• The difference between 2 models can be codified by the null hypothesis of a
test

(the 𝛽 from the full model that are not in the null model are zero)
• The resulting model, which is nested, is the null model
Pham Thi Viet Huong - VNU-IS 26
Nested Models Test
ANOVA table

Pham Thi Viet Huong - VNU-IS 27


Nested Models – Example
- Dataset autompg
- Consider 2 different models

- The null hypothesis 𝐻% : 𝛽STU = 𝛽VRWJ = 𝛽XJ = 𝛽YSS = 0


- The alternative is simply that at least one of the 𝛽6 from the null ≠ 0

Pham Thi Viet Huong - VNU-IS 28


Nested Models – Example

F statistics is 0.533 and the p-value is large, so we fail to reject the null hypothesis. Hence, the
ℎ𝑝, 𝑤𝑡, 𝑎𝑐𝑐, 𝑦𝑒𝑎𝑟 are significant with 𝑤𝑡 and 𝑦𝑒𝑎𝑟 already in the model

Pham Thi Viet Huong - VNU-IS 29


Simulation
Self-learning

Pham Thi Viet Huong - VNU-IS 30

You might also like