0% found this document useful (0 votes)

52 views

Linear Regression Analysis: Module - Ii

The document discusses linear regression analysis and provides the following key points in 3 sentences: 1) It describes how to find a joint confidence region for the slope (β1) and intercept (β0) parameters using a chi-square distribution and the F-distribution. 2) It introduces analysis of variance techniques for testing hypotheses about multiple slope parameters in multiple linear regression models. 3) Key sums of squares - total, regression, and residual - are defined based on the variation between observed and predicted values, and they each have different degrees of freedom.

Uploaded by

naruto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views

Linear Regression Analysis: Module - Ii

Uploaded by

naruto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

LINEAR REGRESSION ANALYSIS

MODULE – II
Lecture - 5

Simple Linear Regression

Analysis
Dr. Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
2

Joint confidence region for β 0 and β1

A joint confidence region for β 0 and β1 can also be found. Such region will provide a 100(1 − α )% confidence that both the
estimates of β 0 and β1 are correct. Consider the centered version of the linear regression model

yi = β 0* + β1 ( xi − x ) + ε i
s xy
where β= β 0 + β1 x . The least squares estimators of β 0* and β=
*
0 1
are b0* y=
and b1 respectively.
sxx
Using the results that E (b* ) = β * ,
0 0

E (b1 ) = β1 ,
σ2
Var (b0* ) = ,
n
σ2
Var (b1 ) = .
s xx
When σ 2 is known, then the statistic

b0* − β 0*
~ N (0,1)
σ 2

n
and

b1 − β1
~ N (0,1).
σ 2

sxx
3
Moreover, both the statistics are independently distributed. Thus
2
 
 * * 
 b0 − β 0  ~ χ12
 σ  2
 
 n 
and
2
 
 
 b1 − β1  ~ χ 2
 2  1

 σ 
 s 
 xx 

are also independently distributed because b0* and b1 are independently distributed. Consequently sum of these two

n(b0* − β o* ) 2 sxx (b1 − β1 ) 2

+ ~ χ 22 .
σ 2
σ 2

Since
SS res
~ χ n2− 2
σ 2

and SSres is independently distributed of b0* and b1 , so the ratio

 n(b0* − β 0* ) 2 sxx (b1 − β1 ) 2 
 +  2
 σ2 σ2  ~ F2,n − 2 .
 SS res 
 2  (n − 2)
 σ 
4

Substituting b0=
*
b0 + b1 x and β=
*
0 β 0 + β1 x , we get

 n − 2   Qf 
  
 2   SS res 

where
n n
Q f = n(b0 − β 0 ) + 2∑ xi (b0 − β1 )(b1 − β1 ) + ∑ xi2 (b1 − β1 ) 2 .
2

=i 1 =i 1

Since
 n − 2  Q f 
P   1−α
≤ F2,n − 2  =
 2  SS res 

holds true for all values of β 0 and β1, so the 100(1 − α )% confidence region for β 0 and β1 is

 n − 2  Qf
  ≤ F2,n − 2;α .
 2  SS res

This confidence region is an ellipse which gives the 100 (1 − α )% probability that β 0 and β1 are contained simultaneously in
this ellipse.
5
Analysis of variance

The technique of analysis of variance is usually used for testing the hypothesis related to equality of more than one
parameters, like population means or slope parameters. It is more meaningful in case of multiple regression model when
there are more than one slope parameters. This technique is discussed and illustrated here to understand the related
basic concepts and fundamentals which will be used in developing the analysis of variance in the next module in multiple
linear regression model where the explanatory variables are more than two.
A test statistic for testing H 0 : β1 = 0 can also be formulated using the analysis of variance technique as follows.

On the basis of the identity yi − yˆi = ( yi − y ) − ( yˆi − y ),

the sum of squared residuals is

n
=
S (b) ∑ ( y − yˆ )
i =1
i i
2

n n n
= ∑ ( yi − y ) + ∑ ( yˆi − yi ) 2 − 2∑ ( yi − y )( yˆi − y ).
2

=i 1 =i 1 =i 1

n n
Further consider
=i 1 =i 1
∑ ( yi − y )( yˆi − y=) ∑ ( y − y )b ( x − x )
i 1 i

n
= b12 ∑ ( xi − x ) 2
i =1
n
= ∑ ( yˆ − y ) .
i =1
i
2

n n n
Thus we have
i i i
=i 1 =i 1 =i 1
∑ ( y − y )= ∑ ( y − yˆ ) + ∑ ( yˆ − y ).
2 2
i
2

The term ∑ ( y − y)
i =1
i
2
is called the sum of squares about the mean or corrected sum of squares of y (i.e., SS corrected)
or total sum of squares denoted as syy.
6
n
The term ∑ ( y − yˆ )
i =1
i i
2
describes the deviation: observation minus predicted value, viz., the residual sum of squares, i.e.:
n
=
SS res ∑ ( y − yˆ )
i =1
i i
2

n
whereas the term ∑ ( yˆ − y )
i =1
i
2
describes the proportion of variability explained by regression,
n
=
SS reg ∑ ( yˆ − y ) .
i =1
i
2

n
If all observations yi are located on a straight line, then in this case= ∑
( yi − yˆi ) 0 and= 2
thus SScorrected SS r e g .
i =1

Note that SSreg is completely determined by b1 and so has only one degrees of freedom. The total sum of squares
n n
s yy = ∑ ( yi − y ) 2 has (n - 1) degrees of freedom due to constraint ∑ ( y − y) =
i 0 and SS res has (n - 2) degrees of
i =1 i =1

freedom as it depends on b0 and b1.

All sums of squares are mutually independent and distributed as χ df2 with df degrees of freedom if the errors are normally
distributed.
The mean square due to regression is
SS r e g
MS r e g =
1
and mean square due to residuals is
SS res
MSE =.
n−2
The test statistic for testing H 0 : β1 = 0 is

MS r e g
F0 = .
MSE
7

If H 0 : β1 = 0 is true, then MS r e g and MSE are independently distributed and thus F0 ~ F1, n − 2 .

The decision rule for H1 : β1 ≠ 0 is to reject H0 if F0 > F1,n − 2;1−α

at α level of significance. The test procedure can be described in an Analysis of Variance table.
Analysis of variance for testing H 0 : β1 = 0

Source of variation Sum of squares Degrees of freedom Mean Square

Regression SSreg 1 MSreg

Residual SSres n-2 MSE

Total syy n-1

Some other forms of SS reg , SS res and syy can be derived as follows:
The sample correlation coefficient then may be written as
sxy
rxy = .
sxx s yy
sxy s yy
Moreover, we have =
b1 = rxy .
sxx sxx

The estimator of σ 2 in this case may be expressed as

1 n 2
s2 = ∑ ei
n − 2 i =1
1
= SS res .
n−2
8

Various alternative formulations for SSres are in use as well:

n
SS res= ∑ [ y − (b
i =1
i 0 + b1 xi )]2

n
= ∑ [( y − y ) − b ( x − x )]
i =1
i 1 i
2

=s yy + b12 sxx − 2b1sxy

= s yy − b12 sxx

( sxy ) 2
= s yy − .
sxx

Using this result, we find that

SScorrected = syy
and
SS r e=
g s yy − SS res

( sxy ) 2
=
sxx
= b12 sxx
= b1sxy .
9

Goodness of fit of regression

It can be noted that a fitted model can be said to be good when residuals are small. Since SSres is based on residuals, so a
measure of quality of fitted model can be based on Ssres. When intercept term is present in the model, a measure of
goodness of fit of the model is given by

SS res
R2 = 1 −
s yy
SS r e g
= .
s yy
This is known as the coefficient of determination. This measure is based on the concept that how much variation in y’s
stated by syy is explainable by SSreg. and how much unexplainable part is contained in SSres. The ratio SSreg / syy describes
the proportion of variability that is explained by regression in relation to the total variability of y. The ratio SSres / syy
describes the proportion of variability that is not covered by the regression.

It can be seen that

R2 = r2xy.
where rxy is the simple correlation coefficient between x and y. Clearly 0 ≤ R 2 ≤ 1 , so a value of R2 closer to one
indicates the better fit and value of R2 closer to zero indicates the poor fit.

Wind Loading On Lighting Steel Column - en 40-3-1:2013 Assumptions
No ratings yet
Wind Loading On Lighting Steel Column - en 40-3-1:2013 Assumptions
12 pages
Lecture 2-4_Hypothesis Testing (Linear Regression)
No ratings yet
Lecture 2-4_Hypothesis Testing (Linear Regression)
27 pages
11
No ratings yet
11
15 pages
M10 Derive Least Squares
No ratings yet
M10 Derive Least Squares
2 pages
TEST-5 Material F - Test and Chi-Square Test PDF
No ratings yet
TEST-5 Material F - Test and Chi-Square Test PDF
12 pages
STA5328 Ramin Shamshiri HW3
No ratings yet
STA5328 Ramin Shamshiri HW3
6 pages
Maths-3 COMP Dec-2022 Paper Solution
No ratings yet
Maths-3 COMP Dec-2022 Paper Solution
25 pages
Lecture 3 Tests of Hypotheisis About Regression Coefficients
No ratings yet
Lecture 3 Tests of Hypotheisis About Regression Coefficients
10 pages
Business Stats Formulae and Tables
No ratings yet
Business Stats Formulae and Tables
7 pages
2019JUNMS
No ratings yet
2019JUNMS
12 pages
Lsreg
No ratings yet
Lsreg
6 pages
財務計量經濟學大抄_期中
No ratings yet
財務計量經濟學大抄_期中
2 pages
Maths-3 COMP MAY-22 Paper Solution
No ratings yet
Maths-3 COMP MAY-22 Paper Solution
22 pages
Ch.7 Confidence Intervals and Tests using t-Distribution
No ratings yet
Ch.7 Confidence Intervals and Tests using t-Distribution
1 page
5.2.5.soln
No ratings yet
5.2.5.soln
9 pages
Chapter 3 Econometrics (Edited)
No ratings yet
Chapter 3 Econometrics (Edited)
32 pages
Wooldridge 6e AppE IM
No ratings yet
Wooldridge 6e AppE IM
5 pages
Eng ISM Chapter 7 Probability by Mandenhall Solution
100% (1)
Eng ISM Chapter 7 Probability by Mandenhall Solution
28 pages
Brief Notes #11 Linear Regression
No ratings yet
Brief Notes #11 Linear Regression
6 pages
Indicative Solutions: Institute of Actuaries of India
No ratings yet
Indicative Solutions: Institute of Actuaries of India
9 pages
Matlab Assignment 6
No ratings yet
Matlab Assignment 6
3 pages
6038 Cheatsheet Guqiang Luo
No ratings yet
6038 Cheatsheet Guqiang Luo
3 pages
Lecture 3.2.2 [TEST SIGNIFICANCE OF DIFFERENCES ]
No ratings yet
Lecture 3.2.2 [TEST SIGNIFICANCE OF DIFFERENCES ]
8 pages
Lecture16 Module3 Anova 1
No ratings yet
Lecture16 Module3 Anova 1
10 pages
234-Lectures 3 4
No ratings yet
234-Lectures 3 4
9 pages
10PPTs-Handout Ten-Descriptive and Inferential Methods in Regression and Correlation - Chapter 14 & 15
No ratings yet
10PPTs-Handout Ten-Descriptive and Inferential Methods in Regression and Correlation - Chapter 14 & 15
25 pages
10/36-702 Statistical Machine Learning Homework #2 Solutions
No ratings yet
10/36-702 Statistical Machine Learning Homework #2 Solutions
11 pages
P&S unit 2
No ratings yet
P&S unit 2
42 pages
Maths Answer Key - Set 3
No ratings yet
Maths Answer Key - Set 3
12 pages
Maths Answer Key - Set 3
No ratings yet
Maths Answer Key - Set 3
12 pages
財務計量期末考大抄
No ratings yet
財務計量期末考大抄
2 pages
Chapter 11 Lecture Notes .
No ratings yet
Chapter 11 Lecture Notes .
22 pages
ISI MStat PSB Past Year Paper 2014
No ratings yet
ISI MStat PSB Past Year Paper 2014
6 pages
MA241 F16 FinalExam Formula Sheet
No ratings yet
MA241 F16 FinalExam Formula Sheet
5 pages
1.5 Measuring The Goodness of Fit: Special Case
No ratings yet
1.5 Measuring The Goodness of Fit: Special Case
6 pages
Model Answer: N P P Z P Z
No ratings yet
Model Answer: N P P Z P Z
7 pages
CS1A_Nov 24_Solution
No ratings yet
CS1A_Nov 24_Solution
13 pages
Ols Estimates
No ratings yet
Ols Estimates
16 pages
Analisis Regresi Sederhana Dan Berganda (Teori Dan Praktik)
No ratings yet
Analisis Regresi Sederhana Dan Berganda (Teori Dan Praktik)
53 pages
Problem Set 03 - Solutions
No ratings yet
Problem Set 03 - Solutions
16 pages
Unit4 Multivariate Analysis
No ratings yet
Unit4 Multivariate Analysis
20 pages
234-Lectures 1 2
No ratings yet
234-Lectures 1 2
7 pages
Ols Proof
100% (1)
Ols Proof
2 pages
UNIT-III Curve fitting & Smpling, App
No ratings yet
UNIT-III Curve fitting & Smpling, App
51 pages
Reg HW1 Solution
No ratings yet
Reg HW1 Solution
2 pages
bal MATHS_12_PB_MS_SET_A
No ratings yet
bal MATHS_12_PB_MS_SET_A
7 pages
Interpolation: Dr. Gokul K. C
No ratings yet
Interpolation: Dr. Gokul K. C
14 pages
Linear Models SEM-IV (Part2) Main (1)-119-133
No ratings yet
Linear Models SEM-IV (Part2) Main (1)-119-133
15 pages
Maths-3 Comp May 2023 Paper Solution
No ratings yet
Maths-3 Comp May 2023 Paper Solution
17 pages
Simple Regression
No ratings yet
Simple Regression
18 pages
D0685 Math 04 Merged
No ratings yet
D0685 Math 04 Merged
21 pages
Testing of hypothesis about the β, the population regression coefficient
No ratings yet
Testing of hypothesis about the β, the population regression coefficient
5 pages
Formula&Tables (1)
No ratings yet
Formula&Tables (1)
21 pages
485 (W2020) Problem Set 1 Answer Key
No ratings yet
485 (W2020) Problem Set 1 Answer Key
6 pages
STT251 Lecture-05
No ratings yet
STT251 Lecture-05
3 pages
ANOVA Table and Prediction Intervals
No ratings yet
ANOVA Table and Prediction Intervals
7 pages
5db83ef1f71e482 PDF
No ratings yet
5db83ef1f71e482 PDF
192 pages
NewtonInterpolation
No ratings yet
NewtonInterpolation
3 pages
ML Lec-3
No ratings yet
ML Lec-3
11 pages
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet
Transformation of Axes (Geometry) Mathematics Question Bank
From Everand
Transformation of Axes (Geometry) Mathematics Question Bank
Mohmmad Khaja Shareef
3/5 (1)
CIGRE PS1 Id077 Zurich 2013
No ratings yet
CIGRE PS1 Id077 Zurich 2013
10 pages
DLL Mathematics 6 q4 w1
No ratings yet
DLL Mathematics 6 q4 w1
9 pages
SCM ESE Syllabus 15
No ratings yet
SCM ESE Syllabus 15
5 pages
Download full C Programming From Problem Analysis to Program Design 4th Edition Barbara Doyle ebook all chapters
100% (3)
Download full C Programming From Problem Analysis to Program Design 4th Edition Barbara Doyle ebook all chapters
81 pages
Problems For BJT Section: Lecture Notes: Sec. 3
100% (1)
Problems For BJT Section: Lecture Notes: Sec. 3
9 pages
Excel 2013 Bible 1st Edition Walkenbach John pdf download
100% (1)
Excel 2013 Bible 1st Edition Walkenbach John pdf download
39 pages
Chapter 5
No ratings yet
Chapter 5
29 pages
Download Aspects of Boundary Problems in Analysis and Geometry 1st Edition David Bleecker ebook All Chapters PDF
100% (7)
Download Aspects of Boundary Problems in Analysis and Geometry 1st Edition David Bleecker ebook All Chapters PDF
82 pages
Module 2 v2
No ratings yet
Module 2 v2
21 pages
Pam CPF 50 SP 004 8 Piping Class
No ratings yet
Pam CPF 50 SP 004 8 Piping Class
136 pages
Basic Programming Simatic S7-300
No ratings yet
Basic Programming Simatic S7-300
40 pages
Plane Geometry
No ratings yet
Plane Geometry
3 pages
2024-12-02_02-06-37.0587_-0500-20550fb17139528d2e0ea458f73a8bbe7748d717
No ratings yet
2024-12-02_02-06-37.0587_-0500-20550fb17139528d2e0ea458f73a8bbe7748d717
14 pages
25 Test Questions of Solid and Fluid Mechanics
No ratings yet
25 Test Questions of Solid and Fluid Mechanics
6 pages
Datasheet - PAC-DX - EU - Lres PDF
No ratings yet
Datasheet - PAC-DX - EU - Lres PDF
3 pages
Column Slender
100% (2)
Column Slender
11 pages
Nomenclatura Ventiladores EC Fan EBM
No ratings yet
Nomenclatura Ventiladores EC Fan EBM
4 pages
A Novel Scheme For Accurate Remaining Useful Life Prediction For Industrial IoTs by Using Deep Neural Network
No ratings yet
A Novel Scheme For Accurate Remaining Useful Life Prediction For Industrial IoTs by Using Deep Neural Network
9 pages
Practice Fluids - 3
No ratings yet
Practice Fluids - 3
7 pages
Information Retrieval: Unit 4: Web Search - Part 1
No ratings yet
Information Retrieval: Unit 4: Web Search - Part 1
63 pages
Oscillations MCM1
No ratings yet
Oscillations MCM1
15 pages
Densitiy Sugar Graph
No ratings yet
Densitiy Sugar Graph
10 pages
3 - Simplex Method
No ratings yet
3 - Simplex Method
13 pages
FU (H4SO) Fuel Injection
No ratings yet
FU (H4SO) Fuel Injection
61 pages
Introduction To LANSA For I
No ratings yet
Introduction To LANSA For I
129 pages
Life Processes - Grade 10th - Questions
100% (2)
Life Processes - Grade 10th - Questions
3 pages
Module-5 ACA PDF
100% (1)
Module-5 ACA PDF
30 pages
PPT06 Foundations of Business Intelligence Databases and Information Management
No ratings yet
PPT06 Foundations of Business Intelligence Databases and Information Management
58 pages
4) A Logical View of Data
No ratings yet
4) A Logical View of Data
3 pages

Linear Regression Analysis: Module - Ii

Uploaded by

Linear Regression Analysis: Module - Ii

Uploaded by

LINEAR REGRESSION ANALYSIS

Simple Linear Regression

Joint confidence region for β 0 and β1

n(b0* − β o* ) 2 sxx (b1 − β1 ) 2

and SSres is independently distributed of b0* and b1 , so the ratio

On the basis of the identity yi − yˆi = ( yi − y ) − ( yˆi − y ),

the sum of squared residuals is

freedom as it depends on b0 and b1.

The decision rule for H1 : β1 ≠ 0 is to reject H0 if F0 > F1,n − 2;1−α

Source of variation Sum of squares Degrees of freedom Mean Square

Residual SSres n-2 MSE

The estimator of σ 2 in this case may be expressed as

Various alternative formulations for SSres are in use as well:

=s yy + b12 sxx − 2b1sxy

Using this result, we find that

Goodness of fit of regression

It can be seen that

You might also like