0% found this document useful (0 votes)
2 views

Linear Regression Analysis_3

The document provides an overview of linear regression analysis, focusing on simple linear regression models and least squares estimation. It includes examples with calculations for estimating parameters using height and weight data, as well as temperature and yield data. Additionally, it discusses the mean and variance of ordinary least squares (OLS) estimators and their interpretations.

Uploaded by

raisa.mim17
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Linear Regression Analysis_3

The document provides an overview of linear regression analysis, focusing on simple linear regression models and least squares estimation. It includes examples with calculations for estimating parameters using height and weight data, as well as temperature and yield data. Additionally, it discusses the mean and variance of ordinary least squares (OLS) estimators and their interpretations.

Uploaded by

raisa.mim17
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Linear Regression Analysis

Lecture 3
Simple Linear Regression Model
Least Square estimates of the regression
parameters
𝒏
𝒊=𝟏(𝑿𝒊 − 𝑿)(𝒀𝒊 − 𝒀)
𝜷𝟏 = 𝒏 𝟐
𝒊=𝟏 (𝑿𝒊 −𝑿)

𝜷𝟎 = 𝒀 − 𝜷𝟏 𝑿.
Example 1
Consider the following data on height (inches) and weight (lbs) of 10 individuals.
a. Identify the dependent and the independent variable.
b. Draw a scatter diagram for the given data. Interpret.
c. Fit a simple linear regression model using least squares method to estimate the
parameters. Show the calculation and Interpret your results.

Height 63 64 66 69 69 67 67 69 70 70
Weight 127 121 142 160 162 166 169 175 181 200
Solution
a. Height is the independent and Weight is the dependent variable.
250

b. 200

150
Weight

100

50

0
62 64 66 68 70 72 74 76
Height
c.
Ht (X) Wt (Y) X2 Y2 XY
63 127 3969 16129 8001
64 121 4096 14641 7744
66 142 4356 20164 9372
69 157 4761 24649 10833
69 162 4761 26244 11178
71 156 5041 24336 11076
71 169 5041 28561 11999
72 165 5184 27225 11880
73 181 5329 32761 13213
75 208 5625 43264 15600
693 1588 48163 257974 110896
1 = 6.137581
0 = - 266.534

𝑌 = 0 + 1 𝑋= - 266.534 + 6.137581 X

• Interpretation
Example 2
Consider the following data on Temperature and yield of 15 plots.
a. Identify the dependent and the independent variable.
b. Draw a scatter diagram for the given data. Interpret.
c. Fit a simple linear regression model using least squares method to
estimate the parameters. Show the calculation and Interpret your
results.
I Temp X Yield Y
1 50 3.30
2 50 2.80
3 50 2.90
4 70 2.30
5 70 2.60
6 70 2.10
7 80 2.50
8 80 2.90
9 80 2.40
10 90 3.00
11 90 3.10
12 90 2.80
13 100 3.30
14 100 3.50
15 100 3.00
Solution:
a. Temperature is independent and Yield is the dependent variable.

b.

4.00

3.50

3.00

2.50
Yield

2.00

1.50

1.00

0.50

0.00
0.00 20.00 40.00 60.00 80.00 100.00 120.00
Temperature
I Temp X Yield Y X2 Y2 XY
1 50 3.30 2500.00 10.89 165.00
2 50 2.80 2500.00 7.84 140.00
3 50 2.90 2500.00 8.41 145.00
4 70 2.30 4900.00 5.29 161.00
5 70 2.60 4900.00 6.76 182.00
6 70 2.10 4900.00 4.41 147.00
7 80 2.50 6400.00 6.25 200.00
8 80 2.90 6400.00 8.41 232.00
9 80 2.40 6400.00 5.76 192.00
10 90 3.00 8100.00 9.00 270.00
11 90 3.10 8100.00 9.61 279.00
12 90 2.80 8100.00 7.84 252.00
13 100 3.30 10000.00 10.89 330.00
14 100 3.50 10000.00 12.25 350.00
15 100 3.00 10000.00 9.00 300.00
1170 42.5 95700 122.61 3345
b. 1 = 0.006757

0 = 2.306306

𝑌 = 0 + 1 𝑋= 2.306306 + 0.006757 X
Yield
4.00

3.50

3.00

2.50
Yield

2.00

1.50

1.00

0.50

0.00
0.00 20.00 40.00 60.00 80.00 100.00 120.00
Temperature
Mean and variance of the OLS estimators
• The OLS estimator of 1 is

𝒏
𝒊=𝟏(𝑿𝒊 − 𝑿)(𝒀𝒊 − 𝒀)
𝜷𝟏 = 𝒏 𝟐
𝒊=𝟏(𝑿𝒊 −𝑿)
𝒏 𝒏
𝒊=𝟏 𝒀𝒊 (𝑿𝒊 −𝑿) 𝒊=𝟏 𝑿𝒊 −𝑿
= 𝒏 (𝑿 −𝑿)𝟐 −𝒀 𝒏 (𝑿 −𝑿)𝟐
𝒊=𝟏 𝒊 𝒊=𝟏 𝒊
𝒏
𝒊=𝟏 𝒀𝒊 (𝑿𝒊 −𝑿)
= 𝒏 (𝑿 −𝑿)𝟐
𝒊=𝟏 𝒊

𝒏 (𝑿𝒊 −𝑿)
= 𝒊=𝟏 𝒌𝒊 𝒀𝒊 where 𝒌𝒊 = 𝒏 (𝑿 −𝑿)𝟐
𝒊=𝟏 𝒊 Note that,
𝒏 𝒏
𝒏
E(𝜷𝟏 ) = 𝒊=𝟏 𝒌𝒊 𝑬(𝒀𝒊 ) 𝒊=𝟏 (𝑿𝒊 − 𝑿)
𝒌𝒊 = 𝒏 𝟐
=𝟎
𝒊=𝟏(𝑿𝒊 −𝑿)
= 𝒏
𝒌 𝑬(𝟎 + 𝟏 𝑿𝒊 + 𝝐𝒊) 𝒊=𝟏
𝒏 𝒊
𝒊=𝟏 𝒏 𝒏
𝒏
= 𝟎 𝒌𝒊 + 𝟏 𝒌𝒊 𝑿𝒊 + 𝒌𝒊 𝑬(𝝐𝒊) 𝒏
𝒊=𝟏 𝑿𝒊 (𝑿𝒊 − 𝑿)
𝒌 𝒊 𝑿𝒊 = 𝒏 𝟐
=𝟏
𝒊=𝟏 𝒊=𝟏 𝒊=𝟏
𝒊=𝟏 𝒊=𝟏(𝑿𝒊 −𝑿)
= 𝟏
Variance of the estimators 𝟎 and 𝟏 .
• Using the assumptions that yi’s are independently distributed the variance of 𝟏 is
𝒏 𝟐 𝒏 𝒏 𝒏
V(𝜷𝟏 ) = 𝒊=𝟏 𝒌𝒊 𝑽(𝒀𝒊 ) + 𝒊=𝟏 𝒋≠𝒊 𝒌𝒊 𝒌𝒋 𝒄𝒐𝒗(𝒚𝒊 𝒚𝒋 ) [since, 𝜷𝟏 = 𝒊=𝟏 𝒌𝒊 𝒀𝒊 ]

= 𝒏 𝟐
𝒊=𝟏 𝒌𝒊 𝑽(𝟎 + 𝟏 𝑿𝒊 + 𝝐𝒊) +0

𝒏 𝟐
(𝑿𝒊 − 𝑿)
= 𝝈𝟐 𝒏 𝟐
𝒊=𝟏 𝒊=𝟏(𝑿𝒊 −𝑿)

𝒏 𝟐
𝒊=𝟏 (𝑿𝒊 −𝑿)
= 𝝈𝟐 𝒏 (𝑿 −𝑿)𝟐
𝒊=𝟏 𝒊

𝝈𝟐 𝝈𝟐
= 𝒏 𝟐
=
𝒊=𝟏(𝑿 𝒊 − 𝑿) 𝑆𝑥𝑥
The OLS estimator of 0 is

𝜷𝟎 = 𝒀 − 𝜷 𝟏 𝑿
Then
𝑬(𝜷𝟎 ) = 𝑬 𝒀 − 𝜷𝟏 𝑿
= 𝑬 𝒀 − 𝜷𝟏 𝑿
= 𝑬 𝟎 +  𝟏 𝑿 + 𝝐 − 𝟏 𝑿
= 𝟎
Variance of 𝟎
𝜷𝟎 = 𝒀 − 𝜷𝟏 𝑿

Then
𝑽(𝜷𝟎 ) = 𝑽 𝒀 − 𝜷𝟏 𝑿

= 𝑽 𝒀 + 𝑽(𝜷𝟏 𝑿)+ 2 cov(𝒀, 𝜷𝟏 𝑿)

= 𝑽 𝟎 + 𝟏 𝑿 + 𝝐 + 𝑿𝟐 𝑽(𝜷𝟏 ) + 2cov 𝟎 + 𝟏 𝑿 + 𝝐 , 𝜷𝟏 𝑿

= 𝑽 𝝐 + 𝑿𝟐 𝑽(𝜷𝟏 ) + 2cov(𝝐, 𝜷𝟏 𝑿)
𝝈𝟐 𝟐 𝝈𝟐
= + 𝑿 𝑆 + 2cov(𝝐, 𝜷𝟏 𝑿)
𝑛 𝑥𝑥
cov(𝝐, 𝜷𝟏 𝑿)
=E[𝝐 − 𝑬(𝝐)][𝜷𝟏 𝑿 − 𝑬(𝜷𝟏 𝑿)] Note that
𝒏
= 𝑬[𝝐. (𝜷𝟏 − 𝜷𝟏 )𝑿] 𝒊=𝟏(𝑿𝒊 − 𝑿)(𝒀𝒊 − 𝒀)
𝜷𝟏 = 𝒏 𝟐
𝒊=𝟏(𝑿𝒊 −𝑿)
=𝑿. 𝑬[𝝐. (𝜷𝟏 − 𝜷𝟏 )] 𝒀𝒊 − 𝒀 = 𝟎 + 𝟏 𝑿𝒊 + 𝝐𝒊 − 𝟎 + 𝟏 𝑿 + 𝝐
=𝑿. 𝑬[
𝟏 𝒏 𝟐 𝒏 = 𝟏 𝑿𝒊 − 𝑿 + (𝝐𝒊 − 𝝐)
𝒏 𝒏 (𝑿 −𝑿)𝟐 .[ 𝒊=𝟏 𝝐𝒊 𝑿𝒊 − 𝑿 + 𝒊≠𝒋(𝑿𝒊 − 𝑿)𝝐𝒊 𝝐𝒋 ] 𝒏
𝒊=𝟏(𝑿𝒊 − 𝑿)(𝝐𝒊 − 𝝐)
𝒊=𝟏 𝒊 𝒏 𝒏
𝑿 ∴ 𝜷𝟏 = 𝛽1 + 𝒏
= . (𝑿𝒊 − 𝑿) 𝑬(𝝐𝟐𝒊 ) + 𝑿𝒊 − 𝑿 𝑬(𝝐𝒊 𝝐𝒋 )] 𝒊=𝟏(𝑿𝒊 −𝑿)
𝟐
𝒏 𝒏𝒊=𝟏(𝑿𝒊 −𝑿)𝟐 𝒏
𝒊=𝟏(𝑿𝒊 −𝑿)(𝝐𝒊 −𝝐)
𝒊=𝟏
𝒏 𝒊≠𝒋 Or, 𝜷𝟏 − 𝛽1 = 𝒏 (𝑿 −𝑿)𝟐
𝑿 𝒊=𝟏 𝒊
= . 𝑿𝒊 − 𝑿 𝝈𝟐 + 𝟎 𝒏
𝒊=𝟏(𝑿𝒊 −𝑿)𝝐𝒊 𝝐 𝒏
𝒊=𝟏(𝑿𝒊 −𝑿)
𝒏 𝒏𝒊=𝟏(𝑿𝒊 −𝑿)𝟐 = 𝒏 (𝑿 −𝑿)𝟐 − 𝒏 (𝑿 −𝑿)𝟐
𝒊=𝟏 𝒊=𝟏 𝒊 𝒊=𝟏 𝒊
𝒏
𝒊=𝟏(𝑿𝒊 −𝑿)𝝐𝒊
=0 = 𝒏 (𝑿 −𝑿)𝟐
𝒊=𝟏 𝒊
𝝈𝟐 𝝈𝟐 𝟏 𝑿𝟐 𝒏 𝒏 𝒏
+ 𝒊=𝟏(𝑿𝒊 −𝑿)𝝐𝒊 𝒊=𝟏(𝑿𝒊 −𝑿)𝝐𝒊 𝒊=𝟏 𝝐𝒊
So V(𝛽0 ) = + 𝑿𝟐 + 2cov(𝝐, 𝜷𝟏 𝑿)= = 𝝈𝟐 𝑛 𝑆𝑥𝑥 𝝐. (𝜷𝟏 − 𝜷𝟏 )= = 𝒏 (𝑿 −𝑿)𝟐 .𝝐 = 𝒏 (𝑿 −𝑿)𝟐 .
𝒏
𝑛 𝑆𝑥𝑥 𝒊=𝟏 𝒊 𝒊=𝟏 𝒊
𝟏 𝒏 𝟐 𝒏
= .[ 𝒊=𝟏 𝝐𝒊 (𝑿𝒊 − 𝑿) + 𝒊≠𝒋(𝑿𝒊 − 𝑿)𝝐𝒊 𝝐𝒋 ]
𝒏 𝒏
𝒊=𝟏(𝑿𝒊 −𝑿)
𝟐
𝑿𝝈𝟐 𝑿𝝈𝟐
cov(𝜷𝟎 , 𝜷𝟏 )=𝒄𝒐𝒗(𝒀, 𝜷𝟏 ) − 𝑿𝒗𝒂𝒓(𝜷𝟏 ) = 𝟎 − = −
𝑺𝒙𝒙 𝑺𝒙𝒙
𝒄𝒐𝒗(𝒀, 𝜷𝟏 ) = 0 is shown in the derivation of V(𝜷𝟎 ).
• So

• 𝟎 is an unbiased estimator of 𝟎 .

• 𝟏 is an unbiased estimator of 𝟏 .
The ordinary LS estimator 𝜷𝟎 and 𝜷𝟏 possesses the minimum variance
in the class of linear unbiased estimators and hence are termed as
BLUE. This property is called the Gauss Markov theorem.

You might also like