1 Simple Linear Regression
1 Simple Linear Regression
” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
The simple linear regression model is a model with a single regressor 𝑥 that
has a straight-line relationship with a response 𝑦:
𝑦 =𝛽 +𝛽 𝑥+𝜀
where
E(g) 0 Var(2) 0
,
= 0 =
,
=
The regressor 𝑥 is controlled by the data analyst and measured with negligible
errors, while the response 𝑦 is a random variable. At each possible value for 𝑥,
the mean of the distribution of 𝑦 is y po B1X +, Ellio = +
, Var(s) o =
El Bo P /know
Cor(Ei , Ej) =
expectation :· E(Ex)
+ ,
𝐸(𝑦|𝑥) = 𝛽 + 𝛽 𝑥 =>t take It = 0
-
.IE
↓ and the variance is
,
*
E(x(x) = x
𝑉𝑎𝑟(𝑦|𝑥) = 𝜎 . Var(pot Bx +
&(x) = Var((x) = 02
=
pot Pix + 0
Example 1.1.1 A study investigated whether the average number of tweets (or
messages) per hour prior to the movie’s release on Twitter.com could be used to forecast
the opening weekend box office revenues of movies. The two variables of a sample of 23
movies were measured.
Tweet Revenue
-
Movie
Rate (millions)
1 1365.8 142
2 1212.8 77
3 581.5 61
4 310.1 32
5 455 31
6 290 30
⋮ ⋮ ⋮
23 2.75 0.3
Suggest a linear regression model for the investigation. What are the practical meaning
of the regression coefficients.
1
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
e
n
𝑦 =𝛽 +𝛽 𝑥 +𝜀
&
( )
• 𝛽 = (the slope, the average revenue increase of a movie when the tweet
rate increases by 1 unit)
-
The least-squares estimators of 𝛽 and 𝛽 are
estimate
∑ 𝑥 𝑦 − 𝑛𝑥̅ 𝑦
*
SAS GRF lyi-po-Bil a
y-intercept 𝛽 = 𝑦 − 𝛽 𝑥̅
-
and 𝛽 =
∑ 𝑥 − 𝑛𝑥̅
.
E =
↑
slope
Top
& .
y-intercept .
𝑦 = 𝛽 + 𝛽 𝑥.
#En It modify 1 what
The difference between the observed value 𝑦 and the corresponding fitted value
𝑦 is a residual:
I error term
i
𝑒 = 𝑦 −𝑦 = 𝑦 − 𝛽 +𝛽 𝑥 .
du
Elyi)
fitted
(gi)
obarred a
=
yi -
Y
x x
x
x)
=
yi /Bo BiXi)
-
+
2
ve
& true ·
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
Bo-pixil
( 1)
-
& -B
The least-squares estimators of 𝛽 and 𝛽 , say 𝛽 and 𝛽 , must satisfy
𝜕𝑆
= −2 𝑦 −𝛽 −𝛽 𝑥 =0 y =
Bo + B X
,
𝜕𝛽 ,
𝜕𝑆
= −2 𝑦 −𝛽 −𝛽 𝑥 𝑥 = 0
𝜕𝛽 ,
𝑛𝛽 + 𝛽 𝑥 = 𝑦
𝛽 𝑥 +𝛽 𝑥 = 𝑦𝑥
The above equations are called the least-squares normal equations. The solution to the
normal equation is
𝛽 = 𝑦 − 𝛽 𝑥̅
∑ 𝑦 𝑥 − 𝑛𝑦𝑥̅ ∑ (𝑦 − 𝑦)(𝑥 − 𝑥̅ )
𝛽 = =
∑ 𝑥 − 𝑛𝑥̅ ∑ (𝑥 − 𝑥̅ )
Example 1.2.2 A study investigated whether the average number of tweets (or
messages) per hour prior to the movie’s release on Twitter.com could be used to forecast
the opening weekend box office revenues of movies. The two variables of a sample of 23 n = 23
movies were measured. Consider the simple linear regression model:
𝑦 =𝛽 +𝛽 𝑥 +𝜀
where 𝑦 = weekend box office revenues and 𝑥 = the average number of tweets per hour.
Below are some quantities based on the sample:
3
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons. -
-
n
Exyi-nEy
Boz j - B F =
xi2 n
Derive the least-squares estimators of 𝛽 and 𝛽 .
-
No . of million
576.3 6980.65 ↓
∑ 𝑦 𝑥 − 𝑛𝑦𝑥̅ 396603.2 − 23 23 23
𝛽 = = = 0.07876722
∑ 𝑥 − 𝑛𝑥̅ 6980.65
4933199 − 23 23
576.3 6980.65
𝛽 = 𝑦 − 𝛽 𝑥̅ = − (0.07876722) = 1.150157
I
23 23
-
I
h = 23
↑
y Bo pix
=
+
Residuals:
Min 1Q Median 3Q Max
-36.751 -2.302 2.468 5.083 33.270
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.150154 3.676108 0.313 0.757
x 0.078767 0.007938 9.923 2.22e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
*
̅
where 𝑐 = .
∑ ̅
* Y : *s -
t 4
tyr
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
= 𝑐 𝐸(𝑦 )
& It expectation
= 𝑐 𝐸(𝛽 + 𝛽 𝑥 + 𝜀 )
= 𝑐 𝐸(𝛽 + 𝛽 𝑥 ) E(i) = 0
= 𝑐𝛽 + 𝑐𝛽 𝑥
𝑥 − 𝑥̅ (xi x)-
= 0
=𝛽 𝑐 +𝛽 𝑥
∑ 𝑥 − 𝑥̅
𝑥 − 𝑥̅ 𝑥 − 𝑥̅
=𝛽 𝑐 +𝛽 𝑥 +𝛽 𝑥̅
∑ 𝑥 − 𝑥̅ ∑ 𝑥 − 𝑥̅
-O
𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )
=𝛽 +𝛽
∑ 𝑥 − 𝑥̅ ∑ 𝑥 − 𝑥̅
=𝛽
𝑉𝑎𝑟 𝛽 = 𝑉𝑎𝑟 𝑐𝑦
= 𝑐 𝑉𝑎𝑟(𝑦 ) + 𝑐 𝑐 𝐶𝑜𝑣 𝑦 , 𝑦
= 𝑐 𝑉𝑎𝑟(𝛽 + 𝛽 𝑥 + 𝜀 )
= 𝑐 𝑉𝑎𝑟(𝜀 )
↑
𝑥 − 𝑥̅
= 𝜎
∑ 𝑥 − 𝑥̅
(𝑥 − 𝑥̅ )
= 𝜎
∑ 𝑥 − 𝑥̅
𝜎
=
∑ 𝑥 − 𝑥̅
5
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
y =
Bo +
B ,x
I
o ∑ (𝑦 − 𝑦 ) = ∑ 𝑒 =0
o ∑ 𝑦 =∑ 𝑦
o 𝑦 = 𝛽 + 𝛽 𝑥̅
o ∑ 𝑥𝑒 =0
o ∑ 𝑦𝑒 =0
𝛽 = 𝑦 − 𝛽 𝑥̅
=
1
𝑦 − 𝑐𝑦 𝑥̅
: y :
𝑛 oly If out of y
1
= − 𝑐 𝑥̅ 𝑦
𝑛
= Be j -
= 𝑑𝑦
: (cyi) -
where
=Ytebin
1 (𝑥 − 𝑥̅ )𝑥̅
𝑑 = − .
𝑛 ∑ 𝑥 − 𝑥̅
clt-Isilyi}
a) ∑ (𝑦 − 𝑦 ) = ∑ 𝑒 =0
b) 𝑦 = 𝛽 + 𝛽 𝑥̅
:
Chiyi
c) ∑ 𝑥𝑒 =0 where hi =
-xC ,
d) ∑ 𝑦𝑒 =0
(a)
1 𝜕𝑆
(𝑦 − 𝑦 ) = 𝑦 −𝛽 −𝛽 𝑥 = =0
−2 𝜕𝛽 ,
(b)
𝛽 = 𝑦 − 𝛽 𝑥̅ ⇒ 𝛽 + 𝛽 𝑥̅ = 𝑦
(c)
1 𝜕𝑆
𝑥𝑒 = 𝑥 (𝑦 − 𝑦 ) = 𝑥 𝑦 −𝛽 −𝛽 𝑥 = =0
−2 𝜕𝛽 ,
(d)
6
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
𝑦𝑒 = 𝛽 +𝛽 𝑥 𝑒 =𝛽 𝑒 +𝛽 𝑥𝑒 =0
1.4 ESTIMATION OF 𝝈𝟐 ei = 2/ gi
-
y /":
in
An unbiased estimator of 𝜎 is o= le
M
∑ 𝑒 𝑆𝑆
𝜎 = = = 𝑀𝑆
𝑛−2 𝑛−2
𝑆𝑆 = (𝑦 − 𝑦 )
= 𝑦 −𝛽 −𝛽 𝑥
= 𝑦 − 𝑦 + 𝛽 𝑥̅ − 𝛽 𝑥
= (𝑦 − 𝑦) + 2 (𝑦 − 𝑦) 𝛽 𝑥̅ − 𝛽 𝑥 + 𝛽 𝑥̅ − 𝛽 𝑥
= (𝑦 − 𝑦) + 2𝛽 (𝑦 − 𝑦)(𝑥̅ − 𝑥 ) + 𝛽 (𝑥̅ − 𝑥 )
𝑆
= (𝑦 − 𝑦) − 2𝛽 𝑆 +𝛽 𝑆
𝑆
=𝑆 −𝛽 𝑆
Example 1.4.1 Estimate 𝜎 of the error term of the model proposed in example 1.1.2.
Below are some quantities based on the sample:
𝑥 = 6980.65 𝑦 = 576.3
↑
Elpi) =
B ,
7
Bi is a n
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
𝜎 =
𝑆𝑆 - sum of squares
of residuals . =
Elyi
𝑛−2
𝑆 −𝛽 𝑆 IST SiT Total Sum of squares
= = Elyi-gl =
X, Xc , In
=
...
/
-
23 − 2 N(0 1) ~
,
∑ 𝑦 − 𝑛𝑦 − 𝛽 (∑ 𝑦 𝑥 − 𝑛𝑥̅ 𝑦)
=
23 − 2
576.3 6980.65 576.3
35626.09 − (23) 23 − (0.07876722) 396603.2 − (23) 23 23 sum of many
= incluy r V
. . .
23 − 2 -
= 177.3297 + fi fu fo
lother factors)
+ + ...
fluctuation (noice)
&
random
>
Bittour
GPA +G k,
B
-
= +
e .
g
.
e
g
.
....
....
mainly study
The error terms, 𝜀 ’s, are normally and independent distributed with mean 0
and variance 𝜎 , abbreviated NID(0, 𝜎 ).
The appropriate hypotheses for testing that the intercept equals a constant,
↑
say 𝑐 are
① 𝐻 :𝛽 = 𝑐 , 𝐻 :𝛽 ≠ 𝑐
p: N/Bo of
= N
The test statistic is aiyi ,
p E(p)
∗
the
.
2 5 %)
↑ proved
in Ch 2
N(c)
.
.
.
~
where se 𝛽 = 𝑀𝑆 (1/𝑛 + 𝑥̅ /𝑆 ).
# Var(p) * #n
The appropriate hypotheses for testing that the slope equals a constant, say 𝑐
are
𝐻 :𝛽 = 𝑐 , 𝐻 :𝛽 ≠ 𝑐
where se 𝛽 = 𝑀𝑆 /𝑆𝑆 .
to , me
dif
&= significance .
8
level
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
The test about the slope is related to the test of the significance of
regression—failing to reject 𝐻 : 𝛽 = 0 is equivalent to saying that there is no
linear relationship between 𝒚 and 𝒙.
↓ Ett.
(𝑦 − 𝑦) = (𝑦 − 𝑦 + 𝑦 − 𝑦)
SSRES
5 = (𝑦 − 𝑦 ) + 2 (𝑦 − 𝑦 )(𝑦 − 𝑦) + (𝑦 − 𝑦)
Total sum of
squares -li = ej
= (𝑦 − 𝑦 ) + 2 𝑦 (𝑦 − 𝑦 ) − 2 𝑦(𝑦 − 𝑦 ) + (𝑦 − 𝑦)
= (𝑦 − 𝑦 ) + 2 𝑦 𝑒 − 2𝑦 (𝑦 − 𝑦 ) + (𝑦 − 𝑦)
Sum of squares
&S Res ·
of residuals .
= (𝑦 − 𝑦 ) + (𝑦 − 𝑦)
-
since ∑ 𝑦 𝑒 = 0 and ∑ 𝑦 =∑ 𝑦.
How much
We usually write the identity as variation absorbed.
V
𝑆𝑆 = 𝑆𝑆 + 𝑆𝑆
where 𝑆𝑆 = ∑ (𝑦 − 𝑦) , 𝑆𝑆 =∑ (𝑦 − 𝑦 ) , and 𝑆𝑆 = ∑ (𝑦 − 𝑦) .
flic
① B1
Ml
0
Ho : =
(1 n 2)
,
-
9
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
Example 1.5.1 Test for the significance of regression of the model proposed in example
1.1.2. Use 5% level of significance.
① 𝐻 : 𝛽 = 0, 𝐻 : 𝛽 ≠ 0 E slope to
& * relation
slopeNo relation
,
- 0.07876722
=
6980.65
177.3297/ 4933199 − (23) 23
= 9.92333
Approach 2:
(yi -j)" ay i = -
hij
576.3
𝑆 = 35626.09 − (23) = 21186.02
23
lyi-yil ayi-cyiyi
:
xigi-nj]
𝑆𝑆 = (0.07876722) 396603.2 − (23)
6980.65 576.3
= 17462.09
23 23
Source of Sum of Degrees Mean 𝐹∗
Variation Squares of Square (ss/df)
Freedom
no of regression
Regression SSR 17462.09 &
1 17462.09 98.47228
Residual SSRES 3723.93 123 − 2(i(n-no ofBs) 177.33
.
𝑆𝑆 /𝑑𝑓 𝑀𝑆
𝐹∗ = = = 98.47228
𝑆𝑆 /𝑑𝑓 𝑀𝑆
𝛽 −𝑡 / , se 𝛽 , 𝛽 + 𝑡 / , se 𝛽
startwith >E to 10
& confidence interval
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
~
𝛽 −𝑡 / , se 𝛽 , 𝛽 + 𝑡 / , se 𝛽
The formulas of the confidence intervals are based on the fact that
𝛽 −𝛽 𝛽 −𝛽
~𝑡 and ~𝑡
se 𝛽 se 𝛽
E(y) =
Bo + Xo 𝜇̂ | =𝛽 +𝛽 𝑥
& E mean
A 100(1 − 𝛼) percent confidence interval on 𝜇
O is
-
1 (𝑥 − 𝑥̅ ) 1 (𝑥 − 𝑥̅ )
𝜇̂ | −𝑡 / , 𝑀𝑆 + , 𝜇̂ | +𝑡 / , 𝑀𝑆 +
𝑛 𝑆 𝑛 𝑆
The result is based on the fact that 𝜇̂ | is a linear combination of 𝑦 ’s and thus,
normally distributed random variable; and the variance of 𝜇̂ | is
𝑉𝑎𝑟 𝜇̂ | = 𝑉𝑎𝑟 𝛽 + 𝛽 𝑥
= 𝑉𝑎𝑟 𝑦 − 𝛽 𝑥̅ + 𝛽 𝑥
= 𝑉𝑎𝑟 𝑦 + 𝛽 (𝑥 − 𝑥̅ ) o
= 𝑉𝑎𝑟(𝑦) + 𝑉𝑎𝑟 𝛽 (𝑥 − 𝑥̅ ) + 2𝐶𝑜𝑣 𝑦, 𝛽 (𝑥 − 𝑥̅ )
= 𝑉𝑎𝑟(𝑦) + 𝑉𝑎𝑟 𝛽 (𝑥 − 𝑥̅ )
𝛽 +𝛽 𝑥 +𝜀
= 𝑉𝑎𝑟 + (𝑥 − 𝑥̅ ) 𝑉𝑎𝑟 𝛽
𝑛
𝜀 (𝑥 − 𝑥̅ ) 𝜎
= 𝑉𝑎𝑟 +
𝑛 𝑆
𝜎 (𝑥 − 𝑥̅ ) 𝜎
= +
CAFe only
𝑛 𝑆
je-
To predict the future observation 𝑦 , we consider the 100(1 − 𝛼) percent
prediction interval on a future observation at 𝑥 :
Conf ,
into
[E- Zor( ,
x+ zon)
11
Pre into
[x -
200 , Xitzen of
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
1 (𝑥 − 𝑥̅ )
𝑦 −𝑡 / , 𝑀𝑆 1+ + ,𝑦
𝑛 𝑆
X
1 (𝑥 − 𝑥̅ )
+𝑡 / , 𝑀𝑆 1+ +
𝑛 𝑆
where 𝑦 = 𝛽 + 𝛽 𝑥 .
𝑦 −𝑦 𝑦 −𝑦
𝑦 −𝑦 𝑉𝑎𝑟(𝑦 − 𝑦 ) 𝑉𝑎𝑟(𝑦 − 𝑦 )
= = ~𝑡
𝑉𝑎𝑟(𝑦 − 𝑦 ) 𝑉𝑎𝑟(𝑦 − 𝑦 ) 𝑀𝑆 /𝜎
𝑉𝑎𝑟(𝑦 − 𝑦 )
where
𝑉𝑎𝑟(𝑦 − 𝑦 ) = 𝑉𝑎𝑟(𝑦 ) + 𝑉𝑎𝑟(−𝑦 ) + 2𝐶𝑜𝑣(𝑦 , −𝑦 )
= 𝑉𝑎𝑟(𝑦 ) + 𝑉𝑎𝑟(−𝑦 )
𝜎 (𝑥 − 𝑥̅ ) 𝜎
=𝜎 + +
𝑛 𝑆
1 (𝑥 − 𝑥̅ )
=𝜎 1+ +
𝑛 𝑆
and
1 (𝑥 − 𝑥̅ )
𝑉𝑎𝑟(𝑦 − 𝑦 ) = 𝑀𝑆 1+ + .
𝑛 𝑆
Hence, we have
𝑦 −𝑦
𝑃 −𝑡 / , ≤ ≤𝑡 / , =1−𝛼
𝑉𝑎𝑟(𝑦 − 𝑦 )
𝑃 𝑦 −𝑡 / , 𝑉𝑎𝑟(𝑦 − 𝑦 ) ≤ 𝑦 ≤ 𝑡 / , 𝑉𝑎𝑟(𝑦 − 𝑦 ) =1−𝛼
1 1 (𝑥 − 𝑥̅ )
𝑦 −𝑡 / , 𝑀𝑆 + + ,𝑦
𝑚 𝑛 𝑆
1 1 (𝑥 − 𝑥̅ )
+𝑡 / , 𝑀𝑆 + +
𝑚 𝑛 𝑆
12
Reference: Montgomery, Douglas C. “Chapter 2: Simple Linear Regression.” In
Introduction to Linear Regression Analysis. John Wiley & Sons.
Example 1.6.1 Construct 95% confidence intervals on 𝜇 | and a 95% prediction interval
on a future value of a movie with Tweet rate 100.
1 (𝑥 − 𝑥̅ ) 1 (𝑥 − 𝑥̅ )
𝜇̂ | −𝑡 / , 𝑀𝑆 + , 𝜇̂ | +𝑡 / , 𝑀𝑆 +
𝑛 𝑆 𝑛 𝑆
= [2.345137, 15.708621]
1 (𝑥 − 𝑥̅ ) 1 (𝑥 − 𝑥̅ )
𝑦 −𝑡 / , 𝑀𝑆 1+ + ,𝑦 + 𝑡 / , 𝑀𝑆 1+ +
𝑛 𝑆 𝑛 𝑆
= [−19.46604, 37.51980]
𝑆𝑆 𝑆𝑆
𝑅 = =1−
𝑆𝑆 𝑆𝑆
Example 1.7.1 Calculate the proportion of variation in the movie revenue explained by
the Tweet rate.
𝑆𝑆 17462.09
𝑅 = = = 0.8242 = 82.42%
𝑆𝑆 21186.02
13