0% found this document useful (0 votes)
3 views

Math170S_Lecture6

The document discusses regression analysis, focusing on predicting a future random variable Y based on a known variable x, with an emphasis on estimating the conditional mean E[Y |x]. It outlines the assumptions of regression models, particularly linear regression, and explains how to calculate maximum likelihood estimators for the parameters involved. Additionally, it provides a practical example using student scores to illustrate the calculations and the concept of residuals in assessing model fit.

Uploaded by

kirudadarinn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Math170S_Lecture6

The document discusses regression analysis, focusing on predicting a future random variable Y based on a known variable x, with an emphasis on estimating the conditional mean E[Y |x]. It outlines the assumptions of regression models, particularly linear regression, and explains how to calculate maximum likelihood estimators for the parameters involved. Additionally, it provides a practical example using student scores to illustrate the calculations and the concept of residuals in assessing model fit.

Uploaded by

kirudadarinn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Math 170S

Lecture 6

Hubeyb Gurdogan

August 8, 2024

1 / 13
References

Zimmerman
Chapters 6.5

2 / 13
Regression

▶ In general we are interested in predicting a future(latent)


random variable Y corresponding to a realized(known)
variable x.
▶ Example: Let x represent the average age in a country, and
Y be the annual fatality rate of Influenza.
▶ For now we concentrate on estimating the conditional mean
E[Y |x](a standard predictor for Y given x).
▶ To estimate we observe random variables Y1 , Y2 , ..., Yn for
x1 , x2 , .., xn independently, obtaining a sample of n pairs of
known numbers (x1 , y1 ), (x2 , y2 ), ..., (xn , yn )
▶ These pairs then used to estimate the conditional mean
E[Y |x].

3 / 13
Regression Assumptions
The relationship between Y and x is assumed to be:

Y = µ(x) + ϵ

where we have:

▶ µ(x) is a deterministic function of x.


▶ ϵ is a Normal random variable with mean 0, variance σ 2 i.e.
ϵ ∼ N(0, σ 2 ).

Remark: The function µ(x) can have different forms that names
the method;.

▶ Linear regression model: µ(x) = α + βx


▶ Polynomial regression model: µ(x) = α + βx + γx 2
▶ A Non-linear regression model: µ(x) = αx β

4 / 13
Linear Regression Problem

We now consider the linear regression problem where

µ(x) = α + βx

. For a random sample Y1 , Y2 , ..., Yn we have,

Yi = α + βxi + εi , εi ∼ N(0, σ 2 )

Note, this implies that Y1 , ..., Yn have the following distribution;

Yi ∼ N(α + βxi ; σ 2 )

(Warning: α matches with the α1 in the course book.)

5 / 13
Maximum Likelihood Estimators of α, β, σ 2
We can use maximum likelihood estimation to obtain estimates of
α, β, σ 2 . The likelihood function is:
n
(yi − (α + βxi ))2
 
2
Y 1
L(α, β, σ ) = exp −
(2πσ 2 )1/2 2σ 2
i=1
 Pn 2
1 i=1 (yi − (α + βxi ))
= exp −
(2πσ 2 )1/2 2σ 2
The maximum likelihood estimators α̂, β̂, σ̂ 2 of α, β, σ 2 are the
values that maximize L(·).
 Pn 2
2 1 i=1 (yi − (α + βxi ))
(α̂, β̂, σ̂ ) = arg max exp −
2 1/2
α,β,σ 2 (2πσ ) 2σ 2

6 / 13
Maximum Likelihood Estimators of α, β, σ 2
Maximum likelihood estimates of α, β, σ 2 are given by:
Pn Pn
(y −ȳ )(xi −x̄) yi (xi −x̄)
▶ β̂ = Pn i
i=1
2 = Pi=1
n 2
i=1 (xi −x̄) i=1 (xi −x̄)

▶ α̂ = ȳ − β̂ x̄

1 Pn
▶ σ̂ 2 = n i=1 [yi − α̂ − β̂xi ]2

Calculating the above α̂, β̂, σ̂ 2 is not hard, but can be


time-consuming.

Therefore, we provide you the following formulas for α̂, β̂, σ̂ 2 .

7 / 13
Formulas

AB
E− n B A
β̂ = 2 ; α̂ = − β̂ ;
C − An n n
 2
2 D B E AB
σ̂ = − − β̂ + β̂ 2 ,
n n n n

where
n
X n
X
A := xi , B := yi
i=1 i=1
Xn n
X
C := xi2 , D : yi2
i=1 i=1
Xn
E := xi yi ,
i=1

8 / 13
Calculations

Let x1 , . . . , xn be the midterm score of 10 students in a fictional


statistics class:

70 74 72 68 58 54 82 64 80 61.

Let y1 , . . . , yn be the final score of the 10 students:

77 94 88 80 71 76 88 80 90 69.

9 / 13
Calculations
The key values A, B, C , D, E are given by
Xn
A= xi = 70 + 74 + 72 + 68 + 58
i=1
+ 54 + 82 + 64 + 80 + 61 = 683;
Xn
B= yi = 77 + 94 + 88 + 80 + 71
i=1
+ 76 + 88 + 80 + 90 + 69 = 813;
Xn
C= xi2 = 702 + 742 + 722 + 682 + 582
i=1
+ 542 + 822 + 642 + 802 + 612 = 47, 405;
Xn
D= yi2 = 772 + 942 + 882 + 802 + 712
i=1
+ 762 + 882 + 802 + 902 + 692 = 66, 731;
10 / 13
Calculations

n
X
E= xi yi
i=1
=(70)(77) + (74)(94) + (72)(88) + (68)(80) + (58)(71)+
(54)(76) + (82)(88) + (64)(80) + (80)(90) + (61)(69)
= 56, 089;

The MLEs are then given by


AB
E− n B A
β̂ = 2 = 0.742 α̂ = − β̂ = 30.6214
C − An n n

 2
2 D B E AB
σ̂ = − − β̂ + β̂ 2 = 21.77638
n n n n

11 / 13
Residuals

Once we derived the maximum likelihood estimates of α, β, σ 2 , we


can derive the predicted outcome variable ŷi for each i = 1, ..., n

ŷi = α̂ + β̂xi

Residuals: the differences between the observed values yi and the


predicted values ŷi are called residuals and provide information
regarding how well the model fit the data.

In general, we have:

▶ Large residuals ⇒ the model is a bad fit


▶ Small residuals ⇒ the model is a good fit
In practice the average of the squared residuals, σ̂ 2 is used as a
measure of success.

12 / 13
Scatter Plot, Regression Line

13 / 13

You might also like