Lec2 ASE
Lec2 ASE
Karim Nchare
November 2020
Functional relations
y = f (x)
Functional relations
y = f (x)
y = f (x)
y = β0 + β1 x
Linear Model
I A model is linear if it can be written as:
y = β0 + β1 x
y = log (γ0 x γ1 )
The linearity assumption
I The linearity assumption is less restrictive than it appears
y = log (γ0 x γ1 )
β0 = log (γ0 )
β 1 = γ1
z = log (x)
The linearity assumption
I The linearity assumption is less restrictive than it appears
y = log (γ0 x γ1 )
β0 = log (γ0 )
β 1 = γ1
z = log (x)
y = f (x)
Approximating nonlinear models
y = f (x)
ỹ = β0 + β1 x ≈ y = f (x)
Approximating nonlinear models
y = f (x)
ỹ = β0 + β1 x ≈ y = f (x)
= y − ỹ
Approximating nonlinear models
Multivariate regressions
I The value of the response variable may be a function of many
regressors
y = f (x1 , x2 , ..., xk )
Multivariate regressions
I The value of the response variable may be a function of many
regressors
y = f (x1 , x2 , ..., xk )
y = β0 + β1 x1 + β2 x2 + · · · + βk xk
Multivariate regressions
I The value of the response variable may be a function of many
regressors
y = f (x1 , x2 , ..., xk )
y = β0 + β1 x1 + β2 x2 + · · · + βk xk
y = β0 + β1 x1 + β2 x2 + · · · + βk xk
y = β0 + β1 x1 + β2 x2 + · · · + βk xk
Unobserved variables
y = β0 + β1 x1 + β2 x2 + · · · + βk xk
ỹ = β0 + β1 x1
Unobserved variables
y = β0 + β1 x1 + β2 x2 + · · · + βk xk
ỹ = β0 + β1 x1
= y − ỹ = β2 x2 + · · · + βk xk
Stochastic regression
y = β0 + β1 x +
yi = β0 + β1 x1i + · · · + βk xki + i
Predictions and Residuals
ei = yi − ŷi
I Contest Game:
I If you guest the weight of a participant within 10lb of the
actual weight, you get paid 2$.
I Otherwise you pay him or her 3$
I You could use height (observable) to estimate the weight
WEIGHTi = β0 + β1 HEIGHTi + i
\ i = 103.4 + 6.38HEIGHTi
WEIGHT
Example: height and weight, Predictions, observations,
residuals
Example: height and weight Predictions
Example: height and weight, Predictions, observations
Example: height and weight, Predictions, observations,
residuals
Estimating linear models
yi = β0 + β1 xi + i
β̂0 = ȳ − β̂1 x̄
cov (x,y )
I Notice that β̂1 looks like a sample analogue of var (x)
I The OLS estimates guarantee that
P
êi = 0
Example: height and weight Computing OLS
Example: height and weight Computing OLS
Example: height and weight Computing OLS
P
(xi − x̄)(yi − ȳ ) 590.2
β̂1 = P 2
= ≈ 6.38
(xi − x̄) 92.55
β̂0 = ȳ − β̂1 x̄ = 169 − 6.38 × 10 ≈ 105.22
ŷi = 105.22 + 6.38xi
Example: geography of trade
Example: military service and income
Example: income vs. fecundity
Example: public debt vs. growth
The need for an intercept
yi = β1 xi + i
yi = β0 + β1 x1i + · · · + βk xki + i
\ i = 15897 − 0.34PARENTi
FINAID
Example: financial aid, OLS
Estimated OLS model (ignoring GENDER):
y = β0 + β1 x1 + β2 x2 + β3 x1 x2 +
yi − E (y ) = β0 + β1 xi + i − β0 − β1 E (x)
= β1 (xi − E (x)) + i
explained unexplained
yi − ȳ = β0 + β1 xi + i − β0 − β1 x̄
= β1 (xi − x̄) + i
explained unexplained
X X
SST = (yi − ȳ )2 =(yi − ŷi + ŷi − ȳ )2
X X X
= (yi − ŷi )2 + 2 (yi − ŷi )(ŷi − ȳ ) + (ŷi − ȳ )2
X X
= (yi − ŷi )2 + (ŷi − ȳ )2
= Sum of Squares Residual + Sum of Squares Explained
= SSR + SSE
Goodness of fit R 2
I We have decomposed the total variation (SST) into the
explained variation (SSE) and the unexplained or residual
variation (SSR)
I A measure of how much of the variation of y can be explained
by the variation of x according to the estimated model
SSE SST − SSR SSR
R2 = = =1−
SST SST SST
I The higher the R 2 the closer the model is to the data and
since 0 ≤ SSR ≤ SST we know that 0 ≤ R 2 ≤ 1.
I It does not measure:
I How linear/tight the relation between x and y is (correlation)
I The inclination of the estimated line (slope coefficient)
I The strength of the causal relation between x and y
Examples
Example: height and weight, Computing OLS
Adding more regressors
SSR/n − K
R̄ 2 = 1 −
SST /n − 1