0% found this document useful (0 votes)
24 views33 pages

3. OLS Estimation

Uploaded by

Alexandra Farkas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views33 pages

3. OLS Estimation

Uploaded by

Alexandra Farkas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Econometrics

F. Machado

OLS
Estimates

The
Gauss-Markov
Econometrics
Assumptions
Least squares estimation
Properties of
OLS
estimators

The
Gauss-Markov José A. Ferreira Machado João Valle e Azevedo
theorem

Nova School of Business and Economics

Fall Semester, 2024-2025


The Least Squares Criterion

Econometrics Given a model,


F. Machado
y = β0 + β1 x1 + β2 x2 + ... + βk xk + u
OLS
Estimates For instance
The
Gauss-Markov
Earningsi = β0 + β1 Educationi + β2 Experiencei + β3 Tenurei + u
Assumptions
Given data (yi , xi1 , xi2 , . . . , xik ) for i = 1 . . . n, we want to find the
Properties of
OLS best ‘guesses” of the unknown β’s.
estimators

The Ordinary Least squares (OLS) criterion


Gauss-Markov
theorem OLS is fitting a line through the sample points such that the Sum of
Squared Residuals (SSR) is as small as possible, (hence the term least
squares)
Xn n
X
SSR = ûi 2 = (yi − yˆi )2
i=1 i=1

with,
yˆi = βˆ0 + βˆ1 xi1 + βˆ2 xi2 + βˆ3 xi3 + ... + βˆk xik
(the fitted value)
Graph of fitted values and residuals (k = 1)

Econometrics

F. Machado

OLS
Estimates
y
The y4 .
Gauss-Markov û4 {
Assumptions

Properties of yˆ  ˆ0  ˆ1 x


OLS
estimators y3 .} û3
The y2 û2 {
.
Gauss-Markov
theorem

y1 .} û1
x1 x2 x3 x4 x

Figure: Sample regression line, sample data points and the associated
estimated error terms - the residuals
Obtaining the OLS estimates
n
X n
X
Econometrics
ûi 2 = (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 − βˆ3 xi3 − ... − βˆk xik )2
F. Machado i=1 i=1

Minimize the Residuals Sum of Squares (SSR). Take derivatives


OLS
Estimates and equate to zero (why?):
The
Gauss-Markov n n
Assumptions
X X
(yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 − · · · − βˆk xik ) = ûi = 0
Properties of
i=1 i=1
OLS
estimators Xn n
X
The
xi1 (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 − · · · − βˆk xik ) = xi1 ûi = 0
Gauss-Markov i=1 i=1
theorem n n
X X
xi2 (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 − · · · − βˆk xik ) = xi2 ûi = 0
i=1 i=1
...
n
X n
X
xik (yi − βˆ0 − βˆ1 xi1 − βˆ2 xi2 − · · · − βˆk xik ) = xik ûi = 0
i=1 i=1
Algebraic Properties

Econometrics

F. Machado

OLS
Estimates 1 The sample average of the residuals e zero so that ȳ = ŷ¯ ;
The
Gauss-Markov 2 The sample covariance between each independent variable
Assumptions
and the OLS residuals is zero. Consequently
Properties of
OLS Cov (ŷ , û) = 0;
estimators

The
3 The point (x¯1 , x¯2 , . . . , x¯k , ȳ ) is always on the OLS
Gauss-Markov
theorem regression line:

ȳ = β̂0 + β̂1 x¯1 + · · · + β̂k x¯k .


Goodness-of-fit

Econometrics Total Sum of Squares (SST) decomposition:


n n n
F. Machado X X X
SST = (yi − ȳ )2 = ûi 2 + (yˆi − ȳ )2 = SSR + SSE
OLS i=1 i=1 i=1
Estimates Coefficient of determination:
The
Gauss-Markov
Assumptions SSE
R2 =
Properties of
SST
OLS SSR
=1−
estimators SST
The
Gauss-Markov
theorem
Interpretation of R 2
Proportion of the sample variation of the endogenous variable
explained by the OLS regression line.

0 ≤ R 2 ≤ 1;
R 2 = corr2y ,ŷ ;
R 2 can never decrease when another independent variable is added to a
regression, and usually will increase.
Because the R 2 will usually increase with the number of independent
variables, it is not a good measure to compare models.
Partialling-out interpretation of multiple regression

Econometrics Let’s focus on the case of two explanatory variables (besides the
F. Machado
constant), so k=2
We estimate
OLS
Estimates
ŷ = βˆ0 + βˆ1 x1 + βˆ2 x2
The
Gauss-Markov It can be shown that
Assumptions Pn
rˆi1 yi
Properties of βˆ1 = Pi=1
n 2
i=1 rˆi1
OLS
estimators

The where rˆi1 are the residuals obtained when we estimate the
Gauss-Markov
theorem regression:

xˆ1 = γˆ0 + γˆ2 x2


Notice that βˆ1 is the simple linear regression estimator, with rˆi1 instead of the
original regressor x1 (note that the average of the residuals is always 0 )
Therefore, the estimated effect of x1 on y equals the (simple regression)
estimated effect of the ”part” of x1 that is not explained by x2
Partialling-out interprepation of multiple regression
Example

Econometrics
Let us estimate the effect of education on wages, taking also into
F. Machado account the effect of experience
OLS
Start by regressing education on experience (even if it seems
Estimates silly...), storing the residuals of this regression
The
Gauss-Markov
Assumptions Independent Variable Coefficient Estimate Standard Error
Intercept 110.916 0.075
Properties of
OLS Labor Market Experience -0.114 0.003
estimators
n 11064
The R2 0.172
Gauss-Markov
theorem
Figure: Dependent Variable: Education
Econometrics

F. Machado

OLS
Estimates Now regress wages on the previously stored residuals
The
Gauss-Markov
Assumptions Independent Variable Coefficient Estimate Standard Error
Properties of Intercept 657.893 3.524
OLS
estimators Residual from the Education reg. 66.404 0.866

The n 11064
Gauss-Markov
R2 0.347
theorem

Figure: Dependent Variable: Wages


Econometrics
What if we regress wages on education and experience
F. Machado
directly?
OLS
Estimates
Independent Variable Coefficient Estimate Standard Error
The
Gauss-Markov Intercept -111.128 11.906
Assumptions
Education (in years) 66.4041 0.863
Properties of
OLS Labor Market Experience (in years) 11.672 0.301
estimators
n 11064
The
Gauss-Markov R2 0.351
theorem

Figure: Dependent Variable: Wages

Estimated coefficient of education is the same as the


coefficient from the previous regression (that controls for
experience)!
Model in Matrix Form

Econometrics
To solve for the messy first order conditions of the OLS
F. Machado estimates, it is easier to put the model in matrix form.
OLS
The model for the n observations (n > k) of y and the regressors
Estimates is:
The
Gauss-Markov
Assumptions y = Xβ + u
Properties of  
OLS       β0
estimators y1 1 x11 x12 ... x1k u1
 y2 
 β1 
The
 1 x21 x22 ... x2k   u2   
Gauss-Markov
y=
 ... , X=
  , u= , β= β2 
... ... ... ... ...   ...   
theorem  ... 
yn 1 xn1 xn2 ... xnk un
βk
OLS estimators: general expression
Recall FOC to minimize SSR,
Econometrics
n
X n
X
F. Machado (yi − βˆ0 − βˆ1 xi1 − ... − βˆk xik ) = ûi = 0
i=1 i=1
n n
OLS X X
Estimates xi1 (yi − βˆ0 − βˆ1 xi1 − ... − βˆk xik ) = xi1 ûi = 0
i=1 i=1
The (...)
Gauss-Markov
n n
Assumptions X X
xik (yi − βˆ0 − βˆ1 xi1 − ... − βˆk xik ) = xik ûi = 0
Properties of i=1 i=1
OLS
estimators In matrix form,
The X0 û = 0 ⇔ X0 (y − Xβ̂) = 0
Gauss-Markov
or
theorem

(X0 X)β̂ = X0 y
−1
If, (X’X) exists

Therefore we may solve for

OLS estimators of β

β̂ = (X0 X)−1 X0 y
Model Assumptions

Econometrics

F. Machado
Assumption MLR.1 (Linearity in Parameters)
OLS
Estimates

The
y = β0 + β1 x1 + β2 x2 + ... + βk xk + u
Gauss-Markov
Assumptions

Properties of
OLS
estimators

The
Gauss-Markov
theorem
Model Assumptions

Econometrics

F. Machado
Assumption MLR.1 (Linearity in Parameters)
OLS
Estimates

The
y = β0 + β1 x1 + β2 x2 + ... + βk xk + u
Gauss-Markov
Assumptions

Properties of
OLS
estimators
Assumption MLR.2 (Random Sampling)
The
Gauss-Markov Random sample of size n,{(xi1 , xi2 , ..., xik ,yi ): i=1,2,...,n}
theorem
satisfying the above equation

y = β0 + β1 x1 + β2 x2 + ... + βk xk + u
Econometrics Recall random sample
F. Machado {Z1 , . . . , Zn } is a random sample of size n from a population with CDF
OLS
FZ (z) if the r.v.’s Zi are independent and have the same distribution
Estimates FZ (z).
The The Zi ’s are said to be independent and identically distributed ( i.i.d.)
Gauss-Markov
Assumptions

Properties of
OLS
estimators

The
Gauss-Markov
theorem
Econometrics Recall random sample
F. Machado {Z1 , . . . , Zn } is a random sample of size n from a population with CDF
OLS
FZ (z) if the r.v.’s Zi are independent and have the same distribution
Estimates FZ (z).
The The Zi ’s are said to be independent and identically distributed ( i.i.d.)
Gauss-Markov
Assumptions

Properties of
Example
OLS
estimators
? Data on Income and consumption of 1000 Portuguese families randomly
The
selected from the IRS declarations.
Gauss-Markov ? Sample of 50 students in the Econometrics class with an odd student ID
theorem
number and observe: (age, gender, GPA, grade in stats.)

An implication of random sampling


If {Z1 , . . . , Zn } is a random sample from a population with mean µ and
variance σ 2 , then

E (Zi ) = µ Var(Zi ) = σ 2 Cov(Zi , Zj ) = 0


Model Assumptions

Econometrics

F. Machado
Assumption MLR.3 (No Perfect Collinearity)
OLS
Estimates
In the sample none of the independent variables is
The constant, and there are no exact linear relationships
Gauss-Markov
Assumptions
among the independent variables
Properties of
OLS Implies that X has full column rank (=k+1) (notice,
estimators
n ≥ k + 1).
The
Gauss-Markov
This implies that (X’X)−1 exists and, thus, that the OLS
theorem estimator of β exists.
None of the independent variables is a multiple of another
None has perfect correlation with a linear combination of
the others
This would mean that some variable was redundant - can’t
”identify” the parameters
Model Assumptions

Econometrics

F. Machado

OLS
Estimates Assumption MLR.4 (Zero Conditional Mean)
The The error u has an expected value of zero given any values
Gauss-Markov
Assumptions of the independent variables
Properties of
OLS
estimators
E (u|x1 , x2 , ..., xk ) = 0
The
Gauss-Markov (Implying E (u) = 0)
theorem

Under MLR.4, the regressors are said to be strictly


exogenous.
Desired Properties of Estimators
Recall

Econometrics Let θ̂ be an estimator of an unknown parameter θ


F. Machado
Unbiased estimator
OLS
Estimates
θ̂ is unbiased for θ if
The
E (θ̂) = θ ∀θ ∈ Θ
Gauss-Markov
Assumptions

Properties of
OLS The bias of an estimator is Bias(θ̂) = E (θ̂) − θ
estimators

The
Gauss-Markov
theorem
Desired Properties of Estimators
Recall

Econometrics Let θ̂ be an estimator of an unknown parameter θ


F. Machado
Unbiased estimator
OLS
Estimates
θ̂ is unbiased for θ if
The
E (θ̂) = θ ∀θ ∈ Θ
Gauss-Markov
Assumptions

Properties of
OLS The bias of an estimator is Bias(θ̂) = E (θ̂) − θ
estimators

The Efficiency
Gauss-Markov
theorem
Let θˆ1 and θˆ2 be two estimators of θ. θˆ1 is more efficient than θˆ2
(better), if
E (θ̂1 − θ)2 ≤ E (θ̂2 − θ)2 ∀θ ∈ Θ

E (θ̂ − θ)2 is called the Mean squared error (MSE ) of the estimator.
(How does it compare with the variance?)
Unbiasedness of OLS

Econometrics Theorem
F. Machado
Under assumptions MLR.1 to MLR.4, the OLS estimator is
OLS
Estimates
unbiased for β:
The E(β̂j )=βj , j=0,1,2,...,k
Gauss-Markov
Assumptions

Properties of Proof:
OLS
estimators Notice that
The
Gauss-Markov
y = Xβ + u
theorem
Then,
β̂ = (X0 X)−1 X0 y
= (X0 X)−1 X0 (Xβ + u)
= (X0 X)−1 (X0 X)β + (X0 X)−1 X0 u
= β + (X0 X)−1 X0 u
Econometrics

F. Machado
So, conditional on X, and using the fact that
OLS
Estimates

The E (u|x1 , x2 , ..., xk ) = 0


Gauss-Markov
Assumptions

Properties of
and the fact that the sample is random, we can conclude
OLS that:
estimators

The
Gauss-Markov

E (β̂) = β + (X0 X)−1 X0 E(u|X)


theorem

= β + (X0 X)−1 X0 0

Unbiasedness of OLS
Irrelevant variables

Econometrics
Suppose we specify
F. Machado

OLS y = β0 + β1 x1 + β2 x2 + u
Estimates

The
Gauss-Markov
satisfying MLR.1 to MLR.4. However, in the population
Assumptions β2 = 0, that is, after controlling for x1 , x2 has no effect on
Properties of
OLS
y.
estimators
That is, the analyst has included an irrelevant exogenous
The
Gauss-Markov variable and fits
theorem

ŷ = β̂0 + β̂1 x1 + β̂2 x2

Since E (β̂j ) = βj for every value of the parameter,


E (β̂1 ) = β1 and E (β̂2 ) = 0.
The unbiasedness of OLS is unaffected by including
irrelevant variables.
Unbiasedness of OLS
Omitted Variables Bias

Econometrics What if we exclude a variable from our specification that does belong?
F. Machado
For instance, we want to estimate
OLS
Estimates Wage = β0 + β1 Education + β1 Ability + u
The
Gauss-Markov but since Ability is not observed we try estimate the effect of education on
Assumptions
wages ( β̃1 ) by fitting
Properties of
OLS Wage = β0 + β1 Education + v
estimators

The where v = β2 Ability + u.


Gauss-Markov
theorem

E (Wage|Edu) = β0 + β1 Edu + E (v |Edu) = β0 + β1 Edu + β2 E (Ability |Edu)

∂ ∂
E (β̃1 ) = E (Wage|Edu) = β1 + β2 E (Ability |Edu)
∂Edu ∂Edu
So,
If β2 = 0 there is no bias (but Ability was irrelevant)
if Ability is uncorrelated with Education no bias, either.
But if the omitted variable is correlated with the regressor the OLS will be biased
Variance of the OLS Estimators

Econometrics Assumption MLR.5


F. Machado The variance of the errors ui is the same given any value of the regressors.
OLS
Estimates
Var (ui |x11 , ..., xnk ) = σ 2
The
Gauss-Markov (Homoskedasticity)
Assumptions

Properties of
OLS
estimators
f(y|x)
The
Gauss-Markov
theorem
.
. E(y|x) = b0 + b1x

.
x1 x2 x3 x
Figure: Heteroskedasticity
Parenthesis: Variance of a random vector

Econometrics
Definition
F. Machado Let w = [w1 . . . wp ]0 be a vector of p random variable and µw the corresponding
OLS
vector of means: µw = [E (w1 ) . . . E (wp )]0 . The variances-covariances matrix
Estimates of w (V (w) ) is the p × p matrix defined by
The
V (w) = E (w − µµw )(w − µw )0 .
 
Gauss-Markov
Assumptions

Properties of That is, a matrix whose (i, j)-th element is given by


OLS
estimators (
Var(wi ) if i = j
The [V (w)]i,j =
Gauss-Markov 6 j
Cov(wi , wi ) if i =
theorem
Parenthesis: Variance of a random vector

Econometrics
Definition
F. Machado Let w = [w1 . . . wp ]0 be a vector of p random variable and µw the corresponding
OLS
vector of means: µw = [E (w1 ) . . . E (wp )]0 . The variances-covariances matrix
Estimates of w (V (w) ) is the p × p matrix defined by
The
V (w) = E (w − µµw )(w − µw )0 .
 
Gauss-Markov
Assumptions

Properties of That is, a matrix whose (i, j)-th element is given by


OLS
estimators (
Var(wi ) if i = j
The [V (w)]i,j =
Gauss-Markov 6 j
Cov(wi , wi ) if i =
theorem

Theorem
Let, z = b + Aw, A a m × p matrix of constants and b a conformable vector of
constants. Then,
V (z) = AV (w)A0 .
Variance of the OLS Estimators

Econometrics

F. Machado
Theorem
OLS
Estimates
Under assumptions MLR.1 to MLR.5,
The
Gauss-Markov
−1
Assumptions
Var(β̂|X ) = σ 2 (X0 X)
Properties of
OLS
estimators

The
Var(β|X)
b =
 
Gauss-Markov
theorem
Var (βb0 |X) Cov (βb0 , βb1 |X) ... Cov (βb0 , βbk |X)
 Cov (βb1 , βb0 |X) Var (βb1 |X) ... Cov (βb1 , βbk |X) 
 
 ... ... ... ... 
Cov (βk , β0 |X) Cov (βk , βb1 |X)
b b b ... Var (βbk |X)
with Cov (β̂i , β̂j |X)=Cov (β̂j , β̂i |X)
Econometrics

F. Machado

OLS
Estimates Var (β̂|X ) = Var [(X0 X)−1 X0 u|X]
The
Gauss-Markov
= (X0 X)−1 X0 Var(u|X)X(X0 X)−1
Assumptions
Since X’X is symmetric (equal to its transpose) the inverse is also symmetric
Properties of
OLS = (X0 X)−1 X0 (σ 2 In )X(X0 X)−1
estimators
2
Since random sample and homoskedasticity imply Var(u|X)=σ In
The
Gauss-Markov 2 0 −1 0 0 −1
theorem
= σ (X X) X X(X X)
2 0 −1
= σ (X X)
Understanding the Variance of OLS estimators

Econometrics
σ2
F. Machado Var (β̂j ) = , j = 0, 1, ..., k
SSTj (1 − Rj2 )
OLS
with SSTj = ni=1 (xij − x¯j )2 and Rj2 is the coefficient of determination
P
Estimates

The from regressing xj on all the other regressors.


Gauss-Markov
Assumptions

Properties of So, what hides behind big variances?


OLS
estimators Strong linear relations among the independent variables are harmful: a
larger Rj2 implies a larger variance for the estimators (almost
The
Gauss-Markov multicollinearity). If we keep adding variables, Rj2 will always increase.
theorem If it goes close to 1 we are in trouble...
If the new ”irrelevant” variables are uncorrelated to the already
included regressors, then the variance remains unchanged
The error variance: a larger σ 2 implies a larger variance of the OLS
estimators
The total sample variation: a larger SSTj implies a smaller variance for
the estimators (increases with sample size, so in large samples we
should not be too worried)
Estimating the Error Variance

Econometrics

F. Machado In practice, the error variance σ 2 is unknown, we must


OLS
estimate it...
Estimates
Pn 2
i=1 ûi SSR
The
2
Gauss-Markov σ̂ = ≡
Assumptions n−k −1 df
Properties of
OLS where df (degrees of freedom) is the number of
estimators
observations minus the number of estimated parameters.
The
Gauss-Markov
theorem
Can show that this is an unbiased estimator of the error
variance
The standard error of the regression is given by:

σ̂ = σ̂ 2
The OLS Estimator is BLUE

Econometrics

F. Machado

OLS
Estimates
Theorem
The
Gauss-Markov
Assumptions Under assumptions MLR.1 through MLR.5 the OLS
Properties of estimators are BLUE:
OLS
estimators
Best
The
Gauss-Markov Linear
theorem
Unbiased
Estimator
Econometrics

F. Machado

OLS
Estimates No other linear and unbiased estimator β̃ = A0 y has a
The smaller variance than OLS
Gauss-Markov
Assumptions Here the variances are matrices, we are saying that:
Properties of
OLS
estimators Var(β̃|X) − Var(β̂|X)
The
Gauss-Markov
theorem
is a positive semi-definite matrix (implies that all individual
OLS parameter estimators have smaller variance than any
other linear unbiased estimator for those parameters)

You might also like