0% found this document useful (0 votes)
41 views

Lesson01 PDF 02

The document provides an overview of ordinary least squares (OLS) regression including: 1) It describes the four representations of classical linear regression equations: scalar form, vector form for each observation, vector form for each variable, and matrix form. 2) It outlines the classical assumptions of OLS including that the explanatory variables are strictly exogenous and errors are independent with constant variance. 3) It explains that the OLS estimator minimizes the residual sum of squares and results in coefficients that are best linear unbiased estimators according to the Gauss-Markov theorem. 4) It introduces measures of goodness of fit including the coefficient of determination (R2), adjusted R2, Akaike information

Uploaded by

Hyuntae Kim
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Lesson01 PDF 02

The document provides an overview of ordinary least squares (OLS) regression including: 1) It describes the four representations of classical linear regression equations: scalar form, vector form for each observation, vector form for each variable, and matrix form. 2) It outlines the classical assumptions of OLS including that the explanatory variables are strictly exogenous and errors are independent with constant variance. 3) It explains that the OLS estimator minimizes the residual sum of squares and results in coefficients that are best linear unbiased estimators according to the Gauss-Markov theorem. 4) It introduces measures of goodness of fit including the coefficient of determination (R2), adjusted R2, Akaike information

Uploaded by

Hyuntae Kim
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Todays Lesson 1

Basic Econometrics (1):


Ordinary Least Squares
I Representations of Classical Linear Regression Equa-
tions and the Classical Assumptions
I.1 Four Representations of Classical Linear Regression Equa-
tions
There are four conventional ways to describe classical linear regression equations as the
following.
(i) Scalar form

t
= ,
1
+ r
2t
,
2
+ r
3t
,
3
+ c
t
. c
t
s iid(0. o
2
)
(ii) Vector form for each observation

t
=
_
1 r
2t
r
3t
_
_
_
,
1
,
2
,
3
_
_
+ c
t
. c
t
s iid(0. o
2
)
= r
0
t
, + c
t
where r
t
=
_
1 r
2t
r
3t
_
0
and , =
_
,
1
,
2
,
3
_
0
(iii) Vector form for each variable
_

2
.
.
.

T
_

_
. .
1
=
_

_
1
1
.
.
.
1
_

_
. .
,
1
+
A
1
_

_
r
21
r
22
.
.
.
r
2T
_

_
. .
,
2
+
A
2
_

_
r
31
r
32
.
.
.
r
3T
_

_
. .
,
3
+
_

_
c
1
c
2
.
.
.
c
T
_

_
. .
A
3
c
where c s (0. o
2
1
T
)
1
(iv) Matrix form
_

2
.
.
.

T
_

_
. .
1
=
_

_
1 r
21
r
31
1 r
22
r
32
.
.
.
.
.
.
.
.
.
1 r
2T
r
3T
_

_
. .
_
_
,
1
,
2
,
3
_
_
. .
+
_

_
c
1
c
2
.
.
.
c
T
_

_
. .
A , c
I.2 The Classical Assumptions
The classical assumptions are assumptions about the explanatory variables and the
stochastic error terms.
A. 1 A is strictly exogenous and has full column rank (no multicollinearity)
A. 2 The disturbances are mutually independent and the variance is constant at each
sample point, which can be combined in the single statement, cjA (0. o
2
1
T
)
Given the assumptions, we make two additional assumptions: (i) A is nonstochastic,
and (ii) the error terms are normally distributed, i.e. c `(0. o
2
1
T
). These assump-
tion may look very restrictive, but the properties of the estimators derived under the
stronger assumptions are extremely useful to understand the large-sample properties of
the estimators under the classical assumptions.
II OLS Estimator
The OLS estimator
^
, minimizes the residual sum of squares n
0
n where n = 1 A/.
Namely
^
, = arg min
b
1oo = n
0
n
Then,
1oo = n
0
n
= (1 Ab)
0
(1 Ab)
= 1
0
1 b
0
A
0
1 1
0
Ab +b
0
A
0
Ab
= 1
0
1 2b
0
A
0
1 +b
0
A
0
Ab
2
since the transpose of a scalar is the scalar and thus b
0
A
0
1 = 1
0
Ab. The rst order
conditions are
J1oo
Jb
= 2A
0
1 +2A
0
Ab = 0
(A
0
A) b = A
0
1
, which gives the OLS estimator
^
, = (A
0
A)
1
A
0
1
Remark 1 Let `
X
= 1
T
A(A
0
A)
1
A
0
. It can be easily seen that It follows that
`
X
1 is the vector of residuals when 1 is regressed on A. Also note that ^ c = `
X
1 =
`
X
(A, + c) = `
X
c. Then `
X
is a symmetric (`
0
X
=`
X
) and idempotent (`
X
`
X
=`
X
)
matrix with the properties that `
X
A = 0
T
and `
X
^ c = `
X
1 = ^ c since `
X
A = 0
T
.
Remark 2 The trace of an nn square matrix G, denoted by t:(G), is dened to be the
sum of the elements on the diagonal elements of G. Then, by denition, t:(c
0
) = c for
any scalar c.
Remark 3 t:(1C) = t:(1C) = t:(C1).
Remark 4 Estimator of o
2
On letting / denote the number of the regressors, ^ o
2
=
^ c
0
^ c,(1 /) is an unbiased estimator of o
2
since
E[^ c
0
^ c] = E[c
0
`
0
X
`
X
c]
= E[c
0
`
X
c] since `
X
is idempotent and symmetric
= E[t: (c
0
`
X
c)] by the Remark (2)
= E[t: (cc
0
`
X
)] by the Remark (3)
= o
2
t: (`
X
)
= o
2
t: (1
T
) o
2
t:
_
A(A
0
A)
1
A
0

= o
2
1 o
2
t:
_
(A
0
A)
1
A
0
A

= o
2
(1 /)
and hence we have
E
_
^ c
0
^ c
(1 /)
_
= o
2
.

3
By making an additional assumption that c is normally distributed, we have
^
, = (A
0
A)
1
A
0
1
= (A
0
A)
1
A
0
(A, + c)
= , + (A
0
A)
1
A
0
c
N(,. o
2
(A
0
A)
1
)
For each individual regression coecients,
^
,
i
N(,
i
. o
2
(A
0
A)
1
i
)
where o
2
(A
0
A)
1
i
is the (i. i) element of o
2
(A
0
A)
1
. In general, however, o
2
is unknown,
and thus the estimated variance of
^
, is
\
\ c:
_
^
,
_
= ^ o
2
(A
0
A)
1
II.1 Gauss-Markov Theorem
The OLS estimator is seen to be linear combinations of the dependent variables and
hence linear combinations of the error terms. As one could see above, it belongs to
the class of linear unbiased estimators. Its distinguishing feature is that its sampling
variance is the minimum that can be achieved by any linear unbiased estimator under
the classical assumptions. The Gauss-Markov Theorem is the fundamental least-square
theorem. It states that conditioned on the classical assumptions made, any other linear
unbiased estimator of the , cannot have smaller sampling variances than that of the
least-squares estimator. It is said to be a BLUE(best linear unbiased estimator).
III Measures of Goodness of Fit
The vector of the dependent variable 1 can be decomposed into the part explained by
the regressors and the unexplained part.
1 =
^
1 + ^ c where
^
1 = A
^
,
Remark 5
^
1 and ^ c are orthogonal because
^
1
0
^ c =
^
,
0
A
0
^ c
4
=
^
,
0
A
0
`
X
c
= 0 since A
0
`
X
= 0
TT

Then it follows from the Remark 5 that if



1 is the sample mean of the dependent
variable, then
1
0
1 1

1
2
. .
TSS
=
^
1
0
^
1 1

1
2
. .
ESS
+ ^ c
0
^ c
..
RSS
.
The coecient of multiple correlation 1 is dened as the positive square root of
1
2
=
1oo
1oo
= 1
1oo
1oo
1
2
is the proportion of the total variation of 1 explained by the linear combination of
the regressors. This value is increasing by including any additional regressors even if the
added regressors are irrelevant to the dependent. However, the adjusted 1
2
, denoted by

1
2
may decrease with the addition of variables of low explanatory power.

1
2
= 1
1oo
1oo

1 1
1 /
Two of the frequently used criteria for comparing the t of various specications is
the Schwarz criterion
oC = ln
^ c
0
^ c
1
+
/
1
ln 1
and the Akaike information criterion
1C = ln
^ c
0
^ c
1
+
2/
1
It is important to note that these criterions favor a model with smaller sum of the
squared residuals, but each criterion adds on a penalty for model complexity measured
by the number of the regressors. Most statistics computer programs routinely produce
those criterions.
5

You might also like