0% found this document useful (0 votes)

43 views

Lecture 2

The analogy principle suggests estimating population parameters by sample statistics that have the same property in the sample as the parameters do in the population. For estimating the mean of a standard normal distribution based on a random sample, a good estimator following this principle would be the sample mean, as it serves the same purpose of measuring the central tendency of the data in the sample as the population mean does for the overall distribution. The sample mean takes the average of the observed values in the sample, analogous to how the population mean calculates the average value across the entire population.

Uploaded by

Vivi Enne

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views

Lecture 2

Uploaded by

Vivi Enne

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Advanced Econometrics I

Jürgen Meinecke
Lecture 2 of 12
Research School of Economics, Australian National University

1 / 42
Roadmap

Projections (rinse and repeat)

Linear Projections in L2 — General Case

Ordinary Least Squares Estimation

2 / 42
Given a bunch of random variables X1 , . . . , XK , Y, we wanted to
express Y as a linear combination in X1 , . . . , XK
A fancy way of saying the same thing:
We want to project Y onto the subspace spanned by X1 , . . . , XK
That projection is labeled Psp(X1 ,...,XK ) Y or Ŷ
Instead of Psp(X1 ,...,XK ) , we may simply write PX ,
where X := (X1 , . . . , XK )0
(Aside: the Xi can enter non-linearly, for example X2 := X12 )

3 / 42
Viewing X1 , . . . , XK , Y as elements of a Hilbert space, we
learned the generic characterization using the inner product:
Using the orthonormal basis X̃1 , . . . , X̃K
(such that sp(X̃1 , . . . , X̃K ) = sp(X))
K
Ŷ = PX Y = ∑ hX̃i , YiX̃i
i=1
K
= ∑ E(X̃i · Y)X̃i
i=1
K
= ∑ β∗i Xi
i=1

For example, when X1 = 1 (constant term) and K = 2, we saw

Cov(X2 , Y)
β∗2 =
Var (X2 )
β∗1 = EY − β∗2 EX2

4 / 42
For general K, we use matrices to express β∗ := ( β 1 , . . . , β K )0
Let X := (X1 , X2 , . . . , XK )0 be a K × 1 vector
K
Ŷ = PX Y = ∑ β∗i Xi
i=1
0 ∗
=: X β ,

−1
where β∗ := (E(XX0 )) E(XY) is a K × 1 vector
Linear algebra
 detour: for generic
  column vectors
x1 y1
 .   .  K
x :=  ..  ,
  y :=  .. 

 , ∑ xi yi = x0 y = y0 x,
i=1
xK yK

compact notation: x := (x1 , . . . , xK )0 and y := (y1 , . . . , yK )0

5 / 42
When X1 = 1, β∗ can be expressed via covariances
Corollary
When X = (1, X2 , . . . , XK )0 , then the projection coefficients are
( β∗2 , . . . , β∗K )0 = ΣXX
−1
ΣXY
β∗1 = EY − β∗2 EX2 − · · · − β∗K EXK ,
where    
σ22 σ23 . . . σ2K Cov(X2 , Y)
 σ32 σ32 . . . σ3K   Cov(X3 , Y) 
   
ΣXX :=  .
 .. .. ..  , ΣXY := 
  .. 
 .. . . .  .

 
σK2 σK3 . . . σK 2 Cov(XK , Y)

ΣXX is matrix that collects variances of X on the diagonal and

covariances on the off-diagonal
ΣXY is vector that collects covariances between X and Y

6 / 42
Proof that projection is orthogonal to space spanned by X:
E(X(Y − PX Y)) = E(X(Y − X0 · E(XX0 )−1 E(XY)))
= E(XY − XX0 · E(XX0 )−1 E(XY))
= E(XY) − E(XX0 )E(XX0 )−1 E(XY)
=0

This result justifies the following linear model representation:

Y = PX Y + (Y − PX Y)
=: PX Y + u
= X0 β∗ + u,

where E (Xu) = 0
An undergraduate course in econometrics (or regression
analysis) typically starts with this linear model
7 / 42
Using the linear projection representation
Y = X0 β∗ + u

Once you learn that E(Xu) = 0 you know that β∗ must be the
projection coefficient
You have learned that it exists and is unique
It is important to understand that the definition of the linear
projection model is not restrictive
In particular, E(uX) = 0 is not an assumption, it is definitional
To drive home this point, suppose I claim
Y = X0 θ + w

Next I tell you that E(wX) = 0

−1
You therefore conclude that θ = β∗ = (E(XX0 )) E(XY)
8 / 42
In summary
Definition (Linear Projection Model)
Given

(i) X1 , . . . , XK , Y ∈ L2
(ii) E(XX0 ) > 0 (positive definite)
(aka, no perfect multicollinearity)

Then the linear projection model is given by

Y = X0 β∗ + u,

−1
where E(uX) = 0 and β∗ = (E(XX0 )) E(XY).

9 / 42
We accept and understand now that the unique projection
coefficient exists
Let’s say we’re interested in knowing the value of β∗
−1
We just learned that ( β∗2 , . . . , β∗K )0 = ΣXX ΣXY
Do we know the objects on the rhs?
These are population variances and covariances
We don’t know these, therefore we don’t know β∗
How else could we quantify β∗ ?

10 / 42
Roadmap

Projections (rinse and repeat)

Ordinary Least Squares Estimation

The Problem of Estimation
Definition of the OLS Estimator
Basic Asymptotic Theory (part 1 of 2)
Large Sample Properties of the OLS Estimator

11 / 42
Let’s indulge ourselves and take a short detour to think about
estimation in an abstract way
This subsection is based on Stachurski A Primer in Econometric
Theory chapters 8.1 and 8.2
We’re dealing with a random variable Z with distribution P
We’re interested in a feature of P
Definition (Feature)
Let Z ∈ L2 and P ∈ P where P is a class of distributions on Z.
A feature of P is an object of the form
γ(P) for some γ : P → S

Here S is an arbitrarily flexible space (usually R)

Examples of features: means, moments, variances, covariances

12 / 42
For some reason we are interested in γ(P)
If we knew P then we may be able to derive γ(P)
R
Example: P is standard normal and γ(P) = ZP(dZ) = 0
(mean of the standard normal distribution)
But we typically don’t know P
If all we’re interested in is γ(P) then we may not need to know
P (unless the feature we’re interested in is P itself)
Instead, we use a random sample to make an inference about a
feature of P

13 / 42
Definition (Random Sample)
The random variables Z1 , . . . , ZN are called a random sample
of size N from the population P if Z1 , . . . , ZN are mutually
independent and all have probability distribution P.

The joint distribution of Z1 , . . . , ZN is PN by independence

We sometimes say that Z1 , . . . , ZN are iid copies of Z
We sometimes say that Z1 , . . . , ZN are iid random variables
By the way: Zi could be vectors or matrices too

14 / 42
Definition (Statistic)
i=1 R → S that maps the
A statistic is any function g : ×N K

sample data somewhere: g(Z1 , . . . , ZN ).

The definition of a statistic is deliberately broad

It is a function that maps the sample data somewhere
Where to? Depends on the feature γ(P) you’re interested in
There are countless examples
Illustration: let K = 1 (i.e., univariate)
N
sample mean: g(Z1 , . . . , ZN ) = ∑i=1 Zi /N =: Z̄N
N
sample variance: g(Z1 , . . . , ZN ) = ∑i=1 (Zi − Z̄N )2 /N
sample min: g(Z1 , . . . , ZN ) = min {Z1 , . . . , ZN }
answer to everything: g(Z1 , . . . , ZN ) = 42

15 / 42
A statistic becomes an estimator when linked to a feature γ(P)
Definition (Estimator)
An estimator γ̂ is a statistic used to infer some feature γ(P) of
an unknown distribution P.

In other words: an estimator is a statistic with a purpose

16 / 42
Earlier example: P is the standard normal distribution
(but let’s pretend we don’t know this, as is usually the case)
So Z ∼ N (0, 1)
R
And we’re interested in EZ so we set γ(P) = EZ = ZP(dZ)
We have available a random sample {Z1 , . . . , ZN }
Each Zi ∼ N (0, 1), but we don’t know this
But we do know: all Zi are iid
So they must all have the same mean EZi
What would be an estimator for EZ?
Aside: there are infinitely many
What would be a good estimator for EZi ?
(perhaps not so many anymore)

17 / 42
Analogy Principle

A good way to create estimators is the analogy principle

Goldberger explains the main idea of it:
the analogy principle of estimation. . . proposes that population
parameters be estimated by sample statistics which have the same
property in the sample as the parameters do in the population
(Goldberger, 1968, as cited in Manski, 1988)
That is very unspecific, of course
Manski (1988) wrote an entire book on analog estimation and
explains the analogy principle precisely and comprehensively
But we can illustrate it using our earlier framework

18 / 42
Definition (Empirical Distribution)
The empirical distribution PN of the sample {Z1 , . . . , ZN } is
the discrete distribution that puts equal probability 1/N on
each sample point Zi , i = 1, . . . , N.

Definition (Analogy Principle)

To estimate γ(P) use γ̂ := γ(PN ).

How do we use this in our example?

19 / 42
R
We wanted to estimate γ(P) := ZP(dZ)
R
According to the analogy principle, we should use ZPN (dZ)
By definition, the empirical distribution is discrete therefore
Z N
ZPN (dZ) = ∑ Zi /N =: Z̄N
i=1

This is, of course, the sample average and we use the

conventional notation Z̄N
The analogy principle results in the estimator γ̂ = ∑N
i=1 Zi /N

How can we use the analogy principle to estimate β∗ ?

20 / 42
Roadmap

Projections (rinse and repeat)

Ordinary Least Squares Estimation

The Problem of Estimation
Definition of the OLS Estimator
Basic Asymptotic Theory (part 1 of 2)
Large Sample Properties of the OLS Estimator

21 / 42
Recall linear projection representation
Y = X0 β∗ + u,

where X := (X1 , . . . , XK )0 , and X1 , . . . , XK , Y ∈ L2

We saw that E(uX) = 0 implied
−1
β∗ = E(XX0 ) E(XY)

In other words: β∗ is the projection coefficient

We want to estimate β∗
For that purpose we have available a random sample
(X1 , Y1 ), . . . , (XN , YN )
We simply write that (Xi , Yi ), i = 1, . . . , N is a random sample
These are iid copies of the ordered pair (X, Y)

22 / 42
Given the random sample (Xi , Yi ), i = 1, . . . , N we can write the
linear projection representation as
Yi = Xi0 β∗ + ui ,

where Xi := (Xi1 , . . . , XiK )0 is the K-dimensional column vector

that contains copies of X1 , . . . , XK for observation i
Because E (uX) = 0 we have E (ui Xi ) = 0

23 / 42
Combining findingsfrom last lecture and assignment 1:
∗

0 2
β = argmin E Y − X b
b ∈ RK
2
= argmin E Yi − Xi0 b (1)
b∈RK

= E(Xi Xi0 )−1 E(Xi Yi ) (2)

Equations (1) and (2) motivate two succinct analog estimators

for β∗ :

(1) the ordinary least squares estimator;

(2) the method of moments estimator

Let’s look at both

24 / 42
If we define β∗ like so:
2
β∗ := argmin E Yi − Xi0 b ,
b ∈ RK

then the analogy principle suggests the estimator

N
argmin ∑ Yi − Xi0 b
2
b ∈ RK i=1

This seems very sensible and deserves a famous definition

Definition (Ordinary Least Squares (OLS) Estimator)
The ordinary least squares estimator is
N
β̂OLS := argmin ∑ Yi − Xi0 b
2
b ∈ RK i=1

It is obvious how this estimator obtained its name

25 / 42
When you solve this you get
! −1 !
N N
1 1
N i∑ ∑ Xi Yi
β̂OLS = Xi Xi0
=1 N i=1

Most people, when writing vectors, use the default column

notation, meaning that if I tell you that Xi is a K-dimensional
vector, you automatically know it is a K × 1 vector

26 / 42
The second way of defining an estimator for β∗ , via:
β∗ = E(Xi Xi0 )−1 E(Xi Yi )

The analogy principle suggests the estimator

! −1
1 N 1 N
∑
N i=1
Xi Xi0
N i∑
Xi Yi
=1

This also seems very sensible and deserves a familiar name:

Definition (Method of Moments (MM) Estimator)
Applying the analogy principle results in
! −1
N
1 1 N
N i∑ N i∑
0
β̂MM = X X
i i Xi Yi
=1 =1

You immediately see that β̂OLS = β̂MM

I’ll simply refer to it as the OLS estimator
27 / 42
Two slides on notation
Notice that Xi and Yi are sometimes treated as random
variables, and sometimes as realizations (or observations)
When you see something like E(Xi Yi ) then both Xi and Yi are
random variables
Expectations are usually taken over random variables
But when you see something like ∑N i=1 Xi Yi then both Xi and Yi
are realizations of the random variables
That is, they are observed values that the random variables
have taken on
The context will tell you what role Xi and Yi play

28 / 42
The OLS estimator does have a compact matrix representation
Let X := (X1 , X2 , . . . , XN )0 be the N × K matrix collecting all Xi
Let Y := (Y1 , Y2 , . . . , YN )0 be the N × 1 vector collecting all Yi
Then ∑ Xi Xi0 = X0 X and ∑ Xi Yi = X0 Y
The well known matrix representation of β̂OLS follows:
β̂OLS = (X0 X)−1 X0 Y

Digression: switch of notation alert!

We are assigning a new meaning to the symbols X and Y:
• Until here, X was a K × 1 vector
From now on, X denotes an N × K matrix
• Until here, Y was a scalar
From now on, Y denotes an N × 1 vector

29 / 42
Now let’s turn to the question: How good is β̂OLS ?
What is goodness?
In the next few weeks we’ll consider things such as

• bias
• variance (small sample and large sample)
• consistency
• distribution (large sample)

30 / 42
Roadmap

Projections (rinse and repeat)

Ordinary Least Squares Estimation

The Problem of Estimation
Definition of the OLS Estimator
Basic Asymptotic Theory (part 1 of 2)
Large Sample Properties of the OLS Estimator

31 / 42
Definition (Convergence in Probability)
A sequence of random variables Z1 , Z2 , . . . converges in
probability to a random variable Z if for all e > 0,
lim P(|ZN − Z| > e) = 0.
N →∞

p
We write ZN → Z and say that Z is the probability limit (plim)
of ZN .

Often times the probability limit Z is a degenerate random

variable that takes on a constant value everywhere
If a sequence converges in probability to zero, we have special
notation:
Definition
p
If ZN → 0 we write ZN = op (1).

32 / 42
Definition (Bounded in Probability)
A sequence of random variables Z1 , Z2 , . . . is bounded in
probability if for all e > 0, there exists be ∈ R and an integer
Ne such that
P(|ZN | ≥ be ) < e for all N ≥ Ne

We write ZN = Op (1).

33 / 42
Lemma
If ZN = c + op (1) then ZN = Op (1) for c ∈ R.

Proposition
Let WN = op (1), XN = op (1), YN = Op (1), and ZN = Op (1).

WN + XN = op (1) WN + YN = Op (1) YN + ZN = Op (1)

WN · XN = op (1) WN · YN = op (1) YN · ZN = Op (1)

34 / 42
We’ve got a few more tricks up our sleeves
Theorem (Slutsky Theorem)
If ZN = c + op (1) and g(·) is continuous at c then
g(ZN ) = g(c) + op (1).

In short: g(c + op (1)) = g(c) + op (1)

That’s a reason to like the plim, it passes through nonlinear
functions (which is not true for expectation operators)
Corollary
1/(c + op (1)) = 1/c + op (1) whenever c 6= 0.

All the definitions on the previous four slides also apply

element by element to sequences of random vectors or matrices

35 / 42
Theorem (Weak Law of Large Numbers (WLLN))
Let Z1 , Z2 , . . . be independent and identically distributed random
variables with EZi = µZ and Var Zi = σZ2 < ∞.
Define Z̄N := ∑N i=1 Zi /N. Then
p
Z̄N − µZ → 0.

Or, equivalently, Z̄N = µZ + op (1).

p
Equivalently we can state: 1
N ∑N
i = 1 Zi → µ Z

In words:
sample mean converges in probability to population mean
Proving the WLLN is easy, using Chebyshev’s inequality

36 / 42
Lemma (Chebyshev’s Inequality)
Let Z be a random variable with EZ2 < ∞ and let g(·) be a
nonnegative function. Then for any c > 0
E (g(Z))
P (g(Z) ≥ c) ≤ .
c

Here we’re interested in bounding limN→∞ P (|Z̄N − µZ | > e)

P (|Z̄N − µZ | > e) = P (Z̄N − µZ )2 > e2

E(Z̄N − µZ )2
≤
e2
Var ZN
=
e2
σZ2
= ,
N · e2
which converges to zero as N → ∞
We have used the fact Var ZN = σZ2 /N
37 / 42
This takes us back to the analogy principle
Remember earlier:
R
We wanted to estimate the feature γ(P) := EZ = ZP(dZ)
R
According to the analogy principle, we should use ZPN (dZ)
This led to the estimator γ̂ = ∑N
i=1 Zi /N
p
Immediately by the WLLN: γ̂ → γ(P)
Definition (Consistency of an Estimator)
p
An estimator γ̂ for γ := γ(P) is called consistent if γ̂ → γ.

Intuition: if the sample size is large, sample mean is almost

equal to population mean
So there is some hope that the analogy principle leads to
consistent estimators

38 / 42
Roadmap

Projections (rinse and repeat)

Ordinary Least Squares Estimation

The Problem of Estimation
Definition of the OLS Estimator
Basic Asymptotic Theory (part 1 of 2)
Large Sample Properties of the OLS Estimator

39 / 42
Let’s first show that the OLS estimator is consistent
−1
Recall the result for β̂OLS := ∑N 0
i = 1 Xi Xi ∑N
i=1 Xi Yi

Using Yi = Xi0 β∗ + ui
! −1 !
N N
1 1
β̂OLS := β∗ +
N ∑ Xi Xi0 N ∑ Xi ui .
i=1 i=1

By the WLLN
1 N
N i∑
Xi Xi0 = E(Xi Xi0 ) + op (1)
=1

Assuming that E(Xi Xi0 ) is positive definite (inverse exists) and

using Slutsky’s theorem, and a matrix version of the earlier
Lemma that c + op (1) = Op (1):
! −1
1 N
N i∑
0
Xi Xi = E(Xi Xi0 )−1 + op (1) = Op (1) + op (1) = Op (1)
=1

40 / 42
For the other factor on the rhs:
1 N
N i∑
Xi ui = E(Xi ui ) + op (1) = 0 + op (1) = op (1)
=1

It follows
β̂OLS = β∗ + Op (1) · op (1)
= β ∗ + op ( 1 )

In words: β̂OLS converges in probability to β∗

This means β̂OLS is a consistent estimator for the projection
coefficient β∗
It illustrates the benefit of the analogy principle when it works

41 / 42
But what is the distribution of β̂OLS ?
• that’s a tricky one
• β̂OLS = β∗ + (X0 X)−1 X0 u, what’s the distribution of the
second term on the rhs?
• short answer: we have no idea
• there’s some suspicion that β̂OLS may have an exact normal
distribution if u is normally distributed
• but we don’t know what the distribution of u is

42 / 42

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6132)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (627)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1148)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (935)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4/5 (8215)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (631)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1253)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4/5 (8365)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (860)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (877)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (954)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4/5 (2923)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (484)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (277)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (4972)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (444)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2061)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4281)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (447)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (1987)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2283)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (278)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1068)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2641)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (1993)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (1936)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (125)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (692)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (1912)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4074)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (75)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (830)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (901)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (143)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2544)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M L Stedman
4.5/5 (790)
Sales & Operations Planning:: A Guide For The Supply Chain Leader
100% (1)
Sales & Operations Planning:: A Guide For The Supply Chain Leader
39 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Little Women
From Everand
Little Women
Louisa May Alcott
4/5 (105)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
3.5/5 (109)
Phys532 Pset 2
No ratings yet
Phys532 Pset 2
2 pages
Edited - Ziggs Build Guide - Up Your Arsenal! Ziggs Guide - League of Legends Strategy Builds PDF
No ratings yet
Edited - Ziggs Build Guide - Up Your Arsenal! Ziggs Guide - League of Legends Strategy Builds PDF
23 pages
How To Add A Context Dependent Descriptive Flex SSHR
No ratings yet
How To Add A Context Dependent Descriptive Flex SSHR
7 pages
HW#4 For Montgomery Design and Analysis of Experiments
No ratings yet
HW#4 For Montgomery Design and Analysis of Experiments
5 pages
Flux Improvement by Dean Vortices: Ultra®ltration of Colloidal Suspensions and Macromolecular Solutions
No ratings yet
Flux Improvement by Dean Vortices: Ultra®ltration of Colloidal Suspensions and Macromolecular Solutions
22 pages
Project Impact of Brexit On Small Business in The UK: Student's Name Student's ID Name of The Subject Name of The Course
No ratings yet
Project Impact of Brexit On Small Business in The UK: Student's Name Student's ID Name of The Subject Name of The Course
12 pages
30 Day Self Love Guide
No ratings yet
30 Day Self Love Guide
6 pages
Kagnew Wolde
No ratings yet
Kagnew Wolde
46 pages
Lesson Plan: Tiếng Anh 10 Friends Global
No ratings yet
Lesson Plan: Tiếng Anh 10 Friends Global
7 pages
2019 May IT404-A - Ktu Qbank
No ratings yet
2019 May IT404-A - Ktu Qbank
2 pages
Subject Profile: Dr. Yanga's Colleges, Inc
No ratings yet
Subject Profile: Dr. Yanga's Colleges, Inc
6 pages
AEG - Control & Signalling Units
No ratings yet
AEG - Control & Signalling Units
19 pages
Thunder Tiger - +-+PRO-21+og+28BX-RP+bilmotor
No ratings yet
Thunder Tiger - +-+PRO-21+og+28BX-RP+bilmotor
4 pages
INA219_WE_public_functions
No ratings yet
INA219_WE_public_functions
1 page
Pressure Washer Inspection Template - 33465
No ratings yet
Pressure Washer Inspection Template - 33465
2 pages
Journal of Food
No ratings yet
Journal of Food
5 pages
Self Assessment Speaking Test
No ratings yet
Self Assessment Speaking Test
150 pages
Poem (I Am A Polar Bear)
No ratings yet
Poem (I Am A Polar Bear)
15 pages
CV and Portfolio PDF
No ratings yet
CV and Portfolio PDF
20 pages
1e1: Engineering Mathematics I (5 Credits) Lecturer
No ratings yet
1e1: Engineering Mathematics I (5 Credits) Lecturer
2 pages
Blockchain Technology Exploring Opportunities, Challenges, And Applications (Sonali Vyas, Vinod Kumar Shukla, Shaurya Gupta Etc.) (Z-Library)
No ratings yet
Blockchain Technology Exploring Opportunities, Challenges, And Applications (Sonali Vyas, Vinod Kumar Shukla, Shaurya Gupta Etc.) (Z-Library)
307 pages
PCM
No ratings yet
PCM
14 pages
Transformer Technical Data Sheet For The 1LAP016417
No ratings yet
Transformer Technical Data Sheet For The 1LAP016417
1 page
Software Environment: History of Visual Basic
No ratings yet
Software Environment: History of Visual Basic
5 pages
Erd r07 Question Papers
No ratings yet
Erd r07 Question Papers
9 pages
Consumer Awareness
68% (25)
Consumer Awareness
9 pages
Lithocaps Characteristics Origins and Si
No ratings yet
Lithocaps Characteristics Origins and Si
4 pages
TRB Polytechnic - TRB Material For Preparation in Lecturers Recruitment in Government Polytechnic Colleges
27% (22)
TRB Polytechnic - TRB Material For Preparation in Lecturers Recruitment in Government Polytechnic Colleges
356 pages
NBK TA Brochure 2016-Email
No ratings yet
NBK TA Brochure 2016-Email
11 pages

Lecture 2

Uploaded by

Lecture 2

Uploaded by

Advanced Econometrics I

Projections (rinse and repeat)

Ordinary Least Squares Estimation

For example, when X1 = 1 (constant term) and K = 2, we saw

compact notation: x := (x1 , . . . , xK )0 and y := (y1 , . . . , yK )0

ΣXX is matrix that collects variances of X on the diagonal and

This result justifies the following linear model representation:

Next I tell you that E(wX) = 0

Then the linear projection model is given by

Projections (rinse and repeat)

Ordinary Least Squares Estimation

Here S is an arbitrarily flexible space (usually R)

The joint distribution of Z1 , . . . , ZN is PN by independence

sample data somewhere: g(Z1 , . . . , ZN ).

The definition of a statistic is deliberately broad

In other words: an estimator is a statistic with a purpose

A good way to create estimators is the analogy principle

Definition (Analogy Principle)

How do we use this in our example?

This is, of course, the sample average and we use the

How can we use the analogy principle to estimate β∗ ?

Projections (rinse and repeat)

Ordinary Least Squares Estimation

where X := (X1 , . . . , XK )0 , and X1 , . . . , XK , Y ∈ L2

In other words: β∗ is the projection coefficient

where Xi := (Xi1 , . . . , XiK )0 is the K-dimensional column vector

= E(Xi Xi0 )−1 E(Xi Yi ) (2)

Equations (1) and (2) motivate two succinct analog estimators

(1) the ordinary least squares estimator;

Let’s look at both

then the analogy principle suggests the estimator

This seems very sensible and deserves a famous definition

It is obvious how this estimator obtained its name

Most people, when writing vectors, use the default column

The analogy principle suggests the estimator

This also seems very sensible and deserves a familiar name:

You immediately see that β̂OLS = β̂MM

Digression: switch of notation alert!

Projections (rinse and repeat)

Ordinary Least Squares Estimation

Often times the probability limit Z is a degenerate random

WN + XN = op (1) WN + YN = Op (1) YN + ZN = Op (1)

In short: g(c + op (1)) = g(c) + op (1)

All the definitions on the previous four slides also apply

Or, equivalently, Z̄N = µZ + op (1).

Here we’re interested in bounding limN→∞ P (|Z̄N − µZ | > e)

Intuition: if the sample size is large, sample mean is almost

Projections (rinse and repeat)

Ordinary Least Squares Estimation

Assuming that E(Xi Xi0 ) is positive definite (inverse exists) and

In words: β̂OLS converges in probability to β∗

You might also like