Web Page
Web Page
• Yi = b 0 + b 1 · Xi + ui , i = 1, ..., n
• b 1 = slope of the population regression line
1 The error term has conditional mean zero given Xi : E (ui |Xi ) = 0.
Estimator: the OLS estimator of b̂ 1 2 (Xi , Yi ), i = 1, ..., n are independent identically distributed draws.
Sampling distribution of b̂ 1 : Since the population regression line is 3 Large outliers are unlikely.
E (Y |X ) = b 0 + b 1 and E (ui |Xi ) = 0, we need to also assume that:
• (Xi , Yi ), i = 1, ..., n are independent identically distributed draws.
• large outliers are unlikely.
These are our second and third OLS assumptions.
sampling distribution of b̂ 1 sampling distribution of b̂ 1
the estimates of b follow a probability distribution
• an estimator is a formula (e.g. OLS) for how to compute b̂ 1 • we need to discuss the properties of this sampling distribution of b̂
• we use a sample of data to obtain an estimate: a value of b̂ 1 based on the single sample that we have.
computed by the formula for a given sample • but remember that the sampling distribution refers to di↵erent
• each di↵erent sample from the same population will produce a values of b̂ across di↵erent samples not just within one.
di↵erent estimate of b • these b̂s are usually assumed to have a normal distribution because
• the probability distribution of these b̂ values across di↵erent samples the error term is normally distributed.
is called the sampling distribution of b̂
What do you think now about the di↵erences between these models?
An estimator b̂ is unbiased if its sampling distribution has as its expected value Hint: Do our OLS assumptions hold in all these models?
the true value of b: E ( b̂) = b
outline lecture 3 part 1
Lecture 3
• recap: sampling distribution of X
Part I: The Sampling Distribution of OLS Estimators • the probability framework of linear regression
• sampling distribution of b̂ 1
1 X -m
2
Random variable X with unknown population mean mX
1 -
s
f(X) f X e 2
s 2 probability density
function of X
X ~ N m ,s 2
mX X
0
0 m-4s m-3s m-2s m-s m m+s m+2s m+3s m+4s X mX X1 mX X2 mX Xn
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE
Random variable X with unknown population mean mX Random variable X with unknown population mean mX
mX X mX X
Actual sample of n observations x1, x2, ..., xn: realization Sample of n observations X1, X2, ..., Xn: potential distributions
1
Estimator: X X 1 ... X n
n
mX x1 X1 x2 mX X2 mX xn Xn mX X1 mX X2 mX Xn
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE Sampling of distribution of X
10
mX X
1
Estimate: x x1 ... xn n=1
n 10 million samples
0
mX x1 X1 x2 mX X2 mX xn Xn 0 0.5 1
The actual number that we obtain, given the realization {x1, …, xn}, is known as our estimate.
15
probability density probability density
function of X function of X
n = 100
10
n = 25
5
n = 10
n=1
10 million samples
0
mX X mX X
0 0.5 1
Thus we have demonstrated that the variance of the sample mean is equal to the variance of
X divided by n, a result with which you will be familiar from your statistics course.
probability framework for OLS the sampling distribution of b̂ 1
Yi = b 0 + b 1 · Xi + ui n n
 Xi X ( ui u)  Xi X ui
Y = b0 + b1 · X + u b̂ 1 b1 = i =1
= i =1
n 2 n 2
Therefore: Â Xi X Â Xi X
i =1 i =1
Yi Y = b 1 Xi X + ( ui u)
su2i
sb̂2 =
n n
 Xi X Yi Y  Xi X [ b 1 Xi X + ( ui u )]
b̂ 1 = i =1
n = i =1
n
2 n · sX2i
2 2
 Xi X  Xi X
i =1 i =1
Conclusion:
The sampling variability of the estimated regression coefficients will be:
• the higher the larger the variability of the unobserved factors, and
• the lower, the higher the variation in the explanatory variable (X)
variance of b̂ 1 and variance of X what comes next?
Dragos Radu
What comes next:
[email protected]
In the next part (three) we discuss properties of the variance of the
regression coefficient.
TESTING A HYPOTHESIS RELATING TO THE POPULATION MEAN TESTING A HYPOTHESIS RELATING TO THE POPULATION MEAN
Example
Null hypothesis: H0 : m 10
Alternative hypothesis: H1 : m 10
6 7 8 9 10 11 12 13 14 X
Suppose that we have a sample of data for the example model and the sample mean X is 9.
Would this be evidence against the null hypothesis m = 10?
TESTING A HYPOTHESIS RELATING TO THE POPULATION MEAN TESTING A HYPOTHESIS RELATING TO THE POPULATION MEAN
2.5% 2.5%
TESTING A HYPOTHESIS RELATING TO THE POPULATION MEAN TESTING A HYPOTHESIS RELATING TO THE POPULATION MEAN
6 7 8 9 10 11 12 13 14 X 6 7 8 9 10 11 12 13 14 X
23
TESTING A HYPOTHESIS RELATING TO THE POPULATION MEAN TESTING A HYPOTHESIS RELATING TO THE POPULATION MEAN
Estimator X 𝟏 =
𝜷
σ 𝑿𝒊 − 𝑿ሜ 𝒀𝒊 − 𝒀ሜ Alternative hypothesis: H1: b1 ≠ 1.0
σ 𝑿𝒊 − 𝑿ሜ 𝟐
pˆ = 1.21 + 0.82w
Null hypothesis H 0 : m = m0 𝑯𝟎 : 𝜷𝟏 = 𝜷𝟎𝟏 (0.05) (0.10)
Alternative hypothesis H 1: m m 0 𝑯𝟏 : 𝜷𝟏 ≠ 𝜷𝟎𝟏
X − m0 𝟏 − 𝜷𝟎𝟏
𝜷 𝟏 − 𝜷𝟎𝟏 𝟎. 𝟖𝟐 − 𝟏. 𝟎𝟎
𝜷
Test statistic t= 𝒕=
s.e. ( X ) 𝟏
s.e. 𝜷 𝒕=
𝟏
=
𝟎. 𝟏𝟎
= −𝟏. 𝟖𝟎.
s.e. 𝜷
Reject H0 if t tcrit t tcrit
n = 20 degrees of freedom = 18 t crit, 5% = 2.101
Degrees of freedom n–1 n–k=n–2
There is one important difference. When locating the critical value of t, one must take
The critical value of t with 18 degrees of freedom is 2.101 at the 5% level. The absolute
account of the number of degrees of freedom. For random variable X, this is n – 1, where n
value of the t statistic is less than this, so we do not reject the null hypothesis.
is the number of observations in the sample. For regression it is n – k with k the number of
coefficients we estimate in the regression, in this case two: intercept and slope.
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS
5%
2.5% 2.5%
–1.96 sd 0 1.96 sd
𝟏
𝜷 0 1.65 sd
𝟏
𝜷
If you use a two-sided 5% significance test, your estimate must be 1.96 standard deviations However, if you can justify the use of a one-sided test, for example with H0: b1 > 0, your
above or below 0 if you are to reject H0. estimate has to be only 1.65 standard deviations above 0.
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS
5% 5%
0 1.65 sd b1 𝟏
𝜷 0 1.65 sd b1 𝟏
𝜷
Suppose that Y is genuinely determined by X and that the true (unknown) coefficient is b1, Suppose that we have a sample of observations and calculate the estimated slope
as shown. 𝟏. If it is as shown in the diagram, what do we conclude when we test H0?
coefficient,𝜷
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS
5% 5%
0 1.65 sd b1 𝟏
𝜷 0 1.65 sd b1 𝟏
𝜷
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS
5% 5%
0 1.65 sd b2 𝟏
𝜷 0 1.65 sd b1 𝟏
𝜷
Lecture 3
Part III: Heteroskedasticity
• part 3 of lecture 3: properties of var (u |X )
• before that: revise hypothesis tests for regresion coefficients
Dragos Radu
(intuition and Stata output)
[email protected]
• Gauss-Markov conditions
• heteroskedasticity
• Gauss-Markov theorem
SLR.1: y = b0 + b1 x + u
SLR.2: random sampling from the population
SLR.3: some sample variation in the xi
SLR.4: E (u |x ) = 0
SLR.1 SLR.4 =) E ( b̂ 0 ) = b 0 , E ( b̂ 1 ) = b 1
under conditions SLR1-4, the OLS coefficients are unbiased.
• the estimated coefficients may be smaller or larger, depending on the
sample that is the result of a random draw
• however, on average, they will be equal to the values that characterise
the true relationship between y and x in the population
• on average means if sampling was repeated, i.e. if drawing the
random sample and doing the estimation was repeated many times
• in a given sample, estimates may di↵er considerably from true values
SLR.1: y = b0 + b1 x + u
SLR.2: random sampling from the population
SLR.3: some sample variation in the xi
SLR.4: E (u |x ) = 0
SLR.5: Var (u |x ) = Var (u ) = s2
• under these assumptions OLS estimator has the smallest variance
among all linear estimators and is therefore BLUE (Best Linear
Unbiased Estimator). This is the Gauss-Markov theorem.
precision of the OLS coefficient
OLS estimator is BLUE