0% found this document useful (0 votes)
51 views

QM 2 Linear Regression

The document summarizes linear regression analysis. It discusses estimating a linear regression model to examine the effect of one variable (X) on another (Y). It provides examples estimating models with age and gender (female) as independent variables using data on risk aversion (crra). The estimated slopes represent the marginal effect of the independent variable on the dependent variable (crra). Exercises are suggested to estimate models using student and young status as additional independent variables.

Uploaded by

Darkasy Edits
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views

QM 2 Linear Regression

The document summarizes linear regression analysis. It discusses estimating a linear regression model to examine the effect of one variable (X) on another (Y). It provides examples estimating models with age and gender (female) as independent variables using data on risk aversion (crra). The estimated slopes represent the marginal effect of the independent variable on the dependent variable (crra). Exercises are suggested to estimate models using student and young status as additional independent variables.

Uploaded by

Darkasy Edits
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

L2.

Linear Regression
• Topics
• Linear regression model
• Testing estimated coefficients in the model
• Notes:
• One-sided tests
• Heteroscedastic error terms
1. Linear regression model
• We want to estimate the causal effect of one variable X on
another variable Y.

• Simple linear model to start with: Y = β0 + β1 X


• Y and X are continuous variables
• Y is a linear function of X
• Estimate the effect of a unit change in X on Y

• The slope of the population regression line is the effect of


a unit change in X on Y.
Statistical inference
Statistical inference about the causal effect of X on Y entails:

Estimation: What line should we draw in an (X,Y) diagram to best


estimate the population slope?

Hypothesis testing: How do we test if the estimated slope is


zero, or any other value?

Confidence intervals: With what precision can we say that the


true population mean of the slope is within a certain interval?
The linear regression model
Yi = β0 + β1Xi + ui, i = 1,…, n observations

We have n observations, (Xi, Yi), i = 1,.., n.


X is the independent variable or regressor
Y is the dependent variable
β0 = intercept
β1 = slope
ui = the regression error

The regression error consists of:


• omitted factors other than X that influence the variation in Y
• errors in the measurement of Y and/or X.
Density
.8

.6

.4

.2

0
Distribution of CRRA in DK (Risk_Field.dta)

-2 0 2 4
crra
4

-
CR
Scatterplot of CRRA and age

20 40 60 80
Age
The OLS estimator
How can we estimate the unknown parameters β0 and β1 in
the linear model?

One method is the ordinary least squares estimators of β0


and β1. The OLS estimators, and , solve

Notation:
• β0 and β1 refer to true population means
• and refer to estimators of population means
• b0 and b1 refer to estimated values of population means
Least squares assumption (i)

CRRA

-2
4

0
For any given value of X, the mean of u is zero: E(u | X=x) = 0.

Scatterplot of CRRA and age

20 40 60 80
Age

crra Fitted values


Least squares assumption (ii)
(Xi,Yi), i =1,…,n, are independently and identically distributed (i.i.d).

This arises automatically if the entity (individual, firm, district, etc.)


is randomly selected from a large population:
• The entities are selected from the same population, so (Xi, Yi)
are identically distributed for all i = 1,…, n.
• The entities are randomly selected, so the values of (X, Y) for
different entities are independently distributed.
Least squares assumption (iii)
Large outliers in X and/or Y are rare.

• A large outlier is an extreme value of X or Y


• A large outlier can strongly influence the results if the number
of observations is relatively small
• Check your data and evaluate any extreme values
• On a technical level, if X and Y are bounded, then they have
finite fourth moments.
Linear regression in Stata
• Please open Lecture_2.do in Stata
• Data: Risk_Field.dta
• (see Andersen et al. [2010] for further documentation)
Example with linear regression model
• Linear regression model with age as only independent variable

. regress crra age

Source | SS df MS Number of obs = 842


-------------+---------------------------------- F(1, 840) = 36.45
Model | 20.6550559 1 20.6550559 Prob > F = 0.0000
Residual | 475.985953 840 .566649944 R-squared = 0.0416
-------------+---------------------------------- Adj R-squared = 0.0404
Total | 496.641009 841 .590536277 Root MSE = .75276

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -.0109613 .0018155 -6.04 0.000 -.0145248 -.0073978
_cons | 1.14752 .0857724 13.38 0.000 .9791665 1.315873
------------------------------------------------------------------------------
Example with linear regression model
• Linear regression model with age as only independent variable

. regress crra age

Source | SS df MS Number of obs = 842


-------------+---------------------------------- F(1, 840) = 36.45
Model | 20.6550559 1 20.6550559 Prob > F = 0.0000
Residual | 475.985953 840 .566649944 R-squared = 0.0416
-------------+---------------------------------- Adj R-squared = 0.0404
Total | 496.641009 841 .590536277 Root MSE = .75276

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -.0109613 .0018155 -6.04 0.000 -.0145248 -.0073978
_cons | 1.14752 .0857724 13.38 0.000 .9791665 1.315873
------------------------------------------------------------------------------

• Predicted value of crra = 1.148 − 0.011 ∙ age


• crra is negatively related to age
Example with linear regression model
• Linear regression model with age as only independent variable

. regress crra age

Source | SS df MS Number of obs = 842


-------------+---------------------------------- F(1, 840) = 36.45
Model | 20.6550559 1 20.6550559 Prob > F = 0.0000
Residual | 475.985953 840 .566649944 R-squared = 0.0416
-------------+---------------------------------- Adj R-squared = 0.0404
Total | 496.641009 841 .590536277 Root MSE = .75276

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -.0109613 .0018155 -6.04 0.000 -.0145248 -.0073978
_cons | 1.14752 .0857724 13.38 0.000 .9791665 1.315873
------------------------------------------------------------------------------

• Marginal effect of age on crra: d(crra)/d(age) = − 0.011


• A one-year increase in age reduces crra by 0.011
Example with linear regression model
• Linear regression model with female as only independent variable

. regress crra female

Source | SS df MS Number of obs = 846


-------------+---------------------------------- F(1, 844) = 1.24
Model | .730429713 1 .730429713 Prob > F = 0.2654
Residual | 496.358966 844 .58810304 R-squared = 0.0015
-------------+---------------------------------- Adj R-squared = 0.0003
Total | 497.089396 845 .588271474 Root MSE = .76688

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | .0587697 .052734 1.11 0.265 -.0447355 .1622749
_cons | .6227924 .0374645 16.62 0.000 .5492579 .6963268
------------------------------------------------------------------------------

• Predicted value of crra = 0.623 + 0.059 ∙ female


• Predicted value for men = 0.623, and women = 0.623 + 0.059 = 0.682
Example with linear regression model
• Linear regression model with female as only independent variable

. regress crra female

Source | SS df MS Number of obs = 846


-------------+---------------------------------- F(1, 844) = 1.24
Model | .730429713 1 .730429713 Prob > F = 0.2654
Residual | 496.358966 844 .58810304 R-squared = 0.0015
-------------+---------------------------------- Adj R-squared = 0.0003
Total | 497.089396 845 .588271474 Root MSE = .76688

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | .0587697 .052734 1.11 0.265 -.0447355 .1622749
_cons | .6227924 .0374645 16.62 0.000 .5492579 .6963268
------------------------------------------------------------------------------

• Marginal effect of female: d(crra)/d(female) = 0.059


• Same as difference in predicted values for men and women
Exercises
• Estimate a linear regression model with student as the only
independent variable
• Are students more or less risk averse than the rest of the
population?

• Repeat the same exercise with young as the only


independent variable
Run model with student on right hand side
. regress crra student

Source | SS df MS Number of obs = 846


-------------+---------------------------------- F(1, 844) = 24.01
Model | 13.7521132 1 13.7521132 Prob > F = 0.0000
Residual | 483.337283 844 .572674506 R-squared = 0.0277
-------------+---------------------------------- Adj R-squared = 0.0265
Total | 497.089396 845 .588271474 Root MSE = .75675

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
student | .435723 .088916 4.90 0.000 .2612007 .6102454
_cons | .611252 .0273426 22.36 0.000 .5575845 .6649194
------------------------------------------------------------------------------
Predicted crra values
. regress crra student

Source | SS df MS Number of obs = 846


-------------+---------------------------------- F(1, 844) = 24.01
Model | 13.7521132 1 13.7521132 Prob > F = 0.0000
Residual | 483.337283 844 .572674506 R-squared = 0.0277
-------------+---------------------------------- Adj R-squared = 0.0265
Total | 497.089396 845 .588271474 Root MSE = .75675

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
student | .435723 .088916 4.90 0.000 .2612007 .6102454
_cons | .611252 .0273426 22.36 0.000 .5575845 .6649194
------------------------------------------------------------------------------

• crra(students) = 0.611 + 0.436 = 1.047


• crra(non-students) = 0.611
Run model with young on right hand side
. regress crra young

Source | SS df MS Number of obs = 846


-------------+---------------------------------- F(1, 844) = 27.36
Model | 15.6100626 1 15.6100626 Prob > F = 0.0000
Residual | 481.479333 844 .570473144 R-squared = 0.0314
-------------+---------------------------------- Adj R-squared = 0.0303
Total | 497.089396 845 .588271474 Root MSE = .7553

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
young | .3585009 .0685339 5.23 0.000 .223984 .4930178
_cons | .5901624 .0285679 20.66 0.000 .5340898 .6462349
------------------------------------------------------------------------------
Predicted crra values
. regress crra young

Source | SS df MS Number of obs = 846


-------------+---------------------------------- F(1, 844) = 27.36
Model | 15.6100626 1 15.6100626 Prob > F = 0.0000
Residual | 481.479333 844 .570473144 R-squared = 0.0314
-------------+---------------------------------- Adj R-squared = 0.0303
Total | 497.089396 845 .588271474 Root MSE = .7553

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
young | .3585009 .0685339 5.23 0.000 .223984 .4930178
_cons | .5901624 .0285679 20.66 0.000 .5340898 .6462349
------------------------------------------------------------------------------

• crra(young) = 0.590 + 0.359 = 0.949


• crra(older) = 0.590
2. Hypothesis testing
• Sampling distribution

• The OLS estimator is computed from a sample of n observations from


a population. A different sample of n observations gives a different
value of . This is the source of the “sampling error” of .
• Suppose you repeatedly take samples of n observations from a large
population.
• The sampling distribution is the probability distribution of the sample
mean for a given sample of n observations from the same population.

• We want to:
• quantify the sampling error associated with
• test hypotheses such as β1=0
• construct a confidence interval for β1
Distribution of the sample average
• Sampling distribution of sample mean
• Yi are i.i.d. from the population
• Sample mean: = (Y1 + Y2 + … + Yn) / n = =
• Sample variance: var(Y) = = σY2

• Mean of sampling distribution: E() = =


• Variance of sampling distribution: var() = σY̅2 = σY2 / n
• Standard deviation of sampling distribution: sd () = σY̅ = σY /

• Large sample approximation


• Central limit theorem: the sampling distribution of is approximately
normal when n is large.
Population and sampling distributions
2
Population distribution, 1 mill observations ~ N(200, 50)

0 100 200 300 400


Random variable (Y)

2
Sampling distribution, sample size = 100 observations ~ N(200, 5)

0 100 200 300 400


Sample mean (Y_bar)
Sampling distribution of OLS estimator
• We want to estimate the population mean of β1 using data from a sample of
n observations

• What is E()?
• If E() = β1, then the mean (expected value) of the sampling distribution is equal to the
population mean, and OLS is unbiased.

• What is var()?
• We need to know the variance (standard deviation) of the sampling distribution for
hypothesis tests. The standard deviation of the sampling distribution for the estimator is
called the standard error.

• Sampling distribution and sample size


• The probability distribution of sample means (sampling distribution) can take any shape
for small samples
• Central limit theorem: the sampling distribution is normally distributed for large
samples.
Null and alternative hypotheses
We want to test statistical hypotheses, such as β1=0, using our sample,
and reach a tentative conclusion whether the hypothesis is rejected or
not rejected.

General setup
The null hypothesis is a specific value assigned to the estimated
parameter, β1.

Null hypothesis and two-sided alternative:


H0: β1 = β1,0 vs. H1: β1 ≠ β1,0
where β1,0 is the value under the null hypothesis.
 
Null hypothesis and one-sided alternative:
H0: β1 = β1,0 vs. H1: β1 > β1,0
General approach
estimator - hypothesized value
• In general: t =
standard error of the estimator
 
where the standard error of the estimator is the standard deviation
of the probability distribution for the estimator.

ˆ1  1,0
• To test β1: t =
SE (ˆ )
1

where SE() is the standard error of the estimator, .


Density
.4

.3

.2

.1
Standard normal distribution

0
Standard normal distribution

-4 -2 0 2 4
z-value
Density
.4

.3

.2

.1
Transformation of normal distributions

0
Normal distributions

-5 0 5 10
z-value
Density
.4

.3

.2

.1
Student t-distribution with large degree of freedom

0
Student t-distribution

-4 -2 0 2 4
t-value
Test-statistic: student t-distribution
ˆ1  1,0
• Construct the t-statistic t =
ˆ 2ˆ
1

• Reject null hypothesis in a two-sided test at the 5% significance level if t


> 1.96 or t < −1.96. You reject the null hypothesis if the estimated
population mean is more than ± 1.96 standard errors from β1,0

• p-value: probability of finding more extreme values of t than ± t-value.

• You reject the null hypothesis at the 5% significance level if p-value <
0.05.

• This procedure relies on the large-n approximation that is normally


distributed; typically n = 50 is large enough for the approximation to be
good.
Confidence intervals for β1

The 95% confidence interval is, equivalently:


• The set of values for the population mean that cannot be rejected at
the 5% significance level;
• A set-valued function of the data (an interval that is a function of the
data) that contains the true parameter value 95% of the time in
repeated samples.
 
The t-statistic for β1 is N(0,1) in large samples. Hence the interval
for construction of a 95% confidence for β1 is:

95% confidence interval for β1 = { ± 1.96×SE()}


Example with continuous independent variable
• Linear regression model with age as only independent variable

. regress crra age

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -.0109613 .0018155 -6.04 0.000 -.0145248 -.0073978
_cons | 1.14752 .0857724 13.38 0.000 .9791665 1.315873
------------------------------------------------------------------------------

• t-value = (− 0.011 − 0) / 0.0018 = −6.04


• H0: β1 = 0 and H1: β1 ≠ 0
• Two-sided test of β1 = 0: critical t-value = ±1.96 at 5% level
• p-value: pr(t-value < −6.04) + pr(t-value > 6.04) < 0.001
• 95% confidence interval of β1: − 0.011 ± 1.96 ∙ 0.0018
Example with discrete independent variable
• Linear regression model with female as only independent variable

. regress crra female

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | .0587697 .052734 1.11 0.265 -.0447355 .1622749
_cons | .6227924 .0374645 16.62 0.000 .5492579 .6963268
------------------------------------------------------------------------------

• t-value = (0.059 − 0) / 0.0527 = 1.11


• H0: β1 = 0 and H1: β1 ≠ 0
• Two-sided test of β1 = 0: critical t-value = ±1.96
• p-value: pr(t-value < −1.11) = 0.133 plus pr(t-value > 1.11) = 0.133
• 95% confidence interval of β1: 0.059 ± 1.96 ∙ 0.0527
Exercises
• Estimate a linear regression model with student as the only
independent variable
• Is the estimated coefficient on student significantly different from 0?
• What is the 95% confidence interval with respect to the coefficient
on student?

• Repeat the same exercise with young as the only


independent variable
Run model with student on right hand side
. regress crra student

Source | SS df MS Number of obs = 846


-------------+---------------------------------- F(1, 844) = 24.01
Model | 13.7521132 1 13.7521132 Prob > F = 0.0000
Residual | 483.337283 844 .572674506 R-squared = 0.0277
-------------+---------------------------------- Adj R-squared = 0.0265
Total | 497.089396 845 .588271474 Root MSE = .75675

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
student | .435723 .088916 4.90 0.000 .2612007 .6102454
_cons | .611252 .0273426 22.36 0.000 .5575845 .6649194
------------------------------------------------------------------------------
T-test and 95% confidence interval
. regress crra student

Source | SS df MS Number of obs = 846


-------------+---------------------------------- F(1, 844) = 24.01
Model | 13.7521132 1 13.7521132 Prob > F = 0.0000
Residual | 483.337283 844 .572674506 R-squared = 0.0277
-------------+---------------------------------- Adj R-squared = 0.0265
Total | 497.089396 845 .588271474 Root MSE = .75675

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
student | .435723 .088916 4.90 0.000 .2612007 .6102454
_cons | .611252 .0273426 22.36 0.000 .5575845 .6649194
------------------------------------------------------------------------------

• t-value = (0.436 − 0) / 0.089 = 4.90


• H0: β1 = 0 and H1: β1 ≠ 0
Two-sided test of β1 = 0: critical t-value = ±1.96 at 5% level
• 95% confidence interval of β1: 0.436 ± 1.96 ∙ 0.089
Run model with young on right hand side
. regress crra young

Source | SS df MS Number of obs = 846


-------------+---------------------------------- F(1, 844) = 27.36
Model | 15.6100626 1 15.6100626 Prob > F = 0.0000
Residual | 481.479333 844 .570473144 R-squared = 0.0314
-------------+---------------------------------- Adj R-squared = 0.0303
Total | 497.089396 845 .588271474 Root MSE = .7553

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
young | .3585009 .0685339 5.23 0.000 .223984 .4930178
_cons | .5901624 .0285679 20.66 0.000 .5340898 .6462349
------------------------------------------------------------------------------
T-test and 95% confidence interval
. regress crra young

Source | SS df MS Number of obs = 846


-------------+---------------------------------- F(1, 844) = 27.36
Model | 15.6100626 1 15.6100626 Prob > F = 0.0000
Residual | 481.479333 844 .570473144 R-squared = 0.0314
-------------+---------------------------------- Adj R-squared = 0.0303
Total | 497.089396 845 .588271474 Root MSE = .7553

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
young | .3585009 .0685339 5.23 0.000 .223984 .4930178
_cons | .5901624 .0285679 20.66 0.000 .5340898 .6462349
------------------------------------------------------------------------------

• t-value = (0.359 − 0) / 0.069 = 5.23


• H0: β1 = 0 and H1: β1 ≠ 0
Two-sided test of β1 = 0: critical t-value = ±1.96 at 5% level
• 95% confidence interval of β1: 0.359 ± 1.96 ∙ 0.069
Summary
• Learning outcomes
• Execute linear regression models in Stata
• Understand the three underlying LSA assumptions
• Understand the meaning of the sampling distribution
• Undertake one- and two-sided tests of estimated parameters
• Construct confidence intervals of estimated parameters
• Understand the meaning of homo- and heteroskedasticity
Extra exercises
• Use the RA_Lab.dta dataset (see Andersen et al. [2010])
A. Run a linear regression of crra on female
• What is the predicted crra value for women and men?
• What is the marginal effect of female on crra?
• Are men significantly more risk averse than women?
• What is the 95% confidence interval of the estimated crra value for
men and women?

B. Run a linear regression of crra on age


• What is the marginal effect of age on crra?
• Is the coefficient for age significantly different from 0?
• What is the 95% confidence interval for the estimated coefficient on
age?
• What is the predicted crra value for a 20-year old and a 30-year old?
Extra exercises
C. Generate a new binary variable age_25 (= 1 if age >= 25, and = 0
if age < 25). Run a new linear regression of crra on age_25
• What is the marginal effect of age_25 on crra?
• Is the estimated coefficient on age_25 significantly different
from 0?
• What is the 95% confidence interval for the estimated
coefficient on age_25?
Note: one-sided tests
• Linear regression model with age as only independent variable

. regress crra age

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -.0109613 .0018155 -6.04 0.000 -.0145248 -.0073978
_cons | 1.14752 .0857724 13.38 0.000 .9791665 1.315873
------------------------------------------------------------------------------

• H0: β1 = 0
• H1: β1 > 0
• Critical t-value = 1.64 at 5% level
• p-value: pr(t-value > −6.04) = 1 − pr(t-value < −6.04) = 1 − 0.0001 = 0.9999
• H1: β1 < 0
• Critical t-value = −1.64 at 5% level
• p-value: pr(Z < −6.04) = 0.0001
One-sided tests of discrete independent variable
• Linear regression model with female as only independent variable

. regress crra female

------------------------------------------------------------------------------
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | .0587697 .052734 1.11 0.265 -.0447355 .1622749
_cons | .6227924 .0374645 16.62 0.000 .5492579 .6963268
------------------------------------------------------------------------------

• H0: β1 = 0
• H1: β1 > 0
• Critical t-value = 1.64 at 5% level
• p-value: pr(t-value > 1.11) = 1 − pr(t-value < 1.11) = 1 − 0.867 = 0.133
• H1: β1 < 0
• Critical t-value = −1.64 at 5% level
• p-value: pr(Z < 1.11) = 0.8665
z-distribution
Standard normal distribution:
• Referred to as the z-distribution
• Cumulative standard normal distribution in Table 1 of S&W
• z-test: testing the mean of a population against a null
hypothesis (hypothesized value)
• One and two-tailed tests
• Can use the z-distribution for any n if the population distribution is
normal and the standard deviation is known
t-distribution
Student t-distribution:
• Referred to as the t-distribution
• Standard deviation of population distribution is not known
• t-test: testing the mean of a population against a null
hypothesis (hypothesized value)
• One and two-tailed tests
• t-distribution is similar to the z-distribution when n is large
• Can use the t-test for small samples (n < 30) if the population
distribution is approximately normal
• Critical values for two-sided and one-sided tests using the t-
distribution in Table 2 of S&W
Note: Heteroskedasticity
• Homoskedasticity
• var(u|X=x) is constant – the variance of the conditional
distribution of u given X does not depend on X

• Heteroskedasticity
• var(u|X=x) is not constant – the variance of the conditional
distribution of u given X depends on X
Example of heteroskedasticity
Example of heteroskedasticity
We implicitly allow for heteroskedasticity
Recall the three least squares assumptions:
1. E(u|X=x) = 0
2. (Xi,Yi), i=1,…,n, are i.i.d.
3. Large outliers are rare

Hetero- and homoskedasticity concern var(u | X=x). Because we


have not explicitly assumed homoskedastic errors, we have
implicitly allowed for heteroskedasticity.
What happens if the errors are homoskedastic?
• You can prove that OLS is the most efficient estimator if errors
are homoskedastic, a result called the Gauss-Markov theorem

• The formula for the variance of and the OLS standard error
simplifies under homoskesticity:
var[( X i   x )ui ]
var() = 2 2
(general formula)
n( )
X
 u2
= 2 (simplification if u is homoscedastic)
n X

Note: var() is inversely proportional to var(X): more variation in


X means more information about . If there is no variation in X
then we can’t estimate the effect of X on Y.
The bottom line:
• If the errors are either homoskedastic or heteroskedastic and
you use heteroskedastic-robust standard errors, you are OK
• If the errors are heteroskedastic and you use the
homoskedasticity-only formula for standard errors, your
standard errors will be wrong (the homoskedasticity-only
estimator of the variance of is inconsistent if there is
heteroskedasticity).
• The two formulas coincide (when n is large) in the special case
of homoskedasticity
• So, you should always use heteroskedasticity-robust standard
errors in linear regression.
Index
• Variance, p. 61
• Mean (expected value), p. 62
• Covariance, pp. 70-71
• Normal distribution, pp. 75-79
• Sampling distribution, pp. 83-84
• Two-sided alternative hypothesis, p. 109
• p-value, pp. 109-111
• t-statistic, pp. 113-114

You might also like