0% found this document useful (0 votes)

38 views

Chapter 6

This document discusses linear regression with multiple regressors. It covers omitted variable bias, which occurs when an omitted variable is correlated with both the dependent and independent variables. This can result in biased coefficient estimates. The document also discusses using regression to estimate causal effects from observational data, and how this differs from an ideal randomized controlled experiment where treatment is randomly assigned. It provides an example of estimating the causal effect of class size on test scores, and how omitting the percentage of English learners could lead to omitted variable bias.

Uploaded by

shayaan ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views

Chapter 6

Uploaded by

shayaan ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

CHAPTER

6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Outline
1. Omitted variable bias
2. Using regression to estimate causal effects
3. Multiple regression and OLS
4. Measures of fit
5. Sampling distribution of the OLS estimator with multiple regressors
6. Control variables

Page 1 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Omitted Variable Bias

In the class size example, 𝛽" is the causal effect on test scores of a change in the 𝑆𝑇𝑅 by
one student per teacher.

When 𝛽" is a causal effect, the first least squares assumption must hold: 𝐸 𝑢 𝑋 = 0.

The error 𝑢 arises because of the factors or variables that influence 𝑌 but are not included
in the regression function. There are always omitted variables!

If the omission of those variables results in 𝐸 𝑢 𝑋 ≠ 0, then the OLS estimator will be
biased.

The bias in the OLS estimator that occurs as a result of an omitted factor, or variable, is
called omitted variable bias.

Page 2 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

For omitted variable bias to occur, the omitted variable, 𝑍, must satisfy two conditions:

1. 𝑍 is a determinant of Y (i.e. 𝑍 is part of 𝑢); and

2. 𝑍 is correlated with the regressor 𝑋 (i.e 𝐶𝑜𝑟𝑟(𝑍, 𝑋) ≠ 0)

Both conditions must hold for the omission of 𝑍 to result in omitted variable bias.

In the test score example:

1. English language ability (whether the student has English as a second language)
plausibly affects standardized test scores: 𝑍 is a determinant of 𝑌
2. Immigrant communities tend to be less affluent and thus have smaller school
budgets and higher 𝑆𝑇𝑅: 𝑍 is correlated with 𝑋

Accordingly, 𝛽" is biased. What is the direction of this bias?

• What does common sense suggest?
• If common sense fails you, there is a formula …

Page 3 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

A formula for omitted variable bias: recall the equation:

1 6
57" 𝜈5
6
57" 𝑋5 − 𝑋 𝑢5 𝑛
𝛽" − 𝛽" = =
6
57"(𝑋5 − 𝑋)
8 𝑛−1 8
𝑠=
𝑛

where 𝜈5 = 𝑋5 − 𝑋 𝑢5 ≈ 𝑋5 − 𝜇= 𝑢5 .

Under Least Squares Assumption #1: 𝐸 𝑋5 − 𝜇= 𝑢5 = 𝑐𝑜𝑣 𝑋5 , 𝑢5 = 0.

But what if 𝐸 𝑋5 − 𝜇= 𝑢5 = 𝑐𝑜𝑣(𝑋5 , 𝑢5 ) ≠ 0?

Page 4 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Let 𝛽" be the causal effect. Under LSA #2 and LSA #3 (that is, even if LSA #1 does not
hold),

1 6
𝑋5 − 𝑋 𝑢5 B 𝜎=D
𝛽" − 𝛽" = 𝑛
57"
1 6 𝜎=8
57"(𝑋5 − 𝑋)8
𝑛
𝜎D 𝜎=D 𝜎D
= × = 𝜌
𝜎= 𝜎= 𝜎D 𝜎= =D

where 𝜌=D = 𝑐𝑜𝑟𝑟(𝑋, 𝑢). If LSA #1 is correct, then 𝜌=D = 0. If not, we have

B 𝜎D
𝛽" 𝛽" + 𝜌
𝜎= =D
à the OLS estimator 𝛽 is biased and is not consistent. Strength and direction of the bias
are determined by 𝜌=D , the correlation between the error term and the regressor.

Page 5 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

In our example, let us think about the bias induced by omitted the share of English
learning students (PctEL). When the estimated regression model does not include PctEL
as a regressor, the true model is
𝑌5 = 𝛽I + 𝛽" 𝑆𝑇𝑅 + 𝛽8 𝑃𝑐𝑡𝐸𝐿 + 𝑢5

where STR and PctEL are correlated, we have 𝜌MNO,PQRST ≠ 0.

Omitting PctEL leads to a negatively biased estimate 𝛽" . As a consequence, we expect

𝛽" , the coefficient on STR, to be too large in absolute value. Put differently, the OLS
estimate of 𝛽" suggests that small classes improve tests scored, but that the effect of small
classes is overestimated as it captures the effect of having fewer English learners too.

Page 6 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

We can look at this another way. Districts with few ESL students: (1) do better on
standardized tests and (2) have smaller classes (bigger budgets). So ignoring the effect of
having many ESL students would result in overstating the class size effect. Is this going
on in the data?

• Districts with fewer English learners have higher test scores

• Districts with fewer English learners have smaller class sizes
• Among districts with comparable English learners, the effect of class size is small
(overall test score gap is 7.4)

Page 7 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Using regression to estimate causal effects

The test score/ STR/fraction of English learners example shows that, if an omitted
variable satisfies the two conditions for omitted variable bias, then the OLS estimator in
the regression omitting that variable is biased and inconsistent. So, even if 𝑛 is large, 𝛽"
will not be close to 𝛽" .

In this example, we are clearly interested in the causal effect: what do we expect to
happen to test scores if the superintendent reduces the class size?

What precisely is a causal effect?

• Causality is a complex concept
• In this course, we take a practical approach to causality: A causal effect is defined to
be the effect measured in an ideal randomized controlled experiment

Page 8 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Ideal Randomized Controlled Experiment

• Ideal: subjects all follow the treatment protocol – perfect compliance, no errors in
reporting, etc.
• Randomized: subjects from the population of interest are randomly assigned to a
treatment or control group (so there are no confounding factors)
• Controlled: having a control group permits measuring the differential effect of the
treatment
• Experiment: the treatment is assigned as part of the experiment: the subjects have
no choice, so there is no reverse causality in which subjects choose the treatment
they think will work best

Page 9 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

In our class size example:

Imagine an ideal randomized controlled experiment for measuring the effect on

𝑇𝑒𝑠𝑡 𝑆𝑐𝑜𝑟𝑒 of reducing 𝑆𝑇𝑅.
• In that experiment, students would be randomly assigned to classes, which would
have different sizes
• Because they are randomly assigned, all student characteristics (and thus 𝑢5 ) would
be distributed independently of 𝑆𝑇𝑅5
• Thus 𝐸 𝑢5 𝑆𝑇𝑅5 = 0. That is, LSA #1 holds in a randomized controlled
experiment.

How does our observational data differ from this ideal?

• The treatment is not randomly assigned
• Consider, the percent English learners 𝑍 in the district. It plausibly satisfies the
two criteria for omitted variable bias.
o A determinant of 𝑌; and
o Correlated with the regressor 𝑋
àThus the control and treatment groups differ in a systematic way, so 𝑐𝑜𝑟𝑟(𝑍, 𝑋) ≠ 0

Page 10 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Randomization implies that any differences between treatment and control groups are
random – not systematically related to the treatment.

We can eliminate the difference in percent English learners between large (control) and
small (treatment) class groups by examining the effect of class size among districts with
the same percent of English learners.
- If the only systematic difference between the large and small class size groups is the
percent English learners, then we are back to the randomized controlled experiment
- This is one way to control for the effect of percent English learners when estimating
the effect of 𝑆𝑇𝑅

Page 11 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Let’s return to omitted variable bias.

Three ways to overcome omitted variable bias:

1. Run a randomized controlled experiment in which treatment (𝑆𝑇𝑅) is randomly
assigned; then the percent of English learners is still a determinant of 𝑇𝑒𝑠𝑡 𝑆𝑐𝑜𝑟𝑒,
but it is uncorrelated with 𝑆𝑇𝑅 (à this solution is rarely feasible).
2. Adopt the “cross tabulation” approach with finer gradations of 𝑆𝑇𝑅 and percent of
English learners, so we control for the percent of English learners (à but will soon
run out of data, and what about other determinants like family income and parental
education?)
3. Use a regression in which the omitted variable is no longer omitted: include the
percent of English learners as an additional regressor in a multiple regression.

Page 12 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

The Population Multiple Regression Model

Consider the case of two regressors:

𝑌5 = 𝛽I + 𝛽" 𝑋"5 + 𝛽8 𝑋85 + 𝑢5

• 𝑌 is the dependent variable

• 𝑋" , 𝑋8 are the two independent variables (regressors)
• (𝑌5 , 𝑋"5 , 𝑋85 ) denote the 𝑖 th observation on 𝑌, 𝑋" , 𝑋8
• 𝛽I = unknown population intercept
= predicted value of 𝑌 when 𝑋" = 𝑋8 = 0
• 𝛽" = effect on 𝑌 of a change in 𝑋" , holding 𝑋8 constant
= 𝜕𝑌/𝜕𝑋" , holding 𝑋8 constant
• 𝛽8 = effect on 𝑌 of a change in 𝑋8 , holding 𝑋" constant
= 𝜕𝑌/𝜕𝑋8 , holding 𝑋" constant
• 𝑢5 = the regression error (omitted factors)

Page 13 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

With two regressors, the OLS estimator solves:

𝑚𝑖𝑛]^ ,]_ ,]` (𝑌5 − 𝑏I − 𝑏" 𝑋"5 − 𝑏8 𝑋85 )8

• the OLS estimator minimizes the average squared difference between the actual values
of 𝑌5 and the prediction based on the estimated line
• this minimization problem is solved using calculus
• this yields the OLS estimators of 𝛽I , 𝛽" , 𝛽8

Page 14 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Example using Test Score data

Regression of 𝑇𝑒𝑠𝑡 𝑆𝑐𝑜𝑟𝑒 against 𝑆𝑇𝑅: 𝑇𝑒𝑠𝑡 𝑆𝑐𝑜𝑟𝑒 = 698.9 − 2.28×𝑆𝑇𝑅

Include the percent English learners: 𝑇𝑒𝑠𝑡 𝑆𝑐𝑜𝑟𝑒 = 686.0 − 1.10×𝑆𝑇𝑅 − 0.65𝑃𝑐𝑡𝐸𝐿

• What happens to the coefficient on 𝑆𝑇𝑅?

• Note: 𝑐𝑜𝑟𝑟 𝑆𝑇𝑅, 𝑃𝑐𝑡𝐸𝐿 = 0.10

> model<-lm(testscr~str+el_pct,data=caschool)
> coeftest(model, vcov = vcovHC(model, type = "HC1"))

t test of coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 686.032249 8.728224 78.5993 < 2e-16 ***
str -1.101296 0.432847 -2.5443 0.01131 *
el_pct -0.649777 0.031032 -20.9391 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Page 15 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Measures of Fit for Multiple Regression

Actual = predicted + residual (𝑌5 = 𝑌5 + 𝑢5 )

𝑆𝐸𝑅 = standard deviation of 𝑢5 (with degrees of freedom correction)

𝑅𝑀𝑆𝐸 = standard deviation of 𝑢5 (without degrees of freedom correction)
𝑅8 = fraction of variance of 𝑌 explained by 𝑋s
𝑅8 = adjusted 𝑅8 = 𝑅8 with degrees of freedom correction that adjusts for estimation
uncertainty (𝑅8 < 𝑅8 )

Page 16 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

As in regression with a single regressor, the 𝑆𝐸𝑅 and 𝑅𝑀𝑆𝐸, are measures of the spread
of the 𝑌𝑠 around the regression line:

6
1
𝑆𝐸𝑅 = 𝑢58
𝑛−𝑘−1
57"

6
1
𝑅𝑀𝑆𝐸 = 𝑢58
𝑛
5

Page 17 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

The 𝑅8 is the fraction of the variance explained – same definition as in regression with a
single regressor:

𝐸𝑆𝑆 𝑆𝑆𝑅
𝑅8 = =1−
𝑇𝑆𝑆 𝑇𝑆𝑆
6 6 8 6
where 𝐸𝑆𝑆 = 57"(𝑌5 − 𝑌)8 , 𝑆𝑆𝑅 = 57" 𝑢5 , 𝑇𝑆𝑆 = 57"(𝑌5 − 𝑌)8

• The 𝑅8 always increases when you add another regressor – a bit of a problem for a
measure of fit
• The 𝑅8 corrects this problem by penalizing you for including another regressor – the
𝑅8 does not necessarily increase when you add another regressor

𝑛−1 𝑆𝑆𝑅
𝑅8 = 1 −
𝑛 − 𝑘 − 1 𝑇𝑆𝑆

• 𝑅8 < 𝑅8 , however, if 𝑛 is large, the two will be very close

Page 18 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

The Least Squares Assumptions for Causal Inference in Multiple Regression:

Let 𝛽" , 𝛽8 , … , 𝛽m be causal effects.

𝑌5 = 𝛽I + 𝛽" 𝑋"5 + 𝛽8 𝑋85 + ⋯ + 𝛽m 𝑋m5 + 𝑢5 , 𝑖 = 1, … , 𝑛

1. The conditional distribution of 𝑢 given the 𝑋s has mean zero,

i.e. 𝐸(𝑢5 |𝑋"5 = 𝑥" ,…, 𝑋m5 = 𝑥m ) = 0
2. 𝑋"5 , … , 𝑋m5 , 𝑌5 , 𝑖 = 1, … , 𝑛 are iid
3. Large outliers are unlikely
4. There is no perfect multicollinearity

Page 19 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Assumption #1: The conditional mean of 𝑢 given the 𝑋s is zero

𝐸(𝑢5 |𝑋"5 = 𝑥" ,…, 𝑋m5 = 𝑥m ) = 0

• This has the same interpretation as in regression with a single regressor

• Failure of this condition leads to omitted variable bias, specifically, if an omitted
variable:
o Belongs in the equation (so, is in 𝑢5 ); and
o Is correlated with an included 𝑋
• The best solution, if possible, is to include the omitted variable in the regression
• A second, related solution is to include a variable that controls for the omitted
variable (discussed shortly)

Assumption #2: 𝑋"5 , … , 𝑋m5 , 𝑌5 , 𝑖 = 1, … , 𝑛 are iid

This is satisfied automatically if the data are collected by simple random sampling.

Assumption #3: large outliers are rare

This is the same assumption as we had before for a single regressor. As in the case of a
single regressor, OLS can be sensitive to large outliers, so you need to check your data
(scatterplots) to make sure there are no crazy values (typos or coding errors)
Page 20 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Assumption #4: There is no perfect multicollinearity
Perfect multicollinearity is when one of the regressors is an exact linear function of the
other regressors

Example: Suppose you accidentally include 𝑆𝑇𝑅 twice

> model<-lm(testscr~str+str+el_pct,data=caschool)
> coeftest(model, vcov = vcovHC(model, type = "HC1"))

t test of coefficients:

Estimate Std. Error t value Pr(>|t|)

à R automatically drops one of the 𝑆𝑇𝑅𝑠

Page 21 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

The Sampling Distribution of the Least Squares Estimator

Under the four Least Squares Assumptions:
• The sampling distribution of 𝛽" has mean 𝛽"
• 𝑣𝑎𝑟(𝛽" ) is inversely proportional to 𝑛
• other than its mean and variance, the exact (finite-𝑛) distribution of 𝛽" is very
complicated; but for large 𝑛:
B
o 𝛽" is consistent: 𝛽" 𝛽" (law of large numbers)
s_ tS(s_ )
o is approximately distributed as 𝑁(0,1) (CLT)
uvw(s_ )

o these statements hold for 𝛽" , … , 𝛽m

Conceptually, there is nothing new here.

Page 22 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Back to Multicollinearity: Perfect and Imperfect

Perfect multicollinearity is when one of the regressors is an exact linear function of the
other regressors.

Some examples:
• the example from before – you include 𝑆𝑇𝑅 twice
• regress 𝑇𝑒𝑠𝑡 𝑆𝑐𝑜𝑟𝑒 on a constant, 𝐷, and 𝐵, where 𝐷5 = 1 if 𝑆𝑇𝑅 ≤ 20, and 0,
otherwise; 𝐵5 = 1 if 𝑆𝑇𝑅 > 20, and 0, otherwise. So, 𝐵5 = 1 − 𝐷5 (exact linear
function) and there is perfect multicollinearity. This is an example of the dummy
variable trap.

Page 23 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

The Dummy Variable Trap:

Suppose you have a set of multiple binary (dummy) variables which are mutually
exclusive and exhaustive – that is, there are multiple categories and every observation
falls in one, and only one, category. If you include all these dummy variables and a
constant, you will have perfect multicollinearity – this is sometimes called the dummy
variable trap.

• Solutions to the dummy variable trap:

§ Omit one of the groups or
§ Omit the intercept
• What are the implications for the interpretation of the coefficients?
• If you have perfect multicollinearity, your statistical software will let you know –
either by crashing or giving an error message or by dropping one of the variables
arbitrarily

Page 24 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Let’s look at an example:
Suppose we would like to relate wages to gender. Suppose we create two dummy
variables:
𝑀𝑎𝑙𝑒 = 1 if the individual is male, 0 if the individual is female;
𝐹𝑒𝑚𝑎𝑙𝑒 = 1 if the individual is female, 0 if the individual is male

Regression models relating wages to gender:

(i) 𝑊𝑎𝑔𝑒 = 𝛽I + 𝛽" 𝑀𝑎𝑙𝑒 + 𝑢5
For women then, 𝑀𝑎𝑙𝑒 = 0: 𝑊𝑎𝑔𝑒 = 𝛽I + 𝑢5
- 𝜷𝟎 is the population mean for women
- 𝛽I + 𝛽" is the population mean for men

(ii) 𝑊𝑎𝑔𝑒 = 𝛾I + 𝛾" 𝐹𝑒𝑚𝑎𝑙𝑒 + 𝑣5

For men then, 𝐹𝑒𝑚𝑎𝑙𝑒 = 0: 𝑊𝑎𝑔𝑒 = 𝛾I + 𝑣5
- 𝛾I is the population mean for men
- 𝜸𝟎 + 𝜸𝟏 is the population mean for women

Relationship among coefficients: 𝜷𝟎 = 𝜸𝟎 + 𝜸𝟏 , 𝛽I + 𝛽" = 𝛾I

Page 25 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

(iii) 𝑊𝑎𝑔𝑒 = 𝛿I + 𝛿" 𝑀𝑎𝑙𝑒 + 𝛿8 𝐹𝑒𝑚𝑎𝑙𝑒 + 𝜀5
For men, 𝑀𝑎𝑙𝑒 = 1, 𝐹𝑒𝑚𝑎𝑙𝑒 = 0: 𝑊𝑎𝑔𝑒 = 𝛿I + 𝛿" + 𝜀5
For women, 𝐹𝑒𝑚𝑎𝑙𝑒 = 1, 𝑀𝑎𝑙𝑒 = 0: 𝑊𝑎𝑔𝑒 = 𝛿I + 𝛿8 + 𝜀5
- 𝛿I + 𝛿" is the population mean for men
- 𝜹𝟎 + 𝜹𝟐 is the population mean for women
à 𝛿I cannot be estimated
à model suffers from perfect multicollinearity – you are including the same
information twice

(iv) 𝑊𝑎𝑔𝑒 = 𝛼I 𝑀𝑎𝑙𝑒 + 𝛼" 𝐹𝑒𝑚𝑎𝑙𝑒 + 𝜂5

For men, 𝑀𝑎𝑙𝑒 = 1, 𝐹𝑒𝑚𝑎𝑙𝑒 = 0: 𝑊𝑎𝑔𝑒 = 𝛼I + 𝜂5
For women, 𝐹𝑒𝑚𝑎𝑙𝑒 = 1, 𝑀𝑎𝑙𝑒 = 0: 𝑊𝑎𝑔𝑒 = 𝛼" + 𝜂5
- 𝛼I is the population mean for men
- 𝜶𝟏 is the population mean for women

Relationship among coefficients: 𝜷𝟎 = 𝜸𝟎 + 𝜸𝟏 = 𝜶𝟎 , 𝛽I + 𝛽" = 𝛾I = 𝛼"

Page 26 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Imperfect multicollinearity occurs when two or more regressors are very highly
correlated.

Imperfect multicollinearity implies that one or more of the regression coefficients will be
imprecisely estimated.

• The idea: the coefficient on 𝑋" is the effect of 𝑋" holding 𝑋8 constant; but if 𝑋" and 𝑋8
are highly correlated, there is very little variation in 𝑋" once 𝑋8 is held constant – so
the data don’t contain much information about what happens when 𝑋" changes but 𝑋8
doesn’t. This means that the variance of the OLS estimator of the coefficient on 𝑋"
will be large.
• Imperfect multicollinearity results in large standard errors for one or more of the OLS
coefficients

Page 27 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Control Variables and Conditional Mean Independence

We want to get an unbiased estimate of the effect on test scores of changing class size,
holding constant factors outside the school committee’s control – such as outside learning
opportunities (museums, etc), parental involvement in education (reading with mom at
home), etc.

If we could run an experiment, we would randomly assign students (and teachers) to

different sized classes. Then 𝑆𝑇𝑅5 would be independent of all the other factors that go
into 𝑢5 , so 𝐸 𝑢5 𝑆𝑇𝑅5 = 0 and the OLS slope estimator in the regression of
𝑇𝑒𝑠𝑡 𝑆𝑐𝑜𝑟𝑒 on 𝑆𝑇𝑅 would be an unbiased estimator of the desired causal effect.

But with observational data, 𝑢5 depends on additional factors (parental involvement,

knowledge of English, access in the community to learning opportunities outside school,
etc).
• If you can observe those factors (e.g. PctEL), then include them in the regression
• But usually you can’t observe all these omitted causal factors (e.g. parental
involvement in homework). In this case, you can include control variables which
T
are correlated with these omitted causal factors, but which themselves are not causal.

Page 28 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

A control variable W is a regressor included to hold constant factors that, if neglected,
could lead the estimated causal effect of interest to suffer from omitted variable bias.

For our test scores example:

𝑇𝑒𝑠𝑡 𝑆𝑐𝑜𝑟𝑒 = 700.2 − 1.00𝑆𝑇𝑅 − 0.122𝑃𝑐𝑡𝐸𝐿 − 0.547𝐿𝑐ℎ𝑃𝑐𝑡, 𝑅8 = 0.773
5.6 0.27 0.033 0.024

𝑃𝑐𝑡𝐸𝐿 = percent English learners in the school district

𝐿𝑐ℎ𝑃𝑐𝑡 = percent of students receiving a free/subsidized lunch (only students from low-
income families are eligible)

• 𝑆𝑇𝑅 is the variable of interest

• 𝑃𝑐𝑡𝐸𝐿 probably has a direct causal effect (school is tougher if you are learning
English). But it is also a control variable: immigrant communities tend to be less
affluent and often have fewer outside learning opportunities, and 𝑃𝑐𝑡𝐸𝐿 is correlated
with those omitted causal variables. 𝑃𝑐𝑡𝐸𝐿 is both a possible causal variable and a
control variable.
• 𝐿𝑐ℎ𝑃𝑐𝑡 might have a causal effect (eating lunch helps learning); it also is correlated
with controls for income-related outside learning opportunities. 𝐿𝑐ℎ𝑃𝑐𝑡 is both a
possible causal variable and a control variable.
Page 29 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Three interchangeable statements about what makes for an effect control variable:

1. An effective control variable is one which, when included in the regression, makes
the error term uncorrelated with the variable of interest.
2. Holding constant the control variable(s), the variable of interest is as if randomly
assigned.
3. Among individuals/observations with the same value of the control variable(s), the
variable of interest is uncorrelated with the omitted determinants of 𝑌

Control variables need not be causal and their coefficients generally do not have a causal
interpretation.

Page 30 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

The math of control variables: conditional mean independence

• Because a control variable is correlated with an omitted causal factor, LSA #1

does not hold. For example, the coefficient on 𝐿𝑐ℎ𝑃𝑐𝑡 is correlated with
unmeasured determinants of test scores such as outside learning opportunities, so is
subject to omitted variable bias. But the fact that 𝐿𝑐ℎ𝑃𝑐𝑡 is correlated with these
omitted variables is precisely what makes it a good control variable!
• If LSA #1 does not hold, then what does?
• We need a mathematical condition for what makes an effective control variable.
This condition is conditional mean independence: given the control variable, the
mean of 𝑢5 doesn’t depend on the variable of interest

Let 𝑋5 denote the variable of interest and 𝑊5 denote the control variable(s). 𝑊 is an
effective control variable if conditional mean independence holds:

𝐸 𝑢5 𝑋5 , 𝑊5 = 𝐸(𝑢5 |𝑊5 ) (conditional mean independence)

If 𝑊 is a control variable, then conditional mean independence replaces LSA #1 – it is

the version of LSA #1 that is relevant for control variables.

Page 31 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Consider the regression model
𝑌 = 𝛽I + 𝛽" 𝑋 + 𝛽8 𝑊 + 𝑢

where 𝑋 is the variable of interest, 𝛽" is its causal effect, and 𝑊 is an effective control
variable so that conditional mean independence holds:

𝐸 𝑢5 𝑋5 , 𝑊5 = 𝐸(𝑢5 |𝑊5 )

In addition, suppose that LSA #2, 3, 4 hold. Then:

1. 𝛽" has a causal interpretation
2. 𝛽" is unbiased
3. The coefficient on the control variable, 𝛽8 , does not in general estimate a causal
effect

Page 32 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

Under conditional mean independence:

1. 𝛽" has a causal interpretation

The math: The expected change in 𝑌 resulting from a change in 𝑋, holding (a single)
𝑊 constant, is:

𝐸 𝑌 𝑋 = 𝑥 + ∆𝑥, 𝑊 = 𝑤 − 𝐸 𝑌 𝑋 = 𝑥, 𝑊 = 𝑤
= 𝛽I + 𝛽" 𝑥 + ∆𝑥 + 𝛽8 𝑤 + 𝐸 𝑢 𝑋 = 𝑥 + ∆𝑥, 𝑊 = 𝑤 − [𝛽I + 𝛽" 𝑥 + 𝛽8 𝑤
+ 𝐸 𝑢 𝑋 = 𝑥, 𝑊 = 𝑤
= 𝛽" ∆𝑥 + [𝐸 𝑢 𝑋 = 𝑥 + ∆𝑥, 𝑊 = 𝑤 − 𝐸 𝑢 𝑋 = 𝑥, 𝑊 = 𝑤 = 𝛽" ∆𝑥

where the final line follows from conditional mean independence:

𝐸 𝑢 𝑋 = 𝑥 + ∆𝑥, 𝑊 = 𝑤 = 𝐸 𝑢 𝑋 = 𝑥, 𝑊 = 𝑤 = 𝐸(𝑢|𝑊 = 𝑤)

2. 𝛽" is unbiased.

Page 33 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

3. 𝛽8 , does not in general estimate a causal effect
The math:
Consider the regression model,

𝑌 = 𝛽I + 𝛽" 𝑋 + 𝛽8 𝑊 + 𝑢

where 𝑢 satisfies the conditional mean independence assumption and where 𝛽" and
𝛽8 are causal effects.

Suppose that 𝐸 𝑢 𝑊 = 𝛾I + 𝛾8 𝑊 (that is, 𝐸 𝑢 𝑊 is linear in 𝑊). Then under

conditional mean independence,

𝐸 𝑢 𝑋, 𝑊 = 𝐸 𝑢 𝑊 = 𝛾I + 𝛾8 𝑊 (∗)

Let
𝜈 = 𝑢 − 𝐸 𝑢 𝑋, 𝑊 (∗∗)

so that 𝐸 𝑣 𝑋, 𝑊 = 0. Combining (∗) and (∗∗) yields

𝑢 = 𝐸 𝑢 𝑋, 𝑊 + 𝑣
= 𝛾I + 𝛾8 𝑊 + 𝑣, where 𝐸 𝑣 𝑋, 𝑊 = 0
Page 34 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

We now substitute 𝑢 = 𝛾I + 𝛾8 𝑊 + 𝑣 into the regression,

𝑌 = 𝛽I + 𝛽" 𝑋 + 𝛽8 𝑊 + 𝑢
= 𝛽I + 𝛽" 𝑋 + 𝜷𝟐 𝑾 + 𝛾I + 𝛾8 𝑊 + 𝑣
= (𝛽I + 𝛾I ) + 𝛽" 𝑋 + (𝜷𝟐 +𝜸𝟐 )𝑾 + 𝑣
= 𝛿I + 𝛽" 𝑋 + 𝜹𝟐 𝑾 + 𝑣, 𝛿I = 𝛽I + 𝛾I , 𝛿8 = 𝛽8 +𝛾8

Since 𝐸 𝑣 𝑋, 𝑊 = 0, the OLS estimators of 𝛿I , 𝛽" and 𝛿8 are unbiased.

Notice that 𝐸 𝛽" = 𝛽" and 𝐸 𝜷𝟐 = 𝜹𝟐 = 𝜷𝟐 +𝜸𝟐

à 𝛽" is an unbiased estimator of the causal effect 𝛽" , but 𝛽8 is not unbiased for 𝛽8

Under conditional mean independence,

𝐸 𝛽" = 𝛽" , and

𝐸 𝛽8 = 𝛿8 = 𝛽8 +𝛾8 ≠ 𝛽8

Page 35 of 36

CHAPTER 6: LINEAR REGRESSION WITH

MULTIPLE REGRESSORS

In summary, if 𝑊 is such that the conditional mean independence is satisfied, then:

• The OLS estimator of the effect of interest, 𝛽" , is unbiased

• The OLS estimator of the coefficient on the control variable, 𝛽8 , does not have a
causal interpretation. The reason is that the control variable is correlated with omitted
variables in the error term, so that 𝛽8 is subject to omitted variable bias

Coming up next: Hypothesis Testing and Confidence Intervals

Page 36 of 36

MG101 Assignment 2
100% (2)
MG101 Assignment 2
15 pages
Lecture 4 MLR - 1
No ratings yet
Lecture 4 MLR - 1
30 pages
Lecture 4 MLR - 1
No ratings yet
Lecture 4 MLR - 1
30 pages
ITER-126-160
No ratings yet
ITER-126-160
35 pages
Lecture set 3
No ratings yet
Lecture set 3
53 pages
Lecture 3a
No ratings yet
Lecture 3a
44 pages
TCH442E Quantitative Methods For Finance: Last Lecture: Next
No ratings yet
TCH442E Quantitative Methods For Finance: Last Lecture: Next
13 pages
Lecture 2_regression_multiple_regressors
No ratings yet
Lecture 2_regression_multiple_regressors
30 pages
M06 StockWatson123635 03 Econ Ch06
No ratings yet
M06 StockWatson123635 03 Econ Ch06
50 pages
Introduction To Multiple Regression
No ratings yet
Introduction To Multiple Regression
36 pages
EE1_3_multiple linear regression
No ratings yet
EE1_3_multiple linear regression
30 pages
Chapter 6-Linear Regression With Multiple Regressors
No ratings yet
Chapter 6-Linear Regression With Multiple Regressors
68 pages
Lecture 3-1_Introduction to Multiple Regression
No ratings yet
Lecture 3-1_Introduction to Multiple Regression
48 pages
Problem Set 3 SOLUTIONS
No ratings yet
Problem Set 3 SOLUTIONS
7 pages
5ssmn932 Lecture4 2021 Collated
No ratings yet
5ssmn932 Lecture4 2021 Collated
72 pages
lecture_8
No ratings yet
lecture_8
29 pages
2 Regression With Multiple Regressors 1
No ratings yet
2 Regression With Multiple Regressors 1
22 pages
Lect 3
No ratings yet
Lect 3
53 pages
統計摘要
No ratings yet
統計摘要
12 pages
Lecture6 MultiRegEstimate
No ratings yet
Lecture6 MultiRegEstimate
50 pages
Lecture 7. Multiple Regression
No ratings yet
Lecture 7. Multiple Regression
11 pages
2024 1 Metrics 6 Multipleols 1
No ratings yet
2024 1 Metrics 6 Multipleols 1
35 pages
Introduction To Econometrics - Stock & Watson - CH 5 Slides
100% (2)
Introduction To Econometrics - Stock & Watson - CH 5 Slides
71 pages
Problem Set 3
No ratings yet
Problem Set 3
2 pages
Assignments Ashoka University
No ratings yet
Assignments Ashoka University
32 pages
(eBook PDF) Introduction to Econometrics 4th Edition by James H. Stock download
100% (1)
(eBook PDF) Introduction to Econometrics 4th Edition by James H. Stock download
45 pages
(eBook PDF) Introduction to Econometrics 4th Edition by James H. Stockinstant download
100% (2)
(eBook PDF) Introduction to Econometrics 4th Edition by James H. Stockinstant download
38 pages
CH 5 Slidesdd (1ddd)
No ratings yet
CH 5 Slidesdd (1ddd)
71 pages
Review: Multiple Regression: Holding The Other Explanatory Variables Constant or Fixed
No ratings yet
Review: Multiple Regression: Holding The Other Explanatory Variables Constant or Fixed
7 pages
Bus 173 - Lecture 5
No ratings yet
Bus 173 - Lecture 5
38 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
Simple Linear Regression - Lecture Notes
No ratings yet
Simple Linear Regression - Lecture Notes
19 pages
class 2
No ratings yet
class 2
53 pages
Lecture 6
No ratings yet
Lecture 6
21 pages
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
No ratings yet
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
17 pages
Multiple Regression Analysis: y + X + X + - . - X + U
No ratings yet
Multiple Regression Analysis: y + X + X + - . - X + U
43 pages
CH-15 - IInd Sem 23-24
No ratings yet
CH-15 - IInd Sem 23-24
99 pages
Week 2, OLS
No ratings yet
Week 2, OLS
83 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Lecture 9
No ratings yet
Lecture 9
26 pages
Multiple Regression Model
No ratings yet
Multiple Regression Model
17 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
17 pages
Lecture 3 - Econometria I
No ratings yet
Lecture 3 - Econometria I
46 pages
ECONOMETRICS Summary 21:22
No ratings yet
ECONOMETRICS Summary 21:22
54 pages
Lecture 3 Multiple Regression Model-Estimation
No ratings yet
Lecture 3 Multiple Regression Model-Estimation
40 pages
Chapter 3 Econometrics
No ratings yet
Chapter 3 Econometrics
67 pages
Estadística Clase 7
No ratings yet
Estadística Clase 7
24 pages
Lecture 2-2_Simple Linear Regression (One Regressor)
No ratings yet
Lecture 2-2_Simple Linear Regression (One Regressor)
22 pages
Lecture 3
No ratings yet
Lecture 3
27 pages
2 - Model Linear Jamak Dan OLS
No ratings yet
2 - Model Linear Jamak Dan OLS
11 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
37 pages
CH 2 Part II Handout
No ratings yet
CH 2 Part II Handout
27 pages
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
No ratings yet
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
64 pages
AG909 Quantitative Methods For Finance
No ratings yet
AG909 Quantitative Methods For Finance
7 pages
MLR Note
No ratings yet
MLR Note
3 pages
Ssss PDF
No ratings yet
Ssss PDF
50 pages
Regression and Multiple Regression Analysis
100% (1)
Regression and Multiple Regression Analysis
21 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
A Concept of Limits
From Everand
A Concept of Limits
Donald W. Hight
4/5 (4)
Hamilton N Week 2
No ratings yet
Hamilton N Week 2
7 pages
Online Application Form For CSEB Manager
No ratings yet
Online Application Form For CSEB Manager
4 pages
Resume
No ratings yet
Resume
3 pages
Moral Convictions
No ratings yet
Moral Convictions
20 pages
HR Knowledge
No ratings yet
HR Knowledge
16 pages
Process Recording
No ratings yet
Process Recording
4 pages
Ashley Wiltz: Statements: My Experience As An Early Childhood Coordinator Is What Made Me
No ratings yet
Ashley Wiltz: Statements: My Experience As An Early Childhood Coordinator Is What Made Me
4 pages
Strategy Analysis
No ratings yet
Strategy Analysis
96 pages
Get (Ebook) Understanding Abnormal Behavior by David Sue, Derald Wing Sue, Stanley Sue, Diane M. Sue ISBN 9781305088061, 1305088069 PDF ebook with Full Chapters Now
100% (2)
Get (Ebook) Understanding Abnormal Behavior by David Sue, Derald Wing Sue, Stanley Sue, Diane M. Sue ISBN 9781305088061, 1305088069 PDF ebook with Full Chapters Now
72 pages
CERQ I Short
No ratings yet
CERQ I Short
16 pages
The Hedonism Project
No ratings yet
The Hedonism Project
4 pages
ASEAN Integration 2015
No ratings yet
ASEAN Integration 2015
16 pages
Filipino Help Seeking For Mental Health Problems and Associated Barriers and Facilitators: A Systematic Review
No ratings yet
Filipino Help Seeking For Mental Health Problems and Associated Barriers and Facilitators: A Systematic Review
17 pages
Item Analysis: Techniques To Improve Test Items and Instruction
No ratings yet
Item Analysis: Techniques To Improve Test Items and Instruction
18 pages
Formatted Coliva Morphology Metaphilosophy+Final+Or
No ratings yet
Formatted Coliva Morphology Metaphilosophy+Final+Or
23 pages
Knowledge Management
No ratings yet
Knowledge Management
298 pages
Addressing Selection Criteria
100% (2)
Addressing Selection Criteria
32 pages
Grade 7: MS Exemplar Unit Mathematics Grade 7 Edition 1
No ratings yet
Grade 7: MS Exemplar Unit Mathematics Grade 7 Edition 1
11 pages
Ivan Vuković - The Path of The Heart
100% (2)
Ivan Vuković - The Path of The Heart
40 pages
Gender Empowerment Through Women's Higher Education: Opportunities and Possibilities
No ratings yet
Gender Empowerment Through Women's Higher Education: Opportunities and Possibilities
11 pages
Cognitive Style and Other Individual Difference Dimensions
No ratings yet
Cognitive Style and Other Individual Difference Dimensions
20 pages
Case of Jake (Humanistic Approach)
No ratings yet
Case of Jake (Humanistic Approach)
6 pages
PSYCHOANALYTIC
No ratings yet
PSYCHOANALYTIC
1 page
Learning Activity Sheet: (For Teachers)
No ratings yet
Learning Activity Sheet: (For Teachers)
4 pages
Find Out More: WWW - Liverpool.ac - Uk/study
No ratings yet
Find Out More: WWW - Liverpool.ac - Uk/study
12 pages
Managment Practices Chapter Two
No ratings yet
Managment Practices Chapter Two
16 pages
Lec 01 - Interpretation in Architecture
No ratings yet
Lec 01 - Interpretation in Architecture
44 pages
Human Resource Planning (HRP) and Job Analysis
No ratings yet
Human Resource Planning (HRP) and Job Analysis
3 pages
Aristotle
No ratings yet
Aristotle
19 pages

Chapter 6

Uploaded by

Chapter 6

Uploaded by

CHAPTER

6: LINEAR REGRESSION WITH

Omitted Variable Bias

1. 𝑍 is a determinant of Y (i.e. 𝑍 is part of 𝑢); and

In the test score example:

Accordingly, 𝛽" is biased. What is the direction of this bias?

Under Least Squares Assumption #1: 𝐸 𝑋5 − 𝜇= 𝑢5 = 𝑐𝑜𝑣 𝑋5 , 𝑢5 = 0.

But what if 𝐸 𝑋5 − 𝜇= 𝑢5 = 𝑐𝑜𝑣(𝑋5 , 𝑢5 ) ≠ 0?

where STR and PctEL are correlated, we have 𝜌MNO,PQRST ≠ 0.

Omitting PctEL leads to a negatively biased estimate 𝛽" . As a consequence, we expect

• Districts with fewer English learners have higher test scores

Using regression to estimate causal effects

What precisely is a causal effect?

Ideal Randomized Controlled Experiment

In our class size example:

Imagine an ideal randomized controlled experiment for measuring the effect on

How does our observational data differ from this ideal?

Let’s return to omitted variable bias.

Three ways to overcome omitted variable bias:

The Population Multiple Regression Model

𝑌5 = 𝛽I + 𝛽" 𝑋"5 + 𝛽8 𝑋85 + 𝑢5

• 𝑌 is the dependent variable

With two regressors, the OLS estimator solves:

𝑚𝑖𝑛]^ ,]_ ,]` (𝑌5 − 𝑏I − 𝑏" 𝑋"5 − 𝑏8 𝑋85 )8

Example using Test Score data

Regression of 𝑇𝑒𝑠𝑡 𝑆𝑐𝑜𝑟𝑒 against 𝑆𝑇𝑅: 𝑇𝑒𝑠𝑡 𝑆𝑐𝑜𝑟𝑒 = 698.9 − 2.28×𝑆𝑇𝑅

• What happens to the coefficient on 𝑆𝑇𝑅?

Estimate Std. Error t value Pr(>|t|)

Measures of Fit for Multiple Regression

𝑆𝐸𝑅 = standard deviation of 𝑢5 (with degrees of freedom correction)

• 𝑅8 < 𝑅8 , however, if 𝑛 is large, the two will be very close

The Least Squares Assumptions for Causal Inference in Multiple Regression:

Let 𝛽" , 𝛽8 , … , 𝛽m be causal effects.

𝑌5 = 𝛽I + 𝛽" 𝑋"5 + 𝛽8 𝑋85 + ⋯ + 𝛽m 𝑋m5 + 𝑢5 , 𝑖 = 1, … , 𝑛

1. The conditional distribution of 𝑢 given the 𝑋s has mean zero,

𝐸(𝑢5 |𝑋"5 = 𝑥" ,…, 𝑋m5 = 𝑥m ) = 0

• This has the same interpretation as in regression with a single regressor

Assumption #2: 𝑋"5 , … , 𝑋m5 , 𝑌5 , 𝑖 = 1, … , 𝑛 are iid

Assumption #3: large outliers are rare

Example: Suppose you accidentally include 𝑆𝑇𝑅 twice

Estimate Std. Error t value Pr(>|t|)

à R automatically drops one of the 𝑆𝑇𝑅𝑠

The Sampling Distribution of the Least Squares Estimator

o these statements hold for 𝛽" , … , 𝛽m

Conceptually, there is nothing new here.

The Dummy Variable Trap:

• Solutions to the dummy variable trap:

Regression models relating wages to gender:

(ii) 𝑊𝑎𝑔𝑒 = 𝛾I + 𝛾" 𝐹𝑒𝑚𝑎𝑙𝑒 + 𝑣5

Relationship among coefficients: 𝜷𝟎 = 𝜸𝟎 + 𝜸𝟏 , 𝛽I + 𝛽" = 𝛾I

(iv) 𝑊𝑎𝑔𝑒 = 𝛼I 𝑀𝑎𝑙𝑒 + 𝛼" 𝐹𝑒𝑚𝑎𝑙𝑒 + 𝜂5

Relationship among coefficients: 𝜷𝟎 = 𝜸𝟎 + 𝜸𝟏 = 𝜶𝟎 , 𝛽I + 𝛽" = 𝛾I = 𝛼"

Control Variables and Conditional Mean Independence

If we could run an experiment, we would randomly assign students (and teachers) to

But with observational data, 𝑢5 depends on additional factors (parental involvement,

For our test scores example:

𝑃𝑐𝑡𝐸𝐿 = percent English learners in the school district

• 𝑆𝑇𝑅 is the variable of interest

• Because a control variable is correlated with an omitted causal factor, LSA #1

𝐸 𝑢5 𝑋5 , 𝑊5 = 𝐸(𝑢5 |𝑊5 ) (conditional mean independence)

If 𝑊 is a control variable, then conditional mean independence replaces LSA #1 – it is

In addition, suppose that LSA #2, 3, 4 hold. Then:

1. 𝛽" has a causal interpretation

Suppose that 𝐸 𝑢 𝑊 = 𝛾I + 𝛾8 𝑊 (that is, 𝐸 𝑢 𝑊 is linear in 𝑊). Then under

so that 𝐸 𝑣 𝑋, 𝑊 = 0. Combining (∗) and (∗∗) yields

Since 𝐸 𝑣 𝑋, 𝑊 = 0, the OLS estimators of 𝛿I , 𝛽" and 𝛿8 are unbiased.

Notice that 𝐸 𝛽" = 𝛽" and 𝐸 𝜷𝟐 = 𝜹𝟐 = 𝜷𝟐 +𝜸𝟐

Under conditional mean independence,

𝐸 𝛽" = 𝛽" , and

• The OLS estimator of the effect of interest, 𝛽" , is unbiased

Coming up next: Hypothesis Testing and Confidence Intervals

You might also like