0% found this document useful (0 votes)
10 views

Web Page

The document discusses modeling interactions between independent variables in multiple regression analysis. It provides examples of including an interaction term between two binary variables in a regression to allow the effect of one variable to depend on the value of the other variable. This is applied to a model examining the effect of class size on test scores, allowing the effect to vary for students with high or low English learner percentages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Web Page

The document discusses modeling interactions between independent variables in multiple regression analysis. It provides examples of including an interaction term between two binary variables in a regression to allow the effect of one variable to depend on the value of the other variable. This is applied to a model examining the effect of class size on test scores, allowing the effect to vary for students with high or low English learner percentages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Lecture 6

Introduction:
Multiple Regression - Further Applications

Dragos Radu
[email protected]

5SSMN932: Introduction to Econometrics


outline lecture 6

• interactions between independent variables


• interactions between binary variables
• interactions between a binary and a continuous variable
• interactions between two continuous variables
• summary of nonlinear regression models of test scores

Recommended readings:
Stock and Watson, chapter: 8
test scores: language learning and small classes

What if students who are still learning English benefit in a di↵erent way
from one-on-one or small-group instruction?
• perhaps smaller classes help more if there are many English learners,
who need individual attention
DTestScore
• in that case, DSTR might depend on PctEL
DY
• or more generally: DX 1
might depend on X2
• how do we model such “interactions” between X1 and X2 ?
three types of interactions

Y = b 0 + b 1 · X1 + b 2 · X2 + b 3 · ( X1 ⇥ X2 ) + u
We consider three cases:
• both X1 and X2 are binary
• X1 is continuous and X2 is binary
• both X1 and X2 are continuous
our variables

outcome of interest:
testscr average of reading and math scores on achivement test

variable of interest (treatment):


str student teacher ratio (nr of students per teacher)

control variables:
el pct percent of English Learners
expn stu expentitures per student ($’s)
avginc district average income (in $1000’s)
example: TestScr , STR and English learners

we define two new binary variables corresponding to high/low


student-to-teacher ratios and high/low percentages of English learners:
⇢ ⇢
1, if STR 20 1, if PctEL 10
HiSTR = and HiEL =
0, if STR < 20 0, if PctEL < 10

How can we allow the e↵ect of being in a small class depend on the
percentage of English learners?
You can do this when you regress test scores on these two dummies
and the interaction between them.
regression with two dummies, no interaction
we first generate the two dummies:
. gen histr=str>=20
. gen hiel=el_pct>=10
we could the run the following regression:
. reg testscr histr hiel

Source | SS df MS Number of obs = 420


-------------+---------------------------------- F(2, 417) = 86.64
Model | 44653.9085 2 22326.9542 Prob > F = 0.0000
Residual | 107455.685 417 257.687494 R-squared = 0.2936
-------------+---------------------------------- Adj R-squared = 0.2902
Total | 152109.594 419 363.030056 Root MSE = 16.053
------------------------------------------------------------------------------
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
histr | -3.586749 1.610329 -2.23 0.026 -6.752124 -.4213748
hiel | -19.71858 1.601846 -12.31 0.000 -22.86728 -16.56988
_cons | 664.725 1.200639 553.64 0.000 662.365 667.0851
------------------------------------------------------------------------------
But this assumes that changing the class size doesn’t depend on the % of English learners.
Can you explain why?
when the e↵ect of HiSTR doesn’t depend on HiEL

We can see why the previous regression (just on the two dummies)
assumes that the e↵ect of class size is the same in districts with high and
low % of English learners using the equation:

\ = b 0 + b 1 · HiSTR + b 2 · HiEL
TestScr
your turn: regression with two interacted binary variables
To a allow the e↵ect of HiSTR to depend on HiEL we include the interaction term in the
regression. We can do this directly in Stata using the ## operator:
. reg testscr histr##hiel

------------------------------------------------------------------------------
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.histr | -1.907842 2.233654 -0.85 0.394 -6.298497 2.482813
1.hiel | -18.16295 2.150084 -8.45 0.000 -22.38933 -13.93656
|
histr#hiel |
1 1 | -3.494335 3.22244 -1.08 0.279 -9.82863 2.83996
|
_cons | 664.1433 1.314807 505.13 0.000 661.5588 666.7278
------------------------------------------------------------------------------

The binary operator histr##hiel tells Stata to do three things:


• to include the dummy histr in the regression
• to include the dummy hiel in the regression
• to include the interaction histr⇥hiel in the regression
your turn: regression with two interacted binary variables
TestScr , STR and English learners

Can you relate these coefficients to the following table of group means?
Can you fill in the rest of the cells using the regression results?

Low STR High STR


Low EL 664.1
High EL
what comes next?

• part 1 lecture 6: interaction between two binary (dummy) variables


• part 2 lecture 6: interaction between binary and continuous variables
• part 3 lecture 6: interaction between two continuous variables
Lecture 6
Part I: Interactions between binary variables

Dragos Radu
[email protected]

5SSMN932: Introduction to Econometrics


outline lecture 6 part I

• interaction between two dummy variables


• interpretation and examples
regression with two dummies, no interaction
we first generate the two dummies:
. gen histr=str>=20
. gen hiel=el_pct>=10
we could the run the following regression:
. reg testscr histr hiel

Source | SS df MS Number of obs = 420


-------------+---------------------------------- F(2, 417) = 86.64
Model | 44653.9085 2 22326.9542 Prob > F = 0.0000
Residual | 107455.685 417 257.687494 R-squared = 0.2936
-------------+---------------------------------- Adj R-squared = 0.2902
Total | 152109.594 419 363.030056 Root MSE = 16.053
------------------------------------------------------------------------------
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
histr | -3.586749 1.610329 -2.23 0.026 -6.752124 -.4213748
hiel | -19.71858 1.601846 -12.31 0.000 -22.86728 -16.56988
_cons | 664.725 1.200639 553.64 0.000 662.365 667.0851
------------------------------------------------------------------------------
But this assumes that changing the class size doesn’t depend on the % of English learners.
Can you explain why?
when the e↵ect of HiSTR doesn’t depend on HiEL

\ = b 0 + b 1 · HiSTR + b 2 · HiEL
TestScr

• no matter what the % of English learners is, the e↵ect of changing


from small to large classes is b 1
• b 1 is the e↵ect of changing HiSTR = 0 to HiSTR = 1
• in this specification, this e↵ect doesn’t depend on the value of HiEL
when the e↵ect of HiSTR does depend on HiEL

• to allow the e↵ect of changing HiSTR to depend on HiEL we include


the interaction term HiSTR ⇥ HiEL as a regressor:

TestScri = b 0 + b 1 · HiSTRi + b 2 · HiELi + b 3 · HiSTRi ⇥ HiELi + ui

E (TestScri | HiSTRi = 0, HiEL) = b 0 + b 2 · HiEL


E (TestScri | HiSTRi = 1, HiEL) = b 0 + b 1 + b 2 · HiEL + b 3 · HiEL
DTestScr = b 1 + b 3 · HiEL

• the e↵ect of HiSTR depends on HiEL (what we wanted)


• b 3 = increment of the e↵ect of HiSTR when HiEL = 1
regression with two interacted binary variables
To a allow the e↵ect of HiSTR to depend on HiEL we include the interaction term in the
regression. We can do this directly in Stata using the ## operator:
. reg testscr histr##hiel

------------------------------------------------------------------------------
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.histr | -1.907842 2.233654 -0.85 0.394 -6.298497 2.482813
1.hiel | -18.16295 2.150084 -8.45 0.000 -22.38933 -13.93656
|
histr#hiel |
1 1 | -3.494335 3.22244 -1.08 0.279 -9.82863 2.83996
|
_cons | 664.1433 1.314807 505.13 0.000 661.5588 666.7278
------------------------------------------------------------------------------

The binary operator histr##hiel tells Stata to do three things:


• to include the dummy histr in the regression
• to include the dummy hiel in the regression
• to include the interaction histr⇥hiel in the regression
regression with two interacted binary variables
TestScr , STR and English learners

Can you relate these coefficients to the following table of group means?

Low STR High STR


Low EL 664.1 662.2
High EL 645.9 640.5

• “e↵ect” of HiSTR if HiEL = 0 is 1.9


• “e↵ect” of HiSTR if HiEL = 1 is 1.9 3.5 = 5.4
• reducing class size appears more e↵ective for English learners
Stata operators for indicator variables

There are four ways to create indicator variables from categorical variables
and to interact categorical and continuous variables:
Operator Description
----------------------------------------------------------
i. operator to specify indicators (categories)
c. operator to treat a variable as continuous
# binary operator to specify interactions
## binary operator to specify factorial interactions
-----------------------------------------------------------
to see how prefixes and binary interaction operators work in Stata use:
help fvvarlist
what comes next?

• part 2 lecture 6: interaction between binary and continuous variables


• part 3 lecture 6: interaction between two continuous variables
Lecture 6
Part II:
Interactions between a binary and a continuous variable

Dragos Radu
[email protected]

5SSMN932: Introduction to Econometrics


outline lecture 6 part II

• interaction between a binary and a continuous variable


• interpretation and examples:
• test scores and class size (HiEL as binary)
• wage regression - gender di↵erences on the labour market
dummy in multiple regression
wage = b 0 + d0 · female + b 1 · exper

d0 = E (wage |female, exper0 ) E (wage |male, exper0 )


. reg wage female exper

Source | SS df MS Number of obs = 750


-------------+------------------------------ F( 2, 747) = 59.45
Model | 2890.1896 2 1445.0948 Prob > F = 0.0000
Residual | 18158.7608 747 24.3089168 R-squared = 0.1373
-------------+------------------------------ Adj R-squared = 0.1350
Total | 21048.9504 749 28.1027376 Root MSE = 4.9304
------------------------------------------------------------------------------
wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | -2.987036 .3780069 -7.90 0.000 -3.729118 -2.244954
exper | .3330561 .0696329 4.78 0.000 .1963566 .4697555
_cons | 8.642637 .8171341 10.58 0.000 7.038484 10.24679
------------------------------------------------------------------------------

we impose a common slope on exper for men and women, b 1 = .333 in this example
only the intercepts that are allowed to di↵er.
intercept shift
graph of wage = b 0 + d0 · female + b 1 · exper for d0 < 0
14
wage
men (slope = .333)
12
10

difference = 2.99
8

women (slope = .333)


6
4
2
0

0 2 4 6 8 10 12 14
exper

predicted wage (men) predicted wage (women)


regressions with dummy and continuous variables
when the e↵ect of STR does depend on HiEL

• we allow the e↵ect of changing STR to depend on HiEL by including


the interaction term STR ⇥ HiEL as a regressor:

TestScri = b 0 + b 1 · STRi + b 2 · HiELi + b 3 · STRi ⇥ HiELi + ui

for a chage in STR by DSTR :


TestScr = b 0 + b 1 · STR + b 2 · HiEL + b 3 · STR ⇥ HiEL
TestScr = b 0 + b 1 · (STR + DSTR ) + b 2 · HiEL + b 3 · (STR + DSTR ) ⇥ HiEL
DTestScr = b 1 · DSTR + b 3 · HiEL · DSTR
DTestScr
= b 1 + b 3 · HiEL
DSTR

• the e↵ect of STR depends on HiEL (what we wanted)


• b 3 = increment of the e↵ect of STR when HiEL = 1
allowing for di↵erent slopes

TestScore = b 0 + b 1 · STR + b 2 · HiEL + b 3 · Hiel · STR + u


we can compare the slopes and intercepts by re-writing the model:

TestScore = ( b 0 + b 2 · HiEL) + ( b 1 + b 3 · Hiel ) · STR + u

Intercept Slope
HiEL=0 (Low % EL) b0 b1
HiEL=1 (High % EL) b0 + b2 b1 + b3
Di↵erence (High % EL) (Low % EL) b2 b3
interaction between a binary and a continuous variable
we allow the e↵ect of STR to depend on HiEL by including their interaction in the regression:
. reg testscr c.str##hiel

Source | SS df MS Number of obs = 420


-------------+---------------------------------- F(3, 416) = 62.40
Model | 47205.8516 3 15735.2839 Prob > F = 0.0000
Residual | 104903.742 416 252.172457 R-squared = 0.3103
-------------+---------------------------------- Adj R-squared = 0.3054
Total | 152109.594 419 363.030056 Root MSE = 15.88

------------------------------------------------------------------------------
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
str | -.9684601 .539787 -1.79 0.074 -2.02951 .0925899
1.hiel | 5.639141 16.71767 0.34 0.736 -27.2225 38.50078
|
hiel#c.str |
1 | -1.276613 .8440608 -1.51 0.131 -2.935769 .3825425
|
_cons | 682.2458 10.51094 64.91 0.000 661.5847 702.907
------------------------------------------------------------------------------
The binary operator c.str##hiel tells to to treat str and a continuous variables and
HiEL as binary.
interacted continuous and binary variables
TestScr , STR and English learners

• when HiEL = 0:
\ = 682.2
TestScr 0.97 · STR
• when HiEL = 1:
\ = 682.2 0.97 · STR + 5.6
TestScr 1.28 · STR
= 687.8 2.25 · STR
• two regression lines for each HiEL group
• class size reduction is estimated to have a larger e↵ect when the
percent of English learners is large
allowing for di↵erent slopes
TestScore = b 0 + b 1 · STR + b 2 · HiEL + b 3 · Hiel · STR + u
back to our wage regression
lwage = b 0 + d0 female + b 1 exper + d1 female · exper + u

. gen femexper = female*exper

. reg lwage female##c.exper

Source | SS df MS Number of obs = 750


-------------+------------------------------ F( 3, 746) = 54.51
Model | 32.4164273 3 10.8054758 Prob > F = 0.0000
Residual | 147.879402 746 .198229762 R-squared = 0.1798
-------------+------------------------------ Adj R-squared = 0.1765
Total | 180.295829 749 .240715393 Root MSE = .44523

------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | -.5184152 .1469194 -3.53 0.000 -.8068399 -.2299905
exper | .0283287 .0111194 2.55 0.011 .0064997 .0501577
femexper | .0233771 .0134822 1.73 0.083 -.0030905 .0498447
_cons | 2.097866 .1258909 16.66 0.000 1.850724 2.345009
------------------------------------------------------------------------------
allowing for di↵erent slopes in the wage regression

[ =
lwage 2.098 .518 female + .0283 exper + .0234 female · exper
(.126) (.147) (.0111) (.0135)
2
n = 750, R = .180

• the intercept for men is 2.098 and the slope is .0283 – about 2.8% for
each year of experience.
• the intercept for women is 2.098 .518 = 1.58 and the slope is
.0283 + .0234 = .0517 – about 5.2% for each year of experience.
• the interaction term is marginally statistically significant, with p-value
= .083. (at the 10% level but not the 5%.)
allowing for di↵erent slopes in the wage regression
lwage = b 0 + d0 female + b 1 exper + d1 female · exper + u
2.6
lwage
2.4

slope = .0283
difference = .190
2.2
2

difference = .518
1.8

slope = .0517
1.6
1.4

0 2 4 6 8 10 12 14
exper
male female
interpretation

[ =
lwage 2.098 .518 female + .0283 exper + .0234 female · exper
(.126) (.147) (.0111) (.0135)
2
n = 750, R = .180

must use care to interpret the coefficient on female when female · exper is
included: at any level of experience, the predicted di↵erence in lwage
between females and males is

.518 + .0234 exper

the coefficient on female, .518, is the predicted di↵erence in lwage


between a woman and man when exper = 0.
interpretation

[ =
lwage 2.098 .518 female + .0283 exper + .0234 female · exper
(.126) (.147) (.0111) (.0135)
2
n = 750, R = .180

must use care to interpret the coefficient on female when female · exper is
included: at any level of experience, the predicted di↵erence in lwage
between females and males is

.518 + .0234 exper

the coefficient on female, .518, is the predicted di↵erence in lwage


between a woman and man when exper = 0.
this is not an especially interesting subset of the population.
(only about 1.5% of the sample has exper less than 3, and no one can
have zero years.)
interpretation

more interesting is the gap at around the mean, say exper = 10:

.518 + .0234(10) = .284

or about 28.4% less for women - the gap never fully closes (largest amount
of experience in the sample = 13 56 years)
• we can centre the variable by replacing female · exper with
female · (exper 10)
• the coefficient on female becomes the di↵erence at 10 years exper
• 10 is close to the mean value of experience in the sample.
results after centuring

. gen femexper_10 = female*(exper - 10)

. reg lwage female exper femexper_10

Source | SS df MS Number of obs = 750


-------------+------------------------------ F( 3, 746) = 54.51
Model | 32.4164273 3 10.8054758 Prob > F = 0.0000
Residual | 147.879402 746 .198229762 R-squared = 0.1798
-------------+------------------------------ Adj R-squared = 0.1765
Total | 180.295829 749 .240715393 Root MSE = .44523

------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | -.2846441 .0350777 -8.11 0.000 -.3535068 -.2157814
exper | .0283287 .0111194 2.55 0.011 .0064997 .0501577
femexper_10 | .0233771 .0134822 1.73 0.083 -.0030905 .0498447
_cons | 2.097866 .1258909 16.66 0.000 1.850724 2.345009
------------------------------------------------------------------------------
what comes next?

• part 3 lecture 6: interaction between two continuous variables


Lecture 6
Part III: Interactions between two continuous variables

Dragos Radu
[email protected]

5SSMN932: Introduction to Econometrics


outline lecture 6 part III

• interaction between two continuous variables


• summary of nonlinear e↵ects of class size on test scores
when the e↵ect of STR does depend on PctEL
interaction between two continuous variables

• we allow the e↵ect of changing STR to depend on PctEL by including


the interaction term STR ⇥ PctEL as a regressor:

TestScri = b 0 + b 1 · STRi + b 2 · PctELi + b 3 · STRi ⇥ PctELi + ui

for a chage in STR by DSTR :


TestScr = b 0 + b 1 · STR + b 2 · PctEL + b 3 · STR ⇥ PctEL
TestScr = b 0 + b 1 · (STR + DSTR ) + b 2 · PctEL + b 3 · (STR + DSTR ) ⇥ PctEL
DTestScr = b 1 · DSTR + b 3 · PctEL · DSTR
DTestScr
= b 1 + b 3 · PctEL
DSTR

• the e↵ect of STR depends on PctEL (what we wanted)


• b 3 = increment of the e↵ect of STR from a unit change in PctEL = 1
interaction between two continuous variables

To a allow the e↵ect of HiSTR to depend on PctEL we include the interaction term in the
regression. We can do this directly in Stata using the ## operator:
. reg testscr c.str##c.el_pct, r

Linear regression Number of obs = 420


F(3, 416) = 155.05
Prob > F = 0.0000
R-squared = 0.4264
Root MSE = 14.482
--------------------------------------------------------------------------------
| Robust
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------------+----------------------------------------------------------------
str | -1.117018 .5875135 -1.90 0.058 -2.271884 .037847
el_pct | -.6729114 .3741231 -1.80 0.073 -1.408319 .062496
|
c.str#c.el_pct | .0011618 .0185357 0.06 0.950 -.0352736 .0375971
|
_cons | 686.3385 11.75935 58.37 0.000 663.2234 709.4537
--------------------------------------------------------------------------------
The binary operator c.str##c.el pct tells Stata to treat both str and el pct as continuous
variables.
interaction between two continuous variables
TestScr , STR and English learners

• the estimated e↵ect of class size reduction is nonlinear because the


size of the e↵ect itself depends on PctEL:
DTestScr
= 1.12 + .0012 · PctEL
DSTR
summary: nonlinear e↵ects on test score of the STR

nonlinear specifications let us examine more nuanced questions about the


test score – STR relation, such as:
1. are there nonlinear e↵ects of class size reduction on test scores?
(does a reduction from 35 to 30 have same e↵ect as a reduction from
20 to 15?)
2. are there nonlinear interactions between PctEL and STR?
(are small classes more e↵ective when there are many English
learners?)
nonlinearities: di↵erent e↵ects for di↵erent STR

• estimate linear and nonlinear functions of STR, holding constant


relevant demographic variables
• PctEL
• income (remember the nonlinear test score-income relation!)
• LunchPCT (fraction on free/subsidised lunch)
• see whether adding the nonlinear terms makes an “economically
important” quantitative di↵erence (“economic” or “real-world”
importance is di↵erent than statistically significant)
• test for whether the nonlinear terms are significant
interactions between PctEL and STR

• estimate linear and nonlinear functions of STR, interacted with PctEL


• if the specification is nonlinear (with STR, STR 2 , STR 3 ), then you
need to add interactions with all the terms so that the entire
functional form can be di↵erent, depending on the level of PctEL
• we will use a binary-continuous interaction specification by adding
HiEL ⇥ STR, HiEL ⇥ STR 2 , and HiEL ⇥ STR 3
nonlinear regression models of test scores
nonlinear regression models of test scores (Tab 8.3 continued)
nonlinear regression models of test scores (Tab 8.3 continued)
regression functions of test scores on class size

the cubic regression from columns (5) and (7) are very similar
regression functions of test scores on class size

the two lines have similar shapes and slopes for most districts (17 < STR < 23)
conclusions

• using functions of the independent variables such as ln (X ) or


X1 ⇥ X2 , allows recasting a large family of nonlinear regression
functions as multiple regression
• estimation and inference proceed in the same way as in the linear
multiple regression model
• interpretation of the coefficients is model-specific, but the general rule
is to compute e↵ects by comparing di↵erent cases (di↵erent value of
the original X ’s)
• many nonlinear specifications are possible, so you must use judgment:
• what nonlinear e↵ect you want to analyse?
• what makes sense in your application?
what comes next?

• lecture 7 : Instrumental Variables


• meanwhile: coursework questions will be uploaded 10am on Nov 13

You might also like