MLRM
MLRM
Dougherty
Introduction to Econometrics,
5th edition
Chapter heading
Chapter 3: Multiple Regression
Analysis
The regression coefficients are derived using the same least squares principle used in
simple regression analysis. The fitted value of Y in observation i depends on our choice of
b1, b2, and b3.
11
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
The residual ei in observation i is the difference between the actual and fitted values of Y.
12
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
RSS uˆ Yi b1 b2 X 2 i b3 X 3 i
2 2
i
We define RSS, the sum of the squares of the residuals, and choose b1, b2, and b3 so as to
minimize it.
13
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
RSS uˆ Yi b1 b2 X 2 i b3 X 3 i
2 2
i
14
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
2b1b3 X 3 i 2b2 b3 X 2 i X 3 i
15
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
X X
2i 2 i 3i 3
Y Y X X 2
X 3 i X 3 Yi Y X 2 i X 2 X 3 i X 3
ˆ
2
X 2i X 2 X 3 i X 3 X 2 i X 2 X 3 i X 3
2 2 2
We thus obtain three equations in three unknowns. Solving these equations, we obtain
expressions for the specific values that satisfy the OLS criterion. (The expression for ̂ 3 is
the same as that for ̂ 2 , with the subscripts 2 and 3 interchanged everywhere.)
16
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
X X
2i 2 i 3i 3
Y Y X X 2
X 3 i X 3 Yi Y X 2 i X 2 X 3 i X 3
ˆ
2
X 2i X 2 X 3i X 3 X 2i X 2 X 3i X 3
2 2 2
17
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
X X
2i 2 i 3i 3
Y Y X X 2
X 3 i X 3 Yi Y X 2 i X 2 X 3 i X 3
ˆ
2
X 2i X 2 X 3i X 3 X 2i X 2 X 3i X 3
2 2 2
However, the expressions for the slope coefficients are considerably more complex than
that for the slope coefficient in simple regression analysis.
18
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
X X
2i 2 i 3i 3
Y Y X X 2
X 3 i X 3 Yi Y X 2 i X 2 X 3 i X 3
ˆ
2
X 2i X 2 X 3i X 3 X 2i X 2 X 3i X 3
2 2 2
For the general case when there are many explanatory variables, ordinary algebra is
inadequate. It is necessary to switch to matrix algebra.
19
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
ˆ
EARNINGS 14.67 1.88 S 0.98 EXP
Here is the regression output for the wage equation using Data Set 21.
20
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
ˆ
EARNINGS 14.67 1.88 S 0.98 EXP
It indicates that hourly earnings increase by $1.88 for every extra year of schooling and by
$0.98 for every extra year of work experience.
21
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
ˆ
EARNINGS 14.67 1.88 S 0.98 EXP
Literally, the intercept indicates that an individual who had no schooling or work experience
would have hourly earnings of –$14.67.
22
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE
ˆ
EARNINGS 14.67 1.88 S 0.98 EXP
Obviously, this is impossible. The lowest value of S in the sample was 8. We have obtained
a nonsense estimate because we have extrapolated too far from the data range.
23
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
ˆ
EARNINGS 14.67 1.88 S 0.98 EXP
The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on
S, years of schooling, and EXP, years of work experience.
1
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
120
100
Hourly earnings ($)
80
60
40
20
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)
Suppose that you were particularly interested in the relationship between EARNINGS and S
and wished to represent it graphically, using the sample data.
2
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
120
100
Hourly earnings ($)
80
60
40
20
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)
3
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
120
. cor S EXP
100 (obs=500)
| S ASVABC
--------+------------------
Hourly earnings ($)
S| 1.0000
80
EXP| -0.5836 1.0000
60
40
20
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)
Schooling is negatively correlated with work experience. The plot fails to take account of
this, and as a consequence the regression line underestimates the impact of schooling on
earnings.
4
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
120
. cor S EXP
100 (obs=500)
| S ASVABC
--------+------------------
Hourly earnings ($)
S| 1.0000
80
EXP| -0.5836 1.0000
60
40
20
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)
We will investigate the distortion mathematically when we come to omitted variable bias.
5
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
120
. cor S EXP
100 (obs=500)
| S ASVABC
--------+------------------
Hourly earnings ($)
S| 1.0000
80
EXP| -0.5836 1.0000
60
40
20
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)
To eliminate the distortion, you purge both EARNINGS and S of their components related to
EXP and then draw a scatter diagram using the purged variables.
6
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
We start by regressing EARNINGS on EXP, as shown above. The residuals are the part of
EARNINGS which is not related to EXP. The ‘predict’ command is the Stata command for
saving the residuals from the most recent regression. We name them EEARN.
7
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
. reg S EXP
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 257.18
Model | 1278.43322 1 1278.43322 Prob > F = 0.0000
Residual | 2475.58878 498 4.9710618 R-squared = 0.3406
-----------+------------------------------ Adj R-squared = 0.3392
Total | 3754.022 499 7.52309018 Root MSE = 2.2296
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
EXP | -.5473191 .0341292 -16.04 0.000 -.6143741 -.4802641
_cons | 18.39324 .241494 76.16 0.000 17.91877 18.86771
----------------------------------------------------------------------------
We do the same with S. We regress it on EXP and save the residuals as ES.
8
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
EEARN
80
60
40
20
0
-10 -8 -6 -4 -2 0 2 4 6 ES
-20
-40
Now we plot EEARN on ES and the scatter is a faithful representation of the relationship,
both in terms of the slope of the trend line (the solid line) and in terms of the variation about
that line.
9
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
EEARN
80
60
40
20
0
-10 -8 -6 -4 -2 0 2 4 6 ES
-20
-40
As you would expect, the trend line is steeper that in scatter diagram which did not control
for EXP (reproduced here as the dashed line).
10
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
. reg EEARN ES
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 70.56
Model | 8727.05507 1 8727.05507 Prob > F = 0.0000
Residual | 61593.5414 498 123.68181 R-squared = 0.1241
-----------+------------------------------ Adj R-squared = 0.1223
Total | 70320.5965 499 140.923039 Root MSE = 11.121
----------------------------------------------------------------------------
EEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ES | 1.877563 .2235186 8.40 0.000 1.438408 2.316719
_cons | -1.32e-08 .4973566 -0.00 1.000 -.977176 .977176
----------------------------------------------------------------------------
11
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
. reg EEARN ES
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 70.56
Model | 8727.05507 1 8727.05507 Prob > F = 0.0000
Residual | 61593.5414 498 123.68181 R-squared = 0.1241
-----------+------------------------------ Adj R-squared = 0.1223
Total | 70320.5965 499 140.923039 Root MSE = 11.121
----------------------------------------------------------------------------
EEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ES | 1.877563 .2235186 8.40 0.000 1.438408 2.316719
_cons | -1.32e-08 .4973566 -0.00 1.000 -.977176 .977176
----------------------------------------------------------------------------
A mathematical proof that the technique works requires matrix algebra. We will content
ourselves by verifying that the estimate of the slope coefficient is the same as in the
multiple regression.
12
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
. reg EEARN ES
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 70.56
Model | 8727.05507 1 8727.05507 Prob > F = 0.0000
Residual | 61593.5414 498 123.68181 R-squared = 0.1241
-----------+------------------------------ Adj R-squared = 0.1223
Total | 70320.5965 499 140.923039 Root MSE = 11.121
----------------------------------------------------------------------------
EEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ES | 1.877563 .2235186 8.40 0.000 1.438408 2.316719
_cons | -1.32e-08 .4973566 -0.00 1.000 -.977176 .977176
----------------------------------------------------------------------------
Finally, a small and not very important technical point. You may have noticed that the
standard error and t statistic do not quite match. The reason for this is that the number of
degrees of freedom is overstated by 1 in the residuals regression.
13
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL
. reg EEARN ES
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 70.56
Model | 8727.05507 1 8727.05507 Prob > F = 0.0000
Residual | 61593.5414 498 123.68181 R-squared = 0.1241
-----------+------------------------------ Adj R-squared = 0.1223
Total | 70320.5965 499 140.923039 Root MSE = 11.121
----------------------------------------------------------------------------
EEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ES | 1.877563 .2235186 8.40 0.000 1.438408 2.316719
_cons | -1.32e-08 .4973566 -0.00 1.000 -.977176 .977176
----------------------------------------------------------------------------
That regression has not made allowance for the fact that we have already used up 1 degree
of freedom in removing EXP from the model.
14
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
This sequence investigates the variances and standard errors of the slope coefficients in a
model with two explanatory variables.
1
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
The expression for the variance of ̂ 2 is shown above. The expression for the variance of ̂ 3
is the same, with the subscripts 2 and 3 interchanged.
2
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
The first factor in the expression is identical to that for the variance of the slope coefficient
in a simple regression model.
3
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
The variance of ̂ 2 depends on the variance of the disturbance term, the number of
observations, and the mean square deviation of X 2 for exactly the same reasons as in a
simple regression model.
4
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
5
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
The higher is the correlation between the explanatory variables, positive or negative, the
greater will be the variance.
6
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
This is easy to understand intuitively. The greater the correlation, the harder it is to
discriminate between the effects of the explanatory variables on Y, and the less accurate
will be the regression estimates.
7
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
Note that the variance expression above is valid only for a model with two explanatory
variables. When there are more than two, the expression becomes much more complex
and it is sensible to switch to matrix algebra.
8
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
u2 1
standard deviation of ˆ 2
2i 2
2 2
X X 1 rX2 ,X3
The standard deviation of the distribution of ̂ 2 is of course given by the square root of its
variance.
9
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
u2 1
standard deviation of ˆ 2
2i 2
2 2
X X 1 rX2 ,X3
With the exception of the variance of u, we can calculate the components of the standard
deviation from the sample data.
10
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
1 2 n k 2
E uˆ i u
n n
The variance of u has to be estimated. The mean square of the residuals provides a
consistent estimator, but it is biased downwards by a factor (n – k) / n , where k is the
number of parameters, in a finite sample.
11
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
1 2 n k 2 1
E uˆ i
n n
u ˆ
2
u
n k
i
ˆ
u 2
Obviously we can obtain an unbiased estimator by dividing the sum of the squares of the
residuals by n – k instead of n. We denote this unbiased estimator ˆ u .
2
12
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
u2 1 u2 1
2ˆ
2
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
1 2 n k 2 1
E uˆ i
n n
u ˆ
2
u
n k
i
ˆ
u 2
ˆ u2 1 ˆ u2 1
s.e. ˆ2
X 2i X 2
2 2 2
1 rX2 ,X3 n MSD X 2 1 rX2 , X3
Thus the estimate of the standard deviation of the probability distribution of ̂ 2 , known as
the standard error of ̂ 2 for short, is given by the expression above.
13
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
We will use this expression to analyze why the standard error of S is larger for the union
subsample than for the non-union subsample in wage equation regressions using Data Set
21.
14
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
To select a subsample in Stata, you add an ‘if’ statement to a command. The COLLBARG
variable is equal to 1 for respondents whose rates of pay are determined by collective
bargaining, and it is 0 for the others.
15
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Note that in tests for equality, Stata requires the = sign to be duplicated.
16
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
17
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
In the case of the non-union subsample, the standard error of S is 0.2439, less than half as
large.
18
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
ˆ u2 1 ˆ 2
1
s.e. 2
ˆ u
X 2i X 2
2
1 r 2
X2 ,X3 n MSD X 2 1 r 2
X2 ,X3
We will explain the difference by looking at the components of the standard error. It is
convenient to start by rearranging the expression for the standard error as the product of
the four factors.
19
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Union 0.5365
Non-union 0.2439
Factor product
Union
Non-union
20
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1
ˆ
2
u RSS
n k
21
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1
ˆ
2
u RSS
n k
22
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1
ˆ
2
u RSS
n k
RSS / (n – k) is equal to 108.908. To obtain ˆ u , we take the square root. This is 10.436.
23
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Non-union 0.2439
Factor product
Union
Non-union
24
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Similarly, in the case of the non-union subsample, ˆ u is the square root of 123.325, which is
11.105. We also note that the number of observations in that subsample is 425.
25
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Factor product
Union
Non-union
26
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Factor product
Union
Non-union
We calculate the mean square deviation of S for the two subsamples from the sample data.
27
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
The correlation coefficients for S and EXP are –0.5866 and –0.5796 for the union and non-
union subsamples, respectively. (Note that "cor" is the Stata command for computing
correlations.)
28
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Factor product
Union
Non-union
These entries complete the top half of the table. We will now look at the impact of each
item on the standard error, using the mathematical expression at the top.
29
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Factor product
Union 10.436
Non-union 11.105
The ˆ u components need no modification. It is a little larger for the non-union subsample,
and so has an adverse effect on the standard error.
30
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Factor product
The number of observations is much larger for the non-union subsample, so the second
factor is much smaller than that for the union subsample.
31
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Factor product
Perhaps surprisingly, the variance in schooling is similar for the two subsamples.
32
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Factor product
The correlation between schooling and work experience is also similar for the two
subsamples. Note that the sign of the correlation makes no difference since it is squared.
33
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Factor product
Multiplying the four factors together, we obtain the standard errors. (The discrepancy in
the last digit of the union standard error has been caused by rounding error.)
34
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Factor product
We see that the reason that the standard error is smaller for the non-union subsample is
that there are far more observations than in the non-union subsample. Otherwise the
standard errors would have been about the same.
35
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
1 1 1
ˆ
s.e. 2 ˆ u
n MSD( X 2 )
1 rX22 , X 3
Factor product
The goodness of fit, as measured by ˆ u , is slightly inferior for the non-union sample and
this has a marginal offsetting effect. The other two factors are very similar for the two
subsamples.
36
Copyright Christopher Dougherty 2016.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.
2016.04.28