0% found this document useful (0 votes)

5 views

MLRM

Chapter 3 of 'Introduction to Econometrics' by Christopher Dougherty focuses on multiple regression analysis with two explanatory variables. It explains the derivation of regression coefficients using the least squares principle, the calculation of residuals, and the minimization of the sum of squares of residuals (RSS). The chapter also presents a regression output example demonstrating how hourly earnings are influenced by years of schooling and work experience.

Uploaded by

gallagherchris823

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

MLRM

Uploaded by

gallagherchris823

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 67

Type author name/s here

Dougherty

Introduction to Econometrics,
5th edition
Chapter heading
Chapter 3: Multiple Regression
Analysis

© Christopher Dougherty, 2016. All rights reserved.

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

True model Fitted model

Y  1   2 X 2   3 X 3  u Yˆ b1  b2 X 2  b3 X 3

The regression coefficients are derived using the same least squares principle used in
simple regression analysis. The fitted value of Y in observation i depends on our choice of
b1, b2, and b3.
11
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

True model Fitted model

Y  1   2 X 2   3 X 3  u Yˆ b1  b2 X 2  b3 X 3

uˆ i Yi  Yˆi Yi  b1  b2 X 2 i  b3 X 3 i

The residual ei in observation i is the difference between the actual and fitted values of Y.

12
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

True model Fitted model

Y  1   2 X 2   3 X 3  u Yˆ b1  b2 X 2  b3 X 3

uˆ i Yi  Yˆi Yi  b1  b2 X 2 i  b3 X 3 i

RSS  uˆ  Yi  b1  b2 X 2 i  b3 X 3 i 
2 2
i

We define RSS, the sum of the squares of the residuals, and choose b1, b2, and b3 so as to
minimize it.

13
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

True model Fitted model

Y  1   2 X 2   3 X 3  u Yˆ b1  b2 X 2  b3 X 3

uˆ i Yi  Yˆi Yi  b1  b2 X 2 i  b3 X 3 i

RSS  uˆ  Yi  b1  b2 X 2 i  b3 X 3 i 
2 2
i

 (Yi 2  b12  b22 X 22i  b32 X 32i  2b1Yi  2b2 X 2 iYi

 2b3 X 3 iYi  2b1b2 X 2 i  2b1b3 X 3 i  2b2 b3 X 2 i X 3 i )
 Yi 2  nb12  b22  X 22i  b32  X 32i  2b1  Yi
 2b2  X 2 iYi  2b3  X 3 iYi  2b1b2  X 2 i
 2b1b3  X 3 i  2b2 b3  X 2 i X 3 i

First, we expand RSS as shown.

14
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

True model Fitted model

Y  1   2 X 2   3 X 3  u Yˆ b1  b2 X 2  b3 X 3

uˆ i Yi  Yˆi Yi  b1  b2 X 2 i  b3 X 3 i

RSS  Yi 2  nb12  b22  X 22i  b32  X 32i  2b1  Yi

 2b2  X 2 iYi  2b3  X 3 iYi  2b1b2  X 2 i

 2b1b3  X 3 i  2b2 b3  X 2 i X 3 i

RSS RSS RSS

0 0 0
b1 b2 b3

Then we use the first order conditions for minimizing it.

15
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 X X 
 2i 2 i  3i 3
 Y  Y   X  X 2

   X 3 i  X 3 Yi  Y   X 2 i  X 2  X 3 i  X 3 
ˆ
2 
  X 2i  X 2    X 3 i  X 3     X 2 i  X 2  X 3 i  X 3 
2 2 2

ˆ1 Y  ˆ2 X 2  ˆ3 X 3

We thus obtain three equations in three unknowns. Solving these equations, we obtain
expressions for the specific values that satisfy the OLS criterion. (The expression for ̂ 3 is
the same as that for ̂ 2 , with the subscripts 2 and 3 interchanged everywhere.)
16
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 X X 
 2i 2 i  3i 3
 Y  Y   X  X 2

   X 3 i  X 3 Yi  Y   X 2 i  X 2  X 3 i  X 3 
ˆ
2 
  X 2i  X 2    X 3i  X 3     X 2i  X 2  X 3i  X 3 
2 2 2

ˆ1 Y  ˆ2 X 2  ˆ3 X 3

The expression for ̂ 1 is a straightforward extension of the expression for it in simple

regression analysis.

17
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 X X 
 2i 2 i  3i 3
 Y  Y   X  X 2

   X 3 i  X 3 Yi  Y   X 2 i  X 2  X 3 i  X 3 
ˆ
2 
  X 2i  X 2    X 3i  X 3     X 2i  X 2  X 3i  X 3 
2 2 2

ˆ1 Y  ˆ2 X 2  ˆ3 X 3

However, the expressions for the slope coefficients are considerably more complex than
that for the slope coefficient in simple regression analysis.

18
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 X X 
 2i 2 i  3i 3
 Y  Y   X  X 2

   X 3 i  X 3 Yi  Y   X 2 i  X 2  X 3 i  X 3 
ˆ
2 
  X 2i  X 2    X 3i  X 3     X 2i  X 2  X 3i  X 3 
2 2 2

ˆ1 Y  ˆ2 X 2  ˆ3 X 3

For the general case when there are many explanatory variables, ordinary algebra is
inadequate. It is necessary to switch to matrix algebra.

19
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

. reg EARNINGS S EXP

----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 35.24
Model | 8735.42401 2 4367.712 Prob > F = 0.0000
Residual | 61593.5422 497 123.930668 R-squared = 0.1242
-----------+------------------------------ Adj R-squared = 0.1207
Total | 70328.9662 499 140.939812 Root MSE = 11.132
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.877563 .2237434 8.39 0.000 1.437964 2.317163
EXP | .9833436 .2098457 4.69 0.000 .5710495 1.395638
_cons | -14.66833 4.288375 -3.42 0.001 -23.09391 -6.242752
----------------------------------------------------------------------------

ˆ
EARNINGS  14.67  1.88 S  0.98 EXP

Here is the regression output for the wage equation using Data Set 21.

20
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

. reg EARNINGS S EXP

ˆ
EARNINGS  14.67  1.88 S  0.98 EXP

It indicates that hourly earnings increase by $1.88 for every extra year of schooling and by
$0.98 for every extra year of work experience.

21
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

. reg EARNINGS S EXP

ˆ
EARNINGS  14.67  1.88 S  0.98 EXP

Literally, the intercept indicates that an individual who had no schooling or work experience
would have hourly earnings of –$14.67.

22
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

. reg EARNINGS S EXP

ˆ
EARNINGS  14.67  1.88 S  0.98 EXP

Obviously, this is impossible. The lowest value of S in the sample was 8. We have obtained
a nonsense estimate because we have extrapolated too far from the data range.

23
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

. reg EARNINGS S EXP

ˆ
EARNINGS  14.67  1.88 S  0.98 EXP

The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on
S, years of schooling, and EXP, years of work experience.

1
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

120

100
Hourly earnings ($)

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

Suppose that you were particularly interested in the relationship between EARNINGS and S
and wished to represent it graphically, using the sample data.

2
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

120

100
Hourly earnings ($)

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

A simple plot would be misleading.

3
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

120

. cor S EXP
100 (obs=500)
| S ASVABC
--------+------------------
Hourly earnings ($)

S| 1.0000
80
EXP| -0.5836 1.0000

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

Schooling is negatively correlated with work experience. The plot fails to take account of
this, and as a consequence the regression line underestimates the impact of schooling on
earnings.
4
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

120

. cor S EXP
100 (obs=500)
| S ASVABC
--------+------------------
Hourly earnings ($)

S| 1.0000
80
EXP| -0.5836 1.0000

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

We will investigate the distortion mathematically when we come to omitted variable bias.

5
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

120

. cor S EXP
100 (obs=500)
| S ASVABC
--------+------------------
Hourly earnings ($)

S| 1.0000
80
EXP| -0.5836 1.0000

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

To eliminate the distortion, you purge both EARNINGS and S of their components related to
EXP and then draw a scatter diagram using the purged variables.

6
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

. reg EARNINGS EXP

----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 0.06
Model | 8.36885807 1 8.36885807 Prob > F = 0.8078
Residual | 70320.5974 498 141.206019 R-squared = 0.0001
-----------+------------------------------ Adj R-squared = -0.0019
Total | 70328.9662 499 140.939812 Root MSE = 11.883
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
EXP | -.0442828 .1818981 -0.24 0.808 -.4016651 .3130996
_cons | 19.86614 1.287089 15.43 0.000 17.33735 22.39494
----------------------------------------------------------------------------

. predict EEARN, resid

We start by regressing EARNINGS on EXP, as shown above. The residuals are the part of
EARNINGS which is not related to EXP. The ‘predict’ command is the Stata command for
saving the residuals from the most recent regression. We name them EEARN.
7
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

. reg S EXP
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 257.18
Model | 1278.43322 1 1278.43322 Prob > F = 0.0000
Residual | 2475.58878 498 4.9710618 R-squared = 0.3406
-----------+------------------------------ Adj R-squared = 0.3392
Total | 3754.022 499 7.52309018 Root MSE = 2.2296
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
EXP | -.5473191 .0341292 -16.04 0.000 -.6143741 -.4802641
_cons | 18.39324 .241494 76.16 0.000 17.91877 18.86771
----------------------------------------------------------------------------

. predict ES, resid

We do the same with S. We regress it on EXP and save the residuals as ES.

8
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

EEARN
80

0
-10 -8 -6 -4 -2 0 2 4 6 ES

-20

-40

Now we plot EEARN on ES and the scatter is a faithful representation of the relationship,
both in terms of the slope of the trend line (the solid line) and in terms of the variation about
that line.
9
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

EEARN
80

0
-10 -8 -6 -4 -2 0 2 4 6 ES

-20

-40

As you would expect, the trend line is steeper that in scatter diagram which did not control
for EXP (reproduced here as the dashed line).

10
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

. reg EEARN ES
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 70.56
Model | 8727.05507 1 8727.05507 Prob > F = 0.0000
Residual | 61593.5414 498 123.68181 R-squared = 0.1241
-----------+------------------------------ Adj R-squared = 0.1223
Total | 70320.5965 499 140.923039 Root MSE = 11.121
----------------------------------------------------------------------------
EEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ES | 1.877563 .2235186 8.40 0.000 1.438408 2.316719
_cons | -1.32e-08 .4973566 -0.00 1.000 -.977176 .977176
----------------------------------------------------------------------------

Here is the regression of EEARN on ES.

11
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

From multiple regression:

. reg EARNINGS S EXP
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.877563 .2237434 8.39 0.000 1.437964 2.317163
EXP | .9833436 .2098457 4.69 0.000 .5710495 1.395638
_cons | -14.66833 4.288375 -3.42 0.001 -23.09391 -6.242752
----------------------------------------------------------------------------

A mathematical proof that the technique works requires matrix algebra. We will content
ourselves by verifying that the estimate of the slope coefficient is the same as in the
multiple regression.
12
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

From multiple regression:

Finally, a small and not very important technical point. You may have noticed that the
standard error and t statistic do not quite match. The reason for this is that the number of
degrees of freedom is overstated by 1 in the residuals regression.
13
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL

From multiple regression:

That regression has not made allowance for the fact that we have already used up 1 degree
of freedom in removing EXP from the model.

14
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

This sequence investigates the variances and standard errors of the slope coefficients in a
model with two explanatory variables.

1
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

The expression for the variance of ̂ 2 is shown above. The expression for the variance of ̂ 3
is the same, with the subscripts 2 and 3 interchanged.

2
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

The first factor in the expression is identical to that for the variance of the slope coefficient
in a simple regression model.

3
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

The variance of ̂ 2 depends on the variance of the disturbance term, the number of
observations, and the mean square deviation of X 2 for exactly the same reasons as in a
simple regression model.
4
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

The difference is that in multiple regression analysis the expression is multiplied by a

factor which depends on the correlation between X 2 and X3.

5
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

The higher is the correlation between the explanatory variables, positive or negative, the
greater will be the variance.

6
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

This is easy to understand intuitively. The greater the correlation, the harder it is to
discriminate between the effects of the explanatory variables on Y, and the less accurate
will be the regression estimates.
7
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

Note that the variance expression above is valid only for a model with two explanatory
variables. When there are more than two, the expression becomes much more complex
and it is sensible to switch to matrix algebra.
8
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

 u2 1
standard deviation of ˆ 2  
  2i 2 
2 2
X  X 1  rX2 ,X3

The standard deviation of the distribution of ̂ 2 is of course given by the square root of its
variance.

9
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

 u2 1
standard deviation of ˆ 2  
  2i 2 
2 2
X  X 1  rX2 ,X3

With the exception of the variance of u, we can calculate the components of the standard
deviation from the sample data.

10
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

1 2 n k 2
E   uˆ i   u
n  n

The variance of u has to be estimated. The mean square of the residuals provides a
consistent estimator, but it is biased downwards by a factor (n – k) / n , where k is the
number of parameters, in a finite sample.
11
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

1 2 n k 2 1
E   uˆ i  
n  n
u ˆ 
2
u
n k
 i
ˆ
u 2

Obviously we can obtain an unbiased estimator by dividing the sum of the squares of the
residuals by n – k instead of n. We denote this unbiased estimator ˆ u .
2

12
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

True model Fitted model

Y  1   2 X 2   3 X 3  u Ŷ  ˆ1  ˆ2 X 2  ˆ3 X 3

 u2 1  u2 1
 2ˆ    
2
  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

1 2 n k 2 1
E   uˆ i  
n  n
u ˆ 
2
u
n k
 i
ˆ
u 2

ˆ u2 1 ˆ u2 1
 
s.e. ˆ2   
 

  X 2i  X 2 
2 2 2
1  rX2 ,X3 n MSD X 2 1  rX2 , X3

Thus the estimate of the standard deviation of the probability distribution of ̂ 2 , known as
the standard error of ̂ 2 for short, is given by the expression above.

13
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

. reg EARNINGS S EXP if COLLBARG==1

----------------------------------------------------------------------------
Source | SS df MS Number of obs = 75
-----------+------------------------------ F( 2, 72) = 4.72
Model | 1027.91667 2 513.958336 Prob > F = 0.0119
Residual | 7841.35558 72 108.907716 R-squared = 0.1159
-----------+------------------------------ Adj R-squared = 0.0913
Total | 8869.27225 74 119.85503 Root MSE = 10.436
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.42955 .536452 2.66 0.010 .3601522 2.498947
EXP | .1918676 .5901747 0.33 0.746 -.9846242 1.368359
_cons | 1.01708 10.84695 0.09 0.926 -20.60593 22.64009
----------------------------------------------------------------------------

We will use this expression to analyze why the standard error of S is larger for the union
subsample than for the non-union subsample in wage equation regressions using Data Set
21.
14
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

. reg EARNINGS S EXP if COLLBARG==1

To select a subsample in Stata, you add an ‘if’ statement to a command. The COLLBARG
variable is equal to 1 for respondents whose rates of pay are determined by collective
bargaining, and it is 0 for the others.
15
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

. reg EARNINGS S EXP if COLLBARG==1

Note that in tests for equality, Stata requires the = sign to be duplicated.

16
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

. reg EARNINGS S EXP if COLLBARG==1

In the case of the union subsample, the standard error of S is 0.5365.

17
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

. reg EARNINGS S EXP if COLLBARG==0

----------------------------------------------------------------------------
Source | SS df MS Number of obs = 425
-----------+------------------------------ F( 2, 422) = 29.48
Model | 7270.82789 2 3635.41394 Prob > F = 0.0000
Residual | 52043.2371 422 123.325206 R-squared = 0.1226
-----------+------------------------------ Adj R-squared = 0.1184
Total | 59314.065 424 139.891663 Root MSE = 11.105
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.866279 .2438803 7.65 0.000 1.386907 2.34565
EXP | 1.100186 .2223238 4.95 0.000 .6631858 1.537186
_cons | -15.9847 4.623791 -3.46 0.001 -25.07323 -6.896172
----------------------------------------------------------------------------

In the case of the non-union subsample, the standard error of S is 0.2439, less than half as
large.

18
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

ˆ u2 1 ˆ 2
1
s.e.   2  
ˆ   u

  X 2i  X 2 
2
1  r 2
X2 ,X3 n MSD  X 2  1  r 2
X2 ,X3

We will explain the difference by looking at the components of the standard error. It is
convenient to start by rearranging the expression for the standard error as the product of
the four factors.
19
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 0.5365

Non-union 0.2439

Factor product

Union

Non-union

We will arrange the components of the standard error as a table.

20
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

. reg EARNINGS S EXP if COLLBARG==1

1
ˆ 
2
u RSS
n k

We will start with ˆ u . Here is RSS for the union subsample.

21
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

. reg EARNINGS S EXP if COLLBARG==1

1
ˆ 
2
u RSS
n k

There are 75 observations in the non-union subsample. k is equal to 3. Thus n – k is equal

to 72.

22
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

. reg EARNINGS S EXP if COLLBARG==1

1
ˆ 
2
u RSS
n k

RSS / (n – k) is equal to 108.908. To obtain ˆ u , we take the square root. This is 10.436.

23
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 10.436 75 0.5365

Non-union 0.2439

Factor product

Union

Non-union

We place this in the table, along with the number of observations.

24
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

. reg EARNINGS S EXP if COLLBARG==0

Similarly, in the case of the non-union subsample, ˆ u is the square root of 123.325, which is
11.105. We also note that the number of observations in that subsample is 425.

25
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 10.436 75 0.5365

Non-union 11.105 425 0.2439

Factor product

Union

Non-union

We place these in the table.

26
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 10.436 75 7.6932 0.5365

Non-union 11.105 425 7.3467 0.2439

Factor product

Union

Non-union

We calculate the mean square deviation of S for the two subsamples from the sample data.

27
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

. cor S EXP if COLLBARG==1

(obs=75)
| S EXP
--------+------------------
S | 1.0000
EXP | -0.5866 1.0000

. cor S EXP if COLLBARG==0

(obs=425)
| S EXP
--------+------------------
S | 1.0000
EXP | -0.5796 1.0000

The correlation coefficients for S and EXP are –0.5866 and –0.5796 for the union and non-
union subsamples, respectively. (Note that "cor" is the Stata command for computing
correlations.)
28
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 10.436 75 7.6932 –0.5866 0.5365

Non-union 11.105 425 7.3467 –0.5796 0.2439

Factor product

Union

Non-union

These entries complete the top half of the table. We will now look at the impact of each
item on the standard error, using the mathematical expression at the top.

29
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 10.436 75 7.6932 –0.5866 0.5365

Non-union 11.105 425 7.3467 –0.5796 0.2439

Factor product

Union 10.436

Non-union 11.105

The ˆ u components need no modification. It is a little larger for the non-union subsample,
and so has an adverse effect on the standard error.

30
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 10.436 75 7.6932 –0.5866 0.5365

Non-union 11.105 425 7.3467 –0.5796 0.2439

Factor product

Union 10.436 0.1155

Non-union 11.105 0.0485

The number of observations is much larger for the non-union subsample, so the second
factor is much smaller than that for the union subsample.

31
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 10.436 75 7.6932 –0.5866 0.5365

Non-union 11.105 425 7.3467 –0.5796 0.2439

Factor product

Union 10.436 0.1155 0.3605

Non-union 11.105 0.0485 0.3689

Perhaps surprisingly, the variance in schooling is similar for the two subsamples.

32
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 10.436 75 7.6932 –0.5866 0.5365

Non-union 11.105 425 7.3467 –0.5796 0.2439

Factor product

Union 10.436 0.1155 0.3605 1.2348

Non-union 11.105 0.0485 0.3689 1.2271

The correlation between schooling and work experience is also similar for the two
subsamples. Note that the sign of the correlation makes no difference since it is squared.

33
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 10.436 75 7.6932 –0.5866 0.5365

Non-union 11.105 425 7.3467 –0.5796 0.2439

Factor product

Union 10.436 0.1155 0.3605 1.2348 0.5366

Non-union 11.105 0.0485 0.3689 1.2271 0.2439

Multiplying the four factors together, we obtain the standard errors. (The discrepancy in
the last digit of the union standard error has been caused by rounding error.)

34
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 10.436 75 7.6932 –0.5866 0.5365

Non-union 11.105 425 7.3467 –0.5796 0.2439

Factor product

Union 10.436 0.1155 0.3605 1.2348 0.5366

Non-union 11.105 0.0485 0.3689 1.2271 0.2439

We see that the reason that the standard error is smaller for the non-union subsample is
that there are far more observations than in the non-union subsample. Otherwise the
standard errors would have been about the same.
35
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

1 1 1
 
ˆ
s.e.  2 ˆ u  
n MSD( X 2 )

1  rX22 , X 3

Decomposition of the standard error of S

Component ˆ u n MSD(S) r S, EXP s.e.

Union 10.436 75 7.6932 –0.5866 0.5365

Non-union 11.105 425 7.3467 –0.5796 0.2439

Factor product

Union 10.436 0.1155 0.3605 1.2348 0.5366

Non-union 11.105 0.0485 0.3689 1.2271 0.2439

The goodness of fit, as measured by ˆ u , is slightly inferior for the non-union sample and
this has a marginal offsetting effect. The other two factors are very similar for the two
subsamples.
36
Copyright Christopher Dougherty 2016.

These slideshows may be downloaded by anyone, anywhere for personal use.

Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.

The content of this slideshow comes from Section 3.2 of C. Dougherty,

Introduction to Econometrics, fifth edition 2016, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
www.oxfordtextbooks.co.uk/orc/dougherty5e/.

Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2016.04.28

Midterm Fall2011
No ratings yet
Midterm Fall2011
13 pages
Predictive Modeling PDF
100% (3)
Predictive Modeling PDF
49 pages
CH 2 Multiple Regression S
No ratings yet
CH 2 Multiple Regression S
78 pages
Introduction To Econometrics, 5 Edition: Chapter 3: Multiple Regression Analysis
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter 3: Multiple Regression Analysis
25 pages
Multicollinearity_example (2)
No ratings yet
Multicollinearity_example (2)
22 pages
Multiple Regression
No ratings yet
Multiple Regression
24 pages
Introduction To Econometrics, 5 Edition: Chapter 4: Nonlinear Models and Transformations of Variables
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter 4: Nonlinear Models and Transformations of Variables
27 pages
Ch3 Multiple Regression
No ratings yet
Ch3 Multiple Regression
56 pages
Econometrics for Finance Lecture III
No ratings yet
Econometrics for Finance Lecture III
54 pages
Simple Regression Model
No ratings yet
Simple Regression Model
54 pages
2 Simple Regression Model
No ratings yet
2 Simple Regression Model
55 pages
EC212: Introduction To Econometrics Multiple Regression: Estimation (Wooldridge, Ch. 3)
No ratings yet
EC212: Introduction To Econometrics Multiple Regression: Estimation (Wooldridge, Ch. 3)
76 pages
Introduction To Econometrics, 5 Edition: Chapter 3: Multiple Regression Analysis
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter 3: Multiple Regression Analysis
16 pages
sheet 2A_MR_2021
No ratings yet
sheet 2A_MR_2021
3 pages
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
No ratings yet
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
4 pages
L9.1_2023
No ratings yet
L9.1_2023
47 pages
Topic 3 Multiple Regression Analysis Estimation
No ratings yet
Topic 3 Multiple Regression Analysis Estimation
31 pages
Introduction To Econometrics, 5 Edition: Chapter 3: Multiple Regression Analysis
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter 3: Multiple Regression Analysis
38 pages
Chapter 2: Properties of The Regression Coe Cients and Hypothesis Testing
No ratings yet
Chapter 2: Properties of The Regression Coe Cients and Hypothesis Testing
5 pages
Assignment No5
No ratings yet
Assignment No5
1 page
QM 9 Instrumental Variables I
No ratings yet
QM 9 Instrumental Variables I
29 pages
WEEK3 Multiple Regression (1)
No ratings yet
WEEK3 Multiple Regression (1)
192 pages
CH 5 - Multicollearity
No ratings yet
CH 5 - Multicollearity
27 pages
Econometrics 7
No ratings yet
Econometrics 7
49 pages
Chapter 15 - Q&A Extra
No ratings yet
Chapter 15 - Q&A Extra
3 pages
05_week_economicsofeducation
No ratings yet
05_week_economicsofeducation
11 pages
ECMT1020 - Week 06 Workshop
No ratings yet
ECMT1020 - Week 06 Workshop
4 pages
examec605D19_20
No ratings yet
examec605D19_20
2 pages
Chapter three
No ratings yet
Chapter three
35 pages
Im ch01
No ratings yet
Im ch01
11 pages
Introduction To Econometrics, 5 Edition: Chapter 4: Nonlinear Models and Transformations of Variables
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter 4: Nonlinear Models and Transformations of Variables
10 pages
Example of 2SLS and Hausman Test
No ratings yet
Example of 2SLS and Hausman Test
4 pages
MGT-Three
No ratings yet
MGT-Three
86 pages
Dougherty5e IM 2015 09 12 ch03
No ratings yet
Dougherty5e IM 2015 09 12 ch03
15 pages
Multiple Linear Regression Model
No ratings yet
Multiple Linear Regression Model
99 pages
Econometrics - Sheet 2A - MR - 2024
No ratings yet
Econometrics - Sheet 2A - MR - 2024
3 pages
Omitted Variable Tests
No ratings yet
Omitted Variable Tests
4 pages
Topic 1 class exercises
No ratings yet
Topic 1 class exercises
5 pages
Lecture 3 Multiple Regression Model-Estimation
No ratings yet
Lecture 3 Multiple Regression Model-Estimation
40 pages
Modelo Multiple
No ratings yet
Modelo Multiple
51 pages
Dougherty5e IM 2015 10 30 ch07
No ratings yet
Dougherty5e IM 2015 10 30 ch07
13 pages
Outputs 1
No ratings yet
Outputs 1
3 pages
07 Multiple Regression Analysis PDF
No ratings yet
07 Multiple Regression Analysis PDF
26 pages
L10.2_2023
No ratings yet
L10.2_2023
64 pages
Study Guide Chapter 1 (EC220)
No ratings yet
Study Guide Chapter 1 (EC220)
11 pages
Centeno - Alexander PSET2 LBYMET2 Final
No ratings yet
Centeno - Alexander PSET2 LBYMET2 Final
11 pages
C2 English
No ratings yet
C2 English
34 pages
CH 14 Handout
No ratings yet
CH 14 Handout
6 pages
Econ 1630 HW1
No ratings yet
Econ 1630 HW1
6 pages
Lab Exercises Answer
No ratings yet
Lab Exercises Answer
13 pages
Lecture 9 - Slides - Multiple Regression and Effect On Coefficients PDF
No ratings yet
Lecture 9 - Slides - Multiple Regression and Effect On Coefficients PDF
127 pages
Assignment No4
No ratings yet
Assignment No4
1 page
Econometrics Assignment... 2
No ratings yet
Econometrics Assignment... 2
12 pages
hjgh
No ratings yet
hjgh
48 pages
Gauss Markov Theorem
No ratings yet
Gauss Markov Theorem
16 pages
Chapter3 PDF
No ratings yet
Chapter3 PDF
52 pages
Brm Unit 3 Mcom Sem1
No ratings yet
Brm Unit 3 Mcom Sem1
40 pages
Midterm GR5412 2019 PDF
No ratings yet
Midterm GR5412 2019 PDF
2 pages
Chapter 2 P1
No ratings yet
Chapter 2 P1
20 pages
Specification
No ratings yet
Specification
12 pages
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
From Everand
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
CSPacademic
No ratings yet
Implementation and Effectiveness of Police Checkpoint in Cotabato City
100% (1)
Implementation and Effectiveness of Police Checkpoint in Cotabato City
7 pages
Correlation and Regression: Statistics For Economics 1
No ratings yet
Correlation and Regression: Statistics For Economics 1
72 pages
Bcom Biz Admin 2023 24
No ratings yet
Bcom Biz Admin 2023 24
78 pages
Important Notes About Econometrics
No ratings yet
Important Notes About Econometrics
24 pages
M.A Sociology
No ratings yet
M.A Sociology
30 pages
Sanjid Managmnet Project Completed
No ratings yet
Sanjid Managmnet Project Completed
84 pages
BE-CSE (AI&ML) - III & IV Sem AMC 2020-21 FINAL
No ratings yet
BE-CSE (AI&ML) - III & IV Sem AMC 2020-21 FINAL
45 pages
Basic Meassurement
No ratings yet
Basic Meassurement
61 pages
Cannabis Prices and Dynamics of Cannabis Use
No ratings yet
Cannabis Prices and Dynamics of Cannabis Use
19 pages
Confirmatory Factor Analysis: Intro
No ratings yet
Confirmatory Factor Analysis: Intro
14 pages
Del Greco Asertividad
No ratings yet
Del Greco Asertividad
15 pages
Reviewer GE103
No ratings yet
Reviewer GE103
11 pages
Applied Logistic Regression Analysis
No ratings yet
Applied Logistic Regression Analysis
36 pages
Fantozzi 2022 Aquatic Therapy After Incomplete Spinal Cord Injury Gait Initiation Analysis Using Inertial Sensors
No ratings yet
Fantozzi 2022 Aquatic Therapy After Incomplete Spinal Cord Injury Gait Initiation Analysis Using Inertial Sensors
9 pages
Assignment 3 - Solution
No ratings yet
Assignment 3 - Solution
5 pages
Get full Test Bank For Social Psychology 10th Editionby Elliot Aronson free all chapters
100% (12)
Get full Test Bank For Social Psychology 10th Editionby Elliot Aronson free all chapters
94 pages
Summer Trainning Project Report On Managing, and Exhicution Policies in Sales and MARKETING DEPARTMENT Undertaken at "Aashman Foundation"
No ratings yet
Summer Trainning Project Report On Managing, and Exhicution Policies in Sales and MARKETING DEPARTMENT Undertaken at "Aashman Foundation"
42 pages
Fundamentals of Statistics for Aviation Research 1st Edition Michael A. Gallo - The ebook is now available, just one click to start reading
100% (2)
Fundamentals of Statistics for Aviation Research 1st Edition Michael A. Gallo - The ebook is now available, just one click to start reading
55 pages
Prediction of Company Bankruptcy: Amlan Nag
100% (2)
Prediction of Company Bankruptcy: Amlan Nag
16 pages
FDRN Reviewer
No ratings yet
FDRN Reviewer
12 pages
Correlation and Regression
No ratings yet
Correlation and Regression
13 pages
Unit 2 Stats & Probability
No ratings yet
Unit 2 Stats & Probability
51 pages
[15588432 - Journal of Applied Meteorology and Climatology] Estimating Drought Conditions for Regions With Limited Precipitation Data
No ratings yet
[15588432 - Journal of Applied Meteorology and Climatology] Estimating Drought Conditions for Regions With Limited Precipitation Data
12 pages
1999strategicHRDwithincompanies
No ratings yet
1999strategicHRDwithincompanies
14 pages
Math 311 Mod Eng
No ratings yet
Math 311 Mod Eng
13 pages
How Strong Is The Support For A Causal Relation in This Study
100% (2)
How Strong Is The Support For A Causal Relation in This Study
2 pages
Food Quality and Customer Retention
No ratings yet
Food Quality and Customer Retention
26 pages
R-Prog Unit-5
No ratings yet
R-Prog Unit-5
23 pages
MSC Thimmaiah K 2024 PDF
No ratings yet
MSC Thimmaiah K 2024 PDF
39 pages