0% found this document useful (0 votes)
4 views

Lecture 03(s)

The document outlines the objectives and methods of hypothesis testing in econometrics, focusing on the Simple Linear Model and Least Squares Regression. It emphasizes the importance of understanding the theory behind hypothesis testing, particularly in relation to regression coefficients. The document also provides a detailed example of testing a hypothesis regarding the relationship between price and wage inflation using a regression model.

Uploaded by

elmrose
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture 03(s)

The document outlines the objectives and methods of hypothesis testing in econometrics, focusing on the Simple Linear Model and Least Squares Regression. It emphasizes the importance of understanding the theory behind hypothesis testing, particularly in relation to regression coefficients. The document also provides a detailed example of testing a hypothesis regarding the relationship between price and wage inflation using a regression model.

Uploaded by

elmrose
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 124

Econometrics Method

ECN 5221
Lecture 03
Hypothesis Testing

Tamat Sarmidi
Centre for Sustainable and Inclusive Development Studies
Universiti Kebangsaan Malaysia
[email protected] or +60192881653

Tamat Sarmidi 2022 1


Objectives Study

• Introduction
– The Simple Linear Model
– Least Squares Regression
• Hypothesis Testing
– Goodness of Fit
– Coefficient Test

Tamat Sarmidi 2022 2


Type author name/s here
Dougherty

Introduction to Econometrics,
5th edition
Chapter heading
Chapter 2: Properties of the
Regression Coefficients and
Hypothesis Testing

© Christopher Dougherty, 2016. All rights reserved.


TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2

Estimator X

This sequence describes the testing of a hypotheses relating to regression coefficients. It


is concerned only with procedures, not with theory.

1
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2

Estimator X

Hypothesis testing forms a major part of the foundation of econometrics and it is essential
to have a clear understanding of the theory.

2
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2

Estimator X

The theory, discussed in sections R.9 to R.11 of the Review chapter, is non-trivial and
requires careful study. This sequence is purely mechanical and is not in any way a
substitute.
3
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2

Estimator X

If you do not understand, for example, the trade-off between the size (significance level) and
the power of a test, you should study the material in those sections before looking at this
sequence.
4
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2

Estimator X

In our standard example in the Review chapter, we had a random variable X with unknown
population mean m and variance s2. Given a sample of data, we used the sample mean as
an estimator of m.
5
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y = b1 + b 2 X + u

Estimator X ˆ
b2 =  ( X i − X )(Yi − Y )
( Xi − X )
2

In the context of the regression model, we have unknown parameters b1 and b2 and we have
derived estimators b̂ 1 and b̂ 2 for them. In what follows, we shall focus on b2 and its
estimator b̂ 2 .
6
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y = b1 + b 2 X + u

Estimator X ˆ
b2 =  ( X i − X )(Yi − Y )
( Xi − X )
2

Null hypothesis H 0 : m = m0 H 0: b 2 = b 20
Alternative hypothesis H 1: m  m 0 H 1: b 2  b 20

In the case of the random variable X, our standard null hypothesis was that m was equal to
some specific value m0. In the case of the regression model, our null hypothesis is that b2 is
equal to some specific value b 2 .
0

7
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y = b1 + b 2 X + u

 ( X i − X )(Yi − Y )
Estimator X
0 ˆ
b2 =
( Xi − X )
2

¥
Null hypothesis H 0 : m = m0 H 0: b 2 = b 20
Alternative hypothesis H 1: m  m 0 H 1: b 2  b 20

X − m0 bˆ2 − b 20
Test statistic t= t=
s.e. ( X ) s.e. ( bˆ2 )

For both the population mean m of the random variable X and the regression coefficient b2,
the test statistic is a t statistic.

8
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y = b1 + b 2 X + u

Estimator X ˆ
b2 =  ( X i − X )(Yi − Y )
( Xi − X )
2

Null hypothesis H 0 : m = m0 H 0: b 2 = b 20
Alternative hypothesis H 1: m  m 0 H 1: b 2  b 20

X − m0 bˆ2 − b 20
Test statistic t= t=
s.e. ( X ) s.e. ( bˆ2 )

In both cases, it is defined as the difference between the estimated coefficient and its
hypothesized value, divided by the standard error of the coefficient.

9
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y = b1 + b 2 X + u

Estimator X ˆ
b2 =  ( X i − X )(Yi − Y )
( Xi − X )
2

Null hypothesis H 0 : m = m0 H 0: b 2 = b 20
Alternative hypothesis H 1: m  m 0 H 1: b 2  b 20
1-4=907 .

X − m0 bˆ2 − b 20 10%
Test statistic t= t=
s.e. ( bˆ2 )
a-
s.e. ( X )

Reject H0 if t  tcrit t  tcrit

We reject the null hypothesis if the absolute value is greater than the critical value of t,
given the chosen significance level.

10
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y = b1 + b 2 X + u

Estimator X ˆ
b2 =  ( X i − X )(Yi − Y )
( Xi − X )
2

Null hypothesis H 0 : m = m0 H 0: b 2 = b 20
Alternative hypothesis H 1: m  m 0 H 1: b 2  b 20

X − m0 bˆ2 − b 20
Test statistic t= t=
s.e. ( X ) s.e. ( bˆ2 )

Reject H0 if t  tcrit t  tcrit

Degrees of freedom n–1 n–k=n–2

There is one important difference. When locating the critical value of t, one must take
account of the number of degrees of freedom. In the case of the random variable X, this is
n – 1, where n is the number of observations in the sample.
11
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Review chapter Regression model

Model X unknown m, s2 Y = b1 + b 2 X + u

Estimator X ˆ
b2 =  ( X i − X )(Yi − Y )
( Xi − X )
2

Null hypothesis H 0 : m = m0 H 0: b 2 = b 20
Alternative hypothesis H 1: m  m 0 H 1: b 2  b 20

X − m0 bˆ2 − b 20
Test statistic t= t=
s.e. ( X ) s.e. ( bˆ2 )

Reject H0 if t  tcrit t  tcrit

Degrees of freedom n–1 n–k=n–2

In the case of the regression model, the number of degrees of freedom is n – k, where n is
the number of observations in the sample and k is the number of parameters (b
coefficients). For the simple regression model above, it is n – 2.
12
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b 1 + b 2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

As an illustration, we will consider a model relating price inflation to wage inflation. p is the
percentage annual rate of growth of prices and w is the percentage annual rate of growth of
wages.
13
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b 1 + b 2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

We will test the hypothesis that the rate of price inflation is equal to the rate of wage
inflation. The null hypothesis is therefore H0: b2 = 1.0. (We should also test b1 = 0.)

14
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b 1 + b 2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

set
pˆ = 1.21 + 0.82w
(0.05) (0.10)

Suppose that the regression result is as shown (standard errors in parentheses). Our
actual estimate of the slope coefficient is only 0.82. We will check whether we should reject
the null hypothesis.
15
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b 1 + b 2w + u

÷
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

pˆ = 1.21 + 0.82w 0.82


Pz
=

(0.05) (0.10) a

t.efvso.IO
bˆ2 − b 20 0.82 − 1.00
t= = = −1.80.
s.e. ( b 2 )
ˆ 0.10

We compute the t statistic by subtracting the hypothetical true value from the sample
estimate and dividing by the standard error. It comes to –1.80.

16
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b 1 + b 2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

pˆ = 1.21 + 0.82w
(0.05) (0.10)

bˆ2 − b 20 0.82 − 1.00


t= = = −1.80.
s.e. ( b 2 )
ˆ 0.10

n = 20 degrees of freedom = 18 t crit, 5% = 2.101

There are 20 observations in the sample. We have estimated 2 parameters, so there are 18
degrees of freedom.
df-u-n.ie 17
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Example: p = b 1 + b 2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

pˆ = 1.21 + 0.82w
(0.05) (0.10)
x

bˆ2 − b 20 0.82 − 1.00


t= = = −1.80.
s.e. ( b 2 )
ˆ 0.10

n = 20 degrees of freedom = 18 t crit, 5% = 2.101

The critical value of t with 18 degrees of freedom is 2.101 at the 5% level. The absolute
value of the t statistic is less than this, so we do not reject the null hypothesis.

18
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Model Y = b 1 + b 2X + u

In practice it is unusual to have a feeling for the actual value of the coefficients. Very often
the objective of the analysis is to demonstrate that Y is influenced by X, without having any
specific prior notion of the actual coefficients of the relationship.
19
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

In this case it is usual to define b2 = 0 as the null hypothesis. In words, the null hypothesis
is that X does not influence Y. We then try to demonstrate that the null hypothesis is false.

20
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

bˆ2 − b 20 bˆ2
t= =
s.e. ( b 2 ) s.e. ( bˆ2 )
ˆ

For the null hypothesis b2 = 0, the t statistic reduces to the estimate of the coefficient
divided by its standard error.

21
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

bˆ2 − b 20 bˆ2
t= =
s.e. ( b 2 ) s.e. ( bˆ2 )
ˆ

This ratio is commonly called the t statistic for the coefficient and it is automatically printed
out as part of the regression results. To perform the test for a given significance level, we
compare the t statistic directly with the critical value of t for that significance level.
22
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S < 0.01


^
----------------------------------------------------------------------------

Source | SS df MS Number of obs = 500
-----------+------------------------------< 0 05 , F( 1,
.
498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855

Éɱ
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

E = 0.761-1-265
B. +Bas
Earning --

I -26 4=0.09
0.76

Sis statistically significant .

8%5=1-2
Here is the output from the earnings function fitted in a previous slideshow, with the t
statistics highlighted.

23
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

larges
You can see that the t statistic for the coefficient of S is enormous. We would reject the null
hypothesis that schooling does not affect earnings at the 1% significance level (critical
value about 2.59).
24
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

In this case we could go further and reject the null hypothesis that schooling does not affect
earnings at the 0.1% significance level.

25
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

The advantage of reporting rejection at the 0.1% level, instead of the 1% level, is that the
risk of mistakenly rejecting the null hypothesis of no effect is now only 0.1% instead of 1%.
The result is therefore even more convincing.
26
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

We have seen that the intercept does not have any plausible meaning, so it does not make
sense to perform a t test on it.

27
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

The next column in the output gives what are known as the p values for each coefficient.
This is the probability of obtaining the corresponding t statistic as a matter of chance, if the
null hypothesis H0: b = 0 is true.
28
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

when t is true
reject the null
.

I.
=

Type false /
failed to reject while + is wrong .

11
Type
=

for to / null hypothesis

If you reject the null hypothesis H0: b = 0, this is the probability that you are making a
mistake and making a Type I error. It therefore gives the significance level at which the null
hypothesis would just be rejected.
29
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

If p = 0.05, the null hypothesis could just be rejected at the 5% level. If it were 0.01, it could
just be rejected at the 1% level. If it were 0.001, it could just be rejected at the 0.1% level. This is
assuming that you are using two-sided tests.
30
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

In the present case p = 0 to three decimal places for the coefficient of S. This means that we
can reject the null hypothesis H0: b2 = 0 at the 0.1% level, without having to refer to the table
of critical values of t. (Testing the intercept does not make sense in this regression.)
31
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

The use of p values is a more informative approach to reporting the results of tests It is
widely used in the medical literature.

32
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

However, in economics, standard practice is to report results referring to 5% and 1%


significance levels, and sometimes to the 0.1% level (when one can reject at that level).

33
Copyright Christopher Dougherty 2016.

These slideshows may be downloaded by anyone, anywhere for personal use.


Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.

The content of this slideshow comes from Section 2.6 of C. Dougherty,


Introduction to Econometrics, fifth edition 2016, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oxfordtextbooks.co.uk/orc/dougherty5e/.

Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2016.04.19
Type author name/s here
Dougherty

Introduction to Econometrics,
5th edition
Chapter heading
Chapter 2: Properties of the
Regression Coefficients and
Hypothesis Testing

© Christopher Dougherty, 2016. All rights reserved.


ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

True model Fitted model


Y = b1 + b 2 X + u Ŷ = bˆ1 + bˆ2 X

Null hypothesis H 0 : b 2 = b 20
Alternative hypothesis H 1 : b 2  b 20

bˆ2 − b 20
Test statistic t=
s.e. ( bˆ2 )

Reject H0 if t  tcrit

In the previous sequence, we were performing what are described as two-sided t tests.
These are appropriate when we have no information about the alternative hypothesis.

1
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

True model Fitted model


Y = b1 + b 2 X + u Ŷ = bˆ1 + bˆ2 X

Null hypothesis H 0 : b 2 = b 20
Alternative hypothesis H 1 : b 2  b 20

bˆ2 − b 20
Test statistic t=
s.e. ( bˆ2 )

Reject H0 if t  tcrit

Under the null, the coefficient is hypothesized to be a certain value. Under the alternative
hypothesis, the coefficient could be any value other than that specified by the null. It could
be higher or it could be lower.
2
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

True model Fitted model


Y = b1 + b 2 X + u Ŷ = bˆ1 + bˆ2 X

Null hypothesis H 0 : b 2 = b 20
Alternative hypothesis H 1 : b 2  b 20

bˆ2 − b 20
Test statistic t=
s.e. ( bˆ2 )

Reject H0 if t  tcrit

However, sometimes we are in a position to say that, if the null hypothesis is not true, the
coefficient cannot be lower than that specified by it. We re-write the null hypothesis as
shown and perform a one-sided test.
3
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

True model Fitted model


Y = b1 + b 2 X + u Ŷ = bˆ1 + bˆ2 X

Null hypothesis H 0 : b 2 = b 20
Alternative hypothesis H 1 : b 2  b 20

bˆ2 − b 20
Test statistic t=
s.e. ( bˆ2 )

Reject H0 if t  tcrit

On other occasions, we might be in a position to assert that, if the null hypothesis is not
true, the coefficient cannot be greater than the value specified by it. The modified null
hypothesis for this case is shown.
4
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

True model Fitted model


Y = b1 + b 2 X + u Ŷ = bˆ1 + bˆ2 X

Null hypothesis H 0 : b 2 = b 20
Alternative hypothesis H 1 : b 2  b 20

bˆ2 − b 20
Test statistic t=
s.e. ( bˆ2 )

Reject H0 if t  tcrit

The theory behind one-sided tests, in particular, the gain in the trade-off between the size
(significance level) and power of a test, is non-trivial and an understanding requires a
careful study of section R.13 of the Review chapter.
5
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

True model Fitted model


Y = b1 + b 2 X + u Ŷ = bˆ1 + bˆ2 X

Null hypothesis H 0 : b 2 = b 20
Alternative hypothesis H 1 : b 2  b 20

bˆ2 − b 20
Test statistic t=
s.e. ( bˆ2 )

Reject H0 if t  tcrit

This sequence assumes a good understanding of that material.

6
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Example: p = b 1 + b 2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

pˆ = 1.21 + 0.82w
(0.05) (0.10)

bˆ2 − b 20 0.82 − 1.00


t= = = −1.80
s.e. ( b 2 )
ˆ 0.10

n = 20 degrees of freedom = 18 t crit, 5% = 2.101 (two - sided test )

Returning to the price inflation/wage inflation model, we saw that we could not reject the
null hypothesis b2 = 1, even at the 5% significance level. That was using a two-sided test.

7
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Example: p = b 1 + b 2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

pˆ = 1.21 + 0.82w
(0.05) (0.10)

bˆ2 − b 20 0.82 − 1.00


t= = = −1.80
s.e. ( b 2 )
ˆ 0.10

n = 20 degrees of freedom = 18 t crit, 5% = 2.101 (two - sided test )

However, in practice, improvements in productivity may cause the rate of cost inflation, and
hence that of price inflation, to be lower than that of wage inflation.

8
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Example: p = b 1 + b 2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

pˆ = 1.21 + 0.82w
(0.05) (0.10)

bˆ2 − b 20 0.82 − 1.00


t= = = −1.80
s.e. ( b 2 )
ˆ 0.10

n = 20 degrees of freedom = 18 t crit, 5% = 2.101 (two - sided test )

Certainly, improvements in productivity will not cause price inflation to be greater than
wage inflation and so in this case we are justified in ruling out b2 > 1. We are left with H0: b2
= 1 and H1: b2 < 1.
9
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Example: p = b 1 + b 2w + u
Null hypothesis: H0: b2 = 1.0
Alternative hypothesis: H1: b2 ≠ 1.0

pˆ = 1.21 + 0.82w
(0.05) (0.10)

bˆ2 − b 20 0.82 − 1.00


t= = = −1.80
s.e. ( b 2 )
ˆ 0.10

n = 20 degrees of freedom = 18 t crit, 5% = 1.734 (one - sided test )

Thus we can perform a one-sided test, for which the critical value of t with 18 degrees of
freedom at the 5% significance level is 1.73. Now we can reject the null hypothesis and
conclude that price inflation is significantly lower than wage inflation, at the 5% significance level.
10
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

Now we will consider the special, but very common, case H0: b2 = 0.

11
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

It occurs when you wish to demonstrate that a variable X influences another variable Y. You
set up the null hypothesis that X has no effect (b2 = 0) and try to reject H0.

12
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 ≠ 0
function of b̂ 2

reject H0 do not reject H0 reject H0

2.5% 2.5%

–1.96 sd 0 1.96 sd b̂ 2

The figure shows the distribution of b̂ 2, conditional on H0: b2 = 0 being true. For simplicity,
we initially assume that we know the standard deviation.

13
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 ≠ 0
function of b̂ 2

reject H0 do not reject H0 reject H0

2.5% 2.5%

–1.96 sd 0 1.96 sd b̂ 2

If you use a two-sided 5% significance test, your estimate must be 1.96 standard deviations
above or below 0 if you are to reject H0.

14
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b̂ 2

However, if you can justify the use of a one-sided test, for example with H0: b2 > 0, your
estimate has to be only 1.65 standard deviations above 0.

15
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b̂ 2

This makes it easier to reject H0 and thereby demonstrate that Y really is influenced by X
(assuming that your model is correctly specified).

16
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b2 b̂ 2

Suppose that Y is genuinely determined by X and that the true (unknown) coefficient is b2,
as shown.

17
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b2 b̂ 2

Suppose that we have a sample of observations and calculate the estimated slope
coefficient, b̂ 2. If it is as shown in the diagram, what do we conclude when we test H0?

18
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b2 b̂ 2

The answer is that b̂ 2 lies in the rejection region. It makes no difference whether we
perform a two-sided test or a one-sided test. We come to the correct conclusion.

19
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b2 b̂ 2

What do we conclude if b̂ 2 is as shown? We fail to reject H0, irrespective of whether we


perform a two-sided test or a two-sided test. We would make a Type II error in either case.

20
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b2 b̂ 2

What do we conclude if b̂ 2 is as shown here? In the case of a two-sided test, b̂ 2 is not in the
rejection region. We are unable to reject H0.

21
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b2 b̂ 2

This means that we are unable to demonstrate that X has a significant effect on Y. This is
disappointing, because we were hoping to demonstrate that X is a determinant of Y.

22
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b2 b̂ 2

However, if we are in a position to perform a one-sided test, b̂ 2 does lie in the rejection
region and so we have demonstrated that X has a significant effect on Y (at the 5%
significance level, of course).
23
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b2 b̂ 2

Thus we get a positive finding that we could not get with a two-sided test.

24
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

2.5%

0 1.96 sd b2 b̂ 2

To put this reasoning more formally, the power of a one-sided test is greater than that of a
two-sided test. The blue area shows the probability of making a Type II error using a two-
sided test. It is the area under the true curve to the left of the rejection region.
25
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b2 b̂ 2

The red area shows the probability of making a Type II error using a one-sided test. It is
smaller. Since the power of a test is (1 – probability of making a Type II error when H0 is
false), the power of a one-sided test is greater than that of a two-sided test.
26
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b̂ 2

In all of this, we have assumed that we knew the standard deviation of the distribution of b̂ 2.
In practice, of course, the standard deviation has to be estimated as the standard error, and
the t distribution is the relevant distribution. However, the logic is exactly the same.
27
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b̂ 2

At any given significance level, the critical value of t for a one-sided test is lower than that
for a two-sided test.

28
ONE-SIDED t TESTS OF HYPOTHESES RELATING TO REGRESSION COEFFICIENTS

Null hypothesis: H0: b 2 = 0


probability density Alternative hypothesis: H1: b 2 > 0
function of b̂ 2

do not reject H0 reject H0

5%

0 1.65 sd b2 b̂ 2

Hence, if H0 is false, the risk of not rejecting it, thereby making a Type II error, is smaller, and
so the power of a one-sided test is greater than that of a two-sided test.

29
Copyright Christopher Dougherty 2016.

These slideshows may be downloaded by anyone, anywhere for personal use.


Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.

The content of this slideshow comes from Section 2.6 of C. Dougherty,


Introduction to Econometrics, fifth edition 2016, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oxfordtextbooks.co.uk/orc/dougherty5e//.

Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2016.04.19
Type author name/s here
Dougherty

Introduction to Econometrics,
5th edition
Chapter heading
Chapter 2: Properties of the
Regression Coefficients and
Hypothesis Testing

© Christopher Dougherty, 2016. All rights reserved.


CONFIDENCE INTERVALS FOR REGRESSION COEFFICIENTS

Model Y = b1 + b 2 X + u
Null hypothesis: H 0 : b 2 = b 20
Alternative hypothesis: H 1 : b 2  b 20

Confidence intervals were treated at length in the Review chapter and their application to
regression analysis presents no problems. We will not repeat the graphical explanation.
We will just provide the mathematical derivation in the context of a regression.
1
CONFIDENCE INTERVALS FOR REGRESSION COEFFICIENTS

Model Y = b1 + b 2 X + u
Null hypothesis: H 0 : b 2 = b 20
Alternative hypothesis: H 1 : b 2  b 20

bˆ2 − b 20 bˆ2 − b 20
Reject H0 if  tcrit or  − tcrit
s.e. ( b 2 )
ˆ s.e. ( b 2 )
ˆ

Agitation region
rejection
region .

From the initial discussion in this section, we saw that, given the theoretical model Y = b1 +
b2X + u and a fitted model, the regression coefficient b̂ 2 and the hypothetical value of b2 are
incompatible if either of the inequalities shown is valid.
2
CONFIDENCE INTERVALS FOR REGRESSION COEFFICIENTS

Model Y = b1 + b 2 X + u
Null hypothesis: H 0 : b 2 = b 20
Alternative hypothesis: H 1 : b 2  b 20

bˆ2 − b 20 bˆ2 − b 20
Reject H0 if  tcrit or  − tcrit
s.e. ( b 2 )
ˆ s.e. ( b 2 )
ˆ

Reject H0 if bˆ2 − b 20  s.e. ( bˆ2 )  tcrit or bˆ2 − b 20  −s.e. ( bˆ2 )  tcrit

Multiplying through by the standard error of b̂ 2 , the conditions for rejecting H0 can be
written as shown.

3
CONFIDENCE INTERVALS FOR REGRESSION COEFFICIENTS

Model Y = b1 + b 2 X + u
Null hypothesis: H 0 : b 2 = b 20
Alternative hypothesis: H 1 : b 2  b 20

bˆ2 − b 20 bˆ2 − b 20
Reject H0 if  tcrit or  − tcrit
s.e. ( b 2 )
ˆ s.e. ( b 2 )
ˆ

Reject H0 if bˆ2 − b 20  s.e. ( bˆ2 )  tcrit or bˆ2 − b 20  −s.e. ( bˆ2 )  tcrit

Reject H0 if bˆ2 − s.e. ( bˆ2 )  tcrit  b 20 or bˆ2 + s.e. ( bˆ2 )  tcrit  b 20

The inequalities may then be re-arranged as shown.

4
CONFIDENCE INTERVALS FOR REGRESSION COEFFICIENTS

Model Y = b1 + b 2 X + u
Null hypothesis: H 0 : b 2 = b 20
Alternative hypothesis: H 1 : b 2  b 20

bˆ2 − b 20 bˆ2 − b 20
Reject H0 if  tcrit or  − tcrit
s.e. ( b 2 )
ˆ s.e. ( b 2 )
ˆ

Reject H0 if bˆ2 − b 20  s.e. ( bˆ2 )  tcrit or bˆ2 − b 20  −s.e. ( bˆ2 )  tcrit

Reject H0 if bˆ2 − s.e. ( bˆ2 )  tcrit  b 20 or bˆ2 + s.e. ( bˆ2 )  tcrit  b 20

Do not reject H0 if ⑧
bˆ2 − s.e. ( bˆ2 )  tcrit  b 2  bˆ2 + s.e. ( bˆ2 )  tcrit
accepted range
We can then obtain the confidence interval for b2, being the set of all values that would not
be rejected, given the sample estimate b̂ 2 . To make it operational, we need to select a
significance level and determine the corresponding critical value of t.
5
CONFIDENCE INTERVALS FOR REGRESSION COEFFICIENTS

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------

÷
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

range estimate
bˆ2 − s.e. ( bˆ2 )  tcrit  b 2  bˆ2 + s.e. ( bˆ2 )  tcrit

For an example of the construction of a confidence interval, we will return to the wage
equation fitted earlier. We will construct a 95% confidence interval for b2.

6
CONFIDENCE INTERVALS FOR REGRESSION COEFFICIENTS

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

bˆ2 − s.e. ( bˆ2 )  tcrit  b 2  bˆ2 + s.e. ( bˆ2 )  tcrit


1.266 – 0.185 x 1.965 ≤ b2 ≤ 1.266 + 0.185 x 1.965

The point estimate b2 is 1.266 and its standard error is 0.185.

7
CONFIDENCE INTERVALS FOR REGRESSION COEFFICIENTS

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

bˆ2 − s.e. ( bˆ2 )  tcrit  b 2  bˆ2 + s.e. ( bˆ2 )  tcrit


1.266 – 0.185 x 1.965 ≤ b2 ≤ 1.266 + 0.185 x 1.965

The critical value of t at the 5% significance level with 498 degrees of freedom is 1.965.

8
CONFIDENCE INTERVALS FOR REGRESSION COEFFICIENTS

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

bˆ2 − s.e. ( bˆ2 )  tcrit  b 2  bˆ2 + s.e. ( bˆ2 )  tcrit


1.266 – 0.185 x 1.965 ≤ b2 ≤ 1.266 + 0.185 x 1.965
0.902 ≤ b2 ≤ 1.630

Hence we establish that the confidence interval is from 0.902 to 1.630. Stata actually
computes the 95% confidence interval as part of its default output, 0.901 to 1.630. The
discrepancy in the lower limit is due to rounding error in the calculations we have made.
9
Copyright Christopher Dougherty 2016.

These slideshows may be downloaded by anyone, anywhere for personal use.


Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.

The content of this slideshow comes from Section 2.6 of C. Dougherty,


Introduction to Econometrics, fifth edition 2016, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oxfordtextbooks.co.uk/orc/dougherty5e/.

Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2016.04.19
Type author name/s here
Dougherty

Introduction to Econometrics,
5th edition
Chapter heading
Chapter 2: Properties of the
Regression Coefficients and
Hypothesis Testing

© Christopher Dougherty, 2016. All rights reserved.


F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u

 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

In an earlier sequence it was demonstrated that the sum of the squares of the actual values
of Y (TSS: total sum of squares) could be decomposed into the sum of the squares of the
fitted values (ESS: explained sum of squares) and the sum of the squares of the residuals.
1
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
residual
sum
.

squares
q
 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

 (Yˆi − Y )
2
ESS
R = =2

TSS  (Yi − Y ) 2
if I = Y / we have the same value , R 2=1 .

ESS TSS
good
-

when R2 - 9- the explained very


is ,

[ best model )
,

model
variability of y
is 100%
≈ explained by the
R2 = 0.9--9090 R2 0.2=20 % =

0.1--1090 error 0.8 80% error


= .

R2, the usual measure of goodness of fit, was then defined to be the ratio of the explained
sum of squares to the total sum of squares.

2
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u

 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

 (Yˆi − Y )
2
ESS
R =
2
=
TSS  (Yi − Y ) 2

The null hypothesis that we are going to test is that the model has no explanatory power.

3
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

 (Yˆi − Y )
2
ESS
R =
2
=
TSS  (Yi − Y ) 2

Since X is the only explanatory variable at the moment, the null hypothesis is that Y is not
determined by X. Mathematically, we have H0: b2 = 0

4
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

 (Yˆi − Y )
2
ESS
R =
2
=
TSS  (Yi − Y ) 2

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
F ( k − 1, n − k ) = = =
RSS ( n − k ) RSS
(n − k) ( 1 − R 2
) (n − k)
TSS
Hypotheses concerning goodness of fit are tested via the F statistic, defined as shown. k is
the number of parameters in the regression equation, which at present is just 2.

5
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

 (Yˆi − Y )
2
ESS
R =
2
=
TSS  (Yi − Y ) 2

ESS
( k)− 1)
ESS ( k − 1) Enumerator R 2
( k − 1)
F ( k − 1, n − k ) = = TSS =
RSS ( n − k )(denominator
RSS )
(n − k) ( 1 − R 2
) (n − k)
TSS
n – k is, as with the t statistic, the number of degrees of freedom (number of observations
less the number of parameters estimated). For simple regression analysis, it is n – 2.

6
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

 (Yˆi − Y )
2
ESS
R =2
=
TSS  (Yi − Y ) 2

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
F ( k − 1, n − k ) = = =
RSS ( n − k ) RSS
(n − k) ( 1 − R 2
) (n − k)
TSS
The F statistic may alternatively be written in terms of R2. First divide the numerator and
denominator by TSS.

7
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

 (Yˆi − Y )
2
ESS
R =
2
=
TSS  (Yi − Y ) 2

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
F ( k − 1, n − k ) = = =
RSS ( n − k ) RSS
(n − k) ( 1 − R 2
) (n − k)
TSS
We can now rewrite the F statistic as shown. The R2 in the numerator comes straight from
the definition of R2.

8
F TEST OF GOODNESS OF FIT

RSS TSS − ESS ESS


= = 1− = 1 − R2
TSS TSS TSS

 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

 (Yˆi − Y )
2
ESS
R =
2
=
TSS  (Yi − Y ) 2

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
F ( k − 1, n − k ) = = =
RSS ( n − k ) RSS
(n − k) ( 1 − R 2
) (n − k)
TSS
It is easily demonstrated that RSS/TSS is equal to 1 – R2.

9
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

 (Yˆi − Y )
2
ESS
R =2
=
TSS  (Yi − Y ) 2
critical ESS

value
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
F ( k − 1, n − k ) = = =
RSS ( n − k ) RSS
(n − k) ( 1 − R 2
) (n − k)
TSS
F is a monotonically increasing function of R2. As R2 increases, the numerator increases
and the denominator decreases, so for both of these reasons F increases.

10
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) =
( ) (n − k)
140
1 − R 2

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
if f-
*
not > Faina ,

statistically
right the Ho
qm_¥gat
* The model
60 is

variable Y
40 explained by
20

0
0 0.2 0.4 0.6 0.8 1 R2

Here is F plotted as a function of R2 for the case where there is 1 explanatory variable and
20 observations. Since k = 2, n – k = 18.
R2 /K -
1--1
11
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) =
( ) (n − k)
140
1 − R 2

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80 Critical = 4:41
value
60

40

20

4.41 - - - -

y
-
-
-

0
0 0.2 0.4 0.6 0.8 1 R2

If the null hypothesis is true, F will have a random distribution.

12
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) =
( ) (n − k)
140
1 − R 2

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80

60

40

20

0
0 0.2 0.4 0.6 0.8 1 R2

There will be some critical value which it will exceed, as a matter of chance, only 5 percent
of the time. If we are performing a 5 percent significance test, we will reject H0 if the F
statistic is greater than this critical value.
13
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) =
( ) (n − k)
140
1 − R 2

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80

60

40

20
4.41
0
0 0.2 0.4 0.6 0.8 1 R2

In the case of an F test, the critical value depends on the number of explanatory variables
as well as the number of degrees of freedom. When there is one explanatory variable and
18 degrees of freedom, the critical value of F at the 5 percent significance level is 4.41.
14
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) =
( ) (n − k)
140
1 − R 2

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,5% ( 1,18 ) = 4.41
60

40

20
4.41
0
0 0.2 0.4 0.6 0.8 1 R2

For one explanatory variable and 18 degrees of freedom, F = 4.41 when R2 = 0.20.

15
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) =
( ) (n − k)
140
1 − R 2

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,5% ( 1,18 ) = 4.41
60

40

20
4.41
0
0 0.2 0.4 0.6 0.8 1 R2

If R2 is higher than 0.20, F will be higher than 4.41, and we will reject the null hypothesis at
the 5 percent level.

16
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) =
( ) (n − k)
140
1 − R 2

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,1% ( 1,18 ) = 8.29
60

40

20
8.29
0
0.32
0 0.2 0.4 0.6 0.8 1 R2

If we were performing a 1 percent test, with one explanatory variable and 18 degrees of
freedom, the critical value of F would be 8.29. F = 8.29 when, R2 = 0.32.

17
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) =
( ) (n − k)
140
1 − R 2

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,1% ( 1,18 ) = 8.29
60

40

20
8.29
0
0.32
0 0.2 0.4 0.6 0.8 1 R2

If R2 is higher than 0.32, F will be higher than 8.29, and we will reject the null hypothesis at
the 1 percent level.

18
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) =
( ) (n − k)
140
1 − R 2

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,1% ( 1,18 ) = 8.29
60

40

20
8.29
0
0.32
0 0.2 0.4 0.6 0.8 1 R2

Why do we perform the test indirectly, through F, instead of directly through R2? After all, it
would be easy to compute the critical values of R2 from those for F.

19
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) =
( ) (n − k)
140
1 − R 2

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,1% ( 1,18 ) = 8.29
60

40

20
8.29
0
0.32
0 0.2 0.4 0.6 0.8 1 R2

The reason is that an F test can be used for several tests of analysis of variance. Rather
than have a specialized table for each test, it is more convenient to have just one.

20
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

 (Yˆi − Y )
2
ESS
R =
2
=
TSS  (Yi − Y ) 2

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
F ( k − 1, n − k ) = = =
RSS ( n − k ) RSS
(n − k) ( 1 − R 2
) (n − k)
TSS
Note that, for simple regression analysis, the null and alternative hypotheses are
mathematically exactly the same as for a two-tailed t test. Could the F test come to a
different conclusion from the t test?
21
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1: b 2 ≠ 0

 (Y − Y ) =  (Yˆ − Y ) +  uˆ 2
2 2
TSS = ESS + RSS

 (Yˆi − Y )
2
ESS
R =
2
=
TSS  (Yi − Y ) 2

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
F ( k − 1, n − k ) = = =
RSS ( n − k ) RSS
(n − k) ( 1 − R 2
) (n − k)
TSS
The answer, of course, is no. We will demonstrate that, for simple regression analysis, the F
statistic is the square of the t statistic.

22
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

 (Yˆi − Y )
2
ESS
F= =
RSS ( n − 2 )  uˆ i2 ( n − 2)
( ( ) ( ))
2
 bˆ + bˆ2 X i − bˆ1 + bˆ2 X 1
2  2 ( )
1
= = bˆ 2
X − X
2

sˆ u2 sˆ u i

bˆ22 bˆ22 bˆ22


= 2 ( Xi − X ) = 2
2
= = t2
sˆ u ( X − X) ( s.e. ( bˆ ) )
2 2
sˆ u i 2

We start by replacing ESS and RSS by their mathematical expressions.

23
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

 (Yˆi − Y )
2
ESS
F= =
RSS ( n − 2 )  uˆ i2 ( n − 2)
( ( ) ( ))
2
 bˆ + bˆ2 X i − bˆ1 + bˆ2 X 1
2  2 ( )
1
= = bˆ 2
X − X
2

sˆ u2 sˆ u i

bˆ22 bˆ22 bˆ22


= 2 ( Xi − X ) = 2
2
= = t2
sˆ u ( X − X) ( s.e. ( bˆ ) )
2 2
sˆ u i 2

The denominator is the expression for sˆ u , the estimator of s u , for the simple regression
2 2

model. We expand the numerator using the expression for the fitted relationship.

24
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

 (Yˆi − Y )
2
ESS
F= =
RSS ( n − 2 )  uˆ i2 ( n − 2)
( ( ) ( ))
2
 bˆ + bˆ2 X i − bˆ1 + bˆ2 X 1
2  2 ( )
1
= = bˆ 2
X − X
2

sˆ u2 sˆ u i

bˆ22 bˆ22 bˆ22


= 2 ( Xi − X ) = 2
2
= = t2
sˆ u ( X − X) ( s.e. ( bˆ ) )
2 2
sˆ u i 2

The b̂ 1 terms in the numerator cancel. The rest of the numerator can be grouped as shown.

25
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

 (Yˆi − Y )
2
ESS
F= =
RSS ( n − 2 )  uˆ i2 ( n − 2)
( ( ) ( ))
2
 bˆ + bˆ2 X i − bˆ1 + bˆ2 X 1
2  2 ( )
1
= = bˆ 2
X − X
2

sˆ u2 sˆ u i

bˆ22 bˆ22 bˆ22


= 2 ( Xi − X ) = 2
2
= = t2
sˆ u ( X − X) ( s.e. ( bˆ ) )
2 2
sˆ u i 2

We take the b̂ 2 term out of the summation as a factor.


2

26
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

 (Yˆi − Y )
2
ESS
F= =
RSS ( n − 2 )  uˆ i2 ( n − 2)
( ( ) ( ))
2
 bˆ + bˆ2 X i − bˆ1 + bˆ2 X 1
2  2 ( )
1
= = bˆ 2
X − X
2

sˆ u2 sˆ u i

bˆ22 bˆ22 bˆ22


= 2 ( Xi − X ) = 2
2
= = t2
sˆ u ( X − X) ( s.e. ( bˆ ) )
2 2
sˆ u i 2

We move the term involving X to the denominator.

27
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

 (Yˆi − Y )
2
ESS
F= =
RSS ( n − 2 )  uˆ i2 ( n − 2)
( ( ) ( ))
2
 bˆ + bˆ2 X i − bˆ1 + bˆ2 X 1
2  2 ( )
1
= = bˆ 2
X − X
2

sˆ u2 sˆ u i

bˆ22 bˆ22 bˆ22


= 2 ( Xi − X ) = 2
2
= = t2
sˆ u ( X − X) ( s.e. ( bˆ ) )
2 2
sˆ u i 2

The denominator is the square of the standard error of b̂ 2 .

28
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

 (Yˆi − Y )
2
ESS
F= =
RSS ( n − 2 )  uˆ i2 ( n − 2)
( ( ) ( ))
2
 bˆ + bˆ2 X i − bˆ1 + bˆ2 X 1
2  2 ( )
1
= = bˆ 2
X − X
2

sˆ u2 sˆ u i

bˆ22 bˆ22 bˆ22


= 2 ( Xi − X ) = 2
2
= = t2
sˆ u ( X − X) ( s.e. ( bˆ ) )
2 2
sˆ u i 2

Hence we obtain bˆ 2 divided by the square of the standard error of b̂ 2 . This is the t statistic,
2

squared.

29
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

 (Yˆi − Y )
2
ESS
F= =
RSS ( n − 2 )  uˆ i2 ( n − 2)
( ( ) ( ))
2
 bˆ + bˆ2 X i − bˆ1 + bˆ2 X 1
2  2 ( )
1
= = bˆ 2
X − X
2

sˆ u2 sˆ u i

bˆ22 bˆ22 bˆ22


= 2 ( Xi − X ) = 2
2
= = t2
sˆ u ( X − X) ( s.e. ( bˆ ) )
2 2
sˆ u i 2

It can also be shown that the critical value of F, at any significance level, is equal to the
square of the critical value of t. We will not attempt to prove this.

30
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

 (Yˆi − Y )
2
ESS
F= =
RSS ( n − 2 )  uˆ i2 ( n − 2)
( ( ) ( ))
2
 bˆ + bˆ2 X i − bˆ1 + bˆ2 X 1
2  2 ( )
1
= = bˆ 2
X − X
2

sˆ u2 sˆ u i

bˆ22 bˆ22 bˆ22


= 2 ( Xi − X ) = 2
2
= = t2
sˆ u ( X − X) ( s.e. ( bˆ ) )
2 2
sˆ u i 2

Since the F test is equivalent to a two-sided t test in the simple regression model, there is
no point in performing both tests. In fact, if justified, a one-sided t test would be better than
either because it is more powerful (lower risk of Type II error if H0 is false).
31
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

 (Yˆi − Y )
2
ESS
F= =
RSS ( n − 2 )  uˆ i2 ( n − 2)
( ( ) ( ))
2
 bˆ + bˆ2 X i − bˆ1 + bˆ2 X 1
2  2 ( )
1
= = bˆ 2
X − X
2

sˆ u2 sˆ u i

bˆ22 bˆ22 bˆ22


= 2 ( Xi − X ) = 2
2
= = t2
sˆ u ( X − X) ( s.e. ( bˆ ) )
2 2
sˆ u i 2

The F test will have its own role to play when we come to multiple regression analysis.

32
F TEST OF GOODNESS OF FIT

. reg EARNINGS S 97% random component


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = O 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-

-----------+------------------------------ Adj R-squared = 0.0837


Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 O
6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

E- F
46.57 p2= 0.08 power
[ 6.82272
= .

Explanatorylow
(6.82×6-82)=46.57 8% very random
92% component .

Here is the output for the regression of hourly earnings on years of schooling for theCuneop
_

sample of 500 respondents from the National Longitudinal Survey of Youth 1997‒. lained )

33
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

ESS 6014 6014


F ( 1, n − 2 ) = = = = 46.57
RSS ( n − 2 ) 64315 ( 500 − 2 ) 129.15

We shall check that the F statistic has been calculated correctly. The explained sum of
squares (described in Stata as the model sum of squares) is 6014.

34
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

ESS 6014 6014


F ( 1, n − 2 ) = = = = 46.57
RSS ( n − 2 ) 64315 ( 500 − 2 ) 129.15

The residual sum of squares is 64315.

35
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

ESS 6014 6014


F ( 1, n − 2 ) = = = = 46.57
RSS ( n − 2 ) 64315 ( 500 − 2 ) 129.15

The number of degrees of freedom is 500 – 2 = 498.

36
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

ESS 6014 6014


F ( 1, n − 2 ) = = = = 46.57
RSS ( n − 2 ) 64315 ( 500 − 2 ) 129.15

The denominator of the expression for F is therefore 129.15. Note that this is an estimate of
s u2 . Its square root, denoted in Stata by Root MSE, is an estimate of the standard deviation
of u.
37
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

ESS 6014 6014


F ( 1, n − 2 ) = = = = 46.57
RSS ( n − 2 ) 64315 ( 500 − 2 ) 129.15

Our calculation of F agrees with that in the Stata output.

38
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

R2 0.0855
F ( 1, n − 2 ) = = = 46.56
( 1 − R 2
) ( n − 2 ) ( 1 − 0.0855 ) ( 500 − 2 )

We will also check the F statistic using the expression for it in terms of R2. We see again
that it agrees, apart from rounding error.

39
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

We will also check the relationship between the F statistic and the t statistic for the slope
coefficient.

40
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

6.822 = 46.51

Obviously, this is correct as well, apart from rounding error.

41
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

Fcrit, 0.1% (1,500) = 10.96 t crit, 0.1% (500) = 3.31 10.96 = 3.312

And the critical value of F is the square of the critical value of t. (We are using the values
for 500 degrees of freedom because those for 498 do not appear in the table.)

42
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

Fcrit, 0.1% (1,500) = 10.96 t crit, 0.1% (500) = 3.31 10.96 = 3.312

The relationship is shown for the 0.1% significance level, but obviously it is also true for any
other significance level.

43
Copyright Christopher Dougherty 2016.

These slideshows may be downloaded by anyone, anywhere for personal use.


Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.

The content of this slideshow comes from Section 2.7 of C. Dougherty,


Introduction to Econometrics, fifth edition 2016, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oxfordtextbooks.co.uk/orc/dougherty5e/

Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2016.04.20

You might also like