Unit-14
Unit-14
ERROR
Structure
14.1 Introduction
14.2 Objectives
14.3 Tests for Identifying the Most Efficient Model
14.3.1 The 𝑅 Test and Adjusted 𝑅 Test
14.1 INTRODUCTION
In the previous Unit we highlighted the consequences of specification errors.
There could be three types of specification errors; inclusion of an irrelevant
variable, exclusion of a relevant variable, and incorrect functional form. When
the econometric model is not specified correctly, the coefficient estimates, the
confidence intervals, and the hypothesis tests are misleading and inconsistent. In
view of this, econometric models should be correctly specified.
Dr. Sahba Fatima, Independent Researcher, Lucknow.
Econometric theory suggests certain criteria and test statistics. On the basis of Tests for Specification
Error
these criteria we select the most appropriate econometric model. We describe
some of these criteria below.
14.2 OBJECTIVES
After going through this Unit, you should be in a position to
identify econometric models that are not specified correctly;
take remedial measures for correcting the specification error; and
evaluate the performance of competing models.
191
Econometric Model model with higher R2 is preferred. You should however keep in mind that a very
Specification and high R2 indicates the presence of multicollinearity in the model. If the R2 is high
Diagnostic Testing
but the t-ratio of the coefficients are not statistically significant you should check
for multicollinearity. The R2 is calculated on the basis of the sample data.
Thus the explanatory variables included the model are considered for estimation
of R2. Variables not included in the model do not account for the variation in the
dependent variable.
There is a tendency of the R2 to increase if more explanatory variables are added.
Thus, we are tempted to add more explanatory variables to increase the
explanatory power of the model. If we add irrelevant explanatory variables in a
model, the estimators are unbiased, but there is an increase in the variance of the
estimators. This makes forecast and analysis on the basis of such models
unreliable.
In order to overcome this difficulty, we use the ‘adjusted-R2’. It is denoted by 𝑅
and defined as follows:
⁄( )
𝑅 =1− ⁄( )
= 1 − (1 − 𝑅 ) … (14.4)
ln 𝐴𝐼𝐶 = + ln … (14.6)
192
where ln 𝐴𝐼𝐶 is the natural log of AIC, and is the penalty factor. Tests for Specification
Error
Remember that the model with a lower value of lnAIC is considered to be better.
Thus, when we compare two models by using the AIC criterion, the model with
lower value of AIC has a better specification. The logic is simple. An
econometric model that reduces the residual sum of squares is a better specified
model.
14.3.3 Schwarz Information Criterion
The Schwarz Information Criterion (SIC) also relies on the RSS, like the AIC
criterion mentioned above. This method also is popular for analysing correct
specification of an econometric model. The SIC is defined as follows:
⁄ ∑ ⁄
𝑆𝐼𝐶 = 𝑛 = 𝑛 … (14.7)
ln 𝑆𝐼𝐶 = ln 𝑛 + ln … (14.8)
where [(𝑘⁄𝑛) ln 𝑛] is the penalty factor. Note that the SIC criterion imposes a
harsher penalty for inclusion of explanatory variable compared to the AIC
criterion.
14.3.4 Mallow’s 𝑪𝒑 Criterion
When we do not include all the relevant variables in a model, the estimators are
biased. The Mallow’s Cp Criterion evaluates such bias to find out whether there
is significant deviation from the unbiased estimators. Thus, the Mallow’s Cp
Criterion helps us in selecting the best among competing econometric models.
If some of the explanatory variables are dropped from a model, there is an
increase in the residual sum of squares (RSS). Let us assume that the true model
has k regressors. For this model, 𝜎 is the estimator of true 𝜎 . Now, suppose we
drop p regressors from the model. The residual sum of squares obtained from the
truncated model is 𝑅𝑆𝑆 . The Mallow’s Cp Criterion is based on the following
formula:
𝐶 = − (𝑛 − 2𝑝) ... (14.9)
193
Econometric Model you should go by the theoretical appropriateness of including or excluding a
Specification and variable. In order to have a correctly specified model, a thorough understanding
Diagnostic Testing
of the theoretical concepts and the related literature is necessary. Also, the model
that we fit will only be as good as the data that we have collected. If the data
collected does not suffer from, say, multicollinearity or autocorrelation, we are
likely to have a more robust model.
As mentioned earlier, the criteria for selecting an appropriate model primarily
rests on the theory behind it and the strength of the collected data. Many a time,
we observe certain relationship between two variables. Such relationship
however may be superficial or spurious. Let us take an example. At a traffic light,
cars stop when the signal is red. It does not mean that cars cannot move when
there is red light in front of them. It also does not mean that traffic light has some
damaging effect on moving cars. The reason is observance of traffic rules. Unless
we look into the traffic rules and go by observation only, our reasoning will be
wrong. The dependent variable and the independent variable both may be
affected by another variable. In such cases the relationship is confounded.
You should note one more issue regarding selection of econometric models.
Different test criteria may suggest different models. For example, economic logi
suggests that there could two possible econometric models (say, model A and
model B) for a particular issue. You may come across a situation such that 𝑅
test suggests model A and AIC criterion suggest model B. In such situations you
should carry out a number of tests and then only chose the best model.
Adjusted R-squared, Mallows 𝐶 , p-values, etc. may point to different regression
equations without much clarity to the econometrician. Thus, we conclude that
none of the methods for model selection listed above are adequate by itself.
There is no substitute to theoretical understanding of the related literature,
accurately collected data, practical understanding of the problem, and common
sense while specifying an econometric model. We will discuss further on the
model selection criteria in the course BECC 142: Applied Econometrics.
Check Your Progress 1
1) Explain why 𝑅 is a better criterion than R2 in model specification.
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
194
2) Explain how the AIC and BIC criteria are applied in selection of Tests for Specification
econometric models. Error
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
195
APPENDIX TABLES
Table A1: Normal Area Table
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
196
Table A2: Critical Values of Chi-squared Distribution
197
Table A3: Critical Values of t Distribution
198
Table A4: Critical Values of F Distribution
(5% level of significance)
df2/df1 1 2 3 4 5 6 7 8 9 10
1 161.448 199.500 215.707 224.583 230.162 233.986 236.768 238.883 240.543 241.882
2 18.513 19.000 19.164 19.247 19.296 19.330 19.353 19.371 19.385 19.396
3 10.128 9.552 9.277 9.117 9.014 8.941 8.887 8.845 8.812 8.786
4 7.709 6.944 6.591 6.388 6.256 6.163 6.094 6.041 5.999 5.964
5 6.608 5.786 5.410 5.192 5.050 4.950 4.876 4.818 4.773 4.735
6 5.987 5.143 4.757 4.534 4.387 4.284 4.207 4.147 4.099 4.060
7 5.591 4.737 4.347 4.120 3.972 3.866 3.787 3.726 3.677 3.637
8 5.318 4.459 4.066 3.838 3.688 3.581 3.501 3.438 3.388 3.347
9 5.117 4.257 3.863 3.633 3.482 3.374 3.293 3.230 3.179 3.137
10 4.965 4.103 3.708 3.478 3.326 3.217 3.136 3.072 3.020 2.978
11 4.844 3.982 3.587 3.357 3.204 3.095 3.012 2.948 2.896 2.854
12 4.747 3.885 3.490 3.259 3.106 2.996 2.913 2.849 2.796 2.753
13 4.667 3.806 3.411 3.179 3.025 2.915 2.832 2.767 2.714 2.671
14 4.600 3.739 3.344 3.112 2.958 2.848 2.764 2.699 2.646 2.602
15 4.543 3.682 3.287 3.056 2.901 2.791 2.707 2.641 2.588 2.544
16 4.494 3.634 3.239 3.007 2.852 2.741 2.657 2.591 2.538 2.494
17 4.451 3.592 3.197 2.965 2.810 2.699 2.614 2.548 2.494 2.450
18 4.414 3.555 3.160 2.928 2.773 2.661 2.577 2.510 2.456 2.412
19 4.381 3.522 3.127 2.895 2.740 2.628 2.544 2.477 2.423 2.378
20 4.351 3.493 3.098 2.866 2.711 2.599 2.514 2.447 2.393 2.348
21 4.325 3.467 3.073 2.840 2.685 2.573 2.488 2.421 2.366 2.321
22 4.301 3.443 3.049 2.817 2.661 2.549 2.464 2.397 2.342 2.297
23 4.279 3.422 3.028 2.796 2.640 2.528 2.442 2.375 2.320 2.275
24 4.260 3.403 3.009 2.776 2.621 2.508 2.423 2.355 2.300 2.255
25 4.242 3.385 2.991 2.759 2.603 2.490 2.405 2.337 2.282 2.237
26 4.225 3.369 2.975 2.743 2.587 2.474 2.388 2.321 2.266 2.220
27 4.210 3.354 2.960 2.728 2.572 2.459 2.373 2.305 2.250 2.204
28 4.196 3.340 2.947 2.714 2.558 2.445 2.359 2.291 2.236 2.190
29 4.183 3.328 2.934 2.701 2.545 2.432 2.346 2.278 2.223 2.177
30 4.171 3.316 2.922 2.690 2.534 2.421 2.334 2.266 2.211 2.165
40 4.085 3.232 2.839 2.606 2.450 2.336 2.249 2.180 2.124 2.077
60 4.001 3.150 2.758 2.525 2.368 2.254 2.167 2.097 2.040 1.993
120 3.920 3.072 2.680 2.447 2.290 2.175 2.087 2.016 1.959 1.911
inf 3.842 2.996 2.605 2.372 2.214 2.099 2.010 1.938 1.880 1.831
199
Table A4: Critical Values of F Distribution (Contd.)
(5% level of significance)
200
Table A4: Critical Values of F Distribution (contd.)
(1% level of significance)
df2/df1 1 2 3 4 5 6 7 8 9 10
1 4052.181 4999.500 5403.352 5624.583 5763.650 5858.986 5928.356 5981.070 6022.473 6055.847
2 98.503 99.000 99.166 99.249 99.299 99.333 99.356 99.374 99.388 99.399
3 34.116 30.817 29.457 28.710 28.237 27.911 27.672 27.489 27.345 27.229
4 21.198 18.000 16.694 15.977 15.522 15.207 14.976 14.799 14.659 14.546
5 16.258 13.274 12.060 11.392 10.967 10.672 10.456 10.289 10.158 10.051
6 13.745 10.925 9.780 9.148 8.746 8.466 8.260 8.102 7.976 7.874
7 12.246 9.547 8.451 7.847 7.460 7.191 6.993 6.840 6.719 6.620
8 11.259 8.649 7.591 7.006 6.632 6.371 6.178 6.029 5.911 5.814
9 10.561 8.022 6.992 6.422 6.057 5.802 5.613 5.467 5.351 5.257
10 10.044 7.559 6.552 5.994 5.636 5.386 5.200 5.057 4.942 4.849
11 9.646 7.206 6.217 5.668 5.316 5.069 4.886 4.744 4.632 4.539
12 9.330 6.927 5.953 5.412 5.064 4.821 4.640 4.499 4.388 4.296
13 9.074 6.701 5.739 5.205 4.862 4.620 4.441 4.302 4.191 4.100
14 8.862 6.515 5.564 5.035 4.695 4.456 4.278 4.140 4.030 3.939
15 8.683 6.359 5.417 4.893 4.556 4.318 4.142 4.004 3.895 3.805
16 8.531 6.226 5.292 4.773 4.437 4.202 4.026 3.890 3.780 3.691
17 8.400 6.112 5.185 4.669 4.336 4.102 3.927 3.791 3.682 3.593
18 8.285 6.013 5.092 4.579 4.248 4.015 3.841 3.705 3.597 3.508
19 8.185 5.926 5.010 4.500 4.171 3.939 3.765 3.631 3.523 3.434
20 8.096 5.849 4.938 4.431 4.103 3.871 3.699 3.564 3.457 3.368
21 8.017 5.780 4.874 4.369 4.042 3.812 3.640 3.506 3.398 3.310
22 7.945 5.719 4.817 4.313 3.988 3.758 3.587 3.453 3.346 3.258
23 7.881 5.664 4.765 4.264 3.939 3.710 3.539 3.406 3.299 3.211
24 7.823 5.614 4.718 4.218 3.895 3.667 3.496 3.363 3.256 3.168
25 7.770 5.568 4.675 4.177 3.855 3.627 3.457 3.324 3.217 3.129
26 7.721 5.526 4.637 4.140 3.818 3.591 3.421 3.288 3.182 3.094
27 7.677 5.488 4.601 4.106 3.785 3.558 3.388 3.256 3.149 3.062
28 7.636 5.453 4.568 4.074 3.754 3.528 3.358 3.226 3.120 3.032
29 7.598 5.420 4.538 4.045 3.725 3.499 3.330 3.198 3.092 3.005
30 7.562 5.390 4.510 4.018 3.699 3.473 3.304 3.173 3.067 2.979
40 7.314 5.179 4.313 3.828 3.514 3.291 3.124 2.993 2.888 2.801
60 7.077 4.977 4.126 3.649 3.339 3.119 2.953 2.823 2.718 2.632
120 6.851 4.787 3.949 3.480 3.174 2.956 2.792 2.663 2.559 2.472
inf 6.635 4.605 3.782 3.319 3.017 2.802 2.639 2.511 2.407 2.321
201
Table A4: Critical Values of F Distribution (contd.)
(1% level of significance)
202
Table A5: Durbin-Watson d-statistic Level of Significance = 0.05 k= no. of regressors
________________________________________________________________________________________________
203
GLOSSARY
Association : It refers to the connection or relationship between
variables
205
Econometric Model : These are statistical models specifying relationship
between relationships between various economic
quantities.
208
MWD test : This is the test for the selection of the appropriate
functional form for regression as proposed by
Mackinnon, White and Davidson. The test is hence
known as the MWD Test.
211
Type II Error : The error that occurs when we accept a null
hypothesis that is actually false. It is the probability
of accepting the null hypothesis when it is false.
212