Econometrics Chapter 3
Econometrics Chapter 3
Contents
3.0 Aims and Objectives
3.1 Introduction
3.2 Specification of the Model
3.3 Assumptions
3.4 Estimation
3.5 The Coefficient of Multiple Determination
3.6 Test of Significance in Multiple Regression
3.7 Forecasting Based on Multiple Regression
3.9 Summary
3.10 Answers to Check Your Progress
3.11 References.
3.12 Model Examination Question
The purpose of this unit is to introduce you with the concept of multiple linear regression
model and show how the method of OLS can be extended to estimate the parameters of such
models.
After covering this unit you will be able to:
understand the specification of multiple linear regression model
estimate the parameters of the model
understand the meaning of partial correlation and multiple coefficient of determination
undertake test of significance in multiple regression
make forecasting based on multiple regression
3.1 INTRODUCTION
We have studied the two-variable model extensively in the previous unit. But in economics you
hardly found that one variable is affected by only one explanatory variable. For example, the
demand for a commodity is dependent on price of the same commodity, price of other
competing or complementary goods, income of the consumer, number of consumers in the
market etc. Hence the two variable model is often inadequate in practical works. Therefore, we
need to discuss multiple regression models. The multiple linear regression is entirely concerned
with the relationship between a dependent variable (Y) and two or more explanatory variables
(X1, X2, …, Xn).
1
Let us start our discussion with the simplest multiple regression model i.e., model with two
explanatory variables.
Y = f(X1, X2)
Example:
Example: Demand for a commodity may be influenced not only by the price of the commodity
but by the consumers income.
Since the theory does not specify the mathematical form of the demand function, we assume
the relationship between Y, X1, and X2 is linear. Hence we may write the three variable
Population Regression Function (PRF) as follows:
Yi = o + 1X1i + 2X2i +Ui
Where Y is the quantity demanded
X1 and X2 are the price and income respectively
o is the intercept term
1 is the coefficient of X1 and its expected sign is negative
(Remember the law of demand)
2 is the coefficient of X2 and its expected sign is positive assuming that the good is a
normal good.
The coefficients 1 and 2 are called the partial regression coefficients. We will discuss the
meaning of these coefficients later.
3.3 ASSUMPTIONS
To complete the specification of our simple model we need some assumptions about the
random variable U. These assumptions are the same as those assumptions already explained in
the two-variables model in unit 2.
Assumptions of the model
1. Zero mean value of Ui
The random variable U has a zero mean value for each Xi
E(Ui/X1i, X2i) = 0 for each i.
2. Homoscedasticity
The variance of each Ui is the same for all the Xi values
Var (Ui) = E(Ui2) =
3. Normality
The values of each Ui are normally distributed
Ui ~ N(0, )
4. No serial correlation (serial independence of the U’s)
The values of Ui (corresponding to Xi) are independent from the values of any other U j
(corresponding to Xj).
Cov (Ui, Uj) = 0 for i j
5. Independence of Ui and Xi
Every disturbance term Ui is independent of the explanatory variables. That is there is zero
covariance between Ui and each X variables.
Cov(Ui, X1i) = Cov (Ui, X2i) = 0
Or E(Ui X1i) = E(Ui X2i) = 0
2
Here the values of the X’s are a set of fixed numbers in all hypothetical samples (Refer the
assumptions of OLS in unit 2)
6. No collinearity between the X variables (No multicollinearity) The explanatory variables
are not perfectly linearly correlated. There is no exact linear relationships between X 1 and
X2.
3.4 ESTIMATION
We have specified our model in the previous subsection. We have also stated the assumptions
required in subsection 3.3. Now let us have sample observations on Y, X 1i, and X2i and obtain
estimates of the true parameters b0, b1 and b2
Yi X1i X2i
Y1 X11 X21
Y2 X12 X22
Y3 X13 X23
Yn X1n X2n
The sample regression function (SRF) can be written as
Yi =
Where , and are estimates of the true parameters , and
is the residual term.
But since is un observable the above equation becomes
= is the estimated Regression line.
As discussed in unit 2, the estimates will be obtained by choosing the values of the unknown
parameters that will minimize the sum of squares of the residuals. (OLS requires the Ui2 be as
small as possible). Symbolically,
3
After differentiating, we get the following normal equations:
Yi = n + X1i + X2i
X1iYi = X1i + X21i + X1iX2i
X2iYi = X2i + X1iX2i + X22i
After solving the above normal equations we can obtain values for , and
=
Var( )=
Var( )=
Var( )=
4
Where , k being the total number of parameters that are estimated. In the above
case (three-variable model, k = 3)
x1 and x2 are in deviations form.
In unit 2 we saw the coefficient of determination (r 2) that measures the goodness of fit of the
regression equation. This notion of r 2 can be easily extended to regression models containing
more than two variables.
In the three-variable model we would like to know the proportion of the variation in Y
explained by the variables X1 and X2 jointly.
The quantity that gives this information is known as the multiple coefficient of determination.
It is denoted by R2, with subscripts the variables whose relationships is being studies.
Example:
Example: - shows the percentage of the total variation of Y explained by the
regression plane, that is, by changes in X1 and X2.
=1–
5
=1–
The value of R2 lies between 0 and 1. The higher R2 the greater the percentage of the variation
of Y explained by the regression plane, that is, the better the goodness of fit of the regression
plane to the sample observations. The closer R2 to zero, the worse the fit.
The Adjusted R2
Note that as the number of regressors (explanatory variables) increases the coefficient of
multiple determinations will usually increase. To see this, recall the definition of R 2
R2 = 1 –
Now yi2 is independent of the number of X variables in the model because it is simply (yi -
)2. The residual sum of squares (RSS), Ui2, however depends on the number of explanatory
variables present in the model. It is clear that as the number of X variables increases, Ui2 is
bound to decrease (at least it will not increase), hence R 2 will increase. Therefore, in comparing
two regression models with the same dependent variable but differing number of X variables,
one should be very wary of choosing the model with the highest R 2. An explanatory variable
which is not statistically significant may be retained in the model if one looks at R 2 only.
Therefore, to correct for this defect we adjust R 2 by taking into account the degrees of freedom,
which clearly decrease as new repressors are introduced in the function
=1–
or = 1 – (1 – R2)
where k = the number of parameters in the model (including the intercept term)
n = the number of sample observations
R2 = is the unadjusted multiple coefficient of determination
As the number of explanatory variables increases, the adjusted R 2 is increasingly less than the
unadjusted R2. The adjusted R2 ( ) can be negative, although R2 is necessarily non-negative.
In this case its value is taken as zero.
If n is large, and R2 will not differ much. But with small samples, if the number of regressors
(X’s) is large in relation to the sample observations, will be much smaller than R2.
The principle involved in testing multiple regression is identical with that of simple regression.
6
3.6.1 Hypothesis Testing about Individual Partial Regression Coefficients
We can test whether a particular variable X1 or X2 is significant or not holding the other
variable constant. The t test is used to test a hypothesis about any individual partial regression
coefficient. The partial regression coefficient measures the change in the mean value of Y
E(Y/X2,X3), per unit change in X2, holding X3 constant
t= ~ t(n – k) (i = 0, 1, 2, …., k)
This is the observed (or sample) value of the t ratio, which we compare with the theoretical
value of t obtainable from the t-table with n – k degrees of freedom.
The theoretical values of t (at the chosen level of significance) are the critical values that define
the critical region in a two-tail test, with n – k degrees of freedom.
Now let us postulate that
H0: = 0
H1: 0 or one sided ( > 0, < 0)
The null hypothesis states that, holding X2 constant, X1 has no (linear) influence on y.
If the computed t value exceeds the critical t value at the chosen level of significance, we may
reject the hypothesis; otherwise, we may accept it ( is not significant at the chosen level of
significance and hence the corresponding regression does not appear to contribute to the
explanation of the variations in Y).
Look at the following figure
Acceptance
region
95%
Critical Critical
region 2.5% region (2.5%)
0
7
H0: = =…= =0
Against the alternative hypothesis
H1: not all ’s are zero.
If the null hypothesis is true, then there is no linear relationship between y and the regressors.
The above joint hypothesis can be tested by the analysis of variance (AOV) technique. The
following table summarizes the idea.
Total N–1
(Total variation)
Therefore to undertake the test first find the calculated value of F and compare it with the F
tabulated. The calculated value of F can be obtained by using the following formula.
5% of area
F
0 1 2 3 4 5
When R2 = 0, F is zero. The larger the R 2, the greater the F value. In the limit, when R 2 = 1, F is
infinite. Thus the F test, which is a measure of the overall significance of the estimated
regression, is also a test of significance of R2. Testing the null hypothesis is equivalent to
testing the null hypothesis that (the population) R2 is zero.
The F test expressed in terms of R2 is easy for computation.
8
Example:
Example: Suppose we have data on wheat yield (y), amount of rainfall (x 2), and amount of
fertilizer applied (X1). It is assumed that the fluctuations in yield can be explained by varying
levels of rainfall and fertilizer.
Table 3.6.1
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Yield Fertilizer Rain fall yi x1i x2i x1i yi x2i yi x1x2
(Y) (X1) (X2)
40 100 10
50 200 20
50 300 10
70 400 30
65 500 20
65 600 20
80 700 30
420 2800 140
= 60 = 400 = 20 (means)
1. Find the OLS estimators (i.e., , and )
Solutions:
Solutions: The formula for , and are
=
where x’s and y’s
= are in deviation
forms
Now find the deviations of the observations from their mean values. (Column 4 to 11 in the
above table)
The next step will be to insert the following values (in deviation) in to the above formula
x1i yi = 16500, x2i2 = 400, x2iyi = 600, x1i x2i = 7000, x1i2 = 280,000,
=
= 0.833
Now =
= 60 – (0.0381) (400) – (0.833) (20)
= 28.1
Hence the estimated function is written as follows
9
= 28.1 + 0.0381X1 + 0.833X2
2. Find the variance of ,
Solution
Var( )= , Var ( )=
= but Ui = Yi -
Var( )= = 0.000034
S( )= = 0.0058
S( )= = 0.1543
R2 =
= = 0.98
Interpretation: 98% of the variation in yield is due to the regression plane (i.e., because of
variation in the amount of fertilizer and rainfall). The model is a good fit.
10
ttabulated = = 2.78- can be found from the statistical table (t-distribution)
Decision:
Decision: Since tcalculated > ttabulated , we reject H0.
That is is statistically significant. The variable X1, fertilizer significantly affects yield.
(b) H0: =0
H1: 0
= 0.05
tcalculated = = = 5.3986
ttabulated = = 2.78
Decision:
Decision: Since tcalculated > ttabulated , we reject H0. is statistically significant
3.9 SUMMARY
11
Yi =
Assumptions of the model
1. Zero mean value of Ui: E(Ui/X1i, X2i) = 0 for each i.
2. Homoscedasticity ; Var (Ui) = E(Ui2) =
3. Normality; Ui ~ N(0, )
4. No serial correlation (serial independence of the U’s); Cov (Ui, Uj) = 0 for i j
5. Independence of Ui and Xi; Cov(Ui, X1i) = Cov (Ui, X2i) = 0
6. No collinearity between the X variables (No multicollinearity)
7. Correct specification of the model
Formulas for the parameters
=
Var( )=
Var( )=
Var( )=
= =1–
The Adjusted R2
12
=1– or = 1 – (1 – R2)
The partial regression coefficient measures the change in the mean value of Y E(Y/X 2,X3), per
unit change in X2, holding X3 constant
Hypothesis Testing about Individual Partial Regression Coefficients
t= ~ t(n – k) (i = 0, 1, 2, …., k)
H0: =0
H1: 0 or one sided ( > 0, < 0)
Decision Rule:
Rule: If Fcalculated > Ftabulated (F(k – 1, N – k)), reject H0: otherwise you may accept it,
F=
The F test expressed in terms of R2 is easy for computation.
The 95% confidence interval for Y0 (forecasted value) can be given by making use of
~t(n – k)
P(-t/2 < t < t/2) = 1 -
Partial Correlation Coefficients
r12.3 – holding X3 constant, there is a positive or negative association between Y and X2.
The method of maximum likelihood (ML)
The method of maximum likelihood, as the name indicates, consists in estimating the unknown
parameters in such a manner that the probability of observing the given Y’s is as high (or
maximum) as possible.
The ML estimators, the are the same as the OLS estimators.
13
The ML estimator of 2 is biased.
14