0% found this document useful (0 votes)
27 views

Econometrics Chapter 3

Uploaded by

Nahili wondimu
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Econometrics Chapter 3

Uploaded by

Nahili wondimu
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 14

UNIT 3: MULTIPLE LINEAR REGRESSION

Contents
3.0 Aims and Objectives
3.1 Introduction
3.2 Specification of the Model
3.3 Assumptions
3.4 Estimation
3.5 The Coefficient of Multiple Determination
3.6 Test of Significance in Multiple Regression
3.7 Forecasting Based on Multiple Regression
3.9 Summary
3.10 Answers to Check Your Progress
3.11 References.
3.12 Model Examination Question

3.0 AIMS AND OBJECTIVES

The purpose of this unit is to introduce you with the concept of multiple linear regression
model and show how the method of OLS can be extended to estimate the parameters of such
models.
After covering this unit you will be able to:
 understand the specification of multiple linear regression model
 estimate the parameters of the model
 understand the meaning of partial correlation and multiple coefficient of determination
 undertake test of significance in multiple regression
 make forecasting based on multiple regression

3.1 INTRODUCTION

We have studied the two-variable model extensively in the previous unit. But in economics you
hardly found that one variable is affected by only one explanatory variable. For example, the
demand for a commodity is dependent on price of the same commodity, price of other
competing or complementary goods, income of the consumer, number of consumers in the
market etc. Hence the two variable model is often inadequate in practical works. Therefore, we
need to discuss multiple regression models. The multiple linear regression is entirely concerned
with the relationship between a dependent variable (Y) and two or more explanatory variables
(X1, X2, …, Xn).

3.2 SPECIFICATION OF THE MODEL

1
Let us start our discussion with the simplest multiple regression model i.e., model with two
explanatory variables.
Y = f(X1, X2)
Example:
Example: Demand for a commodity may be influenced not only by the price of the commodity
but by the consumers income.
Since the theory does not specify the mathematical form of the demand function, we assume
the relationship between Y, X1, and X2 is linear. Hence we may write the three variable
Population Regression Function (PRF) as follows:
Yi = o + 1X1i + 2X2i +Ui
Where Y is the quantity demanded
X1 and X2 are the price and income respectively
o is the intercept term
1 is the coefficient of X1 and its expected sign is negative
(Remember the law of demand)
2 is the coefficient of X2 and its expected sign is positive assuming that the good is a
normal good.
The coefficients 1 and 2 are called the partial regression coefficients. We will discuss the
meaning of these coefficients later.

3.3 ASSUMPTIONS

To complete the specification of our simple model we need some assumptions about the
random variable U. These assumptions are the same as those assumptions already explained in
the two-variables model in unit 2.
Assumptions of the model
1. Zero mean value of Ui
The random variable U has a zero mean value for each Xi
E(Ui/X1i, X2i) = 0 for each i.
2. Homoscedasticity
The variance of each Ui is the same for all the Xi values
Var (Ui) = E(Ui2) =
3. Normality
The values of each Ui are normally distributed
Ui ~ N(0, )
4. No serial correlation (serial independence of the U’s)
The values of Ui (corresponding to Xi) are independent from the values of any other U j
(corresponding to Xj).
Cov (Ui, Uj) = 0 for i  j
5. Independence of Ui and Xi
Every disturbance term Ui is independent of the explanatory variables. That is there is zero
covariance between Ui and each X variables.
Cov(Ui, X1i) = Cov (Ui, X2i) = 0
Or E(Ui X1i) = E(Ui X2i) = 0

2
Here the values of the X’s are a set of fixed numbers in all hypothetical samples (Refer the
assumptions of OLS in unit 2)
6. No collinearity between the X variables (No multicollinearity) The explanatory variables
are not perfectly linearly correlated. There is no exact linear relationships between X 1 and
X2.

7. Correct specification of the model


The model has no specification error in that all the important explanatory variables appear
explicitly in the function and the mathematical form is correctly defined.
The rationale for the above assumptions is the same as unit 2.

3.4 ESTIMATION

We have specified our model in the previous subsection. We have also stated the assumptions
required in subsection 3.3. Now let us have sample observations on Y, X 1i, and X2i and obtain
estimates of the true parameters b0, b1 and b2
Yi X1i X2i
Y1 X11 X21
Y2 X12 X22
Y3 X13 X23
  
Yn X1n X2n
The sample regression function (SRF) can be written as
Yi =
Where , and are estimates of the true parameters , and
is the residual term.
But since is un observable the above equation becomes
= is the estimated Regression line.
As discussed in unit 2, the estimates will be obtained by choosing the values of the unknown
parameters that will minimize the sum of squares of the residuals. (OLS requires the Ui2 be as
small as possible). Symbolically,

Min Ui2 = (Yi – )2 = (Yi - )2


A necessary condition for a minimum value is that the partial derivatives of the above
expression with respect to the unknowns (i.e. , and ) should be set to zero.

3
After differentiating, we get the following normal equations:
Yi = n + X1i + X2i
X1iYi = X1i + X21i + X1iX2i
X2iYi = X2i +  X1iX2i + X22i
After solving the above normal equations we can obtain values for , and
=

where the variables x and y are in deviation forms


i.e., yi = Yi - , x1i = X1i - , x2i = X2i -
Note:
Note: The values for the parameter estimates ( , and ) can also be obtained by using
other methods (ex. crammer’s rule).
The Mean and Variance of the Parameter Estimates
The mean of the estimates of the parameters in the three-variable model is derived in the same
way as in the two-variable model.
The mean of , and .
E( ) = E( ) = E( ) =
The estimates are unbiased estimates of the true parameters of the relationship between Y, X 1
and X2. The means expected value of the estimates is the true parameter itself.
The variance of , and .
The variances are obtained by using the following formulae

Var( )=

Var( )=

Var( )=

4
Where , k being the total number of parameters that are estimated. In the above
case (three-variable model, k = 3)
x1 and x2 are in deviations form.

3.5 THE MULTIPLE COEFFICIENT OF DETERMINATION

In unit 2 we saw the coefficient of determination (r 2) that measures the goodness of fit of the
regression equation. This notion of r 2 can be easily extended to regression models containing
more than two variables.
In the three-variable model we would like to know the proportion of the variation in Y
explained by the variables X1 and X2 jointly.
The quantity that gives this information is known as the multiple coefficient of determination.
It is denoted by R2, with subscripts the variables whose relationships is being studies.

Example:
Example: - shows the percentage of the total variation of Y explained by the
regression plane, that is, by changes in X1 and X2.

=1–

where: RSS – residual sum of squares


TSS – total sum of squares
Recall that
= x1i + ( the variables are in deviation forms)
yi = + Ui
Ui2 = (yi - )2 =  (y i - x1i - )2
or Ui2 = Ui .Ui = Ui(yi - x1i - )
= Ui .yi - Ui .x1i - Ui .x2i
but Ui .x1i = Ui .x2i = 0
Hence Ui 2 = Ui yi
=  (yi - )yi since Ui = yi -
yi (y
(yi - x1i - ) = yi2 - x1i yi - x2i yi
By substituting the value of Ui 2 in the formula of R2, we get

5
=1–

= , where x1i, x2i and yi are in their deviation forms.

The value of R2 lies between 0 and 1. The higher R2 the greater the percentage of the variation
of Y explained by the regression plane, that is, the better the goodness of fit of the regression
plane to the sample observations. The closer R2 to zero, the worse the fit.
The Adjusted R2
Note that as the number of regressors (explanatory variables) increases the coefficient of
multiple determinations will usually increase. To see this, recall the definition of R 2

R2 = 1 –

Now yi2 is independent of the number of X variables in the model because it is simply (yi -
)2. The residual sum of squares (RSS), Ui2, however depends on the number of explanatory
variables present in the model. It is clear that as the number of X variables increases, Ui2 is
bound to decrease (at least it will not increase), hence R 2 will increase. Therefore, in comparing
two regression models with the same dependent variable but differing number of X variables,
one should be very wary of choosing the model with the highest R 2. An explanatory variable
which is not statistically significant may be retained in the model if one looks at R 2 only.
Therefore, to correct for this defect we adjust R 2 by taking into account the degrees of freedom,
which clearly decrease as new repressors are introduced in the function

=1–

or = 1 – (1 – R2)
where k = the number of parameters in the model (including the intercept term)
n = the number of sample observations
R2 = is the unadjusted multiple coefficient of determination
As the number of explanatory variables increases, the adjusted R 2 is increasingly less than the
unadjusted R2. The adjusted R2 ( ) can be negative, although R2 is necessarily non-negative.
In this case its value is taken as zero.
If n is large, and R2 will not differ much. But with small samples, if the number of regressors
(X’s) is large in relation to the sample observations, will be much smaller than R2.

3.6 TEST OF SIGNIFICANCE IN MULTIPLE REGRESSION

The principle involved in testing multiple regression is identical with that of simple regression.

6
3.6.1 Hypothesis Testing about Individual Partial Regression Coefficients
We can test whether a particular variable X1 or X2 is significant or not holding the other
variable constant. The t test is used to test a hypothesis about any individual partial regression
coefficient. The partial regression coefficient measures the change in the mean value of Y
E(Y/X2,X3), per unit change in X2, holding X3 constant

t= ~ t(n – k) (i = 0, 1, 2, …., k)

This is the observed (or sample) value of the t ratio, which we compare with the theoretical
value of t obtainable from the t-table with n – k degrees of freedom.
The theoretical values of t (at the chosen level of significance) are the critical values that define
the critical region in a two-tail test, with n – k degrees of freedom.
Now let us postulate that
H0: = 0
H1:  0 or one sided ( > 0, < 0)
The null hypothesis states that, holding X2 constant, X1 has no (linear) influence on y.
If the computed t value exceeds the critical t value at the chosen level of significance, we may
reject the hypothesis; otherwise, we may accept it ( is not significant at the chosen level of
significance and hence the corresponding regression does not appear to contribute to the
explanation of the variations in Y).
Look at the following figure

Assume  = 0.05, = 2.179 for 12 df

Acceptance
region
95%
Critical Critical
region 2.5% region (2.5%)
0

Fig 3.6.1. 95% confidence interval for t


Note that the greater the value of t calculated the stronger is the evidence that is significant.
For a number of degrees of freedom higher than 8 the critical value of t (at the 5% level of
significance) for the rejection of the null hypothesis is approximately 2.
3.6.2 Testing the Overall Significance of a Regression
This test aims at finding out whether the explanatory variables (X 1, X2, …Xk) do actually have
any significant influence on the dependent variable. The test of the overall significance of the
regression implies testing the null hypothesis

7
H0: = =…= =0
Against the alternative hypothesis
H1: not all ’s are zero.
If the null hypothesis is true, then there is no linear relationship between y and the regressors.
The above joint hypothesis can be tested by the analysis of variance (AOV) technique. The
following table summarizes the idea.

Source of variation Sum of squares Degrees of freedom Mean square


(SS) (Df) (MSS)
Due to regression (ESS) K–1

Due to Residual (RSS) N–k

Total N–1
(Total variation)
Therefore to undertake the test first find the calculated value of F and compare it with the F
tabulated. The calculated value of F can be obtained by using the following formula.

F= follows the F distribution with k – 1 and n – k df.

where k – 1 refers to degrees of freedom of the numerator


n – k refers to degrees of freedom of the denominator
k – number of parameters estimated
Decision Rule:
Rule: If Fcalculated > Ftabulated (F(k – 1, N – k)), reject H 0: otherwise you may accept it,
where F(k – 1, N – k) is the critical F value at the  level of significance and (k – 1) numerator
df and (N – k) denominator df.
Note that there is a relationship between the coefficient of determination R 2 and the F test used
in the analysis of variance.
F= f(F)

5% of area
F
0 1 2 3 4 5
When R2 = 0, F is zero. The larger the R 2, the greater the F value. In the limit, when R 2 = 1, F is
infinite. Thus the F test, which is a measure of the overall significance of the estimated
regression, is also a test of significance of R2. Testing the null hypothesis is equivalent to
testing the null hypothesis that (the population) R2 is zero.
The F test expressed in terms of R2 is easy for computation.

3.6.3 The Confidence Interval for


This has been discussed in unit 2. The (1 - )100% confidence interval for is given by
 t/2 S( ), (i = 0, 1, 2, 3, … k)

8
Example:
Example: Suppose we have data on wheat yield (y), amount of rainfall (x 2), and amount of
fertilizer applied (X1). It is assumed that the fluctuations in yield can be explained by varying
levels of rainfall and fertilizer.
Table 3.6.1
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Yield Fertilizer Rain fall yi x1i x2i x1i yi x2i yi x1x2
(Y) (X1) (X2)
40 100 10
50 200 20
50 300 10
70 400 30
65 500 20
65 600 20
80 700 30
 420 2800 140
= 60 = 400 = 20 (means)
1. Find the OLS estimators (i.e., , and )
Solutions:
Solutions: The formula for , and are
=
where x’s and y’s
= are in deviation
forms

Now find the deviations of the observations from their mean values. (Column 4 to 11 in the
above table)
The next step will be to insert the following values (in deviation) in to the above formula
x1i yi = 16500, x2i2 = 400, x2iyi = 600, x1i x2i = 7000, x1i2 = 280,000,
=

= 0.833
Now =
= 60 – (0.0381) (400) – (0.833) (20)
= 28.1
Hence the estimated function is written as follows

9
= 28.1 + 0.0381X1 + 0.833X2
2. Find the variance of ,
Solution

Var( )= , Var ( )=

In order to use the above formula we need to find

= but Ui = Yi -

Example. U1 = 40 – 40.24  U12 = (0.24)2 = 0.0576


Example.
U2 = y2 -
U2 = y3 -
: :
Therefore, Ui = (Yi - )2
2

Hence u2 = = 5.3532

Var( )= = 0.000034

S( )= = 0.0058

Var( ) = (5.3572) = 0.02381

S( )= = 0.1543

3. Find R2 (the coefficient of determination)

R2 =

= = 0.98
Interpretation: 98% of the variation in yield is due to the regression plane (i.e., because of
variation in the amount of fertilizer and rainfall). The model is a good fit.

4. Test (a) the null hypothesis H0: = 0 against the


alternative hypothesis H1:  0
 = 0.05

t= = 6.5689 – calculated value

10
ttabulated = = 2.78- can be found from the statistical table (t-distribution)
Decision:
Decision: Since tcalculated > ttabulated , we reject H0.
That is is statistically significant. The variable X1, fertilizer significantly affects yield.
(b) H0: =0
H1: 0
 = 0.05

tcalculated = = = 5.3986

ttabulated = = 2.78

Decision:
Decision: Since tcalculated > ttabulated , we reject H0. is statistically significant

5. Construct the 95% confidence interval for


- t/2(n-k) S( )< < + t/2(n-k) S( ) ,
0.0381- t0.025(4) (0.0058) < < 0.0381+ t0.025(4) (0.0058)
0.0381- 2.78 (0.0058) < < 0.0381+ 2.78 (0.0058)
0.0219< <0.0542
Interpretation:
Interpretation: The value of the true population parameter will lie between 0.0219 and
0.0542 in 95 out of 100 cases.
Note:
Note: The coefficient of X1 and X2 ( and ) measures the partial effect. For example
measures the rate of change of Y with respect to X1 while X2 is held constant
6. Test the joint significance of the explanatory variables
H0: = = … = =0
H1: H0 is not true.
The test statistic is F-statistic
Fcal =

Assuming that  = 0.05, F0.95, (2, 4) = 6.94


Decision: we reject H0 since Fcal > Ftab. We accept that the regression is significant: not all ’s
are zero.

3.9 SUMMARY

Multiple linear regression Model


Population Regression Function (PRF) as follows:
Yi = o + 1X1i + 2X2i +Ui
The sample regression function (SRF) can be written as

11
Yi =
Assumptions of the model
1. Zero mean value of Ui: E(Ui/X1i, X2i) = 0 for each i.
2. Homoscedasticity ; Var (Ui) = E(Ui2) =
3. Normality; Ui ~ N(0, )
4. No serial correlation (serial independence of the U’s); Cov (Ui, Uj) = 0 for i  j
5. Independence of Ui and Xi; Cov(Ui, X1i) = Cov (Ui, X2i) = 0
6. No collinearity between the X variables (No multicollinearity)
7. Correct specification of the model
Formulas for the parameters
=

where the variables x and y are in deviation forms


The mean of , and .
E( )= E( )= E( )=
The variance of , and .

Var( )=

Var( )=

Var( )=

Where , k being the total number of parameters that are estimated.


x1 and x2 are in deviations form.
The multiple coefficient of determination (R 2): measures the proportion of the variation in Y
explained by the variables X1 and X2 jointly.

= =1–

The Adjusted R2

12
=1– or = 1 – (1 – R2)

The partial regression coefficient measures the change in the mean value of Y E(Y/X 2,X3), per
unit change in X2, holding X3 constant
Hypothesis Testing about Individual Partial Regression Coefficients

t= ~ t(n – k) (i = 0, 1, 2, …., k)

H0: =0
H1:  0 or one sided ( > 0, < 0)

The test of the overall significance of the regression


H0: = =…= =0
H1: not all ’s are zero.
If the null hypothesis is true, then there is no linear relationship between y and the regressors.
Find the calculated value of F and compare it with the F tabulated.

Fcal = follows the F distribution with k – 1 and n – k df.

Decision Rule:
Rule: If Fcalculated > Ftabulated (F(k – 1, N – k)), reject H0: otherwise you may accept it,
F=
The F test expressed in terms of R2 is easy for computation.

The (1 - )100% confidence interval for is given by


 t/2 S( ), (i = 0, 1, 2, 3, … k)
Forecasting
Point forecast vs interval estimation (the forecasted value will lie on the interval (a, b))

The 95% confidence interval for Y0 (forecasted value) can be given by making use of

~t(n – k)
P(-t/2 < t < t/2) = 1 - 
Partial Correlation Coefficients
r12.3 – holding X3 constant, there is a positive or negative association between Y and X2.
The method of maximum likelihood (ML)
The method of maximum likelihood, as the name indicates, consists in estimating the unknown
parameters in such a manner that the probability of observing the given Y’s is as high (or
maximum) as possible.
The ML estimators, the are the same as the OLS estimators.

13
The ML estimator of 2 is biased.

14

You might also like