0% found this document useful (0 votes)
15 views

Manual Econometrics

This chapter discusses two variable regression analysis and some basic concepts. It explains how regression models estimate the relationship between an explanatory variable and a response variable. It also discusses the difference between sample and population regression functions and how regression models can never perfectly describe reality due to stochastic error. The chapter continues explaining how regression models can be used to predict mean responses and test economic theories.

Uploaded by

mrrnahid
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Manual Econometrics

This chapter discusses two variable regression analysis and some basic concepts. It explains how regression models estimate the relationship between an explanatory variable and a response variable. It also discusses the difference between sample and population regression functions and how regression models can never perfectly describe reality due to stochastic error. The chapter continues explaining how regression models can be used to predict mean responses and test economic theories.

Uploaded by

mrrnahid
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

CHAPTER2

TWO VARIABLE REGRESSION ANALYSIS: SOME BASIC IDEAS

2.1 It tells how the mean or average response of the sub-populations of


Y varies with the fixed values of the explanatory variable (s).

2.2 The distinction between the sample regression function and the
population regression function is important, for the former is
is an estimator of the latter; in most situations we have a sample of
observations from a given population and we try to learn
something about the population from the given sample.

2.3 A regression model can never be a completely accurate


description of reality. Therefore, there is bound to be some difference
between the actual values of the regressand and its values
estimated from the chosen model. This difference is simply the
stochastic error term, whose various forms are discussed in the
chapter. The residual is the sample counterpart of the stochastic error
term.

2.4 Although we can certainly use the mean value, standard deviation
and other summary measures to describe the behavior the of the
regressand, we are often interested in finding out if there are any
causal forces that affect the regressand. If so, we will be able to
better predict the mean value of the regressand. Also, remember
that econometric models are often developed to test one or more
economic theories.

2.5 A model that is linear in the parameters; it may or may not be


linear in the variables.

2.6 Models (a), (b), (c) and (e) are linear (in the parameter) regression
models. If we let a = ln p 1, then model (d) is also linear.

2. 7 (a) Taking the natural log, we find that ln Yi = p 1 + p 2 Xi + ui, which


becomes a linear regression model.
(b) The following transformation, known as the logit transformation,
makes this model a linear regression model:
ln [(1- Yi)Ni] = p I+ P2 xi+ U;
(c) A linear regression model
(d) A nonlinear regression model
(e) A nonlinear regression model, as p 2 is raised to the third power.

2.8 A model that can be made linear in the parameters is called an


intrinsically linear regression model, as model (a) above. If p 2 is

5
0.8 in model (d) of Question 2.7, it becomes a linear regression
model, as e-o.scxi - 2>can be easily computed.

2.9 (a) Transforming the model as (1/Yi) = P 1 + P2 Xi makes it a linear


regression model.
(b) Writing the model as (Xi/Yi) = P 1 + P2 Xi makes it a linear
regression model.
(c) The transformation ln[(l - Yi)/Yi] = - P 1 - p 2 Xi makes it a
linear regression model.
Note: Thus the original models are intrinsically linear models.

2.10 This scattergram shows that more export-oriented countries on


average have more growth in real wages than less export oriented
countries. That is why many developing countries have followed
an export-led growth policy. The regression line sketched in the
diagram is a sample regression line, as it is based on a sample
of 50 developing countries.

2.11 According to the well-known Heckscher-Ohlin model of trade,


countries tend to export goods whose production makes intensive
use of their more abundant factors of production. In other words,
this model emphasizes the relation between factor endowments
and comparative advantage.

2.12 This figure shows that the higher is the minimum wage, the lower
is per head GNP, thus suggesting that minimum wage laws may
not be good for developing countries. But this topic is controversial.
The effect of minimum wages may depend on their effect on
employment, the nature of the industry where it is imposed, and
how strongly the government enforces it.

2.13 It is a sample regression line because it is based on a sample


of 15 years of observations. The scatter points around the regression
line are the actual data points. The difference between the actual
consumption expenditure and that estimated from the regression line
represents the (sample) residual. Besides GDP, factors such as
wealth, interest rate, etc. might also affect consumption expenditure.

6
CHAPTER3
TWO-VARIABLE REGRESSION MODEL:
THE PROBLEM OF ESTIMATION

3.1 (1) Yi = Pi+ pix+ Ui • Therefore,


E(Yi IX; ) = E[( Pi+ pix + u;) IX; ]
= P1+P2X+E(u;IX;), sincethe P'sareconstantsandX
is nonstochastic.
= Pi+ PiX, since E(u;jX; ) is zero by assumption.

(2) Given cov(u;u1) = 0 for 'v for all ij (i :I: j), then
cov(Y;½) = E{[Yi - E(Yi)][Yj - E(Yj)]}
= E(u;u1), from the results in (1)
= E(u;)E(u1), because the error terms are not
correlated by assumption,
= 0, since each u; has zero mean by assumption.

(3) Given var(u;\Xi) = o- 2, var (Yi\Xi) = E[Yi - E(Yi)] 2 = E(u/) =


var( u;\Xi) = o- 2 , by assumption.

3.2 Yi xi Yi Xi

4 1 -3 -3 9 9
5 4 -2 0 0 0
7 5 0 1 0 1
12 6 5 2 10 4

sum 28 16 0 0 19 14
-------------------------------------------------
Note: Y = 1 and X = 4

A LXO,,i 19 A - A -

Therefore, Pi = --= - = 1.357; pi= Y- p2 X = 1.572


Lx? 14

3.3 The PRF is: Yi = Pi+ pix+ Ui


Situation I: Pi = 0, Pi =1, and E( uJ = 0, which gives E(Yi IX; ) = Xi
Situation 2:_ Pi =1, Pi= 0 , and E(u;) = (Xi - 1), which gives
E(YilX;) = xi
which is the same as Situation 1. Therefore, without the assumption
E(u;) = 0, one cannot estimate the parameters, because, as just
shown, one obtains the same conditional distribution of Y although
the assumed parameter values in the two situations are quit different.

11
(LXi)2
I:x? = LXi2 _ ' _ n(n+ 1)(2n+ 1) _ n(n+ 1)2 = n(n 2 -1)
n 6 4 12
and similarly,
2
1) Th
"'"'
L.JY;
2
= n( n12- , en
2
I:a2 = I:<X-Yi) = L(Xi2 +Yi2 -2XiYi)
= 2n(n+1)(2n+1) _ 2LXiYi
6

Ld2
Therefore, LXiYi= n(n+l)( 2 n+l) (2)
6 2

I:xI:Yi
Since L xo,i; = L Xi Yi - - -n - , using (2), we obtain
"'"' d 2 "'"' d 2
n(n+1)(2n+1) __ L.J_ _ n(n+1) 2 _ n(n 2 -1) _L.J_
(3)
3 2 4 12 2
Now substituting the preceding equations in (1), you will get the answer.

A A A A -
3.9 (a) fi,= Y -fi2Xi and m= f -fi2 x [Note: Xi= (Xi - X )]

= Y , since Lx; = 0
2
I\
"'"'
L.J Xi L.J x?
"'"' I\ 0'2
var( fi,) = - - - a and var( m) = - - - a2
2 = -
nI:x? nI:x? n

Therefore, neither the estimates nor the variances of the two

estimators are the same.

I\ LX(}'i LX(}'i
(b) f i 2 = - - and a,=--, since Xi= (Xi - X)
Lx? I:x?

13
I\ I\ 0'2
It is easy to verify that var( p2) = var( a2) = - -
Ix?

That is, the estimates and variances of the two slope estimators are
the same.

(c) Model II may be easier to use with large X numbers, although

with high speed computers this is no longer a problem.

3.10 Since Ix;= Iy; = 0, that is, the sum of the deviations from mean

value is always zero, x = y = 0 are also zero. Therefore,

I\ I\ -

P1 = y- Pix = 0. The point here is that if both Y and X are

expressed as deviations from their mean values, the regression line

will pass through the origin.

- -
,. I(x;-x)(y;-y) Ix~;
Pi = - - - - - - = - - , since means of the two
I(x;-x) 2 "'"'x?
Li

variables are zero. This is equation (3.1.6).

3.11 Let Zi = aXi +band Wi = cYi + d. In deviation form, these become:

Zi = axi and Wi = CYi• By definition,

z;w; I ac Ix~;
r2= ~ - ~ =r1 inEq.(3.5.13)

vizliw/ acvix?Iy?

3.12 (a) True. Let a and c equal -1 and band d equal O in Question 3.11.

14
(b) False. Again using Question 3.11, it will be negative.

(c) True. Since rxy = ryx > 0, Sx and Sy (the standard deviations ofX
and Y, respectively) are both positive, and ryx = pyx Sx and rxy =
Sy
pxy Sy , then pxy and pyx must be positive.
Sx

3.13 Let Z = X 1 + X 2 and W = X 2 and X3. In deviation form, we can write


these as z = x1 + x2 and w = x2 + X3. By definition the correlation
between Z and W is:

Lz;w; L(x1+x2)(x2+x3)
r =---======-;:::=====-----
~ ✓L,ziL, wi L, (x, + n) L, (x, + x,)
2 2

Lxi2
= -;:::===========, because the X's are
(Ixi2 + Ixi2)(Lxi2 + Lxi)
uncorrelated. Note: We have omitted the observation subscript for
convenience.
0'2 1
= .J =- , where a 2 is the common variance.
(2a2 +2a 2 ) 2
The coefficient is not zero because, even though the X's are
individually uncorrelated, the pairwise combinations are not.

As just shown, L zw = a 2 , meaning that the covariance between z

and w is some constant other than zero.

3.14 The residuals and fitted values of Y will not change. Let
Yi= Pi+ P2X + w and Yi= m +a2Z; + w, where Z = 2X
Using the deviation form, we know that

I\

Pi = - - , omitting the observation subscript.


LX2
L ZO,i 2 L XO,i l "
a2 = = = -Pi
4Lx? 2

15
" - " - I\ - I\ - I\ - -
Pi= Y - Pi X; Y -a2 Z = Pi (Note: Z = 2X)
ai =
That is the intercept term remains unaffected. As a result, the fitted
Y values and the residuals remain the same even if Xi is multiplied
by 2. The analysis is analogous if a constant is added to Xi,

3.15 By definition,

rY.Y.2 =
<Iy61;) 2 [L (j,; +U;)(y;) r y;
A2

---
I
<Iyl)<ISi) <Iyl)(I.v/) LY;2

LJ
"
(P 2Xi) 2
A

since I.v,iu = 0. = ---- r2, using (3.5.6).


LY;2 LY;2

3.16 (a) False. The covariance can assume any value; its value depends
on the units of measurement. The correlation coefficient, on the
other hand, is unitless, that is, it is a pure number.

(b) False. See Fig.3.1 lh. Remember that correlation coefficient


is a measure of linear relationship between two variables. Hence,
as Fig.3.1 lh shows, there is a perfect relationship between Y and
X, but that relationship is nonlinear.

(c) True. In deviation form, we have


Yi= y;+Ui
Therefore, it is obvious that if we regress Y; on .Y;, the slope
coefficient will be one and the intercept zero. But a formal proof can
proceed as follows:
If we regress Yi on y;, we obtain the slope coefficient, say, a as:

LYf; /Jix;y; A 2

a=---=---= ~2 =1 because
/J2Ixl P
y; = /Jx; and L X;Y; = /J L x/ for the two-variable model. The
intercept in this regression is zero.

3.17 Write the sample regression as: Y; = /J1 + U; . By LS principle, we


want to minimize: I u; =I 2
(Y; - /J )2 . Differentiate this equation
1

16
with the only unknown parameter and set the resulting expression to
zero, to obtain:

d~•i) = 2L(Y,-.8,)(-I) = 0

which on simplification gives /i, = Y ,that is, the sample mean. And
0'2
we know that the variance of the sample mean is ---2'.... , where n is the
n
sample size, and a 2 is the variance of Y. The RSS is

L (}'; - f )2 = Ly; and a 2 =


RSS =
(n-1) (n-1)
Ly; .It is worth adding the

X variable to the model if it reduces o- 2 significantly, which it will if


X has any influence on Y. In short, in regression models we hope
that the explanatory variable(s) will better predict Y than simply its
mean value. As a matter of fact, this can be looked at formally.
Recall that for the two-variable model we obtain from (3.5.2),
RSS = TSS - ESS
= Ii-IY?
= LYi- Pf Ix;
Therefore, if /i2 is different from zero, RSS of the model that
contains at least one regressor, will be smaller than the model with no
regressor. Of course, if there are more regressors in the model and
their slope coefficients are different from zero, the RSS will be much
smaller than the no-regressor model.

Problems

3.18 Taking the difference between the two ranks, we obtain:


d -2 1 -1 3 0 -1 -1 -2 1 2

d2 4 1 1 9 0 1 1 4 1 4 ; L d2 = 26
Therefore, Spearman's rank correlation coefficient is

6Ld2
rs= l - - - - = 1- 6(26) = 0.842
1)
n(n 2 - 10(102 - 1)
Thus there is a high degree of correlation between the student's
midterm and final ranks. The higher is the rank on the midterm, the
higher is the rank on the final.

3.19 (a) The slope value of -4.318 suggests that over the period 1980-
1994, for every unit increase in the relative price, on average, the
(GM/$) exchange rate declined by about 4.32 units. That is, the

17
dollar depreciated because it was getting fewer German marks for
every dollar exchanged. Literally interpreted, the intercept value of
6.682 means that if the relative price ratio were zero, a dollar would
exchange for 6.682 German marks. Of course, this interpretation
is not economically meaningful.

(b) The negative value of the slope coefficient makes perfect


economic sense because if U.S. prices go up faster than German
prices, domestic consumers will switch to German goods, thus
increasing the demand for GM, which will lead to appreciation
of the German mark. This is the essence of the theory of purchasing
power parity (PPP), or the law of one price.

(c) In this case the slope coefficient is expected to be positive, for


the higher the German CPI relative to the U.S. CPI, the higher the
relative inflation rate in Germany which will lead to appreciation
of the U.S. dollar. Again, this is in the spirit of the PPP.

3.20 (a) The scattergrams are as follows:

120

100
0
I
80 e'
0
0
Cl) 0
::::, 0
0

IXI 60 0
w 0
(!)

~ 40 0
'
0
0

0
0
,, 0

20 s•
~ooo oocP

0
40 60 80 100 120

PRODBUS

120
0
0

100
0
l
0

8
80 0
0
IXI 0
u. 0

zw 60 0
0
0

(!)

~
0
0
40 0
0

0
0
0

20 d'f O 0

c90 oo oo

0
40 60 80 100 120

PRODNFB

18
CHAPTERS
TWO-VARIABLE REGRESSION:
INTERVAL ESTIMATION AND HYPOTHESIS TESTING

Questions

5.1 (a) True. The t test is based on variables with a normal distribution.
Since the estimators of fi1 and fi2 are linear combinations of the
error ui, which is assumed to be normally distributed under CLRM,
these estimators are also normally distributed.

(b) True. So long as E(ui) = 0, the OLS estimators are unbiased.


No probabilistic assumptions are required to establish unbiasedness.

(c) True. In this case the Eq. (1) in App. 3A, Sec. 3A.l, will be
absent. This topic is discussed more fully in Chap. 6, Sec. 6.1.

(d) True. The p value is the smallest level of significance at which


the null hypothesis can be rejected. The terms level of significance
and size of the test are synonymous.

(e) True. This follows from Eq. (1) of App. 3A, Sec. 3A.l.

(j) False. All we can say is that the data at hand does not permit
us to reject the null hypothesis.

(g) False. A larger a 2 may be counterbalanced by a larger Ix/. It


is only if the latter is held constant, the statement can be true.

(h) False. The conditional mean of a random variable depends on


the values taken by another (conditioning) variable. Only if the
two variables are independent, that the conditional and
unconditional means can be the same.

(i) True. This is obvious from Eq. (3.1.7).

(j) True. Refer ofEq. (3.5.2). If Xhas no influence on Y, jJ2 will

be zero, in which case I l = Lu; 2 •

24
we obtain: r 2 = <9·653 ?)2 ~ 0.8944
[(9.6536) -11]
5.4 Verbally, the hypothesis states that there is no correlation between
the two variables. Therefore, if we can show that the covariance
between the two variables is zero, then the correlation must be zero.

5.5 (a) Use the t test to test the hypothesis that the true slope coefficient
A

.
1s one.
Th . b .
at 1s o tam:t =
-l 1.0598-1 0821
=----= .
p 2
A

se(fi2 ) 0.0728
For 238 df this t value is not significant even at a= 10%.
The conclusion is that over the sample period, IBM was
not a volatile security.
(b) Since t = 0 ·7264 = 2.4205, which is significant at the two
0.3001
percent level of significance. But it has little economic meaning.
Literally interpreted, the intercept value of about 0.73 means
that even if the market portfolio has zero return, the security's
return is 0.73 percent.
A

5.6 Under the normality assumption, p2 is normally distributed. But


since a normally distributed variable is continuous, we know from
probability theory that the probability that a continuous random
variable takes on a specific value is zero. Therefore, it makes no
difference if the equality is strong or weak.

5.7 Under the hypothesis that p 2 = 0, we obtain


A A~ A~

t=
P2 A =
P2-..JLX;
A
P2-..JLX;
= ---.=====
se(P2 ) a
IY?<1-r2)
(n-2)

Ii(l-/)
because a- 2 =--=
(n-2) (n-2)
, from Eq.(3.5.10)

_ft,p.j(n-2)
- .JLy; .j(l-r 2)

• 2 A2
Ix/ 2
&X; A
But smce r =p 2 --
2 , then r =P 2 --
2 , from Eq.(3.5.6).
LY; LY;

26
CHAPTER6
EXTENSIONS OF THE TWO-VARIABLE REGRESSION MODEL

6.1 True. Note that the usual OLS formula to estimate the intercept is
ft
/J, = (mean of the regressand- 2 mean of the regressor).
But when Y and X are in deviation form, their mean values are
always zero. Hence in this case the estimated intercept is also zero.

6.2 (a) & (b) In the first equation an intercept term is included.
Since the intercept in the first model is not statistically significant,
say at the 5% level, it may be dropped from the model.

(c) For each model, a one percentage point increase in the monthly
market rate of return lead on average to about 0.76 percentage point
increase in the monthly rate ofreturn on Texaco common stock over
the sample period.

(d) As discussed in the chapter, this model represents the


characteristic line of investment theory. In the present case the
model relates the monthly return on the Texaco stock to the monthly
return on the market, as represented by a broad market index.

(e) No, the two ?s are not comparable. The r2 of the interceptless
model is the raw?.

(/) Since we have a reasonably large sample, we could use the


Jarque-Bera test of normality. The JB statistic for the two models is
about the same, namely, 1.12 and the p value of obtaining such a
JB value is about 0.57. Hence do not reject the hypothesis that the
error terms follow a normal distribution.

(g) As per Theil's remark discussed in the chapter, if the intercept


term is absent from the model, then running the regression through
the origin will give more efficient estimate of the slope coefficient,
which it does in the present case.

6.3 (a) Since the model is linear in the parameters, it is a linear


regression model.

(b) Define Y* = (1/Y) and X* = (1/X) and do an OLS regression of


ofY* on X*.

36
(c) As X tends to infinity, Y tends to (1/ p1 ).

(d Perhaps this model may be appropriate to explain low


consumption of a commodity when income is large, such as an
inferior good.

6.4 slope= 1 Slope >1

slope< 1

6.5 For Model I we know that


fi2 = L X;-;; , where X and Y are in deviation form.
LX;
For Model II, following similar step, we obtain:

A LX;y; -
- L(X; I Sx)(Y; I Sy) - L(x,y;)I sxsy
a2 - Lx;2 - L(x; I Sx )2 - Lx/ Is;

sx LX;Y; - sx PA
2 - 2
Sy LX; Sy
This shows that the slope coefficient is not invariant to the
change of scale.

6.6 We can write the first model as:


In (w1Y;) = a 1 +a2 In(w2 X;)+u;, that is,
In w 1 + In Yi = a 1 + a 2 In w2 + a 2 In X; + u; , using properties
of the logarithms. Since thew's are constants, collecting
terms, we can simplify this model as:

37
= A+ a2 lnX; +u;
where A= (a1 +a2 In w2 -In w1)
Comparing this with the second model, you will see that except
for the intercept terms, the two models are the same. Hence the
estimated slope coefficients in the two models will be the same, the
only difference being in the estimated intercepts.

(b) The r2 values of the two models will be the same.

6.7 Equation (6.6.8) is a growth model, whereas (6.6.10) is a linear


trend model. The former gives the relative change in the
regressand, whereas the latter gives the absolute change. For
comparative purposes it is the relative change that may be
more meaningful.

6.8 The null hypothesis is that the true slope coefficient is 0.005.The
alternative hypothesis could be one or two-sided. Suppose we
use the two-sided alternative. The estimated slope value is 0.00743.
Using the t test, we obtain:

t = 0.00743- 0.005 = 14.294


0.00017
This tis highly significant. We can therefore reject the null
hypothesis.

6.9 This can be obtained approximately as: 18.5508/3.2514 = 5.7055,


percent.

6.10 As discussed in Sec. 6. 7 of the text, for most commodities the


Engel model depicted in Fig. 6.6(c) seems appropriate. Therefore,
the second model given in the exercise may be the choice.

6.11 As it stands, the model is not linear in the parameter. But consider
the following "trick." First take the ratio of Y to ( 1-Y) and then take
the natural log of the ratio. This transformation will
make the model linear in the parameters. That is, run
the following regrssion:

y
In 1_' y = P1 + P2X;
I

This model is known as the logit model, which we will discuss


in the chapter on qualitative dependent variables.

38
CHAPTER 7
MULTIPLE REGRESSION ANALYSIS: THE PROBLEM OF
ESTIMATION

7.1 The regression results are:


ll1 = -3.00;a2 = 3.50
A A

A, = 4.00; A-i = -1.357


A A A

Pi= 2.00;P2 = l.00;p3 =-1.00


(a) No. Given that model (3) is the true model, a is a biased estimator of
2

Pi· A

(b) No. ~ is a biased estimator of P3 , for the same reason as in (a).


The lesson here is that misspecifying an equation can lead to biased
estimation of the parameters of the true model.

7.2 Using the formulas given in the text, the regression results are as
follows:
A

Y; = 53.1612 + 0.727 X 2; + 2.736X3;


se (0.049) (0.849)R 2 = 0.9988; R. 2= 0.9986
7.3 Omitting the observation subscript i for convenience, recall that

p _(L yx )(L x;) - (L yx3)(L x2x 3)


2
2 - (Ix;)(Ix;)-(Ix2x3 )2
= (L yx2 ) - (L yx3)(L x 2x 3) /(Ix;)
(Ix;)-(Ix2x3 )2 !(Ix;)
(Lyx2)-(Lyx3)b23 · b _ (LX2X3)
i 'usmg 23 - 2
(LX2 )-b23 (LX 2 X3) (LX3 )
= LY(Xz -bz3X3)
L Xi (xi -b23X3)

7.4 Since we are told that is, Ui ~ N(0,4), generate, say, 25 observations
from a normal distribution with these parameters. Most computer
packages do this routinely. From these 25 observations, compute
the sample variance
I(X -X)2
as s2- =; , where Xi= the observed value of Ui in the
24
sample of 25 observations. Repeat this exercise, say, 99 more times,

43
for a total of 100 experiments. In all there will be 100 values of s2.
Take the average of these 100 s2 values. This average value should
be close to a 2 = 4. Sometimes you may need more than 100
samples for the approximation to be good.

7.5 From Eq. (7.11.7) from the text, we have

R 2 = 'i; + (l-1j;)1j;_3. Therefore,


2 R 2 -lj32
'i2 3 = 1 2
-lj3
This is the coefficient of partial determination and may be
interpreted as describing the proportion of the variation in the
dependent variable not explained by explanatory variable X 3 , but has
been explained by the addition of the explanatory variable X2 to the
model.

7.6 The given equation can be written as:

X1 =(-a2 /a1 )X2 +(-a3 la1 )X3 ,or


X 2 =(-a1 /a2 )X1 +(-a3 /a 2 )X3 ,or
X 3 = (-a1 I a 3 )X1 + (-a2 I a 3 )X2
Therefore, the partial regression coefficients would be as follows:
p123 = -(a2 I a 1);fi13 _2 = -(a3I a 1)
P213 = -(a1 / a2);fi23.1 = -(a2 I a3)
P31.2 = -(a1 I a3);fi32_1 = -(a2 I a3)
Recalling Question 3.6, it follows:
...----
/ (-a2)( -a1) r;
'i2.3 = vCP12JCP2u) = =vl = ±1
(a1)(a2)

7.7 (a) No. An r-value cannot exceed 1 in absolute value. Plugging the
given data in Eq. (7 .11.2), the reader can should verify that:
r12.3 = 2.295, which is logically impossible.

(b) Yes. Following the same procedure as in (a), the reader will
find that r 123 = 0.397, which is possible.

(c) Yes, again it can be shown that r12.3 = 0.880, which is possible.

7.8 If you leave out the years of experience (X3 ) from the model, the
coefficient of education (X2) will be biased, the nature of the bias
depending on the correlation between X2 and X 3. The standard error,
the residual sum of squares, and R2 will all be affected as a result

44
of this omission. This is an instance of the omitted variable bias.

7.9 The slope coefficients in the double-log models give direct estimates
of the (constant) elasticity of the left-hand side variable with
respect to the right hand side variable. Here:
o In Y = oY I Y = p2 ' and
oinX2 oX2I X 2
a 1n Y = aY I Y = P3
oinX3 oX3I X 3

7.10 (a) & (b) If you multiply X2 by 2, you can verify from Equations
(7.4.7) and (7.4.8), that the slopes remain unaffected. On the other hand, if
you multiply Y by 2, the slopes as well as the intercept coefficients and
their standard errors are all multiplied by 2. Always keep in mind the units
in which the regressand and regressors are measured.

7.11 From (7 .11.5) we know that


2 2 2
R2 = 'i2 + 'i3 - 'i2'i3r3
1-,2~
Therefore, when r23 = 0, that is, no correlation between variables
X2 andX3,
R 2 = r2 12 + r2 13 , that is, the multiple coefficient of
determination is the sum of the coefficients of determination
in the regression of Yon X 2 and that of Y on X 3 •

7.12 (a) Rewrite Model B as:


Y, = Pi +(1+ P2)X2, + P3X3, +u,
= Pi+ p;x2, + P3X3, + u,' where p; = (1 + P2)
Therefore, the two models are similar.Yes, the intercepts in the
models are the same.

(b)The OLS estimates of the slope coefficient of X3 in the two


models will be the same.

(d) No, because the regressands in the two models are different.

7.13 (a) Using OLS, we obtain:


Ly.x. L(x. -z.)(x.)
a 2 =--'-' =
A

I I I

LXi LX;2
_ LX2 LZ-X.
I I I

- LX 2 - "'x 2
I L..., I

45
A

= 1 - P2
That is, the slope in the regression of savings on income (i.e., the
marginal propensity to save) is one minus the slope in the regression
of consumption on income. (i.e., the marginal propensity to
consume). Put differently, the sum of the two marginal propensities
is 1, as it should be in view of the identity that total income is equal
to total consumption expenditure and total savings. Incidentally,
note that al = - Pi
(b) Yes. The RSS for the consumption function is:
L(Y; - a1 - a2x; )2
Now substitute (Xi- Ii) for zi, al = - P1 and a2 = (1- P2)
and verify that the two RSS are the same.

(c) No, since the two regressands are not the same.

7.14 (a) As discussed in Sec. 6.9, to use the classical normal linear
regression model (CNLRM), we must assume that
In ui ~ N(O, a 2 )
After estimating the Cobb-Douglas model, obtain the
residuals and subject them to normality test, such as the Jarque-Bera
test.

(b) No. As discussed in Sec. 6.9,


2 /2 2 2
U; □ Iog-norma/[e,,. ,e,,. (e,,. -1)]

7.15 (a) The normal equations would be:


LY;X2; = P2 LXi; + p3 LX2;X3;
LY;X3; = P2 LX2;X3; + p3 LX;;
(b) No, for the same reason as the two-variable case.

(c) Yes, these conditions still hold.

(d) It will depend on the underlying theory.

(e) This is a straightforward generalization of the normal

equations given above.

46
CHAPTERS
MULTIPLE REGRESSION ANALYSIS:
THE PROBLEM OF INFERENCE

8.1 (a) In the first model, where sales is a linear function of time, the
rate of change of sales, (dY/dt) is postulated to be a constant, equal
to p1 , regardless of time t. In the second model the rate of change is
not constant because (dY/dt) = a 1 + 2aif, which depends on time t.

(b) The simplest thing to do is plot Y against time. If the resulting


graph looks parabolic, perhaps the quadratic model is appropriate.

(c) This model might be appropriate to depict the earnings profile


of a person. Typically, when someone enters the labor market, the
entry-level earnings are low. Over time, because of accumulated
experience, earnings increase, but after a certain age they start
declining.

(d) Look up the web sites of several car manufacturers, or Motor


Magazine, or the American Automobile Association for the data.

8.2
F = (ESSnew - ESSo/d) / NR (8.5.16)
RSSnew /(n-k)
where NR = number of new regressors. Divide the numerator and
• 2 ESS 2 RSS
denommator by TSS and recall that R = --and (1- R ) = - -
TSS TSS
Substituting these expressions into (8.5.16), you will obtain (8.5.18).

8.3 This is a definitional issue. As noted in the chapter, the unrestricted


regression is known as the long, or new, regression, and the
restricted regression is known as the short regression. These two
differ in the number of regressors included in the models.

8.4 In OLS estimation we minimize the RSS without putting any


restrictions on the estimators. Hence, the RSS in this case represents
the true minimum RSS or RSSuR- When restrictions are put on one
or more parameters, one may not obtain the absolute minimum RSS
due to the restrictions imposed. (Students of mathematics will
recall constrained and unconstrained optimization procedures).
Thus, RSSR>RSSuR, unless the restrictions are valid, in which
case the two RSS terms will be the same.

Recalling that R 2 = 1- RSS , it follows that


TSS

53
R2 =l- RSSuR '?::.R 2 =l- RSSR
UR TSS R TSS
Note that whether we use the restricted or unrestricted regression,
n -
the TSS remains the same, as it is simply equal to L (Y; - Y) 2
I

8.5 (a) Let the coefficient of log K be p• = (P2 + fi3 -1). Test the null
hypothesis that p• = 0 , using the usual t test. If there are indeed
constant returns to scale, the t value will be small.

(b) If we define the ratio (Y/K) as the output/capital ratio, a measure


of capital productivity, and the ratio (L/K) as the labor capital ratio,
then the slope coefficient in this regression gives the mean percent
change in capital productivity for a percent change in the
labor/capital ratio.

(c) Although the analysis is symmetrical, assuming constant returns


to scale, in this case the slope coefficient gives the mean percent
change in labor productivity (Y/L) for a percent change in the
capital labor ratio (KIL). What distinguishes developed countries
from developing countries is the generally higher capital/labor ratios
in such economies.

8.6 Start with equation (8.5 .11) and write it as:


F = (n-k)R2 h" h b .
2 , w 1c can e rewntten as:
(k-1)(1-R )

F (k-l) = R2 2 , after further algebraic manipulation, we


(n-k) (1-R )
obtain
R2 = F(k-l) , which is the desired result.
F(k-l)+(n-k)
For regression (8.2.1), n=64, k = 3. Therefore,
Fo.osc2,62) = 3.15, approx. (Note use 60 df in place of 62 df).
Therefore, putting these values in the preceding R2 formula,
we obtain:
R2 = 2(3.15) = 6.30 = 0 _0936
2(3.15)+61 67.3
This is the critical R2 value at the 5% level of significance. Since
the observed ofR2of 0.7077 in (8.2.1) far exceeds the critical value,
we reject the null hypothesis that the true R2 value is zero.

8. 7 Since regression (2) is a restricted form of (1 ), we can first


calculate the Fratio given in (8.5.18):

54

You might also like