Manual Econometrics
Manual Econometrics
2.2 The distinction between the sample regression function and the
population regression function is important, for the former is
is an estimator of the latter; in most situations we have a sample of
observations from a given population and we try to learn
something about the population from the given sample.
2.4 Although we can certainly use the mean value, standard deviation
and other summary measures to describe the behavior the of the
regressand, we are often interested in finding out if there are any
causal forces that affect the regressand. If so, we will be able to
better predict the mean value of the regressand. Also, remember
that econometric models are often developed to test one or more
economic theories.
2.6 Models (a), (b), (c) and (e) are linear (in the parameter) regression
models. If we let a = ln p 1, then model (d) is also linear.
5
0.8 in model (d) of Question 2.7, it becomes a linear regression
model, as e-o.scxi - 2>can be easily computed.
2.12 This figure shows that the higher is the minimum wage, the lower
is per head GNP, thus suggesting that minimum wage laws may
not be good for developing countries. But this topic is controversial.
The effect of minimum wages may depend on their effect on
employment, the nature of the industry where it is imposed, and
how strongly the government enforces it.
6
CHAPTER3
TWO-VARIABLE REGRESSION MODEL:
THE PROBLEM OF ESTIMATION
(2) Given cov(u;u1) = 0 for 'v for all ij (i :I: j), then
cov(Y;½) = E{[Yi - E(Yi)][Yj - E(Yj)]}
= E(u;u1), from the results in (1)
= E(u;)E(u1), because the error terms are not
correlated by assumption,
= 0, since each u; has zero mean by assumption.
3.2 Yi xi Yi Xi
4 1 -3 -3 9 9
5 4 -2 0 0 0
7 5 0 1 0 1
12 6 5 2 10 4
sum 28 16 0 0 19 14
-------------------------------------------------
Note: Y = 1 and X = 4
A LXO,,i 19 A - A -
11
(LXi)2
I:x? = LXi2 _ ' _ n(n+ 1)(2n+ 1) _ n(n+ 1)2 = n(n 2 -1)
n 6 4 12
and similarly,
2
1) Th
"'"'
L.JY;
2
= n( n12- , en
2
I:a2 = I:<X-Yi) = L(Xi2 +Yi2 -2XiYi)
= 2n(n+1)(2n+1) _ 2LXiYi
6
Ld2
Therefore, LXiYi= n(n+l)( 2 n+l) (2)
6 2
I:xI:Yi
Since L xo,i; = L Xi Yi - - -n - , using (2), we obtain
"'"' d 2 "'"' d 2
n(n+1)(2n+1) __ L.J_ _ n(n+1) 2 _ n(n 2 -1) _L.J_
(3)
3 2 4 12 2
Now substituting the preceding equations in (1), you will get the answer.
A A A A -
3.9 (a) fi,= Y -fi2Xi and m= f -fi2 x [Note: Xi= (Xi - X )]
= Y , since Lx; = 0
2
I\
"'"'
L.J Xi L.J x?
"'"' I\ 0'2
var( fi,) = - - - a and var( m) = - - - a2
2 = -
nI:x? nI:x? n
I\ LX(}'i LX(}'i
(b) f i 2 = - - and a,=--, since Xi= (Xi - X)
Lx? I:x?
13
I\ I\ 0'2
It is easy to verify that var( p2) = var( a2) = - -
Ix?
That is, the estimates and variances of the two slope estimators are
the same.
3.10 Since Ix;= Iy; = 0, that is, the sum of the deviations from mean
I\ I\ -
- -
,. I(x;-x)(y;-y) Ix~;
Pi = - - - - - - = - - , since means of the two
I(x;-x) 2 "'"'x?
Li
z;w; I ac Ix~;
r2= ~ - ~ =r1 inEq.(3.5.13)
vizliw/ acvix?Iy?
3.12 (a) True. Let a and c equal -1 and band d equal O in Question 3.11.
14
(b) False. Again using Question 3.11, it will be negative.
(c) True. Since rxy = ryx > 0, Sx and Sy (the standard deviations ofX
and Y, respectively) are both positive, and ryx = pyx Sx and rxy =
Sy
pxy Sy , then pxy and pyx must be positive.
Sx
Lz;w; L(x1+x2)(x2+x3)
r =---======-;:::=====-----
~ ✓L,ziL, wi L, (x, + n) L, (x, + x,)
2 2
Lxi2
= -;:::===========, because the X's are
(Ixi2 + Ixi2)(Lxi2 + Lxi)
uncorrelated. Note: We have omitted the observation subscript for
convenience.
0'2 1
= .J =- , where a 2 is the common variance.
(2a2 +2a 2 ) 2
The coefficient is not zero because, even though the X's are
individually uncorrelated, the pairwise combinations are not.
3.14 The residuals and fitted values of Y will not change. Let
Yi= Pi+ P2X + w and Yi= m +a2Z; + w, where Z = 2X
Using the deviation form, we know that
I\
15
" - " - I\ - I\ - I\ - -
Pi= Y - Pi X; Y -a2 Z = Pi (Note: Z = 2X)
ai =
That is the intercept term remains unaffected. As a result, the fitted
Y values and the residuals remain the same even if Xi is multiplied
by 2. The analysis is analogous if a constant is added to Xi,
3.15 By definition,
rY.Y.2 =
<Iy61;) 2 [L (j,; +U;)(y;) r y;
A2
---
I
<Iyl)<ISi) <Iyl)(I.v/) LY;2
LJ
"
(P 2Xi) 2
A
3.16 (a) False. The covariance can assume any value; its value depends
on the units of measurement. The correlation coefficient, on the
other hand, is unitless, that is, it is a pure number.
LYf; /Jix;y; A 2
a=---=---= ~2 =1 because
/J2Ixl P
y; = /Jx; and L X;Y; = /J L x/ for the two-variable model. The
intercept in this regression is zero.
16
with the only unknown parameter and set the resulting expression to
zero, to obtain:
d~•i) = 2L(Y,-.8,)(-I) = 0
which on simplification gives /i, = Y ,that is, the sample mean. And
0'2
we know that the variance of the sample mean is ---2'.... , where n is the
n
sample size, and a 2 is the variance of Y. The RSS is
Problems
d2 4 1 1 9 0 1 1 4 1 4 ; L d2 = 26
Therefore, Spearman's rank correlation coefficient is
6Ld2
rs= l - - - - = 1- 6(26) = 0.842
1)
n(n 2 - 10(102 - 1)
Thus there is a high degree of correlation between the student's
midterm and final ranks. The higher is the rank on the midterm, the
higher is the rank on the final.
3.19 (a) The slope value of -4.318 suggests that over the period 1980-
1994, for every unit increase in the relative price, on average, the
(GM/$) exchange rate declined by about 4.32 units. That is, the
17
dollar depreciated because it was getting fewer German marks for
every dollar exchanged. Literally interpreted, the intercept value of
6.682 means that if the relative price ratio were zero, a dollar would
exchange for 6.682 German marks. Of course, this interpretation
is not economically meaningful.
120
100
0
I
80 e'
0
0
Cl) 0
::::, 0
0
IXI 60 0
w 0
(!)
~ 40 0
'
0
0
0
0
,, 0
20 s•
~ooo oocP
0
40 60 80 100 120
PRODBUS
120
0
0
100
0
l
0
8
80 0
0
IXI 0
u. 0
zw 60 0
0
0
(!)
~
0
0
40 0
0
0
0
0
20 d'f O 0
c90 oo oo
0
40 60 80 100 120
PRODNFB
18
CHAPTERS
TWO-VARIABLE REGRESSION:
INTERVAL ESTIMATION AND HYPOTHESIS TESTING
Questions
5.1 (a) True. The t test is based on variables with a normal distribution.
Since the estimators of fi1 and fi2 are linear combinations of the
error ui, which is assumed to be normally distributed under CLRM,
these estimators are also normally distributed.
(c) True. In this case the Eq. (1) in App. 3A, Sec. 3A.l, will be
absent. This topic is discussed more fully in Chap. 6, Sec. 6.1.
(e) True. This follows from Eq. (1) of App. 3A, Sec. 3A.l.
(j) False. All we can say is that the data at hand does not permit
us to reject the null hypothesis.
24
we obtain: r 2 = <9·653 ?)2 ~ 0.8944
[(9.6536) -11]
5.4 Verbally, the hypothesis states that there is no correlation between
the two variables. Therefore, if we can show that the covariance
between the two variables is zero, then the correlation must be zero.
5.5 (a) Use the t test to test the hypothesis that the true slope coefficient
A
.
1s one.
Th . b .
at 1s o tam:t =
-l 1.0598-1 0821
=----= .
p 2
A
se(fi2 ) 0.0728
For 238 df this t value is not significant even at a= 10%.
The conclusion is that over the sample period, IBM was
not a volatile security.
(b) Since t = 0 ·7264 = 2.4205, which is significant at the two
0.3001
percent level of significance. But it has little economic meaning.
Literally interpreted, the intercept value of about 0.73 means
that even if the market portfolio has zero return, the security's
return is 0.73 percent.
A
t=
P2 A =
P2-..JLX;
A
P2-..JLX;
= ---.=====
se(P2 ) a
IY?<1-r2)
(n-2)
Ii(l-/)
because a- 2 =--=
(n-2) (n-2)
, from Eq.(3.5.10)
_ft,p.j(n-2)
- .JLy; .j(l-r 2)
• 2 A2
Ix/ 2
&X; A
But smce r =p 2 --
2 , then r =P 2 --
2 , from Eq.(3.5.6).
LY; LY;
26
CHAPTER6
EXTENSIONS OF THE TWO-VARIABLE REGRESSION MODEL
6.1 True. Note that the usual OLS formula to estimate the intercept is
ft
/J, = (mean of the regressand- 2 mean of the regressor).
But when Y and X are in deviation form, their mean values are
always zero. Hence in this case the estimated intercept is also zero.
6.2 (a) & (b) In the first equation an intercept term is included.
Since the intercept in the first model is not statistically significant,
say at the 5% level, it may be dropped from the model.
(c) For each model, a one percentage point increase in the monthly
market rate of return lead on average to about 0.76 percentage point
increase in the monthly rate ofreturn on Texaco common stock over
the sample period.
(e) No, the two ?s are not comparable. The r2 of the interceptless
model is the raw?.
36
(c) As X tends to infinity, Y tends to (1/ p1 ).
slope< 1
A LX;y; -
- L(X; I Sx)(Y; I Sy) - L(x,y;)I sxsy
a2 - Lx;2 - L(x; I Sx )2 - Lx/ Is;
sx LX;Y; - sx PA
2 - 2
Sy LX; Sy
This shows that the slope coefficient is not invariant to the
change of scale.
37
= A+ a2 lnX; +u;
where A= (a1 +a2 In w2 -In w1)
Comparing this with the second model, you will see that except
for the intercept terms, the two models are the same. Hence the
estimated slope coefficients in the two models will be the same, the
only difference being in the estimated intercepts.
6.8 The null hypothesis is that the true slope coefficient is 0.005.The
alternative hypothesis could be one or two-sided. Suppose we
use the two-sided alternative. The estimated slope value is 0.00743.
Using the t test, we obtain:
6.11 As it stands, the model is not linear in the parameter. But consider
the following "trick." First take the ratio of Y to ( 1-Y) and then take
the natural log of the ratio. This transformation will
make the model linear in the parameters. That is, run
the following regrssion:
y
In 1_' y = P1 + P2X;
I
38
CHAPTER 7
MULTIPLE REGRESSION ANALYSIS: THE PROBLEM OF
ESTIMATION
Pi· A
7.2 Using the formulas given in the text, the regression results are as
follows:
A
7.4 Since we are told that is, Ui ~ N(0,4), generate, say, 25 observations
from a normal distribution with these parameters. Most computer
packages do this routinely. From these 25 observations, compute
the sample variance
I(X -X)2
as s2- =; , where Xi= the observed value of Ui in the
24
sample of 25 observations. Repeat this exercise, say, 99 more times,
43
for a total of 100 experiments. In all there will be 100 values of s2.
Take the average of these 100 s2 values. This average value should
be close to a 2 = 4. Sometimes you may need more than 100
samples for the approximation to be good.
7.7 (a) No. An r-value cannot exceed 1 in absolute value. Plugging the
given data in Eq. (7 .11.2), the reader can should verify that:
r12.3 = 2.295, which is logically impossible.
(b) Yes. Following the same procedure as in (a), the reader will
find that r 123 = 0.397, which is possible.
(c) Yes, again it can be shown that r12.3 = 0.880, which is possible.
7.8 If you leave out the years of experience (X3 ) from the model, the
coefficient of education (X2) will be biased, the nature of the bias
depending on the correlation between X2 and X 3. The standard error,
the residual sum of squares, and R2 will all be affected as a result
44
of this omission. This is an instance of the omitted variable bias.
7.9 The slope coefficients in the double-log models give direct estimates
of the (constant) elasticity of the left-hand side variable with
respect to the right hand side variable. Here:
o In Y = oY I Y = p2 ' and
oinX2 oX2I X 2
a 1n Y = aY I Y = P3
oinX3 oX3I X 3
7.10 (a) & (b) If you multiply X2 by 2, you can verify from Equations
(7.4.7) and (7.4.8), that the slopes remain unaffected. On the other hand, if
you multiply Y by 2, the slopes as well as the intercept coefficients and
their standard errors are all multiplied by 2. Always keep in mind the units
in which the regressand and regressors are measured.
(d) No, because the regressands in the two models are different.
I I I
LXi LX;2
_ LX2 LZ-X.
I I I
- LX 2 - "'x 2
I L..., I
45
A
= 1 - P2
That is, the slope in the regression of savings on income (i.e., the
marginal propensity to save) is one minus the slope in the regression
of consumption on income. (i.e., the marginal propensity to
consume). Put differently, the sum of the two marginal propensities
is 1, as it should be in view of the identity that total income is equal
to total consumption expenditure and total savings. Incidentally,
note that al = - Pi
(b) Yes. The RSS for the consumption function is:
L(Y; - a1 - a2x; )2
Now substitute (Xi- Ii) for zi, al = - P1 and a2 = (1- P2)
and verify that the two RSS are the same.
(c) No, since the two regressands are not the same.
7.14 (a) As discussed in Sec. 6.9, to use the classical normal linear
regression model (CNLRM), we must assume that
In ui ~ N(O, a 2 )
After estimating the Cobb-Douglas model, obtain the
residuals and subject them to normality test, such as the Jarque-Bera
test.
46
CHAPTERS
MULTIPLE REGRESSION ANALYSIS:
THE PROBLEM OF INFERENCE
8.1 (a) In the first model, where sales is a linear function of time, the
rate of change of sales, (dY/dt) is postulated to be a constant, equal
to p1 , regardless of time t. In the second model the rate of change is
not constant because (dY/dt) = a 1 + 2aif, which depends on time t.
8.2
F = (ESSnew - ESSo/d) / NR (8.5.16)
RSSnew /(n-k)
where NR = number of new regressors. Divide the numerator and
• 2 ESS 2 RSS
denommator by TSS and recall that R = --and (1- R ) = - -
TSS TSS
Substituting these expressions into (8.5.16), you will obtain (8.5.18).
53
R2 =l- RSSuR '?::.R 2 =l- RSSR
UR TSS R TSS
Note that whether we use the restricted or unrestricted regression,
n -
the TSS remains the same, as it is simply equal to L (Y; - Y) 2
I
8.5 (a) Let the coefficient of log K be p• = (P2 + fi3 -1). Test the null
hypothesis that p• = 0 , using the usual t test. If there are indeed
constant returns to scale, the t value will be small.
54