Lec Topic6
Lec Topic6
Fall, 2024
1 / 33
Effects of Data Scaling on OLS Statistics
2 / 33
Example
3 / 33
Chapter 6 Multiple Regression Analysis: Further Issues 187
Example
T a b l e 6 . 1 Effects of Data Scaling
Dependent Variable (1) bwght (2) bwghtlbs (3) bwght
Independent Variables
cigs 2.4634 2.0289 —
(.0916) (.0057)
packs — — 29.268
(1.832)
faminc .0927 .0058 .0927
(.0292) (.0018) (.0292)
intercept 116.974 7.3109 116.974
(1.049) (.0656) (1.049)
Observations 1,388
▶ What happens
The estimates of thiswhen weobtained
equation, measure birth
using the data inweight
BWGHT.RAW,in pounds,
are given inrather
the
than
first in ounces?
column of Table Let
6.1. Standard
bwghtlbs = listed
errors are in parentheses. The estimate on cigs
bwght/16.
says that if a woman smoked 5 more cigarettes per day, birth weight is predicted to be
about .4634(5) 5 2.317 ounces less. The t statistic on cigs is 25.06, so the variable is very
▶ Allstatistically
the coefficients,
significant. standard errors, CIs, the standard error of the
regression (SER)
Now, suppose that are rescaled
we decide bybirth
to measure 1/16;
weightSSR arerather
in pounds, rescaled by (1/16)2 ;
than in ounces.
Let bwghtlbs 5 bwght/16 be birth weight in pounds. What happens to our OLS statistics
t statistics,
if we use this asFthe
statistics, and inR-squared
dependent variable remain
our equation? It is easy to unchanged.
find the effect on the
coefficient estimates by simple manipulation of equation (6.1). Divide this entire equation
by 16: 4 / 33
Example
Chapter 6 Multiple Regression Analysis: Further Issues 187
▶ LetTheusestimates
change cigs
of this to packs.
equation, In particular,
obtained using packs =arecigs/20.
the data in BWGHT.RAW, given in the
first column of Table 6.1. Standard errors are listed in parentheses. The estimate on cigs
▶ Allsays
thethatstatistics remain
if a woman smoked unchanged,
5 more cigarettes perexcept
day, birth that
weight the coefficient,
is predicted to be
about .4634(5) 5 2.317 ounces less. The t statistic on cigs is 25.06, so the variable is very
standard errors, and
statistically significant. CIs for β̂ 2 are rescaled by 20.
Now, suppose that we decide to measure birth weight in pounds, rather than in ounces.
Let bwghtlbs 5 bwght/16 be birth weight in pounds. What happens to our OLS statistics
if we use this as the dependent variable in our equation? It is easy to find the effect on the
coefficient estimates by simple manipulation of equation (6.1). Divide this entire equation 5 / 33
Beta Coefficients
6 / 33
Beta Coefficients
▶ Simple algebra gives the equation:
yi − ȳ = β̂1 (xi1 − x̄1 ) + β̂2 (xi2 − x̄2 ) + · · · + β̂k (xik − x̄k ) + ûi
Then,
(yi −ȳ )/σ̂y = (σ̂1 /σ̂y )β̂1 [(xi1 −x̄1 )/σ̂1 ]+· · ·+(σ̂k /σ̂y )β̂k [(xik −x̄k )/σ̂k ]+(ûi /σ̂y ).
7 / 33
Unit Change in Logarithmic Form
8 / 33
More on Functional Form
9 / 33
More on Using Logarithmic Functional Forms
▶ Logarithmic transformations have the convenient elasticity
interpretation.
▶ Slope coefficients of logged variables are invariant to rescaling.
More on Functional Form
▶ Taking logs often eliminates/mitigates problems with outliers.
1.5
0.5
-0.5
-1
-1.5
-2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
log y
Figure: log y < y and lim y =0 10 / 33
More on Using Logarithmic Functional Forms
11 / 33
Models with Quadratics
▶ Example: Suppose the fitted regression line for the wage equation is
∂wage
= β̂1 + 2β̂2 exper = .298 − 2 × .0061exper .
∂exper
▶ The first year of experience increases the wage by some $.298, the
second year by .298 − 2(.0061)(1) = $.286 < $.298
12 / 33
of the curve to the right of 24 can be ignored. The cost of using a quadratic to capture
diminishing effects is that the quadratic must eventually turn around. If this point is beyond
all but a small percentage of the people in the sample, then this is not of much concern.
Models with Quadratics
wage
F i g u r e 6 . 1 Quadratic relationship between
and exper.
wage
7.37
3.73
▶ x∗ = β̂1
= .298
≈ 24.4
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2(.0061)
2β̂2
ln(price)
\ = 13.39 − .902 ln(nox ) − .087 ln(dist)
(.57) (.115) (.043)
− .545 rooms + .062 rooms 2 − .048 stratio
(.165) (.013) (.006)
n = 526, R 2 = 0.093
14 / 33
Example: Effects of Pollution on Housing Prices
∂ ln(price) ∂price/price
= = −.545 + 2 × .062rooms
∂rooms ∂rooms
Chapter 6 Multiple Regression Analysis: Further Issues 197
Figure 6.2
) as a quadratic function of rooms.
log(price ▶ x∗ = −.545
2(.062) ≈ 4.4
log(price)
%Δ
price < 100{[2.545 1 2(.062)]rooms}Δrooms
5 (254.5 1 12.4 rooms)Δrooms.
15 / 33
Other Possibilities
ln(price)
\ =β0 + β1 ln(nox ) + β2 ln(nox )2
+ β3 crime + β4 rooms + β5 rooms 2 + β6 stratio + u,
which implies
∂ ln(price) %∂price
= = β1 + 2β2 ln(nox ).
∂ ln(nox ) %∂nox
16 / 33
Other Possibilities More on Functional Form
TVC
TC
TFC
0 0
▶ In the model
∂price
= β2 + β3 sqrft.
∂bdrms
▶ The effect of the number of bedrooms depends on the level of
square footage.
▶ Interaction effects complicate interpretation of parameters: β2 is the
effect of number of bedrooms, but for a square footage of zero.
▶ How to avoid this interpretation difficulty?
18 / 33
Models with Interaction Terms
▶ The model
y = β0 + β1 x 1 + β2 x 2 + β3 x 1 x 2 + u
can be rewritten as
y = α0 + δ1 x1 + δ2 x2 + β3 (x1 − µ1 )(x2 − µ2 ) + u,
19 / 33
Models with Interaction Terms
▶ Now
∂y
= δ2 + β3 (x1 − µ1 ),
∂x2
i.e., δ2 is the effect of x2 if all other variables take on their mean
values.
▶ Advantages of reparametrization:
20 / 33
More on Goodness-of-Fit and Selection of
Regressors
21 / 33
Adjusted R-Squared
▶ Recall that
SSR/n σ̃ 2
R2 = 1 − = 1 − u2 .
SST /n σ̃y
σ2
so R 2 is an estimator the population R-squared: ρ2 = 1 − σu2 , which
y
is the proportion of the variation in y in the population explained by
the independent variables.
▶ However, σ̃u2 and σ̃y2 are biased estimators for σu2 and σy2 .
22 / 33
Adjusted R-Squared
▶ Adjusted R-Squared:
SSR/(n − k − 1) σ̂ 2
R̄ 2 = 1 − = 1 − u2 ,
SST /(n − 1) σ̂y
where σ̂u2 and σ̂y2 are unbiased estimators of σu2 and σy2 due to the
correction of degree of freedoms.
▶ R̄ 2 imposes a penalty for adding new regressors: k ↑ =⇒ R 2 ↓
|tβ̂2 | > 1.
23 / 33
Adjusted R-Squared
Therefore, we have
n−1
R̄ 2 = 1 − (1 − R 2 ) < R2
n−k −1
unless k = 0 or R 2 = 1.
▶ Note that R̄ 2 even gets negative if R 2 < n−1 .
k
24 / 33
Relationship Between R̄ 2 and R 2
More on Goodness-of-Fit and Selection of Regressors
0 1
2
Figure: Relationship Between R and R 2
25 / 33
Using Adjusted R 2 to Choose between Nonnested Models
26 / 33
Comparing Models with Different Dependent Variables
and
\ = 4.36 + .275 lsales + .0179 roe
lsalary
(0.29) (.033) (0.0040)
n = 209, R 2 = .282, R̄ 2 = .275, SST = 66.72
27 / 33
Controlling for Too Many Factors in Regression Analysis
28 / 33
Controlling for Too Many Factors in Regression Analysis
29 / 33
Adding Regressors to Reduce the Error Variance
▶ Recall that
σ2
Var (β̂j ) = .
SSTj (1 − Rj2 )
– On the one hand, adding regressors may exacerbate
multicollinearity problems (Rj2 ↑)
– On the other hand, adding regressors reduces the error
variance (σ 2 ↓)
30 / 33
Predicting y When ln(y ) is the Dependent Variable
▶ Note that
ln(y ) = β0 + β1 x1 + · · · + βk xk + u
=⇒ y = exp(β0 + β1 x1 + · · · + βk xk ) exp(u) = m(x) exp(u)
▶ Under the additional assumption that u is independent of
(x1 , · · · , xk ), we have
ŷ = m̂(x)α̂0
Pn
where m̂(x) = exp(β̂0 + β̂1 x1 + · · · + β̂k xk ) and α̂0 = 1
n i=1 exp(ûi )
31 / 33
Comparing R-Squared of a Logged and an Unlogged
Specification
and
\ = 4.504 + .163 sales + .0109 rmktval + .0117 rceoten
lsalary
(.257) (.039) (.050) (.0053)
n = 177, R̃ 2 = .318
32 / 33
About R̃ 2
▶ Recall that
R 2 = Corr
d (y , ŷ )2 ,
d (y , ŷ ) = Corr
Corr d (y , α̂0 ỹ ) = Corr
d (y , ỹ )
▶ As a result,
R̃ 2 = Corr
d (y , ỹ )2 ,
33 / 33