0% found this document useful (0 votes)
16 views

Lecture 6.2 - Polynomial Regression

Uploaded by

suryapratp369
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Lecture 6.2 - Polynomial Regression

Uploaded by

suryapratp369
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 56

Polynomial Regression

Learning Objectives
 What is polynomial regression and when is it appropriate
 Contrast with other non-linear methods
 Interpret coefficients from quadratic models when X is
centered and uncentered
 Test and describe overall effect of predictor variable X
 Test and describe linear and quadratic effects of X
 Test simple effects of X

 Use “calculus” to describe how effect of X changes over


range of X
 Understand conceptual link between polynomial
regression and interactions
 Use calculus to describe interaction effect
 Analyze polynomial effect of X with additive covariate
 Write up results and make figure
How to Handle Non-Linear Effects
Power transformations can make simple monotone relationships more
linear (Fig A). Polynomial regression (or other transformations, e.g. logit)
is often needed for more complex relationships (Figs, B & C)
How to Handle Non-Linear Effects
• Simple monotonic relationships: Power
transformations.

• Polynomial Regression- Quantitative Variables

• [Polynomial Regression- Categorical Variables]

• Generalized Linear Models (e.g., logistic


regression)

4
Non-Linear Effects in MR/GLM

Multiple regression/GLM is “linear in the regressors”

The predicted score is a linear combination of the regressors


(X’s) in the model

Each regressor is multiplied by its coefficient and added


together (+ intercept/constant)

Y’ = b1X1 + b2X2 + …… b0

5
5
Linear vs. Logistic Regression
Multiple Regression Logistic Regression
Y′ = b0 + b1X1 Y′ = e b0 + b1X1
(1+ eb0 + b1X1)

Predicted values graphed for X1 = -50 to 50

6
6
Polynomial Regression
Y′ = A + BX + CX2 + DX3 + ….. QXN-1

It is important to distinguish between regressors in the


model vs. variables of interest.

In this example there is only one variable of interest. The


powers of X act as a structural set of regressors to allow for
non-linear relationships between this variable of interest &
Y.

However, the model is still linear in the regressors. I.E.


linear combination of the regressors multiplied by their
parameter estimates.

7
7
Polynomial Regression Order
Y′ = A + BX + CX2 + DX3 + ….. QXN-1

If you include (N-1) regressors based on X, you will


perfectly fit the data.

The order of the equation is the highest power: (N-1) in this


example.

X(N-1) is the highest order predictor. All other regressors are


lower order.

The highest order regressor determines the overall shape of


the relationship within the range of –inf to inf

8
8
Polynomial Regression Shape
Y′ = A + BX + CX2 + DX3 + ….. QXN-1

The highest order regressor determines the overall shape of


the relationship within the range of –inf to inf

Linear Quadratic Cubic


Y′ = A + BX Y′ = A+ BX + CX2 Y′ = A+ BX + CX2 + DX3
Zero bends One bend Two bends

There is one less bend than the highest order in the


polynomial model
9
9
Shape and Coefficient Sign
The sign of the coefficient for the highest order regressor
determines the direction of the curvature

Linear Quadratic Cubic


Y’ = 0 + 1X Y’ = 0 + 1X + 1X2 Y’ = 0 + 1X + 1X2 + 1X3

Y’ = 0 + -1X Y’ = 0 + 1X + -1X2 Y’ = 0 + 1X + 1X2 + -1X3

10
10
How to Determine Order
Can fit order up to N-1 but wont

Theory should generally guide order

Social science theory rarely predicts higher than cubic (and


typically not higher than quadratic)

Sometimes higher order models (quadratic, cubic) are


implicated by the distribution of residuals.

11
11
Polynomial vs. Power Transformation of X
Power Transformations of X

Polynomial Regression

12
12
Example 1: Predicting Interest in Electives
How does number of electives taken in an area predict
interest in further electives?
varDescribe(dE)
var n mean sd median trimmed mad min max range skew
Electives 1 100 8.17 3.09 8.00 8.15 2.97 1.00 17.00 16.00 0.14
Interest 2 100 18.10 4.75 19.33 18.58 3.62 3.56 26.49 22.92 -0.96
kurtosis se
Electives -0.30 0.31
Interest 0.74 0.47

cor(dE)
Electives Interest
Electives 1.0000000 0.7488801
Interest 0.7488801 1.0000000

13
13
Example 1: Predicting Interest in Electives
scatterplot(dE$Electives, dE$Interest,cex=1.5, lwd=2,xlab = 'Electives',
ylab='Interest', col='black')

14
14
Example 1: Predicting Interest in Electives
mLinear = lm(Interest ~ Electives, data=dE)
summary(mLinear)

Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 8.7177 0.8968 9.721 4.87e-16 ***
Electives 1.1490 0.1027 11.187 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Sum of squared errors (SSE): 980.2, Error df: 98


R-squared: 0.5608

15
15
Example 1: Predicting Interest in Electives
modelAssumptions(mLinear,Type=‘normal')

16
16
Example 1: Predicting Interest in Electives
modelAssumptions(mLinear,Type='constant', one.page=FALSE)

17
17
Example 1: Predicting Interest in Electives
modelAssumptions(mLinear,Type=‘linear', one.page=FALSE)

18
18
Example 1: Predicting Interest in Electives
ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS
USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:
Level of Significance = 0.05

Call:
gvlma(x = model)

Value p-value Decision


Global Stat 26.970200 2.016e-05 Assumptions NOT satisfied!
Skewness 0.005409 9.414e-01 Assumptions acceptable.
Kurtosis 0.475458 4.905e-01 Assumptions acceptable.
Link Function 25.445431 4.551e-07 Assumptions NOT satisfied!
Heteroscedasticity 1.043902 3.069e-01 Assumptions acceptable.

19
19
Example 1: Predicting Interest in Electives
mLinear = lm(Interest ~ Electives, data=dE)
summary(mLinear)

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.7177 0.8968 9.721 4.87e-16
Electives 1.1490 0.1027 11.187 < 2e-16

Residual standard error: 3.163 on 98 degrees of freedom


Multiple R-squared: 0.5608, Adjusted R-squared: 0.5563
F-statistic: 125.1 on 1 and 98 DF, p-value: < 2.2e-16

What are your concerns about interpreting this model?


The linearity assumption regarding the electives effect is
pretty clearly violated. When this assumption is violated, the
parameter estimates are biased. In this case, you can expect
that the model is underestimating the true magnitude of the
electives effect. This is a bad model!
20
20
How to estimate polynomial (quadratic) model
dE$Electives2 = dE$Electives * dE$Electives

mQuad= lm(Interest ~ Electives + Electives2, data=dE)


summary(mQuad)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 1.0031 1.5503 0.647 0.519
Electives 3.2743 0.3800 8.617 0.000000000000129 ***
Electives2 -0.1266 0.0220 -5.754 0.000000101764858 ***
---
Sum of squared errors (SSE): 730.8, Error df: 97
R-squared: 0.6726

mQuad= lm(Interest ~ Electives + I(Electives^2), data=dE)


summary(mQuad)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 1.0031 1.5503 0.647 0.519
Electives 3.2743 0.3800 8.617 0.000000000000129 ***
I(Electives^2) -0.1266 0.0220 -5.754 0.000000101764858 ***
---
Sum of squared errors (SSE): 730.8, Error df: 97
R-squared: 0.6726 21
21
Compare linear and quadratic models
mQuad = lm(Interest ~ Electives + I(Electives^2), data=dE)
summary(mQuad)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 1.0031 1.5503 0.647 0.519
Electives 3.2743 0.3800 8.617 0.000000000000129 ***
I(Electives^2) -0.1266 0.0220 -5.754 0.000000101764858 ***
---

Sum of squared errors (SSE): 730.8, Error df: 97


R-squared: 0.6726

mLinear = lm(Interest ~ Electives, data=dE)


summary(mLinear)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 8.7177 0.8968 9.721 4.87e-16 ***
Electives 1.1490 0.1027 11.187 < 2e-16 ***
---

Sum of squared errors (SSE): 980.2, Error df: 98


R-squared: 0.5608 22
22
Example 1: Predicting Interest in Electives
Linear model
Interest = 8.7 + 1.4*Electives

Quadratic model
Interest = 1.0 + 3.3*Electives + -0.1*Electives2

23
23
Example 1: Predicting Interest in Electives

24
24
Example 1: Predicting Interest in Electives

25
25
How to test overall effect of variable
mQuad = lm(Interest ~ Electives + I(Electives^2), data=dE)
summary(mQuad)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 1.0031 1.5503 0.647 0.519
Electives 3.2743 0.3800 8.617 0.000000000000129 ***
I(Electives^2) -0.1266 0.0220 -5.754 0.000000101764858 ***
---

Sum of squared errors (SSE): 730.8, Error df: 97


R-squared: 0.6726

How do we get a test of the overall Electives effect in


polynomial (i.e., quadratic) model?
The test of the set of regressors that code for Electives, In
this case, Electives and Electives2 provides the overall test.
Can use R2 here b/c these are the only two regressors in the
model. What if there were other control variables in model?
26
26
Is quadratic model better
mQuad = lm(Interest ~ Electives + I(Electives^2), data=dE)
summary(mQuad)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.0031 1.5503 0.647 0.519
Electives 3.2743 0.3800 8.617 1.29e-13
I(Electives^2) -0.1266 0.0220 -5.754 1.02e-07

Residual standard error: 2.745 on 97 degrees of freedom


Multiple R-squared: 0.6726, Adjusted R-squared: 0.6658
F-statistic: 99.62 on 2 and 97 DF, p-value: < 2.2e-16

How can you test if the quadratic model is necessary?


Test if the quadratic model fits better than the linear model.

You can compare the augmented model with electives and


electives2 to the compact model with only electives. This is
of course, equivalent to testing if the coefficient for electives 2
27
is 0. 27
Is higher order model necessary
How can you test if more complex model (e.g., cubic) is
needed?
Fit the cubic and compare this augmented cubic model to
quadratic model or test if the coefficient for cubic term is 0.

mCubic = lm(Interest ~ Electives + I(Electives^2) + I(Electives^3),


data=dE)
summary(mCubic)

Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) -2.581884 2.485665 -1.039 0.30155
Electives 4.937010 0.982495 5.025 0.00000233 ***
I(Electives^2) -0.342747 0.120023 -2.856 0.00526 **
I(Electives^3) 0.008285 0.004524 1.831 0.07015 .
---

Sum of squared errors (SSE): 706.1, Error df: 96


R-squared: 0.6836 28
28
Example 1: Interpreting the Coefficients
Linear model
Interest = 8.7 + 1.4*Electives

Quadratic model
Interest = 1.0 + 3.3*Electives + -0.1*Electives2

In polynomial model
b0 interpretation is unchanged
but its value will likely change.
Predicted value when
Electives = 0

b1 is the linear effect of


Electives at Electives = 0
29
29
Example 1: Predicting Interest in Electives
Quadratic model
Interest = 1.0 + 3.3*Electives + -0.1*Electives2

In other words, 3.3 is the slope of the tangent line at


Electives = 0

In polynomial regression,
the focus remains on
the linear effect but it
quantifies change in Y
for change in X at a point in
the X distribution
Higher order terms inform
us how the linear effect
changes across distribution 30
of X 30
Centering Predictors
How will model change if Electives is centered?
dE$cElectives = scale(dE$Electives,scale=FALSE)
mcQuad = lm(Interest ~ cElectives + I(cElectives^2), data=dE)
modelSummary(mcQuad)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 19.30498 0.34473 56.000 < 2e-16 ***
cElectives 1.20599 0.08969 13.446 < 2e-16 ***
I(cElectives^2) -0.12658 0.02200 -5.754 0.000000102 ***
---

Sum of squared errors (SSE): 730.8, Error df: 97


R-squared: 0.6726

Overall model fit (i.e., overall effect of electives) will remain


the same but coefficients will change.

b for cElectives will be linear effect at mean of Electives


31
31
Centering Predictors
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 19.30498 0.34473 56.000 < 2e-16
cElectives 1.20599 0.08969 13.446 < 2e-16
I(cElectives^2) -0.12658 0.02200 -5.754 1.02e-07

32
32
Details
Regressors variables in polynomial regression will be
correlated unless X is centered and perfectly symmetric (or
orthogonal coefficients and equal N with categorical
polynomial regression)

Therefore, the current interpretation of higher order


regressors applies only if all lower order predictors are
partialled (i.e., included in the model).

Lower order regressors become “simple effects”. E.g.-


Linear effect of X at X=0.

As with interactions, you can get simple (linear) effect of the


variable at any point along its distribution by centering
itself on that raw score and fitting a new model.
33
33
Simple Effects (Slopes)
You can also derive the formula to describe how the simple slope
changes across X using calculus (partial derivative).

We want:
Y
Xi
at a specific value of Xi

Use these three simple rules:


1.The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.

2.The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a can


be either a constant or another variable (or some combination of both

3.The partial derivative of a component of a sum, with respect to X i,


where that component does not contain Xi, is zero
34
34
Simple Effects (Slopes)
1. The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.
2. The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a
can be either a constant or another variable (or some combination
of both
3. The partial derivative of a component of a sum, with respect to X i,
where that component does not contain Xi, is zero

Interest = 1.00 + 3.27*Electives + -0.13*Electives 2

What is the formula that describes the magnitude of the Elective effect
on interest (Interest/ Electives) across the range of electives?

Interest/ Electives = 3.27 – 0.26 * Electives

35
35
Simple Effects (Slopes)
Interest = 1.0 + 3.27*Electives + -0.13*Electives2
Interest/ Electives = 3.27 – 0.26 * Electives
For Electives = 0: Interest/ Electives = 3.27 – 0.26 * (0) = 3.3
For Electives = 8.17 : Interest/ Electives = 3.27 – 0.26 * (8.17) = 1.2

36
36
Conditional effect of Xi
1. The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.
2. The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a
can be either a constant or another variable (or some combination
of both
3. The partial derivative of a component of a sum, with respect to X i,
where that component does not contain Xi, is zero

Y = 2 + 3X1

What is the formula that describes the magnitude of the X 1 effect on Y


(Y/ X1) and why does this make sense?

Y/ X1 = 3 This makes sense because this is a simple linear model
and the effect of X1 is the same across the whole range of X1 values.
37
37
Conditional effect of Xi
1. The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.
2. The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a
can be either a constant or another variable (or some combination
of both
3. The partial derivative of a component of a sum, with respect to X i,
where that component does not contain Xi, is zero

Y = 7 + 5X1 + 4X2
What is the formula that describes the magnitude of the X 1 effect on Y
(Y/ X1) across the range of X1 scores and why does this make
sense?
Y/ X1 = 5 This makes sense because this is an additive model and
the effect of X1 is the same across the whole range of X1 and X2 values.
38
Similarly, Y/ X2 = 4
38
Simple Effects (Slopes): Interactions!!!!
1. The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.
2. The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a
can be either a constant or another variable (or some combination
of both
3. The partial derivative of a component of a sum, with respect to X i,
where that component does not contain Xi, is zero

BC Intent= -1 + 6*Attitudes + 1*Peer Pressure + -1*ATTxPP

What is the formula the describes the magnitude of the Attitudes effect
on BC Intent (BC Intent/ Attitude)
BC Intent/Attitudes = 6 - 1*Peer Pressure

39
39
Simple Effects (Slopes): Interactions!!!!
1. The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.
2. The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a
can be either a constant or another variable (or some combination
of both
3. The partial derivative of a component of a sum, with respect to X i,
where that component does not contain Xi, is zero

BC Intent= -1 + 6*Attitudes + 1*Peer Pressure + -1*ATTxPP

What is the formula the describes the magnitude of the Peer pressure
effect on BC Intent (BC Intent/ Peer pressure)
BC Intent/Peer Pressure = 1 - 1*Attitudes

40
40
Sample Results Section
We regressed Interest on regressors that modeled the linear and quadratic effects of
Number of Electives. Number of Electives was mean-centered in the primary analyses.
We report raw regression coefficients (Bs) and partial eta2 (p2), as appropriate, to quantify
effect sizes.

[This is paragraph optional in some cases]


The overall effect of Number of Electives was significant, F(2,97) = 99.62, p< .0001, with
Number of Electives accounting for 67.3% of the total variance in Interest.

The linear effect of Electives was significant, B= 1.2, p2= .651, t(97)= 13,45, p < .0001,
indicating that taking an additional elective was associated with at 1.2 point increase in
interest for participants who had already taken an average number of electives. However,
the quadratic effect of Electives was also significant, B=-0.1, t(97)= 5.75, p < .0001,
indicating that the magnitude of the Electives effect decreased by .02 for every additional
Elective taken.

[This is paragraph optional in some cases]


Formal testing of simple effects of Electives across meaningful values for Electives
indicated that the magnitude of the Electives effects was significant for participants who
had taken no previous electives, B= 3.3, p2= .434, t(97)= 8.62, p< .0001. The effect of
Electives was also significant for participants who had taken a low number of electives
(i.e., mean - 1 SD; 5.1 electives), B= 2.0, p2= .582, t(97)= 11.63, p< .0001 and a high number
of electives (i.e., mean + 1 SD; 11.3 electives), B= 0.4, p2= .072, t(97)= 2.73, p= .0074. 41
41
Multiple Regression with Non-Linear Effects
The non-linear effects of individual variables can be evaluated in
models that contain other variables as well.

For example, lets examine the effect of Age and Weekly training miles
on cross country skiers 5K race time. We might expect that there are
diminishing returns with increasing weekly mileage and at some point
more miles may even hurt.
Describe the expected relationship between Miles and 5K times?
We would expected a non-linear relationship between Miles and 5K. The
relationship should generally be negative with an increase in miles
leading to a decrease in 5K times. However, the magnitude of the
decrease in 5K Time per Mile increase will not be constant. The
magnitude of this effect will decrease across the distribution of Miles.

Why would we include Age as an additional predictor in this model (4


reasons)?
1. To simultaneously study Age effects in the same sample. 2. To use
Age as a covariate to increase power to test Miles effect. 3. To use Age
as a covariate to examine “unique” effect of Miles controlling for Age.
42 4.
To test for an interaction between Age and Miles 42
Multiple Regression with Non-Linear Effects
Describe how to test the non-linear (polynomial) additive effects of Miles
on 5K Times, controlling for Age?
Estimate two models: compact and augmented
1.Compact model includes only cAge (centered?)

2.Augmented model adds cMiles and cMiles2 (centered?)

3.Test of coefficient for cMiles2 indicates if the effect of training miles on races
times changes based on number of miles skied.

4.Coefficient for cMiles indicates effect of miles for someone who skies an
average number of miles.

5.Model comparison of augmented to compact model provides test of overall


Miles effect. NOTE: There is no single coefficient to test. Must use model
comparison approach.

6.Test of coefficient for cAge in augmented model provides test of Age effect
(controlling for miles).

7.Can re-center Miles on other values to determine linear effect across range
43 of
scores. Can quantify effect via model formula using partial derivative 43
Multiple Regression with Non-Linear Effects
mC = lm(Time ~ cAge, data=dMiles)
summary(mC)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 23.55225 0.52421 44.929 < 2e-16 ***
cAge 0.21677 0.04562 4.752 0.00000903 ***
---

Sum of squared errors (SSE): 1714.7, Error df: 78


R-squared: 0.2245

mA = lm(Time ~ cAge + cMiles + I(cMiles^2),data=dMiles)


summary(mA)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 22.037345 0.481184 45.798 < 2e-16 ***
cAge 0.168096 0.027779 6.051 0.0000000505 ***
cMiles -0.256454 0.023236 -11.037 < 2e-16 ***
I(cMiles^2) 0.008027 0.001927 4.165 0.0000814131 ***
---

Sum of squared errors (SSE): 603.4, Error df: 76


R-squared: 0.7271 44
44
Multiple Regression with Non-Linear Effects
modelCompare(mC,mA)
SSE (Compact) = 1714.72
SSE (Augmented) = 603.4482
PRE = 0.6480778
F(2,76) = 69.9784, p = 5.820688e-18

45
45
Multiple Regression with Non-Linear Effects
Describe and test the effect of Age on 5K Times, controlling for Miles?
This is obtained from the augmented model. Age has a significant
positive effect on 5K race times, controlling for Weekly miles, b= 0.17,
R2= 0.13, t(76)= 6.05, p< .001. For every one year increase in Age, 5K
race times increase by .17 minutes.
modelSummary(mA)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 22.037345 0.481184 45.798 < 2e-16
cAge 0.168096 0.027779 6.051 5.05e-08
cMiles -0.256454 0.023236 -11.037 < 2e-16
I(cMiles^2) 0.008027 0.001927 4.165 8.14e-05

modelEffectSizes(mA)
Coefficients
SSR df pEta-sqr dR-sqr
(Intercept) 16654.1790 1 0.9650 NA
cAge 290.7416 1 0.3251 0.1315
cMiles 967.2142 1 0.6158 0.4374
I(cMiles^2) 137.7470 1 0.1858 0.0623
46
Sum of squared errors (SSE): 603.4
Sum of squared total (SST): 2211.1 46
Multiple Regression with Non-Linear Effects
Is there evidence that the effect of Miles is quadratic?
This is tested in the augmented model via the coefficient for Miles 2. This
coefficient is significant, which indicates that it adds unique variance
beyond the linear component. This also means that the size of the linear
miles effects changes based on miles (conceptual link to interaction?).
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 22.037345 0.481184 45.798 < 2e-16 ***
cAge 0.168096 0.027779 6.051 0.0000000505 ***
cMiles -0.256454 0.023236 -11.037 < 2e-16 ***
I(cMiles^2) 0.008027 0.001927 4.165 0.0000814131 ***
---

Sum of squared errors (SSE): 603.4, Error df: 76


R-squared: 0.7271

47
47
Multiple Regression with Non-Linear Effects
Describe and test the overall effect of Miles, controlling for
Age?
This is tested via model comparison for the augmented (age,
miles, miles2) vs. compact (age only) models. There is a
significant overall effect of miles on 5K times, F(2,76) = 69.98,
p < .001, with weekly mileage accounting for 64.8% of the
unexplained variance in 5K times after controlling for age.

modelCompare(mC,mA)
SSE (Compact) = 1714.72
SSE (Augmented) = 603.4482
PRE = 0.6480778
F(2,76) = 69.9784, p = 5.820688e-18

48
48
Multiple Regression with Non-Linear Effects
What do you want to report to describe the Miles effect?
1. Size of overall effect in variance terms and sig. test (last
slide)
2. b (R2 or partial eta2?) and sig tests for linear and
quadratic term in centered model
3. Magnitude (and tests) of simples slopes?
4. Overall form of relationship between Miles and 5K times?

49
49
Multiple Regression with Non-Linear Effects
Describe and test the “average” linear effect of miles.
There was a significant negative linear effect of Miles for skiers with
average weekly mileage, b= -0.26, , ΔR2= 0.44, t(76)= 11.04, p< .001. A
one mile increase for runners with average weekly mileage is
associated with a .26 minute decrease in 5K times.

Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 22.037345 0.481184 45.798 < 2e-16 ***
cAge 0.168096 0.027779 6.051 0.0000000505 ***
cMiles -0.256454 0.023236 -11.037 < 2e-16 ***
I(cMiles^2) 0.008027 0.001927 4.165 0.0000814131 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Sum of squared errors (SSE): 603.4, Error df: 76


R-squared: 0.72716

50
50
Multiple Regression with Non-Linear Effects
Report B and sig test for quadratic effect.
However, there was also a significant quadratic effect for Miles, b= 0.01,
ΔR2= 0.06, t(76)= 4.17, p< .001. This indicates that for every one mile
increase in weekly mileage, the magnitude of the linear mileage
effects decreases.

Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 22.037345 0.481184 45.798 < 2e-16 ***
cAge 0.168096 0.027779 6.051 0.0000000505 ***
cMiles -0.256454 0.023236 -11.037 < 2e-16 ***
I(cMiles^2) 0.008027 0.001927 4.165 0.0000814131 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Sum of squared errors (SSE): 603.4, Error df: 76


R-squared: 0.7271

51
51
Simple Effects (Slopes)
dMiles$hMiles = dMiles$cMiles - sd(dMiles$cMiles)
mHigh= lm(Time ~ cAge + hMiles + I(hMiles^2),data=dMiles)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 20.026014 0.449986 44.504 < 2e-16 ***
cAge 0.168096 0.027779 6.051 0.0000000505 ***
hMiles -0.034521 0.058311 -0.592 0.556
I(hMiles^2) 0.008027 0.001927 4.165 0.0000814131 ***
---

Sum of squared errors (SSE): 603.4, Error df: 76


R-squared: 0.7271
dMiles$lMiles = dMiles$cMiles + sd(dMiles$cMiles)
mLow= lm(Time ~ cAge + lMiles + I(lMiles^2),data=dMiles)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 27.116838 0.449930 60.269 < 2e-16 ***
cAge 0.168096 0.027779 6.051 0.00000005048546 ***
lMiles -0.478387 0.057948 -8.256 0.00000000000357 ***
I(lMiles^2) 0.008027 0.001927 4.165 0.00008141310687 ***
---

Sum of squared errors (SSE): 603.4, Error df: 76


R-squared: 0.7271 52
52
Simple Effects (Slopes)
Report and test simple slopes
As described earlier, for runners with average weekly mileage, a one mile
increase is associated with a .26 minute decrease in 5K times (p<.001).

For runners with low weekly mileage (i.e., 1 SD below mean), a mile
increase is associated with a .48 minute decrease in 5K times (p< .001).

For runners with high weekly mileage (i.e., 1 SD above mean), a mile
increase is associated with a non-significant .03 minute decrease in 5K
times (p= .556).

53
53
Simple Effects (Slopes)
mrA= lm(Time ~ cAge + Miles + I(Miles^2),data=dMiles)
summary(mrA)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 36.862826 1.550650 23.773 < 2e-16
cAge 0.168096 0.027779 6.051 5.05e-08
Miles -0.736047 0.117271 -6.276 1.96e-08
I(Miles^2) 0.008027 0.001927 4.165 8.14e-05

Describe the mileage effect as a formula (5K Time/ Miles)

5K Times= 36.9 + 0.17 * cAge + -0.74 * Miles+ 0.01 * Miles2

5K Time/ Miles= -.74 + 0.02 * Miles

54
54
Displaying the Mileage Effect
• Scatterplot with prediction line and error bands for mean
Age
• Could graph multiple lines for different ages (what would
it look like?)
• Could remove Age variance from scatterplot points (how?)

55
55
Multiple Regression with Non-Linear Effects
What would have changed had we 1) not centered Age or 2) not centered
Miles?
Not centering AGE has no effect on AGE or MILES coefficients. There are
no higher order (interactive or non-linear) effects involving AGE

Not centering Miles will change Miles coefficient b/c Miles is in a higher
order effect involving Miles (i.e., nonlinear Miles2). Miles will be simple
effect at 0 on Miles if not centered.

Not centering Miles has not effect on AGE b/c AGE is not in a higher order
(interactive) effect with Miles.

Scale of both Age and Miles DOES affect intercept (b0).

56
56

You might also like