0% found this document useful (0 votes)

16 views

Lecture 6.2 - Polynomial Regression

Uploaded by

suryapratp369

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Lecture 6.2 - Polynomial Regression

Uploaded by

suryapratp369

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 56

Polynomial Regression

Learning Objectives
 What is polynomial regression and when is it appropriate
 Contrast with other non-linear methods
 Interpret coefficients from quadratic models when X is
centered and uncentered
 Test and describe overall effect of predictor variable X
 Test and describe linear and quadratic effects of X
 Test simple effects of X

 Use “calculus” to describe how effect of X changes over

range of X
 Understand conceptual link between polynomial
regression and interactions
 Use calculus to describe interaction effect
 Analyze polynomial effect of X with additive covariate
 Write up results and make figure
How to Handle Non-Linear Effects
Power transformations can make simple monotone relationships more
linear (Fig A). Polynomial regression (or other transformations, e.g. logit)
is often needed for more complex relationships (Figs, B & C)
How to Handle Non-Linear Effects
• Simple monotonic relationships: Power
transformations.

• Polynomial Regression- Quantitative Variables

• [Polynomial Regression- Categorical Variables]

• Generalized Linear Models (e.g., logistic

regression)

4
Non-Linear Effects in MR/GLM

Multiple regression/GLM is “linear in the regressors”

The predicted score is a linear combination of the regressors

(X’s) in the model

Each regressor is multiplied by its coefficient and added

together (+ intercept/constant)

Y’ = b1X1 + b2X2 + …… b0

5
5
Linear vs. Logistic Regression
Multiple Regression Logistic Regression
Y′ = b0 + b1X1 Y′ = e b0 + b1X1
(1+ eb0 + b1X1)

Predicted values graphed for X1 = -50 to 50

6
6
Polynomial Regression
Y′ = A + BX + CX2 + DX3 + ….. QXN-1

It is important to distinguish between regressors in the

model vs. variables of interest.

In this example there is only one variable of interest. The

powers of X act as a structural set of regressors to allow for
non-linear relationships between this variable of interest &
Y.

However, the model is still linear in the regressors. I.E.

linear combination of the regressors multiplied by their
parameter estimates.

7
7
Polynomial Regression Order
Y′ = A + BX + CX2 + DX3 + ….. QXN-1

If you include (N-1) regressors based on X, you will

perfectly fit the data.

The order of the equation is the highest power: (N-1) in this

example.

X(N-1) is the highest order predictor. All other regressors are

lower order.

The highest order regressor determines the overall shape of

the relationship within the range of –inf to inf

8
8
Polynomial Regression Shape
Y′ = A + BX + CX2 + DX3 + ….. QXN-1

The highest order regressor determines the overall shape of

the relationship within the range of –inf to inf

Linear Quadratic Cubic

Y′ = A + BX Y′ = A+ BX + CX2 Y′ = A+ BX + CX2 + DX3
Zero bends One bend Two bends

There is one less bend than the highest order in the

polynomial model
9
9
Shape and Coefficient Sign
The sign of the coefficient for the highest order regressor
determines the direction of the curvature

Linear Quadratic Cubic

Y’ = 0 + 1X Y’ = 0 + 1X + 1X2 Y’ = 0 + 1X + 1X2 + 1X3

Y’ = 0 + -1X Y’ = 0 + 1X + -1X2 Y’ = 0 + 1X + 1X2 + -1X3

10
10
How to Determine Order
Can fit order up to N-1 but wont

Theory should generally guide order

Social science theory rarely predicts higher than cubic (and

typically not higher than quadratic)

Sometimes higher order models (quadratic, cubic) are

implicated by the distribution of residuals.

11
11
Polynomial vs. Power Transformation of X
Power Transformations of X

Polynomial Regression

12
12
Example 1: Predicting Interest in Electives
How does number of electives taken in an area predict
interest in further electives?
varDescribe(dE)
var n mean sd median trimmed mad min max range skew
Electives 1 100 8.17 3.09 8.00 8.15 2.97 1.00 17.00 16.00 0.14
Interest 2 100 18.10 4.75 19.33 18.58 3.62 3.56 26.49 22.92 -0.96
kurtosis se
Electives -0.30 0.31
Interest 0.74 0.47

cor(dE)
Electives Interest
Electives 1.0000000 0.7488801
Interest 0.7488801 1.0000000

13
13
Example 1: Predicting Interest in Electives
scatterplot(dE$Electives, dE$Interest,cex=1.5, lwd=2,xlab = 'Electives',
ylab='Interest', col='black')

14
14
Example 1: Predicting Interest in Electives
mLinear = lm(Interest ~ Electives, data=dE)
summary(mLinear)

Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 8.7177 0.8968 9.721 4.87e-16 ***
Electives 1.1490 0.1027 11.187 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Sum of squared errors (SSE): 980.2, Error df: 98

R-squared: 0.5608

15
15
Example 1: Predicting Interest in Electives
modelAssumptions(mLinear,Type=‘normal')

16
16
Example 1: Predicting Interest in Electives
modelAssumptions(mLinear,Type='constant', one.page=FALSE)

17
17
Example 1: Predicting Interest in Electives
modelAssumptions(mLinear,Type=‘linear', one.page=FALSE)

18
18
Example 1: Predicting Interest in Electives
ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS
USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:
Level of Significance = 0.05

Call:
gvlma(x = model)

Value p-value Decision

Global Stat 26.970200 2.016e-05 Assumptions NOT satisfied!
Skewness 0.005409 9.414e-01 Assumptions acceptable.
Kurtosis 0.475458 4.905e-01 Assumptions acceptable.
Link Function 25.445431 4.551e-07 Assumptions NOT satisfied!
Heteroscedasticity 1.043902 3.069e-01 Assumptions acceptable.

19
19
Example 1: Predicting Interest in Electives
mLinear = lm(Interest ~ Electives, data=dE)
summary(mLinear)

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.7177 0.8968 9.721 4.87e-16
Electives 1.1490 0.1027 11.187 < 2e-16

Residual standard error: 3.163 on 98 degrees of freedom

Multiple R-squared: 0.5608, Adjusted R-squared: 0.5563
F-statistic: 125.1 on 1 and 98 DF, p-value: < 2.2e-16

What are your concerns about interpreting this model?

The linearity assumption regarding the electives effect is
pretty clearly violated. When this assumption is violated, the
parameter estimates are biased. In this case, you can expect
that the model is underestimating the true magnitude of the
electives effect. This is a bad model!
20
20
How to estimate polynomial (quadratic) model
dE$Electives2 = dE$Electives * dE$Electives

mQuad= lm(Interest ~ Electives + Electives2, data=dE)

summary(mQuad)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 1.0031 1.5503 0.647 0.519
Electives 3.2743 0.3800 8.617 0.000000000000129 ***
Electives2 -0.1266 0.0220 -5.754 0.000000101764858 ***
---
Sum of squared errors (SSE): 730.8, Error df: 97
R-squared: 0.6726

mQuad= lm(Interest ~ Electives + I(Electives^2), data=dE)

summary(mQuad)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 1.0031 1.5503 0.647 0.519
Electives 3.2743 0.3800 8.617 0.000000000000129 ***
I(Electives^2) -0.1266 0.0220 -5.754 0.000000101764858 ***
---
Sum of squared errors (SSE): 730.8, Error df: 97
R-squared: 0.6726 21
21
Compare linear and quadratic models
mQuad = lm(Interest ~ Electives + I(Electives^2), data=dE)
summary(mQuad)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 1.0031 1.5503 0.647 0.519
Electives 3.2743 0.3800 8.617 0.000000000000129 ***
I(Electives^2) -0.1266 0.0220 -5.754 0.000000101764858 ***
---

Sum of squared errors (SSE): 730.8, Error df: 97

R-squared: 0.6726

mLinear = lm(Interest ~ Electives, data=dE)

summary(mLinear)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 8.7177 0.8968 9.721 4.87e-16 ***
Electives 1.1490 0.1027 11.187 < 2e-16 ***
---

Sum of squared errors (SSE): 980.2, Error df: 98

R-squared: 0.5608 22
22
Example 1: Predicting Interest in Electives
Linear model
Interest = 8.7 + 1.4*Electives

Quadratic model
Interest = 1.0 + 3.3*Electives + -0.1*Electives2

23
23
Example 1: Predicting Interest in Electives

24
24
Example 1: Predicting Interest in Electives

25
25
How to test overall effect of variable
mQuad = lm(Interest ~ Electives + I(Electives^2), data=dE)
summary(mQuad)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 1.0031 1.5503 0.647 0.519
Electives 3.2743 0.3800 8.617 0.000000000000129 ***
I(Electives^2) -0.1266 0.0220 -5.754 0.000000101764858 ***
---

Sum of squared errors (SSE): 730.8, Error df: 97

R-squared: 0.6726

How do we get a test of the overall Electives effect in

polynomial (i.e., quadratic) model?
The test of the set of regressors that code for Electives, In
this case, Electives and Electives2 provides the overall test.
Can use R2 here b/c these are the only two regressors in the
model. What if there were other control variables in model?
26
26
Is quadratic model better
mQuad = lm(Interest ~ Electives + I(Electives^2), data=dE)
summary(mQuad)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.0031 1.5503 0.647 0.519
Electives 3.2743 0.3800 8.617 1.29e-13
I(Electives^2) -0.1266 0.0220 -5.754 1.02e-07

Residual standard error: 2.745 on 97 degrees of freedom

Multiple R-squared: 0.6726, Adjusted R-squared: 0.6658
F-statistic: 99.62 on 2 and 97 DF, p-value: < 2.2e-16

How can you test if the quadratic model is necessary?

Test if the quadratic model fits better than the linear model.

You can compare the augmented model with electives and

electives2 to the compact model with only electives. This is
of course, equivalent to testing if the coefficient for electives 2
27
is 0. 27
Is higher order model necessary
How can you test if more complex model (e.g., cubic) is
needed?
Fit the cubic and compare this augmented cubic model to
quadratic model or test if the coefficient for cubic term is 0.

mCubic = lm(Interest ~ Electives + I(Electives^2) + I(Electives^3),

data=dE)
summary(mCubic)

Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) -2.581884 2.485665 -1.039 0.30155
Electives 4.937010 0.982495 5.025 0.00000233 ***
I(Electives^2) -0.342747 0.120023 -2.856 0.00526 **
I(Electives^3) 0.008285 0.004524 1.831 0.07015 .
---

Sum of squared errors (SSE): 706.1, Error df: 96

R-squared: 0.6836 28
28
Example 1: Interpreting the Coefficients
Linear model
Interest = 8.7 + 1.4*Electives

Quadratic model
Interest = 1.0 + 3.3*Electives + -0.1*Electives2

In polynomial model
b0 interpretation is unchanged
but its value will likely change.
Predicted value when
Electives = 0

b1 is the linear effect of

Electives at Electives = 0
29
29
Example 1: Predicting Interest in Electives
Quadratic model
Interest = 1.0 + 3.3*Electives + -0.1*Electives2

In other words, 3.3 is the slope of the tangent line at

Electives = 0

In polynomial regression,
the focus remains on
the linear effect but it
quantifies change in Y
for change in X at a point in
the X distribution
Higher order terms inform
us how the linear effect
changes across distribution 30
of X 30
Centering Predictors
How will model change if Electives is centered?
dE$cElectives = scale(dE$Electives,scale=FALSE)
mcQuad = lm(Interest ~ cElectives + I(cElectives^2), data=dE)
modelSummary(mcQuad)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 19.30498 0.34473 56.000 < 2e-16 ***
cElectives 1.20599 0.08969 13.446 < 2e-16 ***
I(cElectives^2) -0.12658 0.02200 -5.754 0.000000102 ***
---

Sum of squared errors (SSE): 730.8, Error df: 97

R-squared: 0.6726

Overall model fit (i.e., overall effect of electives) will remain

the same but coefficients will change.

b for cElectives will be linear effect at mean of Electives

31
31
Centering Predictors
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 19.30498 0.34473 56.000 < 2e-16
cElectives 1.20599 0.08969 13.446 < 2e-16
I(cElectives^2) -0.12658 0.02200 -5.754 1.02e-07

32
32
Details
Regressors variables in polynomial regression will be
correlated unless X is centered and perfectly symmetric (or
orthogonal coefficients and equal N with categorical
polynomial regression)

Therefore, the current interpretation of higher order

regressors applies only if all lower order predictors are
partialled (i.e., included in the model).

Lower order regressors become “simple effects”. E.g.-

Linear effect of X at X=0.

As with interactions, you can get simple (linear) effect of the

variable at any point along its distribution by centering
itself on that raw score and fitting a new model.
33
33
Simple Effects (Slopes)
You can also derive the formula to describe how the simple slope
changes across X using calculus (partial derivative).

We want:
Y
Xi
at a specific value of Xi

Use these three simple rules:

1.The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.

2.The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a can

be either a constant or another variable (or some combination of both

3.The partial derivative of a component of a sum, with respect to X i,

where that component does not contain Xi, is zero
34
34
Simple Effects (Slopes)
1. The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.
2. The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a
can be either a constant or another variable (or some combination
of both
3. The partial derivative of a component of a sum, with respect to X i,
where that component does not contain Xi, is zero

Interest = 1.00 + 3.27Electives + -0.13Electives 2

What is the formula that describes the magnitude of the Elective effect
on interest (Interest/ Electives) across the range of electives?

Interest/ Electives = 3.27 – 0.26 * Electives

35
35
Simple Effects (Slopes)
Interest = 1.0 + 3.27*Electives + -0.13*Electives2
Interest/ Electives = 3.27 – 0.26 * Electives
For Electives = 0: Interest/ Electives = 3.27 – 0.26 * (0) = 3.3
For Electives = 8.17 : Interest/ Electives = 3.27 – 0.26 * (8.17) = 1.2

36
36
Conditional effect of Xi
1. The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.
2. The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a
can be either a constant or another variable (or some combination
of both
3. The partial derivative of a component of a sum, with respect to X i,
where that component does not contain Xi, is zero

Y = 2 + 3X1

What is the formula that describes the magnitude of the X 1 effect on Y

(Y/ X1) and why does this make sense?

Y/ X1 = 3 This makes sense because this is a simple linear model
and the effect of X1 is the same across the whole range of X1 values.
37
37
Conditional effect of Xi
1. The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.
2. The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a
can be either a constant or another variable (or some combination
of both
3. The partial derivative of a component of a sum, with respect to X i,
where that component does not contain Xi, is zero

Y = 7 + 5X1 + 4X2
What is the formula that describes the magnitude of the X 1 effect on Y
(Y/ X1) across the range of X1 scores and why does this make
sense?
Y/ X1 = 5 This makes sense because this is an additive model and
the effect of X1 is the same across the whole range of X1 and X2 values.
38
Similarly, Y/ X2 = 4
38
Simple Effects (Slopes): Interactions!!!!
1. The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.
2. The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a
can be either a constant or another variable (or some combination
of both
3. The partial derivative of a component of a sum, with respect to X i,
where that component does not contain Xi, is zero

BC Intent= -1 + 6Attitudes + 1Peer Pressure + -1*ATTxPP

What is the formula the describes the magnitude of the Attitudes effect
on BC Intent (BC Intent/ Attitude)
BC Intent/Attitudes = 6 - 1*Peer Pressure

39
39
Simple Effects (Slopes): Interactions!!!!
1. The partial derivative of a sum, with respect to Xi, equals the sum of
the partial derivatives of the components of that sum.
2. The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a
can be either a constant or another variable (or some combination
of both
3. The partial derivative of a component of a sum, with respect to X i,
where that component does not contain Xi, is zero

BC Intent= -1 + 6Attitudes + 1Peer Pressure + -1*ATTxPP

What is the formula the describes the magnitude of the Peer pressure
effect on BC Intent (BC Intent/ Peer pressure)
BC Intent/Peer Pressure = 1 - 1*Attitudes

40
40
Sample Results Section
We regressed Interest on regressors that modeled the linear and quadratic effects of
Number of Electives. Number of Electives was mean-centered in the primary analyses.
We report raw regression coefficients (Bs) and partial eta2 (p2), as appropriate, to quantify
effect sizes.

[This is paragraph optional in some cases]

The overall effect of Number of Electives was significant, F(2,97) = 99.62, p< .0001, with
Number of Electives accounting for 67.3% of the total variance in Interest.

The linear effect of Electives was significant, B= 1.2, p2= .651, t(97)= 13,45, p < .0001,
indicating that taking an additional elective was associated with at 1.2 point increase in
interest for participants who had already taken an average number of electives. However,
the quadratic effect of Electives was also significant, B=-0.1, t(97)= 5.75, p < .0001,
indicating that the magnitude of the Electives effect decreased by .02 for every additional
Elective taken.

[This is paragraph optional in some cases]

Formal testing of simple effects of Electives across meaningful values for Electives
indicated that the magnitude of the Electives effects was significant for participants who
had taken no previous electives, B= 3.3, p2= .434, t(97)= 8.62, p< .0001. The effect of
Electives was also significant for participants who had taken a low number of electives
(i.e., mean - 1 SD; 5.1 electives), B= 2.0, p2= .582, t(97)= 11.63, p< .0001 and a high number
of electives (i.e., mean + 1 SD; 11.3 electives), B= 0.4, p2= .072, t(97)= 2.73, p= .0074. 41
41
Multiple Regression with Non-Linear Effects
The non-linear effects of individual variables can be evaluated in
models that contain other variables as well.

For example, lets examine the effect of Age and Weekly training miles
on cross country skiers 5K race time. We might expect that there are
diminishing returns with increasing weekly mileage and at some point
more miles may even hurt.
Describe the expected relationship between Miles and 5K times?
We would expected a non-linear relationship between Miles and 5K. The
relationship should generally be negative with an increase in miles
leading to a decrease in 5K times. However, the magnitude of the
decrease in 5K Time per Mile increase will not be constant. The
magnitude of this effect will decrease across the distribution of Miles.

Why would we include Age as an additional predictor in this model (4

reasons)?
1. To simultaneously study Age effects in the same sample. 2. To use
Age as a covariate to increase power to test Miles effect. 3. To use Age
as a covariate to examine “unique” effect of Miles controlling for Age.
42 4.
To test for an interaction between Age and Miles 42
Multiple Regression with Non-Linear Effects
Describe how to test the non-linear (polynomial) additive effects of Miles
on 5K Times, controlling for Age?
Estimate two models: compact and augmented
1.Compact model includes only cAge (centered?)

2.Augmented model adds cMiles and cMiles2 (centered?)

3.Test of coefficient for cMiles2 indicates if the effect of training miles on races
times changes based on number of miles skied.

4.Coefficient for cMiles indicates effect of miles for someone who skies an
average number of miles.

5.Model comparison of augmented to compact model provides test of overall

Miles effect. NOTE: There is no single coefficient to test. Must use model
comparison approach.

6.Test of coefficient for cAge in augmented model provides test of Age effect
(controlling for miles).

7.Can re-center Miles on other values to determine linear effect across range
43 of
scores. Can quantify effect via model formula using partial derivative 43
Multiple Regression with Non-Linear Effects
mC = lm(Time ~ cAge, data=dMiles)
summary(mC)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 23.55225 0.52421 44.929 < 2e-16 ***
cAge 0.21677 0.04562 4.752 0.00000903 ***
---

Sum of squared errors (SSE): 1714.7, Error df: 78

R-squared: 0.2245

mA = lm(Time ~ cAge + cMiles + I(cMiles^2),data=dMiles)

summary(mA)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 22.037345 0.481184 45.798 < 2e-16 ***
cAge 0.168096 0.027779 6.051 0.0000000505 ***
cMiles -0.256454 0.023236 -11.037 < 2e-16 ***
I(cMiles^2) 0.008027 0.001927 4.165 0.0000814131 ***
---

Sum of squared errors (SSE): 603.4, Error df: 76

R-squared: 0.7271 44
44
Multiple Regression with Non-Linear Effects
modelCompare(mC,mA)
SSE (Compact) = 1714.72
SSE (Augmented) = 603.4482
PRE = 0.6480778
F(2,76) = 69.9784, p = 5.820688e-18

45
45
Multiple Regression with Non-Linear Effects
Describe and test the effect of Age on 5K Times, controlling for Miles?
This is obtained from the augmented model. Age has a significant
positive effect on 5K race times, controlling for Weekly miles, b= 0.17,
R2= 0.13, t(76)= 6.05, p< .001. For every one year increase in Age, 5K
race times increase by .17 minutes.
modelSummary(mA)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 22.037345 0.481184 45.798 < 2e-16
cAge 0.168096 0.027779 6.051 5.05e-08
cMiles -0.256454 0.023236 -11.037 < 2e-16
I(cMiles^2) 0.008027 0.001927 4.165 8.14e-05

modelEffectSizes(mA)
Coefficients
SSR df pEta-sqr dR-sqr
(Intercept) 16654.1790 1 0.9650 NA
cAge 290.7416 1 0.3251 0.1315
cMiles 967.2142 1 0.6158 0.4374
I(cMiles^2) 137.7470 1 0.1858 0.0623
46
Sum of squared errors (SSE): 603.4
Sum of squared total (SST): 2211.1 46
Multiple Regression with Non-Linear Effects
Is there evidence that the effect of Miles is quadratic?
This is tested in the augmented model via the coefficient for Miles 2. This
coefficient is significant, which indicates that it adds unique variance
beyond the linear component. This also means that the size of the linear
miles effects changes based on miles (conceptual link to interaction?).
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 22.037345 0.481184 45.798 < 2e-16 ***
cAge 0.168096 0.027779 6.051 0.0000000505 ***
cMiles -0.256454 0.023236 -11.037 < 2e-16 ***
I(cMiles^2) 0.008027 0.001927 4.165 0.0000814131 ***
---

Sum of squared errors (SSE): 603.4, Error df: 76

R-squared: 0.7271

47
47
Multiple Regression with Non-Linear Effects
Describe and test the overall effect of Miles, controlling for
Age?
This is tested via model comparison for the augmented (age,
miles, miles2) vs. compact (age only) models. There is a
significant overall effect of miles on 5K times, F(2,76) = 69.98,
p < .001, with weekly mileage accounting for 64.8% of the
unexplained variance in 5K times after controlling for age.

modelCompare(mC,mA)
SSE (Compact) = 1714.72
SSE (Augmented) = 603.4482
PRE = 0.6480778
F(2,76) = 69.9784, p = 5.820688e-18

48
48
Multiple Regression with Non-Linear Effects
What do you want to report to describe the Miles effect?
1. Size of overall effect in variance terms and sig. test (last
slide)
2. b (R2 or partial eta2?) and sig tests for linear and
quadratic term in centered model
3. Magnitude (and tests) of simples slopes?
4. Overall form of relationship between Miles and 5K times?

49
49
Multiple Regression with Non-Linear Effects
Describe and test the “average” linear effect of miles.
There was a significant negative linear effect of Miles for skiers with
average weekly mileage, b= -0.26, , ΔR2= 0.44, t(76)= 11.04, p< .001. A
one mile increase for runners with average weekly mileage is
associated with a .26 minute decrease in 5K times.

Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 22.037345 0.481184 45.798 < 2e-16 ***
cAge 0.168096 0.027779 6.051 0.0000000505 ***
cMiles -0.256454 0.023236 -11.037 < 2e-16 ***
I(cMiles^2) 0.008027 0.001927 4.165 0.0000814131 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Sum of squared errors (SSE): 603.4, Error df: 76

R-squared: 0.72716

50
50
Multiple Regression with Non-Linear Effects
Report B and sig test for quadratic effect.
However, there was also a significant quadratic effect for Miles, b= 0.01,
ΔR2= 0.06, t(76)= 4.17, p< .001. This indicates that for every one mile
increase in weekly mileage, the magnitude of the linear mileage
effects decreases.

Sum of squared errors (SSE): 603.4, Error df: 76

R-squared: 0.7271

51
51
Simple Effects (Slopes)
dMiles$hMiles = dMiles$cMiles - sd(dMiles$cMiles)
mHigh= lm(Time ~ cAge + hMiles + I(hMiles^2),data=dMiles)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 20.026014 0.449986 44.504 < 2e-16 ***
cAge 0.168096 0.027779 6.051 0.0000000505 ***
hMiles -0.034521 0.058311 -0.592 0.556
I(hMiles^2) 0.008027 0.001927 4.165 0.0000814131 ***
---

Sum of squared errors (SSE): 603.4, Error df: 76

R-squared: 0.7271
dMiles$lMiles = dMiles$cMiles + sd(dMiles$cMiles)
mLow= lm(Time ~ cAge + lMiles + I(lMiles^2),data=dMiles)
Coefficients
Estimate SE t-statistic Pr(>|t|)
(Intercept) 27.116838 0.449930 60.269 < 2e-16 ***
cAge 0.168096 0.027779 6.051 0.00000005048546 ***
lMiles -0.478387 0.057948 -8.256 0.00000000000357 ***
I(lMiles^2) 0.008027 0.001927 4.165 0.00008141310687 ***
---

Sum of squared errors (SSE): 603.4, Error df: 76

R-squared: 0.7271 52
52
Simple Effects (Slopes)
Report and test simple slopes
As described earlier, for runners with average weekly mileage, a one mile
increase is associated with a .26 minute decrease in 5K times (p<.001).

For runners with low weekly mileage (i.e., 1 SD below mean), a mile
increase is associated with a .48 minute decrease in 5K times (p< .001).

For runners with high weekly mileage (i.e., 1 SD above mean), a mile
increase is associated with a non-significant .03 minute decrease in 5K
times (p= .556).

53
53
Simple Effects (Slopes)
mrA= lm(Time ~ cAge + Miles + I(Miles^2),data=dMiles)
summary(mrA)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 36.862826 1.550650 23.773 < 2e-16
cAge 0.168096 0.027779 6.051 5.05e-08
Miles -0.736047 0.117271 -6.276 1.96e-08
I(Miles^2) 0.008027 0.001927 4.165 8.14e-05

Describe the mileage effect as a formula (5K Time/ Miles)

5K Times= 36.9 + 0.17 * cAge + -0.74 * Miles+ 0.01 * Miles2

5K Time/ Miles= -.74 + 0.02 * Miles

54
54
Displaying the Mileage Effect
• Scatterplot with prediction line and error bands for mean
Age
• Could graph multiple lines for different ages (what would
it look like?)
• Could remove Age variance from scatterplot points (how?)

55
55
Multiple Regression with Non-Linear Effects
What would have changed had we 1) not centered Age or 2) not centered
Miles?
Not centering AGE has no effect on AGE or MILES coefficients. There are
no higher order (interactive or non-linear) effects involving AGE

Not centering Miles will change Miles coefficient b/c Miles is in a higher
order effect involving Miles (i.e., nonlinear Miles2). Miles will be simple
effect at 0 on Miles if not centered.

Not centering Miles has not effect on AGE b/c AGE is not in a higher order
(interactive) effect with Miles.

Scale of both Age and Miles DOES affect intercept (b0).

56
56

Dimensionality Reduction Using PCA (Principal Component Analysis)
No ratings yet
Dimensionality Reduction Using PCA (Principal Component Analysis)
13 pages
Regression Analysis Sheet
No ratings yet
Regression Analysis Sheet
1 page
Statistics 17 by Keller
No ratings yet
Statistics 17 by Keller
76 pages
Demand Forecasting (For Students) - V6
No ratings yet
Demand Forecasting (For Students) - V6
75 pages
Session 4 - Multiple Linear Regression
No ratings yet
Session 4 - Multiple Linear Regression
63 pages
Solving Multicollinearity Problem: Int. J. Contemp. Math. Sciences, Vol. 6, 2011, No. 12, 585 - 600
No ratings yet
Solving Multicollinearity Problem: Int. J. Contemp. Math. Sciences, Vol. 6, 2011, No. 12, 585 - 600
16 pages
3 Fall 2007 Exam PDF
No ratings yet
3 Fall 2007 Exam PDF
7 pages
Nonlinear Regression Part 1
No ratings yet
Nonlinear Regression Part 1
47 pages
Lecture10 - SIMPLE LINEAR REGRESSION
No ratings yet
Lecture10 - SIMPLE LINEAR REGRESSION
13 pages
Multiple Regression Analysis: I 0 1 I1 K Ik I
100% (1)
Multiple Regression Analysis: I 0 1 I1 K Ik I
30 pages
Regression and Multiple Regression Analysis
100% (1)
Regression and Multiple Regression Analysis
21 pages
2b Multiple Linear Regression
No ratings yet
2b Multiple Linear Regression
14 pages
Econometrics I 7
No ratings yet
Econometrics I 7
20 pages
Principles 2 PDF
No ratings yet
Principles 2 PDF
17 pages
Polynomial Regression Models
No ratings yet
Polynomial Regression Models
5 pages
Topic_13_Correlation_and_Simple_Linear_Regression
No ratings yet
Topic_13_Correlation_and_Simple_Linear_Regression
17 pages
Unit 6
No ratings yet
Unit 6
8 pages
DEEP_LEARNING REC
No ratings yet
DEEP_LEARNING REC
40 pages
Unit 5
No ratings yet
Unit 5
104 pages
MultivariableRegression Summary
No ratings yet
MultivariableRegression Summary
15 pages
Statistics-17 by Keller
100% (1)
Statistics-17 by Keller
76 pages
Greene - Chap 9
No ratings yet
Greene - Chap 9
2 pages
Regression (Basic Concepts)
No ratings yet
Regression (Basic Concepts)
39 pages
Kalman
No ratings yet
Kalman
34 pages
Notes 9
No ratings yet
Notes 9
57 pages
Topic 6B Regression
No ratings yet
Topic 6B Regression
13 pages
Multiple Linear Regression: Chapter 12
No ratings yet
Multiple Linear Regression: Chapter 12
49 pages
Wiener Filter DSP Proakis
No ratings yet
Wiener Filter DSP Proakis
5 pages
Understanding and Applying Kalman Filtering
No ratings yet
Understanding and Applying Kalman Filtering
37 pages
Understanding and Applying Kalman Filtering
No ratings yet
Understanding and Applying Kalman Filtering
37 pages
Desingn of Experiments ch10
No ratings yet
Desingn of Experiments ch10
5 pages
Regression With A Binary Dependent Variable
No ratings yet
Regression With A Binary Dependent Variable
55 pages
L. D. College of Engineering: Lab Manual For
No ratings yet
L. D. College of Engineering: Lab Manual For
70 pages
Multiple Regression and Model Building: Dr. Subhradev Sen Alliance School of Business
No ratings yet
Multiple Regression and Model Building: Dr. Subhradev Sen Alliance School of Business
150 pages
1995 Control charts using robust estimators
No ratings yet
1995 Control charts using robust estimators
13 pages
Chapter 1 Introduction To Data Mining
No ratings yet
Chapter 1 Introduction To Data Mining
10 pages
Multiple Regression
No ratings yet
Multiple Regression
60 pages
Data Analytics Unit 3 Notes
100% (2)
Data Analytics Unit 3 Notes
28 pages
CH 2
No ratings yet
CH 2
14 pages
Simple Regression and Simple Correlation: MA261 Statistical and Numerical Techniques March 24, 2022
No ratings yet
Simple Regression and Simple Correlation: MA261 Statistical and Numerical Techniques March 24, 2022
52 pages
Estimating Penalized Spline Regressions: Theory and Application To Economics
No ratings yet
Estimating Penalized Spline Regressions: Theory and Application To Economics
16 pages
ECON1203/ECON2292 Business and Economic Statistics: Week 2
No ratings yet
ECON1203/ECON2292 Business and Economic Statistics: Week 2
10 pages
SM Unit 3, 2
No ratings yet
SM Unit 3, 2
25 pages
14.Tomography
No ratings yet
14.Tomography
19 pages
Block 1
No ratings yet
Block 1
81 pages
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
No ratings yet
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
40 pages
Functions and Applications
No ratings yet
Functions and Applications
30 pages
ESL: Chapter 1: 1.1 Introduction To Linear Regression
No ratings yet
ESL: Chapter 1: 1.1 Introduction To Linear Regression
4 pages
Linear Regression
No ratings yet
Linear Regression
56 pages
Chap4_Sec5 (1)
No ratings yet
Chap4_Sec5 (1)
96 pages
Simple Regression
100% (1)
Simple Regression
50 pages
Proycto Final Karla Tamayo Bioestadistica - Ingles.
No ratings yet
Proycto Final Karla Tamayo Bioestadistica - Ingles.
5 pages
OLS Method
No ratings yet
OLS Method
12 pages
Exercice V
No ratings yet
Exercice V
5 pages
Note 1 - Approximations and Errors
No ratings yet
Note 1 - Approximations and Errors
3 pages
ppt3 example
No ratings yet
ppt3 example
2 pages
Block1
No ratings yet
Block1
83 pages
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Me Env
No ratings yet
Me Env
37 pages
Tutorials2016s1 Week7 Answers-3
No ratings yet
Tutorials2016s1 Week7 Answers-3
5 pages
Ec 467 Pattern Recognition
No ratings yet
Ec 467 Pattern Recognition
2 pages
Handouts of BIO401 Lesson 1-88
No ratings yet
Handouts of BIO401 Lesson 1-88
387 pages
15MA305-Statistics For Information Technology: Dr. S. Athithan
No ratings yet
15MA305-Statistics For Information Technology: Dr. S. Athithan
18 pages
PS Unit - Iv
No ratings yet
PS Unit - Iv
19 pages
FBA 840 Lecture Slides (Partial) - ADELEKE 20-01-2024
No ratings yet
FBA 840 Lecture Slides (Partial) - ADELEKE 20-01-2024
65 pages
House Price Regression Analysis
No ratings yet
House Price Regression Analysis
15 pages
Chapitre 5 IMEA 1
No ratings yet
Chapitre 5 IMEA 1
32 pages
MODERATION
No ratings yet
MODERATION
2 pages
CH-3-Multiple Linear Regression
No ratings yet
CH-3-Multiple Linear Regression
13 pages
Jurnal Pengaruh Persepsi Dan Harga PDF
No ratings yet
Jurnal Pengaruh Persepsi Dan Harga PDF
11 pages
Motor Trend Car Road Tests
No ratings yet
Motor Trend Car Road Tests
5 pages
Statistics For Managers Using Microsoft® Excel 5th Edition: Multiple Regression Model Building
No ratings yet
Statistics For Managers Using Microsoft® Excel 5th Edition: Multiple Regression Model Building
35 pages
Discrete Random Signal Processing: Textbooks
No ratings yet
Discrete Random Signal Processing: Textbooks
4 pages
CME106 Problem Set 6
No ratings yet
CME106 Problem Set 6
2 pages
Alternative Capital Attracts Commercial P&C Rate To Hike in Q1
No ratings yet
Alternative Capital Attracts Commercial P&C Rate To Hike in Q1
3 pages
Regression and Correlation
No ratings yet
Regression and Correlation
13 pages
Least Squares Matrix Form PDF
No ratings yet
Least Squares Matrix Form PDF
16 pages
Practice Questions Lecture 32-34
No ratings yet
Practice Questions Lecture 32-34
5 pages
Statistical Methods
91% (11)
Statistical Methods
109 pages
ML37 de Thi
No ratings yet
ML37 de Thi
7 pages
SAS Modeling Tool To Access Credit Risk
No ratings yet
SAS Modeling Tool To Access Credit Risk
19 pages
Pattern Recognition Lecture Bayes Decision Theory: Prof. Dr. Marcin Grzegorzek
100% (1)
Pattern Recognition Lecture Bayes Decision Theory: Prof. Dr. Marcin Grzegorzek
35 pages
Hasil Regresi Teori Makro
No ratings yet
Hasil Regresi Teori Makro
5 pages
National University of Singapore ST5223: Statistical Models: Theory/Applications (Semester 2: AY 2016-2017) Time Allowed: 2 Hours
No ratings yet
National University of Singapore ST5223: Statistical Models: Theory/Applications (Semester 2: AY 2016-2017) Time Allowed: 2 Hours
14 pages
機率大抄
No ratings yet
機率大抄
2 pages
Analyzing The Amount of Health Insurance Premiums Using Multiple Linear Regression Models
100% (1)
Analyzing The Amount of Health Insurance Premiums Using Multiple Linear Regression Models
24 pages

Lecture 6.2 - Polynomial Regression

Uploaded by

Lecture 6.2 - Polynomial Regression

Uploaded by

Polynomial Regression

 Use “calculus” to describe how effect of X changes over

• Polynomial Regression- Quantitative Variables

• [Polynomial Regression- Categorical Variables]

• Generalized Linear Models (e.g., logistic

Multiple regression/GLM is “linear in the regressors”

The predicted score is a linear combination of the regressors

Each regressor is multiplied by its coefficient and added

Predicted values graphed for X1 = -50 to 50

It is important to distinguish between regressors in the

In this example there is only one variable of interest. The

However, the model is still linear in the regressors. I.E.

If you include (N-1) regressors based on X, you will

The order of the equation is the highest power: (N-1) in this

X(N-1) is the highest order predictor. All other regressors are

The highest order regressor determines the overall shape of

The highest order regressor determines the overall shape of

Linear Quadratic Cubic

There is one less bend than the highest order in the

Linear Quadratic Cubic

Y’ = 0 + -1X Y’ = 0 + 1X + -1X2 Y’ = 0 + 1X + 1X2 + -1X3

Theory should generally guide order

Social science theory rarely predicts higher than cubic (and

Sometimes higher order models (quadratic, cubic) are

Sum of squared errors (SSE): 980.2, Error df: 98

Value p-value Decision

Residual standard error: 3.163 on 98 degrees of freedom

What are your concerns about interpreting this model?

mQuad= lm(Interest ~ Electives + Electives2, data=dE)

mQuad= lm(Interest ~ Electives + I(Electives^2), data=dE)

Sum of squared errors (SSE): 730.8, Error df: 97

mLinear = lm(Interest ~ Electives, data=dE)

Sum of squared errors (SSE): 980.2, Error df: 98

Sum of squared errors (SSE): 730.8, Error df: 97

How do we get a test of the overall Electives effect in

Residual standard error: 2.745 on 97 degrees of freedom

How can you test if the quadratic model is necessary?

You can compare the augmented model with electives and

mCubic = lm(Interest ~ Electives + I(Electives^2) + I(Electives^3),

Sum of squared errors (SSE): 706.1, Error df: 96

b1 is the linear effect of

In other words, 3.3 is the slope of the tangent line at

Sum of squared errors (SSE): 730.8, Error df: 97

Overall model fit (i.e., overall effect of electives) will remain

b for cElectives will be linear effect at mean of Electives

Therefore, the current interpretation of higher order

Lower order regressors become “simple effects”. E.g.-

As with interactions, you can get simple (linear) effect of the

Use these three simple rules:

2.The partial derivative of aXiM, with respect to Xi is aMXiM-1, where a can

3.The partial derivative of a component of a sum, with respect to X i,

Interest = 1.00 + 3.27*Electives + -0.13*Electives 2

Interest/ Electives = 3.27 – 0.26 * Electives

What is the formula that describes the magnitude of the X 1 effect on Y

BC Intent= -1 + 6*Attitudes + 1*Peer Pressure + -1*ATTxPP

BC Intent= -1 + 6*Attitudes + 1*Peer Pressure + -1*ATTxPP

[This is paragraph optional in some cases]

[This is paragraph optional in some cases]

Why would we include Age as an additional predictor in this model (4

2.Augmented model adds cMiles and cMiles2 (centered?)

5.Model comparison of augmented to compact model provides test of overall

Sum of squared errors (SSE): 1714.7, Error df: 78

mA = lm(Time ~ cAge + cMiles + I(cMiles^2),data=dMiles)

Sum of squared errors (SSE): 603.4, Error df: 76

Sum of squared errors (SSE): 603.4, Error df: 76

Sum of squared errors (SSE): 603.4, Error df: 76

Sum of squared errors (SSE): 603.4, Error df: 76

Sum of squared errors (SSE): 603.4, Error df: 76

Sum of squared errors (SSE): 603.4, Error df: 76

Describe the mileage effect as a formula (5K Time/ Miles)

5K Times= 36.9 + 0.17 * cAge + -0.74 * Miles+ 0.01 * Miles2

5K Time/ Miles= -.74 + 0.02 * Miles

Scale of both Age and Miles DOES affect intercept (b0).

You might also like

Interest = 1.00 + 3.27Electives + -0.13Electives 2

BC Intent= -1 + 6Attitudes + 1Peer Pressure + -1*ATTxPP

BC Intent= -1 + 6Attitudes + 1Peer Pressure + -1*ATTxPP