Comparing Two Regression Slopes by Means of An ANCOVA
Comparing Two Regression Slopes by Means of An ANCOVA
For this example we want to determine how body size (snout-vent length) relates to pelvic canal width in both
male and female alligators (data from Handbook of Biological Statistics). In this specific case, sex is a
categorical factor with two levels (i.e. male and female) while snout-vent length is the regressor (x-var) and
pelvic canal width is the response variable (y-var). The ANCOVA will be used to assess if the regression between
body size and pelvic width are the comparable between the sexes.
Interpretation
An ANCOVA is able to test for differences in slopes and intercepts among regression lines. Both concepts have
different biological interpretations. Differences in intercepts are interpreted as differences in magnitude but
not in the rate of change. If we are measuring sizes and regression lines have the same slope but cross the
y-axis at different values, lines should be parallel. This means that growth is similar for both lines but one
group is simply larger than the other. A difference in slopes is interpreted as differences in the rate of change.
In allometric studies, this means that there is a significant change in growth rates among groups.
Slopes should be tested first, by testing for the interaction between the covariate and the factor. If slopes are
significantly different between groups, then testing for different intercepts is somewhat inconsequential since
it is very likely that the intercepts differ too (unless they both go through zero). Additionally, if the interaction
is significant testing for natural effects is meaningless (see The Infamous Type III SS). If the interaction
between the covariate and the factor is not significantly different from zero, then we can assume the slopes
are similar between equations. In this case, we may proceed to test for differences in intercept values among
regression lines.
Performing an ANCOVA
For an ANCOVA our data should have a format very similar to that needed for an Analysis of Variance. We need
a categorical factor with two or more levels (i.e. sex factor has two levels: male and female) and at least one
independent variable and one dependent or response variable (y-var).
> head(gator)
1 de 7 19/04/13 00:51
R in Ecology and Evolution: Comparing two regression slopes b... http://r-eco-evo.blogspot.com/2011/08/comparing-two-regression...
The preceding code shows the first six lines of the gator object which includes three variables: sex, snout
and pelvic, which hold the sex, snout-vent size and the pelvic canal width of alligators, respectively. The sex
variable is a factor with two levels, while the other two variables are numeric in their type.
We can do an ANCOVA both with the lm() and aov() commands. For this tutorial, we will use the aov()
command due to its simplicity.
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The previous code shows the ANCOVA model, pelvic is modeled as the dependent variable with sex as the
factor and snout as the covariate. The summary of the results show a significant effect of snout and sex, but no
significant interaction. These results suggest that the slope of the regression between snout-vent length and
pelvic width is similar for both males and females.
A second more parsimonious model should be fit without the interaction to test for a significant differences in
the slope.
The second model shows that sex has a significant effect on the dependent variable which in this case can be
interpreted as a significant difference in ‘intercepts’ between the regression lines of males and females. We
can compare mod1 and mod2 with the anova() command to assess if removing the interaction significantly
affects the fit of the model:
> anova(mod1,mod2)
Analysis of Variance Table
The anova() command clearly shows that removing the interaction does not significantly affect the fit of the
model (F=0.0129, p=0.91). Therefore, we may conclude that the most parsimonious model is mod2. Biologically
we observe that for alligators, body size has a significant and positive effect on pelvic width and the effect is
similar for males and females. However, we still don’t know how the slopes change.
At this point we are going to fit linear regressions separately for males and females. In most cases, this should
have been performed before the ANCOVA. However, in this example we first tested for differences in the
regression lines and once we were certain of the significant effects we proceeded to fit regression lines.
To accomplish this, we are now going to sub-set the data matrix into two sets, one for males and another for
females. We can do this with the subset() command or using the extract functions []. We will use both in
the following code for didactic purposes:
Separate regression lines can also be fitted using the subset option within the lm() command, however we will
use separate data frames to simplify the creation of graphs:
2 de 7 19/04/13 00:51
R in Ecology and Evolution: Comparing two regression slopes b... http://r-eco-evo.blogspot.com/2011/08/comparing-two-regression...
Residuals:
Min 1Q Median 3Q Max
-0.85665 -0.40653 -0.08933 0.04518 1.57408
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4527 0.9697 0.467 0.647
snout 6.5854 0.8625 7.636 6.85e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residuals:
Min 1Q Median 3Q Max
-0.69961 -0.19364 -0.07634 0.04907 1.15098
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.2199 0.9689 -0.227 0.824
snout 6.7471 0.9574 7.047 5.8e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The regression lines indicate that males have a higher intercept (a=0.45) than females (a=-0.2199), which
means that males are larger. We can now plot both regression lines as follows:
The resulting plot shows the regression lines for males and females on the same plot.
Advanced
We can fit both regression models with a single call to the lm() command using the nested structure of snout
nested within sex (i.e. sex/snout) and removing the single intercept for the model so that separate
intercepts are fit for each equation.
3 de 7 19/04/13 00:51
R in Ecology and Evolution: Comparing two regression slopes b... http://r-eco-evo.blogspot.com/2011/08/comparing-two-regression...
> summary(reg.todo)
Call:
lm(formula = pelvic ~ sex/snout - 1, data = gator)
Residuals:
Min 1Q Median 3Q Max
-0.85665 -0.33099 -0.08933 0.05774 1.57408
Coefficients:
Estimate Std. Error t value Pr(>|t|)
sexfemale -0.2199 1.2175 -0.181 0.858
sexmale 0.4527 0.8498 0.533 0.598
sexfemale:snout 6.7471 1.2031 5.608 3.76e-06 ***
sexmale:snout 6.5854 0.7558 8.713 7.73e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.6209 on 31 degrees of freedom
Multiple R-squared: 0.9936, Adjusted R-squared: 0.9928
F-statistic: 1213 on 4 and 31 DF, p-value: < 2.2e-16
Conclusion
The results shown in the previous figure are consistent with our previous findings, pelvic width is positively
related to snout-vent length. The relationship is linear for both males and females. The regression slope is
positive and similar for both males and females (b ≈ 7.07; weighted average), which means that pelvic width
grows faster than snout-vent length. Finally, the regression line of males intercepts with the y-axis at a higher
value than for females, which means that males are larger.
24 comments:
tom said...
in your example, you showed an ancova where the interaction term is not significant (ie best model has
different intercepts but same slope)...yet you have plotted it with separate slopes. How would you plot it
with different intercepts but the same slope? thanks.
8/29/2011 11:17 PM
Campitor said...
Tom,
If you have estimated the common slope, then you can plot a differnt line for each group with different
intercepts and the same slope using the abline() command
abline(a,b)
where a is the intercept for males or females and b is the common slope.
8/30/2011 9:16 AM
James W said...
An interesting alternative to:
reg.todo <- lm(pelvic~sex/snout - 1, data=gator)
is this formula:
reg.todo <- lm(pelvic~snout*sex, data=gator)
All of the coefficients will be the same for the two models (i.e. both fit a separate slope and intercept for M
and F gators) but there are two differences, one superficial, one really interesting. The superficial one is the
way the coefficients are presented. In your example, you get the true coefficients, in the second example
(snout*sex) you'll get the coefficients for the first factor (females) and then the differences in the coefficient
for the male factor. You can just add the male coefficients to the female ones so they're on the same scale.
The interesting thing in the summary of the second model is that the t score and p-value test different
hypotheses than they do in the first model. In the first one they test whether or not the regression has a
nonzero slope, in the second one, they test whether or not the slopes and intercepts from different factors
are statistically significant! Nifty, right?
11/04/2011 7:31 PM
Unknown said...
11/04/2011 7:31 PM
Campitor said...
4 de 7 19/04/13 00:51