Chapter 13
Chapter 13
This chapter discusses the analysis of variance model for two categorical explanatory variables. In
particular, it discusses the case where one of the factors is a blocking variable. Two-way ANOVA
can be analyzed as a regression model with two categorical explanatory variables. Each categorical
variable is represented by a set of indicator variables.
• Each combination of levels from a set of factors comprises a treatment. If only one factor is
present, then the treatments are just the levels of that factor.
Blocking
Suppose we are measuring corn yield in a field experiment for 4 varieties (A, B, C, D). A square
field will be divided up into 16 plots and varieties randomly assigned to plots. Suppose also there is a
moisture gradient running East-West across the field. Random assignment of varieties to plots might,
by chance, end up assigning more of the plots for variety A to the East side of the field than the West
side and vice-versa for variety B. More importantly, the moisture gradient in the analysis causes
there to be great variation in yields within each variety; if we can adjust for the moisture gradient,
then we can more easily detect any difference between varieties.
• One way to adjust for the moisture gradient is to divide the field into 4 blocks of 4 plots each
from east to west and then randomly assign all 4 treatments within each block. The block factor
is then included in the analysis.
• If treatments are assigned randomly to plots within each block, then this is called a randomized
complete block design. (Note: the “complete” refers to the fact that every treatment appears in
every block).
• We are primarily interested in making inferences about the treatment variable in a randomized
block design. The blocking variable is included simply to make us better able to detect a
treatment effect – we’re not usually interested in testing for a block effect, because we assume
there is one – that’s why we blocked.
• Write the regression model for mean yield with only the main effects of Variety and Block.
How many coefficient parameters are there in the model?
• According to this model what is the mean yield for each of the 16 Block-Variety combination?
Treatment differences
A B C D A-D B-D C-D
Block 1
Block 2
Block 3
Block 4
page 3
What would a plot of mean yield versus variety look like, with each block having a separate line (see
example on p. 383). This is sometimes called a “profile plot.”
• What would the regression model be if we included the Block by Variety interaction? How
many coefficient parameters are there in the model?
• According to this model what is the mean yield for each of the 16 Block-Variety combination?
Treatment differences
A B C D A-D B-D C-D
Block 1
Block 2
Block 3
Block 4
page 4
• What would the profile plot look like?
• Note: we can’t estimate σ 2 in the model with Variety by Block interaction because there are no
within cell replicates. This model has 16 coefficient parameters plus σ 2 and only 16
observations. Without an estimate of σ 2 , we can’t carry out statistical inferences. Our choices
would be: a) don’t include the interaction, b) include replicates within each Variety by Block
combination. What are the advantages/disadvantages of each?
• Observations within each cell (a “cell” means a particular combination of levels of the two
factors) are independent observations from a normal distribution.
• The cell samples are drawn independently of each other (or there is random assignment to cells).
page 5
Case study 13.1: Intertidal Seaweed Grazers
8 blocks, 6 treatments, 2 replicates per block by treatment combination.
A convenient graphical representation for the data in a two-way classification is the one in Display
13.7 on p. 383, sometimes called a “profile plot.” In SPSS, such a plot can be gotten by:
Graphs…Line…Multiple; choose “Other summary function.” The default function is “Mean” which
is what is desired. You can also obtain the same plot from Graphs…Interactive…Line. This plot
illustrates differences between treatments, between blocks, and treatment by block interaction (if
there is no interaction the profiles are parallel).
The plot below has the blocks in numerical order. The plot on p. 383 has the plots ordered from
smallest mean to largest. An advantage of the latter plot is that it makes it clear that there is more
variability in the means as the means increase (left to right). This suggests that nonconstant variance
might be a problem. To get a plot with the blocks in a different order in SPSS, we would create a
new variable with values 1 to 8 which indicates the desired order of the blocks (for example, 1 would
be for the block with the smallest mean and 8 for the largest). Then use values labels to indicate that
“1” is really “Block 1” and “8” is really “Block 4.”
Treat
CONTROL
f
75.00 fF
L
Lf
LfF
Cover
50.00
Dot/Lines show Means
25.00
0.00
BLOCK 1 BLOCK 3 BLOCK 5 BLOCK 7
BLOCK 2 BLOCK 4 BLOCK 6 BLOCK 8
Block
We should also examine the model assumptions through a residual analysis. To fit a two-way model,
you could create all the indicator variables necessary (7 for block and 5 for treatment plus the 35
products for interaction), and use the Regression procedure, but it’s much easier to use
Analyze…General Linear Model…Univariate. Block and Treatment are entered as “Fixed
factors.” Residuals can be saved under “Save.” The default model includes the interaction; the
model is specified under “Model.”
A residual plot (p. 384) confirms the suspicion of nonconstant variance. Since the responses are
percentages between 0 and 100, which can be converted to proportions between 0 and 1, it’s not too
surprising that the variance is not constant since the variance of a binomial proportion is
page 6
p (1 − p ) / n which is not constant and is greatest for p=.5. The logit transformation is often
useful for proportions. If Y is a proportion, then
⎛ Y ⎞
Logit(Y) = ln⎜ ⎟
⎝1− Y ⎠
The quantity Y/(1-Y) is called the odds ratio since it represents the odds of an event whose probability
is Y.
Remembering to divide Cover by 100 before taking the logit, the profile plot (p. 385) is much
“improved” and there is less evidence of interaction. A residual plot also indicates fewer problems:
Type III sum of squares (the default) indicates that the sum-of-squares for each effect is gotten by
comparing the full model (Treat + Block +Treat*Block) to the model without that effect in it; thus,
• for Treat*Block, compare the full model to the model Treat + Block
page 7
• for Treat, compare the full model to the model Block + Treat*Block
• for Block, compare the full model to the model Treat + Treat*Block
The latter two tests really make no sense since a model with Treat*Block, but not Treat, doesn’t make
sense. Therefore, only the test for the Treat*Block interaction makes sense. Unfortunately, some
people naively use these tests to test for the main effects. There are other options:
If the Treat*Block interaction is not significant, leave it out and refit the model Block + Treat. Then
the test of Treat makes sense (we’re not generally interested in the test of Block). We might also
carefully examine the profile plot, to make sure it’s reasonable to leave out Block*Treat even if it’s
not significant (not significant does not necessarily mean it’s zero).
If the Treat*Block interaction is significant, or if we simply want to conservative and not assume it’s
zero, we can:
1. Test Treat by comparing the model Block to the full model Block+Treat+Block*Treat (this is
not discussed in the text but is advocated by some authors)
2. Realize that an interaction means that the effect of Treat is different in different blocks, and
examine the effect in each block separately. This makes more sense than number 1.
In the Seaweed Grazers example, since the interaction is not significant, and the profile plot indicates
treatment effects that are somewhat consistent across blocks, we might fit the model Block + Treat:
Tests of Between-Subjects Effects
• Note that the SS for Treat and Block have not changed, but that the SS for Error has. It has
become the SS for Error in the full model plus the SS for Treat*Block. The reason the SS for
Block and Treat are unchanged is that the design is balanced (equal samples sizes in all cells).
In unbalanced designs, the SS do change which makes the choice of an appropriate analysis
more important.
page 8
• There is very strong evidence (P<.0001) that there is a difference in the mean log regeneration
ratios among the treatments. There is also very strong evidence of a block effect, but that is
of less interest; we expected a block effect; that’s why blocking was used.
• The Block main effect should always be in the model even if it’s not statistically significant.
This is because we believe there is a block effect (that’s why we blocked) even if we don’t
find strong evidence of it in our particular experiment. It’s just as with paired data: we would
always use a paired t-test and wouldn’t ever use the two-sample t even if there didn’t appear
to be differences between the pairs of subjects.
• Pairwise comparisons of means for each factor, along with some specific types of contrasts, can
be obtained in a couple of different ways which are discussed further down.
To obtain SE’s for arbitrary linear combinations of cell means, as on p. 389, you must do the work by
hand using the formulas we learned in Chapter 6, pp. 154-7. The key is that the estimate of the
common cell standard deviation σ is MSE from your final model. In the Seaweed Grazers
example, the final model is BLOCK + TREAT and MSE = .3586 = .599 with 83 d.f. This is used
for s p in the formulas of Chapter 6.
Example: In number 1, p. 389, the text examines the effect of large fish on the regeneration ratio
through the following contrast in the treatment means (why is it a contrast?):
µ fF − µ f µ LfF − µ Lf
γ1 = +
2 2
This is estimated by the same function of the sample means. Note that the sample means are pooled
over blocks so the sample size for each mean is 16. Therefore,
(12 )2 (− 12 )2 (12 )2 (− 12 )2 1
SE(g) = MSE + + + = .599 = .1497
16 16 16 16 16
Thus a 95% confidence interval for the contrast is -.614 ± t 83 (.975)(.1497) = -.614 ± 1.989(.1497) =
-.614 ± .298 = -.912 to -.316.
Can you reproduce the estimates and standard errors for the remaining contrasts on p. 389?
page 10
• Some pairwise comparisons are also available by choosing Options, putting Treatment under
Display Means for: and checking Compare main effects. However, only “LSD”,
“Bonferroni” and “Sidak” are available; “Tukey” is not. In addition, while this procedure will
give the exact same confidence intervals as Post Hoc (if the same procedure is chosen) for
balanced designs (same number of observations in every cell), it won’t when the design isn’t
balanced. Stick with Post Hoc and don’t mess with this procedure.
• Some other types of contrasts can be obtained through the Contrasts option on the General
Linear Models window. We can define a set of contrasts to be evaluated for each factor. The
choices are limited to the ones listed. Simple gives contrasts that compare each level with a
reference level (either the first or last level). Deviation gives the deviation of the mean of each
level from the overall mean. The deviations are what are displayed as the “Block effect” and
“Treatment effect” in the last column and row of Display 13.12. In the Seaweed experiment for
the Treatment variable, the deviation contrasts have the form
µ1 + µ 2 + … + µ 6
µ i − µ where µ = is the mean of all the treatment means.
6
• Note on profile plots: the General Linear Models window also has a Plots.. option which gives
“Profile” plots. However, it doesn’t plot the observed means, but the estimated means under the
fitted model. While it will give the “right” plot when a full factorial model has been selected
and the design is balanced, I recommend using Graphs…Line…Multiple as described earlier.
page 11
Pairwise comparisons in the presence of an interaction
• If the Block*Treat interaction is significant and appears important, comparison of treatment
means over all blocks may not be meaningful, since the treatment means are averages over all
blocks. A negative effect in one block could be offset by a positive effect in another block so
that there appears to be no effect when averaging over all blocks.
• The presence of an interaction doesn’t mean that averaging over blocks is necessarily
meaningless. If the treatment effects are in the same direction in all blocks, but simply differ
somewhat in size, then averaging over blocks may still be useful. Also, with large sample sizes,
the interaction may be statistically significant but small in size.
• If an interaction is present and is important, then you should compare means within blocks.
You can use the procedure for arbitrary linear combinations of cell means to do this (Section
13.3.4; also described on a previous page of these notes). Just remember that the sample sizes
are the sample sizes for the cells involved in the comparison (for example, in the Seaweed
experiment, to compare two treatments within block 1, the samples sizes are both 2). You can
see that the resulting SE’s will be much larger than if we can pool across blocks.
• It’s possible that some intermediate model may describe what’s going on with an interaction
present. For example, perhaps the treatment effects are very similar for all blocks but one, so
we might analyze those blocks together.