0% found this document useful (0 votes)
11 views

Bauer&Curran 2005

Uploaded by

agoall93
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Bauer&Curran 2005

Uploaded by

agoall93
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

MULTIVARIATE BEHAVIORAL RESEARCH, 40(3), 373–400

Copyright © 2005, Lawrence Erlbaum Associates, Inc.

Probing Interactions in Fixed and


Multilevel Regression: Inferential and
Graphical Techniques
Daniel J. Bauer and Patrick J. Curran
University of North Carolina at Chapel

Many important research hypotheses concern conditional relations in which the ef-
fect of one predictor varies with the value of another. Such relations are commonly
evaluated as multiplicative interactions and can be tested in both fixed- and ran-
dom-effects regression. Often, these interactive effects must be further probed to
fully explicate the nature of the conditional relation. The most common method for
probing interactions is to test simple slopes at specific levels of the predictors. A
more general method is the Johnson-Neyman (J-N) technique. This technique is not
widely used, however, because it is currently limited to categorical by continuous in-
teractions in fixed-effects regression and has yet to be extended to the broader class
of random-effects regression models. The goal of our article is to generalize the J-N
technique to allow for tests of a variety of interactions that arise in both fixed- and
random-effects regression. We review existing methods for probing interactions, ex-
plicate the analytic expressions needed to expand these tests to a wider set of condi-
tions, and demonstrate the advantages of the J-N technique relative to simple slopes
with three empirical examples.

Substantive theories within psychology, education and many other disciplines in


the social sciences are replete with instances in which the effect of one predictor on
the outcome is hypothesized to vary as a function of a second predictor. We will re-
fer to the first of these predictors as the focal predictor and the second as the mod-
erator, although this distinction is strictly theoretical. Hypotheses concerning

This work was funded in part by fellowship DA06062 awarded to the first author and grant
DA13148 awarded to the second author. We would like to thank R. J. Wirth and the members of the
Carolina Structural Equations Modeling Group for their valuable input throughout this project.
Correspondence concerning this article should be addressed to Daniel J. Bauer, L.L. Thurstone
Psychometric Laboratory, Department of Psychology, University of North Carolina, Chapel Hill,
NC 27599–3270. E-mail: [email protected]
374 BAUER AND CURRAN

moderated relationships can be evaluated in the standard fixed-effects regression


model (e.g., Cohen, Cohen, West, & Aiken, 2003), or they can manifest in more
complicated ways in the multilevel (or random effects) regression model (e.g.,
Raudenbush & Bryk, 2002). Regardless of the analytic approach, testing the
unique contribution of the product term net the lower-order main effects provides
an omnibus test of the interaction effect (Baron & Kenny, 1986; Cohen, 1978).
More detailed information on the nature of the moderated relationship must then
be obtained through the use of techniques for probing the interaction term (e.g.,
Aiken & West, 1991).
For the standard fixed-effects regression model, the most popular method for
probing interactions is the “pick-a-point” approach (Rogosa, 1980). This approach
involves plotting and testing the conditional effect of the focal predictor at desig-
nated levels of the moderating variable (e.g., high, medium, and low), where these
conditional effect estimates are commonly referred to as “simple slopes” (Aiken &
West, 1991; Jaccard & Turrisi, 2003). Although widely used in fixed-effects regres-
sion, the consideration of simple slopes in random-effects regression has been lim-
ited. While Hox (2002, pp. 58–63) recommended the plotting of simple slopes to aid
in the interpretation of interactions in multilevel models, to our knowledge this ap-
proach has only been used in a small number of empirical applications and almost al-
ways as a descriptive rather than inferential device (e.g., Bryk & Raudenbush, 1987,
p. 154; Hussong, 2003; Willett, Singer & Martin, 1998, p. 423; Yip & Fuligni, 2002).
While the pick-a-point approach is informative and easy to use, typically only a
small number of specific values of the moderator are selected to evaluate the condi-
tional effect of the focal predictor on the outcome. These values are often selected
somewhat arbitrarily (Rogosa, 1980), and may even reside outside of the range of
the sample data (e.g., one standard deviation above the mean of a skewed predic-
tor). Interest may instead center on how the conditional effect of the focal predictor
changes across the entire range of a continuous moderating variable, rather than
just at a few values.
One such instance is when a grouping variable and continuous variable interact
in an ANCOVA or linear regression model. In the presence of such an interaction,
the group means cannot be compared by simply adjusting for differences on the
continuous covariate. An alternative approach is to evaluate the group mean differ-
ence at each level of the covariate with the goal of determining when the group
mean differences are and are not statistically different. The J-N technique was de-
veloped for this purpose (Johnson & Fay, 1950; Johnson & Neyman, 1936) and in-
volves the use of two closely related procedures for evaluating conditional effects,
the computation of regions of significance and confidence bands (Huitema, 1980).
Whereas regions of significance define the levels of the covariate for which the
group mean difference is significant, confidence bands convey the precision with
which the group mean difference is estimated at each level of the covariate.
PROBING INTERACTIONS 375

There are two key advantages of the J-N technique. First, regions of signifi-
cance provide an inferential test for any possible simple slope of the focal predictor
variable. Second, the confidence bands graphically depict the precision of estima-
tion of the effect of the focal predictor over the full range of the moderator, provid-
ing a range of likely values for the conditional effect that narrows or widens de-
pending on the selected level of the moderator. The point where the confidence
bands are narrowest, also known as the center of accuracy, represents the value of
the moderator at which we have the most confidence estimating the conditional ef-
fect of the focal predictor on the outcome.
Despite these distinct advantages of the J-N technique, the use of regions of sig-
nificance and confidence bands for evaluating interactions remains extremely lim-
ited. One possible reason for this is that the J-N technique has not been generalized
to settings other than the traditional dichotomous by continuous interaction in the
ANCOVA model, even in such recent treatments as Cohen et al. (2003) and
Jaccard and Turrisi (2003). For instance, to our knowledge there are no published
applications in which this technique has been applied to evaluate continuous by
continuous interactions in standard regression models. Further, aside from some
preliminary developments by ourselves (Bauer, Curran, & Bollen, 2001; Curran,
Bauer, & Willoughby, 2004, in press) and Miyazaki (2002), we are aware of no
prior work that has extended these methods in a general way to the random-effects
or multilevel regression model.1 Our goal is to make these extensions here. We first
present our expansion of the J-N technique for the more familiar fixed-effects re-
gression model, then extend the use of this technique to the multilevel regression
model involving random effects. Empirical examples are provided throughout to
illustrate the use of these procedures.

PROBING INTERACTIONS
IN FIXED-EFFECTS REGRESSION

A simple fixed-effects regression model involving two predictors and their interac-
tion can be expressed in scalar notation as

yi = ␥0 + ␥1x1i + ␥2x2i + ␥3x1ix2i + ⑀i. (1)

The coefficient ␥3 represents the interaction, defined as the effect of the x1x2 prod-
uct variable on the dependent variable y over and above the additive main effects of
the predictors x1 and x2 (␥1 and ␥2, respectively). The residuals ⑀ represent the por-
tion of variability in y that is unexplained by the predictors.
1Since the writing of this manuscript, related work on this topic has been published by Tate (2004).
376 BAUER AND CURRAN

For this model, we can define the prediction equation to be

␮ y| x1 , x 2 = ␥0 + ␥1 x1 + ␥2 x2 + ␥3 x1 x2 , (2)

where the expected value of y conditioned upon specific values of x1 and x2 is de-
noted ␮ y| x1 , x 2 . Finally, we can rearrange Equation 2 to highlight how the effect of
one predictor varies as a function of the other. Treating x1 as the focal predictor and
x2 as the moderator, we have

␮ y| x1 , x 2 = (␥0 + ␥2 x2 )+ (␥1 + ␥3 x2 ) x1 , (3)

explicating that the y on x1 regression line varies as a function of x2. We can then
designate the simple intercept and simple slope of this conditional relationship as
␻0 and ␻1, respectively:

␻0 = ␥0 + ␥2x2, (4)

␻1 = ␥1 + ␥3x2. (5)

One important fact that Equations 4 and 5 highlight is that, in the presence of a
product interaction, the intercept and main effects of the regression model are
scale-dependent. For instance, Equation 5 shows that the main effect of the focal
predictor (␥1) is the effect where the moderator is zero (x2 = 0). Because zero is of-
ten outside the logical range of the moderator, it is recommended to mean center or
standardize predictors involved in interactions so that the main effects represent
the effects of each predictor at the mean level of the other (see Jaccard, Turrisi, &
Wan, 1990, and Aiken & West, 1991, who also discuss computational advantages
of mean centering predictors). While this aids in the interpretation of the main ef-
fect, we will typically also want to assess the effect of the focal predictor at levels
of the moderator other than zero to more fully explore the nature of the interaction.
The pick-a-point and J-N techniques provide alternative methods for accomplish-
ing this goal. For both approaches, the key is to obtain an estimate and standard er-
ror for a parameter (e.g., ␻1) that is a linear composite of other parameters (e.g., ␥1
and ␥3). We now show how to compute these values for the simple slope of the fo-
cal predictor (␻1), as it is usually this value that is of most interest, though similar
procedures could also be applied to the simple intercept (␻0).
PROBING INTERACTIONS 377

Estimating Conditional Effects in Fixed-Effects Regression


In the general case, the fixed-effects regression model may be written as

y = X␥ + ⑀, (6)

where y is an n × 1 criterion vector, X is the n × p design matrix for p-predictor vari-


ables (including a column vector of 1’s to define the intercept), ␥ is a p × 1 vector of
fixed regression parameters relating each predictor to the criterion, and ⑀ is an n × 1
vector of random residuals. The residuals are assumed to be homoscedastic and
normally distributed [i.e., ⑀ : N (0, ␴2 I n )]. In the specific case of the two-predictor
interaction model given in Equation 1, the matrix X would simply consist of a col-
umn of 1’s, a column of observed values for x1, a column of observed values for x2,
and a column of values for the product of x1 and x2.
From a frequentist perspective, the parameter estimates of the model may be
viewed as random variables characterized by a joint sampling distribution. Given
sufficiently large samples, this sampling distribution will obtain a multivariate nor-
mal form. Conditional effect estimates may then be viewed as weighted linear
composites of these random normal variates (Aiken & West, 1991, pp. 25–26;
Morrison, 1990, p. 83):

ˆ = a1␥ˆ 0 + a2 ␥ˆ 1 K + a p ␥ˆ p-1 = a ¢␥ˆ ,


␻ (7)

where a is a p × 1 column vector containing the a1, a2, …, ap fixed weights used to
form the composite. The variance of the conditional effect is estimated as

 (␻
VAR  (a ′␥ˆ ) = a ′ 
ˆ ) = VAR ACOV (␥ˆ ) a, (8)

where  ACOV (␥ˆ ) is the sample estimate of the asymptotic covariance matrix of
the regression coefficient estimates given by the usual formula

−1
ACOV (␥ˆ ) =  X ′ (␴ˆ 2 I n ) X
 −1

−1
= ␴ˆ 2 (X ′X) . (9)

The standard error of the estimate is then simply the square root of the variance es-
timate obtained from Equation 8.
378 BAUER AND CURRAN

For our simple two-predictor example the vector of regression coefficient esti-
mates is

␥ˆ ¢ = (␥ˆ 0 ␥ˆ 1 ␥ˆ 2 ␥ˆ 3 ). (10)

Thus, to obtain a point estimate and standard error for ␻1 as defined in Equation 5,
we would set

a ¢ = (0 1 0 x2 ), (11)

and solve Equations 7 and 8 to obtain the point estimate and variance of the simple
slope where


ˆ 1 = ␥ˆ 1 + ␥ˆ 3 x2 , (12)

 (␻
VAR  (␥ˆ 1 ) + 2 x2 COV
ˆ 1 ) = VAR  (␥ˆ 1 , ␥ˆ 3 ) + x 2 VAR
 (␥ˆ 3 ). (13)
2

Equations 12 and 13 represent the two fundamental pieces of information that will
be manipulated to probe the interaction effect.
We next briefly review the testing of simple slopes using the pick-a-point ap-
proach, and then proceed to detail how the J-N technique can be applied to more
fully evaluate conditional effects resulting from the interaction of categorical or
continuous predictors.

The Pick-a-Point Approach: Tests of Simple Slopes


A traditional test of simple slopes using the pick-a-point technique consists of
· (␻
computing conditional effect estimates ␻$ and their variances VAR ˆ ) using Equa-
tions 7 and 8 for several fixed values of the moderating variable (e.g., 0 and 1 if the
moderator is binary, or high, medium, and low values if it is continuous). A test of a
simple slope is then obtained by forming the critical ratio of the conditional effect
estimate to its standard error at a specified level of the moderator:


ˆ
t= 1/ 2
. (14)
VAR ˆ )
 (␻

Cohen et al. (2003) and Jaccard and Turrisi (2003) have also suggested construct-
ing confidence intervals for simple slope estimates rather than relying exclusively
on null hypothesis tests, a point that we will return to and reinforce shortly. In addi-
tion, a graphical depiction of the interaction may also be obtained by computing
PROBING INTERACTIONS 379

both simple intercepts and simple slopes for each of several levels of the moderator
and then plotting the simple regression lines given by these estimates.
While this approach has been an extremely useful method for probing interac-
tions in many areas of applied research, it is typically limited to the evaluation of con-
ditional effects at only a small number of values of the moderator. There is not a sys-
tematic way to examine the magnitude of the conditional effects, nor their precision,
across the entire range of the moderator, information that is provided by the more
general J-N technique. For this reason, we now turn to a further explication of the J-N
technique in both fixed-effects and random-effects regression.

The J-N Technique


We distinguish between two related aspects of the J-N technique: The computation
of regions of significance and the plotting of confidence bands for the conditional
effect. To our knowledge, this presentation is the first to expand the use of the J-N
technique beyond the traditional ANCOVA framework. In our view, this expansion
has been impeded by the traditional formulation of the J-N technique from sepa-
rate within-group linear regressions of the outcome on the covariate (see Aiken &
West, 1991, pp. 134–136; Jaccard & Turrisi, 2003, pp. 81–82). Extension of the
technique to continuous by continuous interactions is then nonobvious, given the
lack of discretely defined groups. However, as has been shown elsewhere (Hunka,
1995; Rogosa, 1980, 1981) the traditional J-N technique can be embedded in a
general linear model involving an interaction between the grouping variable and
continuous covariate. It then becomes a relatively simple matter to consider use of
the J-N technique for interactions between continuous variables as well. We thus
motivate the J-N technique from the standpoint of the general fixed-effects regres-
sion model. This presentation has the added benefit of explicating the analytical re-
lation between the J-N technique and the pick-a-point approach which in turn fa-
cilitates our ultimate goal of extending the J-N technique to a broad class of
random-effects regression models.

显著性区域 Regions of significance. The computation of regions of significance is an


alternative to the pick-a-point approach that indicates over what range of the mod-
erator the effect of the focal predictor is significantly positive, nonsignificant, or
significantly negative. To compute regions of significance, we reverse the un-
known quantity in the critical ratio in Equation 14. Specifically, rather than solve
for the t value that corresponds to the simple slope at the designated level of the
moderator, we select a critical value tcrit (i.e., ±1.96 for large samples) and solve for
the values of the moderator that return that critical value:


ˆ
± tcrit = 1/ 2
. (15)
VAR ˆ )
 (␻

380 BAUER AND CURRAN

Manipulation of this equation yields

2 VAR
tcrit  (␻
ˆ )− ␻
ˆ 2 = 0. (16)

For interactions involving two predictors, the two roots of the moderator that sat-
isfy this equality can be solved by the quadratic formula. These roots demarcate
the boundaries of the regions of significance, indicating the points on the scale of
the moderator at which the effect of the focal predictor passes from significance to
nonsignificance at the selected alpha-level.
For example, let us again consider the conditional effect of x1 in the simple two
predictor model in Equation 1. Substituting in the expressions in Equations 12 and
13 into Equation 16 and collecting terms, we have

ax22 + bx2 + c = 0, (17)

where

a = tcrit
2 VAR (␥ˆ ) − ␥ˆ 2 , (18)
3 3

b = 2 tcrit
2 COV (␥ˆ , ␥ˆ ) − ␥ˆ ␥ˆ  ,
1 3 1 3  (19)

c = tcrit
2 VAR (␥ˆ 1 ) − ␥ˆ 2 . (20)
1

The values of x2 satisfying this equality can then be obtained via the quadratic for-
mula

-b ± b2 - 4ac
x2 = . (21)
2a

The two values (or roots) of x2 that are returned by this formula demarcate the
boundaries of the regions of significance.2 Note that this holds regardless of the
distribution of the moderator. The moderator may thus be categorical or continu-
2Sometimes the conditional effect of the focal predictor will be nonsignificant at any given level of

the moderator despite a significant test of the product interaction term. Mathematically, this will result
in there being no real roots to Equation 21 (i.e., the roots will be imaginary numbers). The opposite can
also occur, where regions of significance and nonsignficance can be identified despite the lack of a sig-
nificant interaction term. Rogosa (1980, 1981) provides a useful discussion of these scenarios.
PROBING INTERACTIONS 381

ous, making the traditional application of the J-N technique in the ANCOVA
model a special case of Equation 16.

置信区间 Confidence bands. One common feature of tests of simple slopes and re-
gions of significance is that both procedures are based on traditional null hypothe-
sis testing. However, the usefulness of this approach to statistical inference has
been questioned repeatedly over the decades, culminating in the APA task force re-
port emphasizing that confidence intervals are much more informative than null
hypothesis tests (Wilkinson & the Task Force on Statistical Inference, 1999). Simi-
larly, as we now show, confidence bands provide more information than null hy-
pothesis tests of simple slopes and regions of significance.
The standard formula for a confidence interval (CI) is

1/ 2
CI = ␪ˆ ± tcrit VAR
 (␪ˆ ) ,
 (22)

where ␪ˆ represents a given point estimate. In most cases we are interested in a sin-
gle effect estimate and so simply compute the confidence interval for this estimate.
However, in the case of conditional effects, both the effect estimate and its standard
error vary as a function of the moderating variable. As such, we cannot plot just
one confidence interval; instead we must plot the confidence interval over the full
range of the moderating variable, and these are known as confidence bands
(Rogosa, 1980, 1981). The general formula for the confidence bands (CB) is then

1/ 2
ˆ ± tcrit VAR
CB␻ˆ = ␻  (␻ˆ ) . (23)

To continue with our simple two predictor model, if we were interested in com-
puting confidence bands for the effect of the focal predictor x1 as a function of the
moderator x2, we would substitute in the specific conditional effect estimate of x1
and its variance as given in Equations 12 and 13:

1/ 2
CB␻ˆ 1 = (␥ˆ 1 + ␥ˆ 3 x2 ) ± tcrit VAR
 (␥ˆ 1 ) + 2 x2 COV  (␥ˆ 3 ) . (24)
 (␥ˆ 1 , ␥ˆ 3 ) + x 2 VAR
2 

As is the case with standard confidence intervals, these confidence bands convey
the same information as null hypothesis tests of simple slopes and/or regions of
significance. Specifically, the points where the confidence bands cross zero are the
boundaries of the regions of significance (also indicating which simple slopes are
significant and which are not).
382 BAUER AND CURRAN

Additionally, the confidence bands graphically convey our certainty in the con-
ditional effect estimates and how that certainty changes as we progress across the
range of the moderating variable. The confidence bands are narrowest at the center
of accuracy. This point can be obtained by differentiating the variance function for
the conditional effect estimate with respect to the moderating variable and setting
the resulting derivative equal to zero

d ( )
VAR ␻
ˆ = 0. (25)
dxm

Solving for values of the moderator, here denoted xm, that satisfy this equality will
produce candidate points for the center of accuracy. For our simple example
model, manipulation of Equation 25 shows that the center of accuracy is the value
of x2 given by the formula

 (␥ˆ 1 , ␥ˆ 3 )
−COV
x2 = . (26)
 (␥ˆ 3 )
VAR

More complicated variance functions may have more than one candidate point for
the center of accuracy, requiring that these points be compared to determine the
one yielding the smallest variance. Typically, the center of accuracy will be in the
middle of the scale of the moderator, with decreasing precision evident at the ends
of the scale.

Multiple Testing: Nonsimultaneous Versus


Simultaneous Inference
The regions and bands explicated above, as well as the pick-a-point procedure
described before them, are all predicated on the use of an alpha-level that is only
valid for a single test. The inferences they afford must thus be regarded as
nonsimultaneous. As such, the Type I error rates of these techniques are only
fully accurate for a single selected value of the moderator and the error rate will
otherwise accumulate as additional values of the moderator are considered. If
one wished to simultaneously evaluate the effect of the focal predictor at a small
number of values of the moderator (e.g., high, medium, and low, as in tests of
simple slopes), a family-wise alpha-level could be obtained by using, for in-
stance, the Bonferroni approach (Neter, Kutner, Nachtsheim, & Wasserman,
1996, pp. 157–158). This approach could similarly be applied to the regions and
confidence bands of the J-N technique so that inferences would be valid for a
specific number of values of the moderator. If, with the J-N technique, valid in-
PROBING INTERACTIONS 383

ferences are sought for all values of the moderator simultaneously, then one can
instead use the critical values derived by Potthoff (1964) to compute simulta-
neous regions of significance and confidence bands. In the simple case of the
model considered above, the critical t value in Equations 15 through 24 would
be replaced by [2Fcrit(2, N – p)]1/2 (see also Rogosa, 1980, 1981, and Serber,
1977). If the number of comparisons to be made is not small, this approach will
generally be more powerful than using the Bonferroni approach and has the fur-
ther advantage of yielding an accurate Type I error rate for all possible compari-
sons (Neter et al., 1996, pp. 157–158).
Despite these advantages, the simultaneous critical value of Potthoff (1964)
will necessarily yield narrower regions of significance and broader confidence
bands than the nonsimultaneous critical value, sometimes so much so that the re-
sults may be of little practical use, as noted by Potthoff himself. As a possible rem-
edy to the problem, Potthoff suggested using a higher alpha level with the simulta-
neous approach. Others have suggested alternative forms for the confidence bands
that are valid only over specific (observed) intervals of the moderating variable,
rather than over the entire real line (Aitken, 1973; Gafarian, 1964). Unless other-
wise noted, here we focus specifically on the nonsimultaneous version of the J-N
technique for the principal reasons that this facilitates an understanding of the rela-
tion between the pick-a-point and J-N procedures, no consensus has yet emerged
on the best method for constructing simultaneous regions and confidence bands,
and the simultaneous approach is often impractical. Mindful of the error rate accu-
mulation problem of the nonsimultaneous approach, however, we view the result-
ing regions and bands as primarily of heuristic value.

Empirical Example: The Relation Between Child Math


Ability, Delinquency, and Hyperactivity
To demonstrate the application of the J-N technique to probe a continuous by con-
tinuous interaction in a fixed-effects regression model, we considered a sample of
N = 956 children drawn from the 1990 assessment of the Children of the National
Longitudinal Survey of Youth (NLSY). Children ranged in age from 59 to 156
months (mean age of 100 months), ranged in grade from kindergarten to sixth
grade (third grade was the median), 52% were female, and 55% of mothers re-
ported their child to be of minority status (self identified to be non-Caucasian). The
motivating substantive question related to the prediction of math ability as a func-
tion of the joint influence of antisocial and hyperactive behavior. It was hypothe-
sized that there would be a negative relation between child antisocial behavior and
math ability, and that this effect would be potentiated by the additional presence of
child hyperactive behavior.
The regression model used to evaluate this hypothesis was identical to Equa-
tion 1, with the exception that age, grade, sex, and minority status were entered
384 BAUER AND CURRAN

TABLE 1
Results From Fixed-Effects Regression of Math Ability on Antisocial
Behavior, Hyperactive Behavior, and the Multiplicative Interaction
Between Antisocial and Hyperactive Behavior

Fixed Effects PE SE p

Intercept (␥$ 0 ) 38.07 .322 < .0001


Antisocial (␥$ 1 ) .0373 .2681 .8893
Hyperactive (␥$ 2 ) –.799 .2148 .0002
Antisocial by Hyperactive (␥$ 3 ) –.397 .1429 .0055
Age .347 .041 < .0001
Gender –2.09 .592 .0004
Grade 3.40 .519 < .0001
Minority status –3.70 .583 < .0001

Asymptotic Variance/Covariance Matrix of Fixed Effects

␥$ 0 ␥$ 1 ␥$ 2 ␥$ 3

␥$ 0 .1039
␥$ 1 .0129 .0719
␥$ 2 –.0003 –.0268 .0461
␥$3 –.0214 –.0124 .000 .0204

Note. All models were fit using SAS PROC REG using the sample of N = 956 children drawn
from the NLSY described in the text. The parameter estimates and standard errors for the main effects
of age, gender, grade, and minority status are shown above, but these have no direct relevance with re-
spect to testing and probing the interaction term and are thus not included in the reported ACOV matrix.
PE = parameter estimate; SE = standard error.

as additional covariates. Here, y is mathematics ability, measured as the propor-


tion of 84 items correctly endorsed on the math subsection of the Peabody Indi-
vidual Achievement Test ( y = 37.65, sd = 15.77, range = 5.95 – 82.14), x1 is a
continuous measure of antisocial behavior as reported by the child’s mother (x 1
= 1.34, sd x1 = 1.34, range = 0 – 6), x2 is a continuous measure of hyperactive
behavior as reported by the child’s mother (x 2 = 1.94, sd x 2 = 1.54, range = 0 –
5), and larger values of these variables reflect higher levels of child antisocial
and hyperactive behavior. We mean-centered all predictors (including the co-
variates) prior to fitting the model using ordinary least squares estimation.3
Together, the predictors accounted for 69% of individual differences in math
2 = .69. To further probe the
ability (see Table 1), F(7, 948) = 299.28, p < .0001, Radj
significant interaction, we computed the simple intercepts and simple slopes for
the regression of math ability on antisocial behavior at the mean of hyperactivity

3This centering included the dichotomous measures of minority status and female. This allowed for

the calculation of simple intercepts assessed at the mean of all predictors, and not assessed within the
groups coded as zero on these predictors.
PROBING INTERACTIONS 385

and at one standard deviation above and below the mean, producing the plots in
Figure 1. Only the simple slope of antisocial behavior at high hyperactivity was
statistically significant (p = .045); the relation of antisocial behavior to mathemat-
ics ability was not significantly different from zero at medium and low levels of hy-
peractivity. Although these simple regressions substantially aid in the interpreta-
tion of this interactive effect, the J-N technique can provide additional useful
information.
Applying Equations 16 and 23 resulted in 95% (nonsimultaneous) regions of
significance defined by a lower bound of –2.32 and an upper bound of 1.49. As
shown in Figure 2, these regions imply that the regression of math ability on anti-
social behavior is significant and positive at values of hyperactivity less than
–2.32, not significantly different from zero at values of hyperactivity between
–2.32 and 1.49, and significant and negative at values of hyperactivity greater than
1.49. Given that the minimum and maximum values of (mean centered) hyperac-
tivity were –1.94 and 3.06, respectively, the upper region fell within the observed
range of hyperactivity whereas the lower region did not (hence it will not be inter-
preted further). Indeed, 167 of the 956 children (17.5%) had hyperactivity scores
greater than the upper bound of the region. The 95% confidence bands that corre-
spond to these regions are also presented in Figure 2. The center of accuracy, the
narrowest cross-section of the confidence bands, was determined to be .61, a point

FIGURE 1 The pick-a-point simple slopes of the regression of math ability on antisocial be-
havior at high, medium, and low levels of hyperactive behavior. Note. High, medium, and low
values of hyperactivity are defined as plus and minus 1 sd about the mean (1.54, 0, −1.54). The
slope of the simple regression at high hyperactivity significantly differs from zero, but the sim-
ple slope at medium and low hyperactivity does not.
386 BAUER AND CURRAN

FIGURE 2 J-N regions of significance and confidence bands for the conditional relation be-
tween math ability and antisocial behavior as a function of hyperactivity. Note. Dashed vertical
lines reflect regions of significance (−2.32, 1.49) and the dark horizontal line with diamonds in-
dicates the range of child hyperactivity observed in the sample data (−1.94, 3.06). The intersec-
tion of values of hyperactivity equal to −1.54, 0, and 1.54 with effect of antisocial behavior cor-
respond to the three simple slopes presented in Figure 1.

where the effect of antisocial behavior is nonsignificant. For contrast, the simulta-
neous regions of significance were also calculated (not shown in Figure 2), with
obtained roots of –5.52 and 2.18. The lower root is well outside the observed range
of the data, and only 7% of the children’s hyperactivity scores were observed
above the upper root. The smaller region of significance with the simultaneous ap-
proach is a natural consequence of trade-off between simultaneous inference and
power, but in this case the region includes such a small portion of the sample that it
is of relatively little practical use.
In summary, the J-N technique provides much additional information relative
to the pick-a-point approach. Indeed, the testing of specific simple slopes is also
provided by the J-N technique (e.g., the simple slopes of the high, medium and
low regression lines plotted in Figure 1, and their confidence intervals, simply
reflect three specific points on the abscissa in Figure 2). In addition, these re-
gions convey that the effect of antisocial behavior on mathematics ability is sig-
nificant at any given level of hyperactivity higher than 1.49 units above the mean
and that we have the most confidence estimating this effect for hyperactivity lev-
els slightly above the mean. All of these results, and the corresponding plots, can
be calculated using a suite of freely available javascript programs accessible at
http://www.quantpsy.org that implement the formulas presented in this paper.
Table 1 provides all the necessary information for the interested reader to repli-
cate our results using this program. We now turn to the extension of the J-N
technique to a broad class of random-effects regression models.
PROBING INTERACTIONS 387

PROBING INTERACTIONS
IN MULTILEVEL REGRESSION

Unlike the standard fixed-effects regression models, interactions can be mani-


fested in a variety of ways in multilevel models. They may occur within a given
level of the model (e.g., an interaction between two lower-level or two up-
per-level variables) or between levels of the model (e.g., an interaction between
a lower-level variable and an upper-level variable). Moreover, some interactions
will involve predictors whose main effects are random, or even the interaction
may be a random effect, whereas in other cases these effects will be fixed. The
most common of these scenarios is the cross-level interaction with a random
main effect for the lowest level predictor. We use this case to show how the J-N
technique can be extended and applied to multilevel models, but of course these
procedures directly apply to the other types of interactions that can arise in mul-
tilevel models as well.
A cross-level interaction arises when the random coefficient of a lower level
predictor is itself predicted by an upper level predictor. In the simplest case, the
level-1 model can be written as

yij = ␤0j + ␤1jx1ij + ⑀ij, (27)

where yij is the outcome variable for individual i in group j, x1ij is the single predic-
tor for individual i in group j, ␤0j and ␤1j are the intercept and slope of the regres-
sion of y on x1 within group j, respectively, and ⑀ij is the random residual for indi-
vidual i within group j. Given that the intercept and slope coefficients vary
randomly over groups, these can be conceived as random variables to be predicted
by one or more group-level covariates. In the case of a single group-level covariate,
the level-2 model can be written as

␤0j = ␥00 + ␥01w1j + u0j, (28)

␤1j = ␥10 + ␥11w1j + u1j, (29)

where w1j represents the single predictor for group j, ␥00 and ␥10 are the fixed inter-
cepts of the regression of ␤0j and ␤1j on w1j, and ␥01 and ␥11 represent the fixed
slopes of the regression of ␤0j and ␤1j on w1j, and u0j and u1j reflect the residual
variability in the level-1 intercepts and slopes net the prediction by w1j.
Although there is only a single main effect in the level-1 equation and a single
main effect in each level-2 equation, the cross-level interaction effect is apparent in
388 BAUER AND CURRAN

the reduced-form equation, obtained by substituting Equations 28 and 29 into


Equation 27 and grouping fixed and random components:

yij = (␥00 + ␥01w1j + ␥10x1ij + ␥11x1ijw1j) + (u0j + u1jx1ij + ⑀ij). (30)

This equation illustrates that the main effect regression of the random slopes on
w1 (␥11) is expressed as a cross-level interaction between w1 and x1 in the re-
duced form equation (i.e., ␥11w1jx1ij). Given the formulation of the level-1 and
level-2 models, we would traditionally view w1 as the moderator of the effect of
x1. However, as always, the conditional effects in the model are symmetrical,
and in the reduced-form model we can see that there is no restriction on which
variable is to be considered the focal predictor or moderator.
To consider the conditional nature of the effects of each predictor in greater de-
tail, we first write the prediction equation for the model. Taking the expectation of
Equation 30 over both individuals and groups yields the prediction equation

␮ y| x1 ,w1 = ␥00 + ␥10 x1 + ␥01w1 + ␥11 x1w1 . (31)

The simple slope for the regression of y on x1 as a function of w1 is then

␻x1 = ␥10 + ␥11w1 . (32)

Similarly, given the symmetry of the interaction, the simple slope of the regression
of y on w1 can be written as a function of x1, or

␻w1 = ␥01 + ␥11 x1 . (33)

As for the standard regression model, the challenge is then again to obtain an esti-
mate and standard error for a linear composite. Given that this linear composite
consists entirely of fixed effects, we will see that this task is accomplished in much
the same way for the multilevel regression model as it was for the standard regres-
sion model.

Estimating Conditional Effects in Multilevel Models


More formally, the reduced form of the linear multilevel model can be expressed as

y j = X j␥ + Z ju j + ⑀ j, (34)

where yj is the nj × 1 response vector for group j = 1, 2, …, J, Xj is the nj × p design


matrix for the p × 1 vector of fixed effects ␥ (where p includes a column vector of
PROBING INTERACTIONS 389

1’s for the intercept), Zj is the nj × q design matrix for the q × 1 vector of random ef-
fects uj, and ⑀j is the nj × 1 vector of residuals (Laird & Ware, 1982). Importantly, it
is assumed that the random effects and residuals are independent of one another
and are multivariate normally distributed as

u j : N (0, T), (35)

⑀ j : N (0, ⌺⑀ j ). (36)

Although not requisite, the form of the covariance matrix of the random effects T is
typically unrestricted and the residuals are often constrained to be homoscedastic
and independent (i.e., ⌺⑀ j = ␴ 2 I n j ). Without loss of generality we follow these
conventions here, although all of our developments apply to any structure of T and
⌺⑀ j (assuming these are properly identified).
As we demonstrated earlier, once we move to the prediction equation by taking
expectations over both individuals and groups, we are left with conditional effects
that are linear composites of fixed coefficients only. Thus, the estimated condi-
tional effect of x at a given level of w (or vice versa) can be computed from Equa-
tion 7 just as in the case of the standard regression model. Similarly, application of
Equation 8 will yield the variance of this estimate. Where these computations dif-
fer from the standard fixed-effects case is that the asymptotic covariance matrix of
the fixed-effects estimates no longer has the simple form of Equation 9, and in-
stead is estimated as

 J −1

ACOV (␥) = ∑ X j Vj X j  ,
ˆ  ′ ˆ −1 (37)
 j =1 

$ is the model-implied covariance matrix for yj, estimated as


where V j

V ˆ ¢j + ␴ˆ 2 I n .
ˆ j = Z j TZ (38)
j

Equation 37 would simplify to Equation 9 if T$ (or equivalently Z) were a null ma-


trix, that is, in the absence of random effects.
For example, consider the cross-level interaction in the multilevel regression
model in Equation 30. Arranging the vector of fixed effect parameter estimates as

␥ˆ ¢ = (␥ˆ 00 ␥ˆ 01 ␥ˆ 10 ␥ˆ 11 ), (39)
390 BAUER AND CURRAN

the estimate for the conditional fixed effect given in Equation 32 can be obtained
from Equation 7 by defining

a ¢ = (0 0 1 w1 ). (40)

This results in the linear composite


ˆ x1 = ␥ˆ 10 + ␥ˆ 11w1 . (41)

Application of Equation 8 provides the estimated variance of this composite

( ␻
VAR  (␥ˆ ) + 2w COV
ˆ x1 ) = VAR  (␥ˆ , ␥ˆ ) + w2 VAR
 (␥ˆ ). (42)
10 1 10 11 1 11

Given this information, extension of the J-N technique to multilevel models is


straightforward, as we now demonstrate.

Evaluating Interactions in Multilevel Models


Aside from requiring a more complex asymptotic covariance matrix to compute the
variance of conditional effect estimates, tests of simple slopes, computation of re-
gions of significance, and the plotting of confidence bands are all accomplished in
the same way for multilevel models as for standard regression models, that is, via
Equations 14, 16, and 23. For instance, to identify the regions of significance for the
simple cross-level interaction in Equation 30, we would simply substitute the ex-
pressions in Equations 41 and 42 into Equation 16. Collecting terms, we have

aw12 + bw1 + c = 0, (43)

where

a = tcrit
2 VAR (␥ˆ ) − ␥ˆ 2 , (44)
11 11

b = 2 tcrit
2 COV (␥ˆ , ␥ˆ ) − ␥ˆ ␥ˆ  ,
10 11 10 11  (45)

c = tcrit
2 VAR (␥ˆ 10 ) − ␥ˆ 2 . (46)
10
PROBING INTERACTIONS 391

As before, the two values of w1 that satisfy this equality represent the boundaries of
the regions of significance and can be obtained via the quadratic formula. Note that
this again holds regardless of whether the moderator is categorical or continuous.
Similarly, substituting the expressions in Equations 41 and 42 into Equation 23,
we see that the confidence bands are defined by the function

CB␻ˆ x1 = (␥ˆ 10 + ␥ˆ 11w1 ) ±


1/ 2
tcrit VAR
 (␥ˆ 10 ) + 2w1 COV  (␥ˆ ) .
 (␥ˆ 10 , ␥ˆ 11 ) + w2 VAR
1 11  (47)

The center of accuracy can be determined by solving Equation 25, in this case re-
sulting in the formula

 (␥ˆ 10 , ␥ˆ 11 )
−COV
w1 = . (48)
 (␥ˆ 11 )
VAR

The value of w1 satisfying this equality is the value at which the conditional effect
of x1 is estimated with greatest certainty.
Despite the apparent ease with which the J-N technique generalizes to multi-
level models, it is important to note one key complication. Under standard assump-
tions, the test statistics obtained from a standard regression model are exactly t-dis-
tributed with n – p degrees of freedom. Unfortunately, the same is not typically true
for a multilevel model; even under standard assumptions, the test statistics for the
fixed effects are typically only approximately t-distributed (Kacker & Harville,
1984; Schaalje, McBride, & Fellingham, 2002). By implication, while exact tests
of simple slopes, boundaries to the regions of significance, and width of the confi-
dence bands can be obtained in the absence of random effects, when random ef-
fects are present the same computations provide only approximately valid infer-
ences. As such, appropriate caution should be exercised in the interpretation of the
results of these procedures in multilevel models. In particular, one should not place
too much emphasis on the specific values obtained for the boundaries to the re-
gions of significance, as these boundaries would likely change if another method
for approximating the test distribution was selected (e.g., an alternative method for
determining degrees of freedom). For cross-level interactions, we expect that the
results will be most sensitive to this choice when there is a small sample size at the
upper level of the model.
Similarly, while the distinction between nonsimultaneous and simultaneous
versions of the J-N technique also arises in the multilevel model, the test statistic is
not as easily derived as in the fixed-effects regression model. Specifically,
Miyazaki (2002) argued that a critical F would be inappropriate given the more
complex error structure of the multilevel model and has instead advocated the use
392 BAUER AND CURRAN

of a Wald test that is chi-square distributed when the number of groups J is large.
Given that many applications of multilevel models involve a relatively small num-
ber of groups, the asymptotic nature of this test may provide a serious impediment
to the simultaneous approach as outlined by Miyazaki. For this reason, and those
adumbrated previously, we continue to use the nonsimultaneous version of the J-N
technique here. However, whether one prefers a nonsimultaneous or simultaneous
approach, it is fair to say that more study is needed to determine the optimal
method for testing conditional effects in small samples so as to obtain accurate and
stable estimates of the regions of significance and confidence bands.4
In summary, the principal difficulties associated with applying the J-N tech-
nique to multilevel models are no different than those that impact any multilevel
modeling analysis, regardless of the presence of within-level or cross-level interac-
tive effects. Specifically, the asymptotic covariance matrix of the fixed effects must
take account of the additional variance components in the model and the test distri-
bution for the fixed effects estimates is only approximately known so tests are typi-
cally inexact. Neither of these issues detracts from our ability to apply the J-N tech-
nique to evaluate conditional effects, and much interpretive information is to be
gained from their use, as we now demonstrate with an empirical application.

EMPIRICAL EXAMPLE

To demonstrate the use of J-N regions and bands in practice, we considered data from
the High School and Beyond (HSB) study that was described in detail by Rauden-
bush and Bryk (2002) and Singer (1998). These data consist of a total sample of
7,185 students who are nested within 160 schools. Between 14 to 67 students were
assessed from each school, with a median of number of 47 students assessed. The
outcome of interest is a child-level measure of math achievement (y = 12.75, sd =
6.89). The first predictor was a continuous measure of child socioeconomic status
(SES) which was group mean centered (sd = .66). The second predictor was a dichot-
omous measure of school sector in which a value of 0 reflected a public school and a
value of 1 reflected a private school (49% of schools were private). The final predic-
tor was a continuous measure of disciplinary climate of the school in which higher
values reflected greater disciplinary problems (see Bryk & Thum, 1989, for further
details). School discipline was grand mean centered (sd = .94).
We considered two separate models to demonstrate both a dichotomous by con-
tinuous cross-level interaction and a continuous by continuous cross-level interac-
4As one anonymous reviewer noted, a likelihood ratio test constitutes a plausible alternative to a t

test of the conditional effect estimate, both for multilevel and standard regression models. A potentially
fruitful direction for future research on this topic may be to generate methods for computing regions of
significance and confidence bands based on likelihood ratio tests. Because likelihood ratio tests are
only asymptotically chi-square distributed, we can anticipate that small sample sizes may also be prob-
lematic for this approach.
PROBING INTERACTIONS 393

tion. Both models considered child SES as the sole level-1 predictor, but the first
included sector as the sole level-2 predictor while the second included school dis-
cipline as the sole level-2 predictor. These models are thus of the form given in
Equations 27 through 30. Although basic, these examples allow us to highlight the
probing of two common types of cross-level interactions, and these methods gen-
eralize to any number of more complex (and more realistic) conditions. To be con-
sistent with Singer’s (1998) analysis of the same data, both models were estimated
in SAS Proc Mixed with restricted ML estimation and using the “between-within”
method for computing degrees of freedom for tests of the fixed effects estimates
(SAS Institute, 1999). However, the choice of method for computing degrees of
freedom is relatively unimportant here, given the large number of level-2 units.

Child SES and school sector. The first multilevel model included the con-
tinuous measure of child SES as the sole level-1 predictor and the dichotomous mea-
sure of school sector as the sole level-2 predictor. Random effects were estimated for
the level-1 intercept and slope, and both of these effects were regressed on school
sector. Detailed results are presented in Table 2. All of the fixed effects were signifi-

TABLE 2
Results From HSB Multilevel Models With Categorical
by Continuous Interaction

Model 1: Child SES and Sector

PE SE df p

Random effects
Residual (␴ $2) 36.71 .63 < .0001
Intercept (␶$ 00 ) 6.73 .86 < .0001
Slope (␶$ 11 ) .27 .23 .1228
Covariance (␶$ 01 ) 1.05 .34 .0021
Fixed effects
Intercept (␥$ 00 ) 11.39 .29 158 < .0001
Child SES (␥$ 10 ) 2.80 .16 7023 < .0001
Sector (␥$ 01 ) 2.81 .44 158 < .0001
Child SES by Sector (␥$ 11 ) –1.34 .23 7023 < .0001
–2LL 46638.6

Asymptotic Variance/Covariance Matrix of Fixed Effects

␥$ 00 ␥$ 10 ␥$ 01 ␥$ 11

␥$ 00 .086
␥$ 10 .012 .024
␥$ 01 –.086 –.012 .193
␥$11
–.012 –.024 .027 .055

Note. All models were fit using SAS PROC MIXED using the High School and Beyond data de-
scribed in the text. PE = parameter estimate; SE = standard error.
394 BAUER AND CURRAN

cant, most notably the cross-level interaction between child SES and school sector
reflecting that the magnitude of the relation between child SES and math achieve-
ment varied as a function of school sector.
To probe this effect, we computed the simple slopes of math achievement on
child SES within each level of sector. Results indicated that there was a signifi-
cantly positive relation between math achievement and child SES for both sectors,
but this relation was significantly stronger for children enrolled in public schools.
Further, children enrolled in private schools reported significantly higher math
scores at lower levels of SES, but the magnitude of this effect diminished with in-
creasing SES (see Figure 3). However, the specific point on child SES at which the
difference between private and public schools becomes nonsignificant is not
known using the pick-a-point approach. To identify this point, we applied the J-N
technique to calculate the regions of significance and associated confidence bands
(see Figure 4).
The boundaries to the regions of significance indicated that math achievement
scores were significantly higher for children enrolled in private schools when child
SES was less than 1.23, the difference between sectors was nonsignificant between
SES values of 1.23 and 3.65, and children in public schools outperformed those in
private schools at values of SES greater than 3.65. Given that the observed values
on SES ranged from –3.65 to 2.86 (indicated in the figure by the darkened portion
of the abscissa), these regions indicated that in the HSB sample, children in private
schools reported significantly higher math achievement scores at any given value
of SES up to 1.89 standard deviations above the mean, but this effect was not sig-

FIGURE 3 Plot of simple slopes between math achievement and child SES as a function of
private versus public school. Note. The simple slope between math achievement and child SES
significantly differs from zero within both private and public schools.
PROBING INTERACTIONS 395

FIGURE 4 J-N regions of significance and confidence bands for the conditional relation be-
tween math achievement and child SES as a function of school sector. Note. Dashed vertical
lines reflect regions of significance and dark horizontal line with diamonds indicates the actual
range of child SES observed in sample data.

nificant at higher values of SES. Although the regions implied that the difference
between private and public schools reversed direction at values of SES greater than
3.65, no cases were observed above this value in the HSB sample. The center of ac-
curacy for estimating the acheivement differences between students of private ver-
sus public schools was found at an SES level of –.49, with diminishing precision as
SES increased or decreased from this value.

Child SES and school discipline. To demonstrate these procedures with a


continuous by continuous interaction, we estimated a second multilevel model in
which child SES remained the sole level-1 predictor, but the continuous measure
of school discipline was considered as the sole level-2 predictor. Detailed results
are presented in Table 3. The fixed effect for the cross-level interaction between
child SES and school discipline was significant reflecting that the magnitude of the
relation between child SES and math achievement varied across continuous levels
of school discipline.
To probe the nature of this relation, we first used the pick-a-point approach in
which we calculated the simple slopes between math achievement and child SES at
high, medium, and low values of school discipline (defined as plus and minus one
standard deviation around the mean of school discipline; see Figure 5). There was
a significant and positive relation between child SES and math achievement across
all three levels of school discipline, although the magnitude of this relation was
larger at higher levels of school discipline (reflecting greater school discipline
problems). Thus, child SES is a significantly stronger predictor of math achieve-
ment in schools with greater discipline problems.
396 BAUER AND CURRAN

TABLE 3
Results From HSB Multilevel Models With Continuous
by Continuous Interaction

Model 2: Child SES and Disciplinary Climate

PE SE df p

Random effects
Residual (␴ $2) 36.69 .63 < .0001
Intercept (␶$ 00 ) 6.64 .85 < .0001
Slope (␶$ 11 ) .42 .25 .0459
Covariance (␶$ 01 ) .84 .34 .0149
Fixed effects
Intercept (␥$ 00 ) 12.79 .22 158 < .0001
Child SES (␥$ 10 ) 2.16 .12 7023 < .0001
Sector (␥$ 01 ) –1.49 .22 158 < .0001
Child SES by Sector (␥$ 11 ) .60 .13 7023 < .0001
–2LL 46651.6

Asymptotic Variance/Covariance Matrix of Fixed Effects

␥$ 00 ␥$ 10 ␥$ 01 ␥$ 11

␥$ 00 .048
␥$ 10 .005 .015
␥$ 01 –.005 –.001 .050
␥$11
–.001 –.002 .006 .017

Note. All models were fit using SAS PROC MIXED using the High School and Beyond data de-
scribed in the text. PE = parameter estimate; SE = standard error.

As before, the pick-a-point approach shows that there is a significant and posi-
tive relation between math achievement and child SES at high, medium, and low
values of discipline and that the magnitude of this effect decreases with improved
school discipline; however, we do not yet know at what level of school discipline
the relationship between math achievement and SES becomes nonsignificant. Ap-
plication of the J-N technique (see Figure 6), shows that the conditional effect of
child SES on math achievement was significantly negative at discipline levels less
than –6.38, nonsignificant at discipline levels between –6.38 and –2.47, and signif-
icantly positive at discipline levels above –2.47. Given that the range of observed
values of discipline was between –2.28 and 2.89, this implies that there is a signifi-
cant and positive relation between math achievement and child SES across all lev-
els of school discipline observed within the HSB sample. The confidence bands
show that the conditional effect of SES was estimated with most precision at a dis-
cipline level of .09, or roughly at the mean level of school discipline, with decreas-
ing precision at higher or lower levels of school discipline. Again, the interested
reader can reproduce these analyses, or conduct similar analyses of their own,
through the freely accessible javascript programs described earlier.
PROBING INTERACTIONS 397

FIGURE 5 Plot of simple slopes of the relation between math achievement and child SES as a
function of high, medium and low values of school disciplinary climate. Note. High, medium,
and low values of school disciplinary climate are defined as plus and minus 1 sd about the mean.

FIGURE 6 J-N regions of significance and confidence bands for the conditional relation be-
tween math achievement and child SES across all possible values of school disciplinary cli-
mate. Note. Only the upper boundary of the region is demarcated with the vertical dashed line
given the scaling of the abscissa. Dark horizontal line with diamonds indicates the actual range
of child SES observed in sample data.

LIMITATIONS AND DIRECTIONS


FOR FUTURE RESEARCH

By highlighting the common analytical basis of the pick-a-point approach and J-N
technique for evaluating conditional effects in models including interactions, we
have shown that the J-N technique can be generalized to interactions involving any
combination of categorical and continuous predictors, and that both the
398 BAUER AND CURRAN

pick-a-point and J-N techniques can be extended from fixed-effects regression


models to multilevel models with random effects. We believe that these techniques
provide critically important information needed to gain a full understanding about
the complex conditional relations commonly encountered in both the fixed-effects
and multilevel regression models. Although we believe that we have delineated the
use of pick-a-point and J-N approaches in ways not previously considered, there
are of course several limitations to our work.
One limitation of the present research is that we have not considered how the vi-
olation of specific model assumptions might impact on the use of either the
pick-a-point approach or the J-N technique. We can, however, make several tenta-
tive statements on the basis of prior research. First, if the variance components of
the model are not correctly specified (either in the fixed or multilevel cases, e.g.,
through the omission of a random effect, failure to model heteroscedasticy or
autocorrelation, or in assumptions concerning the distributions of the errors) this is
likely to adversely affect the estimation of the asymptotic covariance matrix of the
fixed effects estimates, in turn compromising the inferential tests provided by
pick-a-point or the J-N technique. A second assumption of both the fixed- and mul-
tilevel regression models is that the predictors in the model are nonstochastic,
meaning that they are known and fixed values measured without error. Of course,
this assumption will rarely hold in practice. Fortunately, research by Rogosa
(1977, pp. 94–95) on the J-N technique in fixed-effects regression models shows
that the results of the J-N technique are robust to the use of stochastic predictors. If
the predictors are error-free, the Type I error rate is unaffected, although power di-
minishes. Measurement error predictably causes the region of significance to
shrink (and confidence bands to broaden), yet Type I errors may also occur through
a displacement of the region of significance (see Rogosa, 1977, p. 78, for further
detail). Although clearly speculative, we would expect that similar results would
also obtain for the J-N technique in multilevel models with stochastic predictors.
Future research on the consequences of violating these and other assumptions of
the fixed and multilevel regression models would be useful for ascertaining the ro-
bustness of the pick-a-point approach and J-N technique. Again, given that some
assumptions will likely be violated in any given application, we believe it best to
consider the results of the J-N technique as heuristic rather than exact.
A second limitation of the present research is that we discussed only two-way
interactions between a single pair of predictors. The application of these tech-
niques to more complex interaction patterns may at times be difficult. For instance,
suppose that the focal predictor x1 independently interacts with both with x2 and
with x3 (i.e., there is a two-way interaction between x1 and x2 and a two-way inter-
action between x1 and x3). Then the conditional effect ␻1 is no longer a linear func-
tion of one moderator but is instead described by a plane over the dimensions x2
and x3. The confidence bands then evolve into confidence sheets about the plane
and the regions are correspondingly more complex to derive and interpret. In the
ANCOVA context, Hunka (1995) and Hunka and Leighton (1997) discussed ways
PROBING INTERACTIONS 399

to make the J-N technique tractable when there are multiple covariates that each in-
teract with the grouping variable, but this approach has yet to be generalized to
continuous focal predictors or models involving random effects. Similarly, the J-N
technique may be difficult to apply in models involving three-way interactions,
such as where x1 interacts with the x2x3 product term. For situations such as these
that may arise in the multilevel growth model, Curran et al. (2004) suggested it
may be fruitful to blend the pick-a-point approach and J-N technique. Specifically,
the conditional effect of x1 would be plotted as a function of x2 and examined
through the J-N technique at various selected levels of x3. This approach has partic-
ular appeal if at least one of the moderators is nominal, providing natural levels at
which to assess the conditional effects of the others. Further research may offer ad-
ditional opportunities to explore higher-order interactions in fixed-effects and
multilevel regression models.

REFERENCES

Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Newbury
Park, CA: Sage.
Aitken, M. A. (1973). Fixed-width confidence intervals in linear regression with applications to the John-
son-Neyman technique. British Journal of Mathematical and Statistical Psychology, 26, 261–269.
Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psycholog-
ical research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social
Psychology, 51, 1173–1182.
Bauer, D. J., Curran, P. J., & Bollen, K. A. (2001, June). On the use of confidence bands in latent trajec-
tory models. Paper presented at the meeting of the Psychometric Society, King of Prussia, PA.
Bryk, A. S., & Raudenbush, S. W. (1987). Application of hierarchical linear models to assessing
change. Psychological Bulletin, 101, 147–158.
Bryk, A. S., & Thum,Y. M. (1989). The effects of high school organization on dropping out: An explor-
atory investigation. American Educational Research Journal, 26, 353–383.
Cohen, J. (1978). Partialled products are interactions; Partialled powers are curve components. Psycho-
logical Bulletin, 85, 858–866.
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analy-
ses for the behavioral sciences (3rd ed.). Mahwah, NJ: Lawerence Erlbaum Associates, Inc.
Curran, P. J., Bauer, D. J., & Willoughby, M. T. (2004). Testing main effects and interactions in latent
curve analysis. Psychological Methods, 9, 220–237.
Curran, P. J., Bauer, D. J, & Willoughby, M. T. (in press). Testing and probing within-level and be-
tween-level interactions in hierarchical linear models. In C. S. Bergeman & S. M. Boker (Eds.), The
Notre Dame Series on quantitative methodology, Volume 1: Methodological issues in aging re-
search. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Gafarian, A. V. (1964). Confidence bands in straight line regression. Journal of the American Statistical
Association, 59, 182–213.
Hox, J. (2002). Multilevel analysis: Techniques and applications. Mahwah, NJ: Lawrence Erlbaum As-
sociates, Inc.
Huitema, B. E. (1980). The analysis of covariance and alternatives. New York: Wiley.
Hunka, S. (1995). Identifying regions of significance in ANCOVA problems having non-homogeneous
regressions. British Journal of Mathematical and Statistical Psychology, 48, 161–188.
400 BAUER AND CURRAN

Hunka, S., & Leighton, J. (1997). Defining Johnson-Neyman regions of significance in the
three-covariate ANCOVA using Mathematica. Journal of Educational and Behavioral Statistics,
22, 361–387.
Hussong, A. M. (2003). Further refining the stress-coping model of alcohol involvement. Addictive Be-
haviors, 28, 1515–1522.
Jaccard, J., & Turrisi, R. (2003). Interaction effects in multiple regression (2nd ed.). Thousand Oaks,
CA: Sage.
Jaccard, J., Turrisi, R., & Wan, C. K. (1990). Interaction effects in multiple regression. Newbury Park,
CA: Sage.
Johnson, P. O., & Fay, L. C. (1950). The Johnson-Neyman technique, its theory and application.
Psychometrika, 15, 349–367.
Johnson, P. O., & Neyman, J. (1936). Tests of certain linear hypotheses and their applications to some
educational problems. Statistical Research Memoirs, 1, 57–93.
Kacker, R. N., & Harville, D. A. (1984). Approximations for standard errors of estimators of fixed and
random effects in mixed linear models. Journal of the American Statistical Association, 79,
853–862.
Laird, N. M., & Ware, J. H. (1982). Random effects models for longitudinal data. Biometrics, 38,
963–974.
Miyazaki, Y. (2002, April). Johnson-Neyman type technique in hierarchical linear model. Paper pre-
sented at the meeting of the American Educational Research Association, New Orleans, LA.
Morrison, D. F. (1990). Multivariate statistical methods. New York: McGraw-Hill.
Neter, J., Kutner, M. H., Nachtsheim, C. J., & Wasserman, W. (1996). Applied linear statistical models
(4th ed.). Boston: McGraw-Hill.
Potthoff, R. F. (1964). On the Johnson-Neyman technique and some extensions thereof. Psychometrika,
29, 241–256.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis
methods (2nd ed.). Newbury Park, CA: Sage.
Rogosa, D. (1977). Some results for the Johnson-Neyman technique. Dissertation Abstracts Interna-
tional, 38 (09), 5366A. (UMI No. AAT 7802225).
Rogosa, D. (1980). Comparing nonparallel regression lines. Psychological Bulletin, 88, 307–321.
Rogosa, D. (1981). On the relationship between the Johnson-Neyman region of significance and statis-
tical tests of parallel within group regressions. Educational and Psychological Measurement, 41,
73–84.
SAS Institute. (1999). SAS documentation, Version 8. Cary, NC: SAS Publications.
Schaalje, G. B., McBride, J. B., & Fellingham, G. W. (2002). Adequacy of approximations to distribu-
tions of test statistics in complex mixed linear models using SAS Proc MIXED. Journal of Agricul-
tural, Biological, and Environmental Statistics, 7, 512–524.
Serber, G. A. F. (1977). Linear regression analysis. New York: Wiley.
Singer, J. (1998). Using SAS PROC MIXED to fit multilevel models, hierarchical models, and individ-
ual growth models. Journal of Educational and Behavioral Statistics, 24, 323–355.
Tate, R. (2004). Interpreting hierarchical linear and hierarchical generalized models with slopes as out-
comes. The Journal of Experimental Education, 73, 71–95.
Wilkinson, L., & the Task Force on Statistical Inference. (1999). Statistical methods in psychology
journals: Guidelines and explanations. American Psychologist, 54, 594–604.
Willett, J. B., Singer, J. D., & Martin, N. C. (1998). The design and analysis of longitudinal studies of
development and psychopathology in context: Statistical models and methodological recommenda-
tions. Development and Psychopathology, 10, 395–426.
Yip, T., & Fuligni, A. J. (2002). Daily variation in ethnic identity, ethnic behaviors, and psychological
well-being among American adolescents of Chinese descent. Child Development, 73, 1557–1572.

Accepted August 2004

You might also like