Fixed Random Effects
Fixed Random Effects
In: B.S. Everitt and D.C. Howell (eds.), Encyclopedia of Statistics in Behavioral Science.
Volume 2, 664-665. Chicester (etc.): Wiley, 2005.
Abstract:
In performing a multilevel analysis, what is a level? And when should a variable have a
random slope? This depends on whether the units in the design should be regarded as
being representative of a population, and on whether the researcher wishes to draw
conclusions primarily about the observed units, or primarily about the population.
Keywords:
fixed effects, random effects, linear model, multilevel analysis, mixed model, population,
dummy variables.
The first decision concerning random effects in specifying a multilevel model is the
choice of the levels of analysis. These levels can be, e.g., individuals, classrooms,
schools, organisations, neigborhoods, etc. Formulated generally, a level is a set of units,
or equivalently a system of categories, or a classification factor in a statistical design. In
statistical terminology, a level in a multilevel analysis is a design factor with random
effects. What does this mean?
The main point of view that qualifies a set of units, e.g., organisations, as a level
in the multilevel sense is that the researcher is not primarily interested in the particular
organisations (units) in the sample, but in the population (the wider set of organisations)
for which the observed units are deemed to be representative. Statistical theory uses the
word ‘exchangeability’, meaning that from the researcher’s point of view any unit in the
population could have taken the place of each unit in the observed sample. (Whether the
sample was drawn randomly according to some probability mechanism is a different
issue – sometimes it can be argued that convenience samples or full population
inventories can reasonably be studied as if they constituted a random sample from some
hypothetical larger population.) It should be noted that what is assumed to be
exchangeable are the residuals (sometimes called error terms) associated with these units,
which means that any fixed effects of explanatory variables are already partialed out.
In addition, to qualify as a non-trivial level in a multilevel analysis, the dependent
variable has to show some amount of residual, or unexplained, variation, associated with
these units: e.g., if the study is about the work satisfaction (dependent variable) of
employees (level-one units) in organisations (level-two units), this means that employees
in some organisations tend to have a higher satisfaction than in some other organisations,
and the researcher cannot totally pin this down to the effect of particular measured
variables. The researcher being interested in the population means here that the
researcher wants to know the amount of residual variability, i.e., the residual variance, in
average work satisfaction within the population of organisations (and perhaps also in the
more complicated types of residual variability discussed below). If the residual variance
is zero, then it is superfluous to use this set of units as a level in the multilevel analysis.
When there are no theoretical or other prior guidelines about which variables should have
a random effect, the researcher can be led by the substantive focus of the investigation,
the empirical findings, and parsimony of modeling. This implies that those explanatory
variables that are especially important or have especially strong effects could be modeled
with random effects, if the variances of these effects are important enough as evidenced
by their significance and size, but one should take care that the number of variables with
random effects should not be so large that the model becomes unwieldy.
Modeling an effect as random usually – although not necessarily – goes with the
assumption of a normal distribution for the random effects. Sometimes this is not in
accordance with reality, which then can lead to biased results. The alternative,
entertaining models with non-normally distributed residuals, can be complicated, but
methods were developed, see [2]. In addition, the assumption is made that the random
effects are uncorrelated with the explanatory variables. If there are doubts about
normality or independence for a so-called nuisance effect, i.e., an effect the researcher is
interested in not for its own sake but only because it must be statistically controlled for,
then there is an easy way out (at least for linear models). If the doubts concern the main
effect of a categorical variable, which also would be a candidate for being modeled as a
level as discussed above, then the easy solution is to model this categorical control
variable by fixed effects, i.e., using dummy variables for the units in the sample. If it is a
random slope for which such a statistical control is required without making the
assumption of residuals being normally distributed and independent of the other
explanatory variables, then the analogue is to use an interaction variable obtained by
multiplying the explanatory variable in question by the dummy variables for the units.
The consequence of this easy way out, however, is that the statistical generalizability to
the population of these units is lost.
References:
[1] Raudenbush, S.W., & Bryk, A.S. Hierarchical Linear Models. Applications and Data
Analysis Methods. Newbury Park, CA: Sage, 2nd ed., 2002.
[2] Seltzer, M. & Choi, K. Model checking and sensitivity analysis for multilevel models.
In N. Duan & S. Reise (Eds.), Multilevel modeling: Methodological advances, issues, and
applications. Hillsdale, NJ: Lawrence Erlbaum, 2002.
[3] Snijders, T.A.B., and Bosker, R.J. Multilevel Analysis: An Introduction to Basic and
Advanced Multilevel Modeling. London etc.: Sage Publishers, 1999.