White 2010
White 2010
Received 3 September 2009, Accepted 14 July 2010 Published online 30 November 2010 in Wiley Online Library
1. Introduction
a MRC Biostatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 0SR, U.K.
b Hub for Trials Methodology Research, MRC Clinical Trials Unit and University College London, 222 Euston Road, London NW1 2DA, U.K.
c Department of Public Health and Primary Care, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge
CB2 8RN, U.K.
377
∗ Correspondence to: Ian R. White, MRC Biostatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 0SR, U.K.
†
E-mail: [email protected]
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377–399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
MI produces asymptotically unbiased estimates and standard errors and is asymptotically efficient. The three stages of
MI are described formally below:
Stage 1: Generating multiply imputed data sets: The unknown missing data are replaced by m independent simulated
sets of values drawn from the posterior predictive distribution of the missing data conditional on the observed data. For a
single incomplete variable z, this involves constructing an imputation model which regresses z on a set of variables with
complete data, say x1 , x2 , . . . , xk , among individuals with the observed z. Choices of imputation model are discussed
in Section 2. Methods for multiple incomplete variables are discussed in Section 1.3. Let b and V be the set of estimated
regression parameters and their corresponding covariance matrix from fitting the imputation model. The following
two steps are repeated m times. Let b∗ be a random draw from the posterior distribution, commonly approximated
by b∗ ∼ MVN( b, V) [2]. Imputations for z are drawn from the posterior predictive distribution of z using b∗ and the
appropriate probability distribution. This process is known as proper imputation because it incorporates all sources
of variability and uncertainty in the imputed values, including prediction errors of the individual values and errors of
estimation in the fitted coefficients of the imputation model. Alternatives to proper imputation [6] are not considered
here. An alternative way to draw proper imputations, predictive mean matching, is described in Section 4.2.
Stage 2: Analyzing multiply imputed data sets: Once the multiple imputations have been generated, each imputed data
set is analyzed separately. This is usually a simple task because complete-data methods can be used. The quantities of
scientific interest (usually regression coefficients) are estimated from each imputed data set, together with their variance–
covariance matrices. The results of these m analyses differ because the missing values have been replaced by different
imputations.
Stage 3: Combining estimates from multiply imputed data sets: The m estimates are combined into an overall estimate
and variance–covariance matrix using Rubin’s rules [2], which are based on asymptotic theory in a Bayesian framework.
The combined variance–covariance matrix incorporates both within-imputation variability (uncertainty about the results
from one imputed data set) and between-imputation variability (reflecting the uncertainty due to the missing information).
Suppose h j is an estimate of a univariate or multivariate quantity of interest (e.g. a regression coefficient) obtained from
the jth imputed data set j and W j is the estimated variance of h j . The combined estimate h is the average of the
individual estimates:
1 m
h= hj. (1)
m j=1
The total variance of h is formed from the within-imputation variance W = (1/m) mj=1 W j and the between-imputation
m
variance B = (1/(m −1)) j=1 ( h j −
h)2 :
1
var(
h) = W+ 1+ B. (2)
m
Single imputation is sometimes considered as an alternative to multiple imputation, but it is unable to capture the
between-imputation variance B, hence standard errors are too small.
Wald-type significance tests and confidence intervals for a univariate can be obtained in the usual way
from a t-distribution; degrees of freedom are given in references [7, 8]. Wald tests can also be constructed for a
multivariate h [7].
and ordered categorical) because each variable is imputed using its own imputation model. Suitable choices of imputation
models are discussed in Sections 2 and 4.
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
We here introduce approaches for imputing missing values in continuous, binary, unordered categorical (nominal) and
ordered categorical variables. In this section, and in Section 4, we assume that z is a variable whose missing values we
wish to impute from other (complete) variables x = (x 1 , . . . , xk ) . For simplicity, we assume that x includes a column of
ones, so that k is the number of parameters estimated, including the intercept. Let n obs be the number of individuals
with observed z values.
Let
b be the estimated parameter (a row vector of length k) from fitting this model to individuals with the observed
z. Let V be the estimated covariance matrix of b, and be the estimated root mean-squared error. We next draw the
imputation parameters ∗ , b∗ from the exact joint posterior distribution of , b [2]. First, ∗ is drawn as
∗ = (n obs −k)/g,
where g is a random draw from a 2 distribution on n obs −k degrees of freedom. Second, b∗ is drawn as
∗
b∗ =
b+ u1 V1/2 ,
where u1 is a row vector of k independent random draws from a standard Normal distribution and V1/2 is the Cholesky
decomposition of V. Imputed values z i∗ for each missing observation z i are then obtained as
z i∗ = b∗ xi +u 2i ∗ ,
where u 2i is a random draw from a standard Normal distribution. Closely related approaches that allow for deviations
from the Normal assumption for continuous variables are discussed in Section 4.
Let
379
b be the estimated parameter from fitting this model to individuals with the observed z, with estimated variance–
covariance matrix V. Let b∗ be a draw from the posterior distribution of b, approximated by MVN( b, V) [2]. For each
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
missing observation z i , let pi∗ = [1+exp(−b∗ xi )]−1 , and draw an imputed value z i∗ as
∗
1 if u i < pi∗ ,
zi =
0 otherwise,
where u i is a random draw from a uniform distribution on (0, 1). Such a procedure is straightforward to implement.
Problems can arise due to perfect prediction, which occurs when one or more observations has fitted probability exactly
0 or exactly 1. This causes difficulty in drawing b∗ . The same difficulty arises for unordered and ordered categorical
variables, and is further discussed in Section 10.3.
where bl is a vector of dimension k = dim(x) and b1 = 0. Let b∗ be the usual random draw from a Normal approximation
to the posterior distribution of b = (b2 , . . . , b L ), a vector of length k(L −1). For each missing observation z i , let pil∗ =
Pr(z i =l|xi ; b∗ ) (l = 1, . . . , L) be the drawn class membership probabilities and cil = ll =1 pil∗ . Each imputed value z i∗ is
L−1
z i∗ = 1+ I (u i >cil ),
l=1
where u i is a random draw from a uniform distribution on (0, 1) and I (u i >cil ) = 1 if u i >cil , 0 otherwise.
The UK700 trial was a multi-centre study conducted in four inner-city areas (here termed ‘centres’) [19]. Participants
were aged 18–65 with a diagnosed psychotic illness and two or more psychiatric hospital admissions, the most recent
within the previous 2 years. In the UK, such patients are typically managed in the community by a case manager. In
the trial, 708 participants were randomly allocated to a case manager either with a case load of 30–35 patients (standard
380
case management) or with a case load of 10–15 patients (intensive case management). The main trial findings have been
previously reported [19].
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
Table II. UK700 data: parameters for imputing sat94 and sat96 in the final cycle of the MICE
algorithm.
Parameters for imputing sat94 Parameters for imputing sat96
Imputation ∗sat96 ∗rand ∗constant ∗ ∗sat94 ∗rand ∗constant ∗
1 0.298 0.524 13.627 4.498 0.308 −0.458 11.476 4.514
2 0.284 0.514 13.985 4.522 0.224 −0.360 13.029 4.619
3 0.262 0.433 14.272 4.547 0.297 −0.275 11.653 4.534
4 0.296 0.486 13.720 4.518 0.310 −0.367 11.423 4.512
5 0.287 0.602 13.856 4.517 0.286 −0.426 11.852 4.550
In order to illustrate MI, we use a subsample of 500 individuals selected at random and summarized in Table I; these
analyses are not intended to be definitive. The main aim of our analyses is to explore the effect of the intervention
(intensive case management, variable rand) on satisfaction with services after two years of follow-up (sat96), allowing
for baseline variables including levels of satisfaction at baseline (sat94). Both sat96 and sat94 are assumed to be
Normally distributed.
Ignoring all other variables in the data set, the simplest imputation model for sat96 is a linear regression on
sat94 and rand. Likewise, the simplest imputation model for sat94 is a linear regression on sat96 and rand. No
imputation model is needed for rand because it is complete. (The alternative of running separate imputation procedures
in different randomized groups is discussed later.) These imputation models assume that missingness in sat94 (sat96)
only depends on the observed values of sat96 (sat94) and rand. The Stata command for running the combined
imputation model with 5 imputations and loading the imputed data into memory is
. ice sat94 sat96 rand, m(5) clear
The command initializes the MICE procedure by replacing all missing values with randomly selected observed values. One
cycle comprises imputing sat94 from the linear imputation model that regresses sat94 on sat96 and rand, and then
imputing sat96 from the linear imputation model that regresses sat96 on sat94 and rand. The process is iterated
through 10 cycles to produce a single imputed data set. In each cycle, variables are imputed in an increasing order of number
of missing values. The procedure is repeated m = 5 times to produce 5 imputed data sets. The imputation parameters for the
last cycle of each of the five imputations are shown in Table II, observed and imputed sat94 and sat96 values for a single
imputation are shown in Figure 1, and the format of the imputed Stata data set is shown in Table III.
Non-Normally distributed continuous variables often occur. The drawback of imputing such variables by assuming
Normality is that the distribution of imputed values does not resemble that of the observed values (lack of face validity).
If, for instance, z is intrinsically positive and non-linearly related to the outcome but imputed by a linear regression on
x, the presence of non-positive imputed z values means that the correct model (linear regression of the outcome on ln z)
cannot be fitted to the MI data.
381
We discuss two main ways of dealing with non-Normality: transformation towards Normality and predictive mean
matching. A different view, that non-Normality may not matter, is considered in Section 6.
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
Both observed
sat94 imputed
30 sat96 imputed
Both imputed
sat96
20
10
10 20 30
sat94
Figure 1. UK700 data: observed and imputed values for sat94 and sat96.
Table III. UK700 data: data format before and after imputation, for 5 selected individuals and 2
imputed data sets. The imputed data set includes two added identifiers, _mi for individuals and _mj
for imputed data sets, and includes the original data as _mj==0.
Before imputation After imputation
sat94 rand sat96 _mi _mj sat94 rand sat96
20 0 . 1 0 20 0 .
18 1 22 2 0 18 1 22
17 0 16 12 0 17 0 16
. 1 . 45 0 . 1 .
. 0 14 61 0 . 0 14
1 1 20 0 8.35
2 1 18 1 22
12 1 17 0 16
45 1 20.28 1 13.66
61 1 19.53 0 14
1 2 20 0 11.10
2 2 18 1 22
12 2 17 0 16
45 2 24.91 1 16.81
61 2 22.76 0 14
4.1. Transformation
When z is non-Normal, a simple monotonic transformation f (·) can often be found such that the marginal distribution
of f (z) is approximately Normal [15]. Note that ideally the conditional distribution of f (z) given x would be Normal,
rather than the marginal distribution of f (z), but in practice this distinction may not matter much. Well-known one-
parameter candidates for f (·) include the Box–Cox transformation f (z) = (z −1)/ (with f (z) → ln(z) as → 0), and
the shifted-log transformation f (z) = ln(±z −a), with a chosen such that (±z −a)>0; the sign of z is taken as positive
when the distribution of z is positively (right) skewed and negative when z is negatively (left) skewed. The ancillary
parameters and a may be estimated by maximum likelihood and confidence intervals found. As exact values are not
critical, rounded values of or a may be chosen within their confidence intervals. In particular, if the confidence interval
for or a includes 0, or a may be taken as 0 and a simple log transformation used.
The two transformations always remove skewness, but non-Normal tails (kurtosis) or other non-Normalities may remain.
If so, two-parameter transformations to remove both skewness and kurtosis may be tried instead. These include the Johnson
SU family [20] and the modulus-exponential-Normal (MEN) and modulus-power-Normal (MPN) transformations [21].
382
To get imputed values of z, the imputed values of f (z) must be back-transformed to the original scale, thus f (·) must
be both monotonic and invertible.
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
Transformation to Normality is impossible with ‘semi-continuous’ variables for which a substantial proportion of
values are equal (often zero). An example is weekly alcohol consumption. Such variables are discussed in Section 4.4.
4.2.1. Example. We consider PMM in a small simulated example in which x is univariate. Data pairs (z i , xi ) for
i = 1, . . . , 30 were simulated by drawing z from a uniform distribution on [0.05, 1.05] and x from the non-linear model
x = ln z +N(0, 2 ) where = 0.16; the population coefficient of determination is R 2 = 0.95. Ten values of z were deleted
completely at random and were imputed using the standard approach and using PMM. The imputation model for z|x
was assumed linear in x, which is mis-specified, since the true relationship is approximately exponential in x.
The upper left panel of Figure 2 shows the result of one imputation with the standard approach, assuming that z|x
is linear in x and Normally distributed. The imputed values are clearly biased at low x where the relationship is least
linear. However, PMM imputes missing z values appropriately (upper right panel of Figure 2).
Repeating the experiment with PMM and m = 10 imputations, and then estimating 1 in the model E(x) = 0 +1 ln z
by Rubin’s rules, we find that 1 = 0.98 (SE 0.05), which is consistent with the true value of 1 = 1. With the standard
approach, the correct model cannot be estimated because of the negative imputed values of z.
–2 –1 0
z
Observed values
Imputed values
True relationship
–2 –1 0
x
383
Figure 2. Artificial example of PMM with a non-linear relationship between z and x. Ten values of z missing completely at
random are imputed from 20 observed values of x.
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
4.3.1. Example. Continuing the example of Section 4.2.1, we apply the ordinal logistic regression approach. The results
in one imputation are shown in the lower panel of Figure 2. The fit of the imputed values appears good. The estimate
1 is 0.97 (SE 0.06), similar to the result using PMM.
The previous sections have focussed on drawing imputations that correctly represent what the unobserved data might
have been. However, the ultimate aim in MI is to draw valid and efficient inferences by fitting models (‘analysis models’)
to multiply imputed data. We now discuss two key requirements to avoid bias and gain precision when selecting variables
for the imputation model. Section 6 discusses further requirements on the form of the imputation model.
5.1. Include the covariates and outcome from the analysis model
To avoid bias in the analysis model, the imputation model must include all variables that are in the analysis model [7].
In particular, when imputing missing values of analysis model covariates, the analysis model outcome must be in the
imputation model [25]. If the imputed data are to be used to fit several different analysis models, then the imputation
model should contain every variable included in any of the analysis models.
Consider, for example, the analysis of the UK700 data in Section 3, where the analysis model is a linear regression
of sat96 on sat94 and rand. Suppose that the imputation model for sat94 is chosen as a linear regression on
rand only, wrongly omitting sat96. This makes sat94 and sat96 uncorrelated in individuals with imputed sat94
and observed sat96, biasing the coefficient of sat94 in the analysis model towards zero. The solution is simple: add
sat96 to the imputation model for sat94. In the UK700 data, the coefficient of sat94 is 0.248 (standard error 0.057)
when sat96 is excluded from the imputation model for sat94, rather smaller than the value of 0.292 (standard error
0.055) when it is included.
When the analysis is based on a survival model, the outcome comprises time t and the censoring indicator d. Possible
approaches to including the outcome when imputing covariates are including t, log t and d [26], d and log t [27] or
d and t [28] in the imputation model. We have recently explored this issue in the case where the analysis model is a
proportional hazards model [29]. We found that using d and log(t) can bias associations toward the null. The correct
imputation model in the case of a single binary covariate involves d and H0 (t), where H0 (.) is the cumulative baseline
hazard function; this model is approximately correct in more complex settings. In general, H0 (.) is not known, but it
may be adequately approximated by H (.), the standard (Nelson–Aalen) estimate of the cumulative hazard function. This
procedure may be implemented in Stata using sts generate NA = na and then including the variables NA and the
censoring indicator _d in the ice call.
that are included in the imputation model. Thus the imputation model should include every variable that both predicts
the incomplete variable and predicts whether the incomplete variable is missing. For example, in the UK700 data,
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
centreid predicts both the outcome sat96 and whether sat96 is missing, hence centreid should be included
in the imputation model for sat96.
Second, adding predictors of the incomplete variable to the imputation model can improve the imputations and hence
reduces the standard errors of the estimates for the analysis model.
For example, consider estimating the intervention effect on outcome sat96 in the UK700 trial. It is well known
that post-randomization variables in clinical trials (treatment adherence variables, potential mediating variables and
other outcome variables) should not be entered in the analysis model when estimating treatment effects. However, such
‘auxiliary’ variables are potentially useful in the imputation model. Some individuals with missing values of sat96
have observed values of other trial outcomes such as cprs96. We may therefore get better imputations for sat96 if
we add cprs96 to the imputation model. This may also make the MAR assumption more plausible.
In practice, one could include in the imputation model all variables that significantly predict the incomplete variable,
or whose association with the incomplete variable exceeds some threshold. One might also include any other variables
which significantly predict whether the incomplete variable is missing, on the grounds that bias may be avoided if the
included variables have a true association with the incomplete variable that fails to reach statistical significance, whereas
little loss of precision is likely if the included variables do not predict the incomplete variable. Variable selection could
be based on univariate or multivariate associations in complete cases [26]; an alternative might be to use a provisional
imputation scheme with a small number of imputations (even just one) and apply a model selection procedure.
Difficulties of MI when the imputation model contains more variables than the analysis model have been much
debated [30--32]. The focus in this debate has been on theoretical deficiencies of the Rubin’s rules variance; there is
no suggestion that bias is incurred. A pragmatic conclusion is that such deficiencies are not of practical importance [7,
Section 4.5.4].
As well as including all variables in the analysis model, the imputation model must also include them in an appropriate
way: in the correct functional form and with any interactions that are required. We explore this problem by considering
whether the association between baseline and 2-year satisfaction in the UK700 data is linear or curved. We use a
quadratic analysis model of sat96 on sat94 and its square, sat94sq (although a more appropriate choice might
be a fractional polynomial model [33]). We focus on individuals with the observed sat96, for reasons explained in
Section 8.1, and we consider how to impute the missing values of sat94 and sat94sq.
We ignore the fact that sat94sq is defined as the square of sat94, and impute under a multivariate Normal model
for sat94, sat94sq and sat96: that is, the imputation model for sat94 is a linear regression on sat94sq and
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
Table IV. UK700 data: estimated quadratic coefficients (×100) from fitting a quadratic regression of
sat96 on sat94 using different imputation procedures.
Imputation method Coefficient Standard error
Linear passive −1.05 0.77
Improved passive −1.36 0.77
Just another variable −1.34 0.80
sat96, and the imputation model for sat94sq is a linear regression on sat94 and sat96. We call this approach ‘just
another variable’ (JAV), because sat94sq is regarded as just another variable whose deterministic relationship with
sat94 is ignored. The JAV approach makes a very bad approximation to the joint density of sat94 and sat94sq,
yet it can yield valid inferences for the analysis model [34].
6.4. Interactions
Imputing covariates whose interactions appear in the analysis model also requires care. We explore an interaction between
randomized group rand and a dummy variable, manual, for the father being of manual social class at the patient’s
birth. If we had complete data then the analysis would involve a linear regression of sat96 on sat94, rand, manual
and the product rand*manual.
A linear passive approach imputes the three incomplete variables in the usual way: linear regression of sat96 on
sat94, rand and manual; linear regression of sat94 on sat96, rand and manual; logistic regression of manual
on sat94, sat96 and rand. It then imputes the rand*manual interaction passively.
We can improve this passive imputation model. The interaction between rand and manual in the analysis model
means that the association between sat96 and manual may differ between randomized groups. Congeniality requires
this interaction to be allowed for in the imputation model, so when we impute manual, we must allow for an interaction
between rand and sat96: this gives an improved passive model. The linear passive approach, which ignores this
interaction in the imputation model, is likely to underestimate the interaction in the analysis model.
A simpler congenial imputation approach involves separate imputation within each randomized group. This is conve-
nient because randomized group is complete.
The JAV approach is based on a multivariate normal model for the four variables jointly with the product
rand*manual, which requires that each variable is imputed using a linear regression. In particular, rand*manual
is imputed using a linear regression on sat96, sat94, rand and manual, and manual is imputed using a linear
regression on sat96, sat94, rand and rand*manual.
For analyses, we used m = 500 imputations because method comparisons with m = 100 were inconclusive. Stata code is
given in Appendix A2. We take imputation separately by randomized group as the gold standard. The results (Table V)
show that the linear passive approach suffers bias towards zero, and this bias is reduced but not completely removed by
the improved passive approach. (Monte Carlo errors are about 0.02 for the coefficients and 0.01 for the standard errors.)
Differences between JAV and by-group imputation are compatible with Monte Carlo error.
6.5. Recommendations
We have illustrated the passive and JAV approaches to imputation in non-linear models. Neither is without problems.
The simplest approach is passive imputation using simple linear and logistic regression models and ignoring interactions
(and other subtleties) that are in the analysis model. The cost is bias of relevant terms in the analysis model (typically, but
not always, towards zero) and a loss of power to detect non-linearities and interactions. Improving the passive approach
386
relies on correct specification of the imputation models, which is hard to do; as a result, some bias is possible, as in
Table V. As the number of variables increases, it becomes harder to find and estimate correct passive imputation models.
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
Table V. UK700 data: estimated interaction coefficients from fitting a linear regression of sat96
on sat94, rand, manual and the interaction of rand and manual, using different imputation
procedures.
Imputation procedure Estimate Standard error
Separately by randomized group −1.77 1.15
Linear passive −1.46 1.15
Improved passive −1.67 1.17
Just another variable −1.80 1.14
The JAV approach, on the other hand, produces implausible imputed values for derived variables, which can undermine
the credibility of the procedure. Furthermore, congenial models such as the multivariate normal are usually strongly
mis-specified. Proof that the JAV procedure is unbiased [34] relies on the MCAR assumption. Our simulation studies
suggest that the model mis-specification inherent in the JAV approach may lead to bias when data are MAR but not
MCAR (see Appendix B).
Thus it is hard to give concrete advice about the choice of imputation model. The best choice is to find an imputation
model that is both congenial and a good representation of the data. If this cannot be done, it may be worth experimenting
with more than one imputation model and exploring the impact of different choices on the analysis results. If several
analyses are to be run on one imputed data set, then the imputation model should be congenial with all analyses. An
awkward problem is checking for omitted non-linear and interaction terms in the analysis model: we suggest a possible
strategy for this in Section 8.3.3.
So far we have not explained how we choose the number of imputations. Standard texts on multiple imputation suggest
that small numbers of imputed data sets (m = 3 or 5) are adequate. We first review this argument, then explain why we
usually prefer larger values of m, and suggest a rule of thumb.
from 0.09 to 0.13 for the coefficient, from 0.03 to 0.28 for the lower confidence limit, from 0.11 to 0.35 for the upper
confidence limit, and from 0.01 to 0.05 for the P-value.
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
Table VI. UK700 data: estimated coefficients of ocfabth from fitting a linear regression of cprs94
on ocfabth and afcarib, in five different runs with m = 5.
Run Coefficient 95 per cent CI P-value
1 −1.24 −2.45 to −0.03 0.05
2 −1.26 −2.37 to −0.16 0.03
3 −0.99 −2.10 to 0.11 0.08
4 −1.29 −2.54 to −0.03 0.05
5 −1.15 −2.40 to 0.10 0.07
are highly correlated with the outcome variable (for standard error reduction), or if they are correlated with the outcome
variable and with the probability that the variable is missing (for bias reduction).
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
Table VII. UK700 data: estimated coefficients and standard errors from fitting a linear regression of
sat96 on sat94 and rand. Monte Carlo errors are given in square brackets.
Analysis n ˆ sat94 se(ˆ sat94 ) ˆ rand se(ˆ rand )
(i) Complete cases 288 0.297 0.056 −0.201 0.539
(ii) Multiple imputation
m =5 500 0.269 [0.0200] 0.065 [0.0169] −0.320 [0.0802] 0.450 [0.0324]
m = 100 500 0.285 [0.0158] 0.065 [0.0117] −0.377 [0.0316] 0.497 [0.0038]
(iii) Multiple imputation restricted to individuals with observed sat96
m =5 349 0.296 [0.0038] 0.057 [0.0018] −0.442 [0.0283] 0.496 [0.0111]
m = 100 349 0.295 [0.0024] 0.056 [0.0008] −0.377 [0.0070] 0.494 [0.0007]
Table VIII. Common statistics that can and cannot be combined using Rubin’s rules (equations (1)
and (2)).
Statistics that can be combined without Mean, proportion, regression coefficient, linear predictor,
any transformation C-index, area under the ROC curve
Statistics that may require sensible Odds ratio, hazard ratio, baseline hazard, survival proba-
transformation before combination bility, standard deviation, correlation, proportion of variance
explained, skewness, kurtosis
Statistics that cannot be combined P-value, likelihood ratio test statistic, model chi-squared
statistic, goodness-of-fit test statistic
8.3.1. Hypothesis testing. All model building procedures require hypothesis testing. The Wald statistics for testing the
univariate or multivariate null hypothesis h = 0 were discussed in the introduction. Likelihood ratio test statistics can
be combined using an approximation proposed by Meng and Rubin [44]: this procedure may be convenient when the
number of parameters to be tested is large, but it has not been shown to be superior to the Wald test. Thus, although
likelihood ratio tests are often preferable with complete data, Wald tests should usually be used for hypothesis tests with
multiply imputed data. In Stata, a set of regression parameters can be tested using mim: testparm.
8.3.2. Variable selection. Classical variable selection is usually implemented through iterative procedures such as
forward, backwards and stepwise selection, and modification is required for application to multiply imputed data. Each
variable selection step involves fitting the model under consideration to all imputed data sets (MI Stage 2) and combining
estimates across imputed data sets (MI Stage 3). This multi-stage iterative process has type 1 error comparable to what
would be achieved if there were no missing data [43]. However, it may be impractical under some circumstances such as
(i) large data sets, (ii) large m, (iii) when multiple outcomes are of interest or (iv) when numerous variables or possible
interaction terms are to be assessed. A pragmatic alternative is to analyze the multiply imputed data sets as a single data
389
set of length m ×n, and to perform the variable selection procedure on this one data set. When assessing inclusion or
exclusion of a covariate x, each observation receives a weight of (1− f x )/m where f x is the fraction of missing data
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
for x [43]. This approach has been shown to be a good approximation. It may also be helpful in selecting functional
forms for continuous predictors, for example using multivariable fractional polynomials [33].
8.3.3. Selection of non-linear and interaction terms. Selection of non-linear and interaction terms presents further
difficulties. For brevity, we use ‘non-linear terms’ to include interaction terms. As noted in Section 6, non-linear terms
can only be correctly assessed when they have been allowed for in the imputation model. However, when imputing data,
one does not necessarily know what non-linear terms will be required in a sequence of analyses, and allowing for all
possible non-linear terms might make the imputation model impractically large. One possible strategy is
(1) Produce a provisional and relatively simple imputation model, including non-linear terms of key scientific interest,
but omitting all other non-linear terms.
(2) Use the imputed data to build and check an analysis model, including investigating the need for non-linear terms.
Note that these model checks are conservative when relevant non-linear terms were omitted from the imputation model.
(3) If any convincing non-linear terms are found, then recreate the imputations including the non-linear terms computed
with ‘JAV’ or a careful passive approach.
(4) Use the revised imputed data set to estimate the parameters of the final analysis model.
An alternative [43] would be to repeat the model building (step 2) if new imputations are drawn at step 3.
In this section we illustrate a possible analysis of the UK700 data in Stata. Our aim is to estimate the intervention effect
on satisfaction with services, with adjustment for some baseline variables, and using other trial outcomes as auxiliary
variables. Some output is omitted without comment.
If the data were complete then our analysis model would be
. regress sat96 rand sat94 Icentre* cprs94
where Icentre* are dummy variables for centres 1–3 (the reference category being centre 0). However, first we
must impute the missing values. We use cprs96 as an auxiliary variable to improve our imputations, because it is
observed for 73 of the 151 individuals with missing sat96. Recall from Section 3 that sat94, sat96 and cprs96
are incomplete while rand, Icentre* and cprs94 are complete.
In the previous sections we have imputed sat94 in the same way as other variables, ignoring the fact that it is a
baseline variable in a randomized trial. However, general statistical methods for missing data are not always appropriate
390
for baseline variables in randomized trials, because they may not respect the independence of baseline variables from
randomization. It is best to impute missing baselines deterministically using only other baseline data [45]. We do this
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
1 2 3
20
10
–10
Residual
10 15 20 25
4 5
20
10
–10
10 15 20 25 10 15 20 25
Fitted value
Graphs by imputation number
Figure 3. UK700 data: plot of residuals against fitted values for analysis (ii) of Table VII.
#missing |
values| Freq. Percent Cum.
------------+--------------------------------------
0 | 348 69.60 69.60
1 | 74 14.80 84.40
2 | 78 15.60 100.00
------------+--------------------------------------
Total | 500 100.00
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
This output first shows the numbers of missing values, and then the models used to impute the two incomplete variables.
The imputed data set that is now loaded in memory contains both the original data (identified by _mj==0) and all the
imputed data sets (identified by _mj==1, _mj==2, etc).
We compare the observed and imputed data (Figure 4) using box plots over the imputations:
Gross discrepancies between the distributions of observed and imputed data would suggest errors in the imputation
procedure, although some differences are to be expected if the data are not MCAR [46]. Figure 4 shows that the
distribution of the imputed data is broadly similar to that of the observed data, but a few imputed values of sat96 lie
below the permitted range of 9–36. We could avoid this by using a transformation such as log {sat96/(45−sat96)}.
A graph like Figure 1 might also be useful here, and similar checks should be done for imputed values of cprs96.
We fit the analysis model to the imputations displayed in Figure 4, including observations with missing outcome to
benefit from the inclusion of cprs96 in the imputation model (see Section 8.1).
------------------------------------------------------------------------ --------
sat96 | Coef. Std. Err. t P>|t| [95% Conf. Int.] FMI
---------+----------------------------------------------------------------- -----
rand | -.38696 .496623 -0.78 0.437 -1.36852 .594601 0.339
sat94 | .250918 .056591 4.43 0.000 .139408 .362428 0.229
Icentre1| -1.02357 .721603 -1.42 0.158 -2.44852 .401368 0.310
Icentre2| .421178 .687126 0.61 0.541 -.937743 1.7801 0.357
Icentre3| .487475 .752643 0.65 0.518 -1.00255 1.977 0.386
cprs94 | .043834 .0199 2.20 0.029 .004528 .08314 0.318
_cons | 11.8665 1.13555 10.45 0.000 9.62614 14.1068 0.279
----------------------------------------------------------------------------- ---
The intervention improves user satisfaction with services by 0.4 units (95 per cent CI from −0.6 to +1.4 units). As the
outcome standard deviation is 4.7 (Table IV), this would appear to exclude any clinically important effect.
Finally, we check that the Monte Carlo error is acceptable.
. mim, mcerror
[Values displayed beneath estimates are Monte Carlo jackknife standard errors]
--- --------------------------------------------------------------------------- --
sat96 | Coef. Std. Err t P>|t| [95% Conf. Int.] FMI
--------- +----------------------------------------------------------------------
rand | -.38696 .496623 -0.78 0.437 -1.36852 .594601 0.339
| .05154 .020382 0.11 .0662 .062639 .07194 0.056
[output omitted]
--- -- ---------------------------------------------------------------------------
The Monte Carlo errors are reasonably small: those for the coefficient and the standard error are less than 10 per cent
of the estimated standard error, as proposed in Section 7. In particular, it is clear that the result is almost sure to be
392
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
40
30
sat96
20
10
0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
Figure 4. UK700 data: comparison of observed (0) and imputed (1–30) data for sat96.
In this section we consider the main methodological limitation of the MICE procedure, that it lacks a clear theoretical
rationale, and in particular that the conditional regression models may be incompatible. We then discuss several pitfalls
in the practical use of MICE.
included the outcome in the imputation model only via the log of the survival time, omitting the event indicator
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
(Section 5.1). Because most of their data were censored, the event indicator conveys most of the outcome information.
This is likely to have biased the cholesterol coefficient towards zero by about 70 per cent (the fraction of missing data).
The authors’ revised analysis correctly included the censoring indicator as well as the log of the survival time in the
imputation model. A second possible problem with the QRISK analysis was that the cholesterol ratio was imputed by
imputing the two components separately. Some imputed values of the denominator may have been close to zero, leading
to influentially large imputed values of the ratio. This would cause the log hazard ratio to be further biased towards
zero, with an associated reduction in the standard error. By excluding extreme values of the imputed ratios, the revised
analysis avoided this problem.
10.5. Non-convergence
As MICE is an iterative procedure, it is important that convergence is achieved. This may be checked by computing, at
each cycle, the means of imputed values and/or the values of regression coefficients, and seeing if they are stable. For the
example in Section 9, these values appeared stable from the very first cycle. We have never found 10 cycles inadequate,
but larger numbers of cycles might in principle be required when incomplete variables are very strongly associated.
A practical difficulty with MI that afflicts many reasonable attempts to impute missing values occurs when the data
set contains many (e.g. dozens of) variables; this is not necessarily an issue specific to MICE. Often, it is unknown in
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
advance what the analysis model should look like—how many and which variables are needed, what functional form is
required for continuous variables, even what type of model is appropriate. The advice in Sections 5 and 6 may lead to
very large and complex imputation models. In principle, a rich imputation structure is desirable, but in practice, fitting
such a complex set of imputation models may defeat the software or lead to model instability. For example, we have
found it particularly challenging to work with structures including several nominal categorical variables imputed by
multinomial logistic regression; convergence of such large models is an issue, tending to make the imputation process
unacceptably slow. It is hard to propose universal solutions, but careful exploration of the data may suggest smaller
imputation models that are unlikely to lead to substantial bias. In general, one should try to simplify the imputation
structure without damaging it; for example, omit variables that seem on exploratory investigation unlikely to be required
in ‘reasonable’ analysis models, but avoid omitting variables that are in the analysis model or variables that clearly
contribute towards satisfying the MAR assumption. Further practical experience and research is needed to develop useful
rules of thumb.
11. Discussion
Multiple imputation is an increasingly popular method of analysis. However, like any powerful statistical technique, it
must be used with caution. We have highlighted several key areas, including choice of imputation models, handling
categorical and skewed variables, identifying what quantities can and cannot be combined using Rubin’s rules and
avoiding pitfalls. We hope to have encouraged readers to use MI widely, but with understanding and care.
We are often asked what fraction of data can be imputed. The theoretical answer is that almost any fraction of data
can be validly imputed, provided that the imputation is done correctly and the MAR assumption is correct, but that any
imperfections in the imputation procedure and any departures from MAR will have a proportionately larger impact when
larger fractions of data are imputed. In the QRISK study, the large fraction of missing data (70 per cent) amplified the
consequences of imperfections in the imputation procedure. It would seem wise to take special care if more than 30–50
per cent missing data are to be imputed.
It is important to report MI analyses in a way that allows readers to assess the adequacy of the methods used.
Sterne et al. suggest reporting guidelines that include careful comparison of MI results with the results of complete-case
analysis [46]: although well-implemented MI results should be superior to complete-case results, it should be possible
to understand the differences between the two analyses in terms of their different assumptions about the missing data
and the usually greater precision of MI.
We now consider various alternatives to the methods we have described.
MI is not the only way to handle missing data. Likelihood-based methods [55] and inverse probability weighting [56]
are often good alternatives. Even complete-case analysis may be appropriate in some settings [57, 58], for example in a
clinical trial with missing data only in the outcome, although sensitivity analyses would also be required to explore the
impact of departures from MAR.
MICE is not the only way to perform MI. The NORM algorithm [59] is based on MCMC sampling and so is more
theoretically founded, but assumes that the data are multivariate Normal. Results from NORM agree asymptotically with
those from MICE when all imputation models are linear [11], but may perform poorly with non-Normal or categorical
data [60]. A simpler procedure is available for monotone missing data: simply impute variables in increasing order of
missingness, using appropriate regressions on more complete variables only [2, Chapter 5.4].
Finally, Stata is not the only software package that implements MI. In R, there are user-contributed libraries for MICE
[9] and NORM [59]. In SAS, PROC MI implements the NORM algorithm [61], and IVEware is a user-contributed
implementation of the MICE algorithm [11, 62]. In SPSS, the missing values module implements the MICE algorithm [63].
We have focussed on unstructured data sets. Longitudinal data can be straightforwardly imputed by regarding the
different time points as different variables. Imputing clustered (multi-level) data is harder, since the imputations should
respect the multi-level structure. If clusters are large, then it may be reasonable to treat cluster as a fixed effect in the
imputation model. If clusters are very small—say, all of size 1 or 2—then it may be possible to format the data with one
record per cluster and different variables for the first and second cluster members, and then to use conditional imputation
so that the second variable is only imputed when the cluster contains more than one individual. If clustering is not the
main focus of analysis, then it may be adequate in practice to ignore the clustering in the imputation model and only
allow for it in the analysis model: this is a topic for future research. Finally, a set of MLwiN macros for imputing
multi-level data are available from www.missingdata.org.uk.
Other topics for future research are exploring the implications of incompatible conditional models in MICE, evaluating
the performance of congenial but mis-specified imputation models, exploring how large the imputation model(s) can
395
safely be, developing methods valid under specified MNAR mechanisms, exploring the performance of PMM, and
exploring multivariable model building with fractional polynomials or splines.
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
From a practical perspective, it is worth considering at what stage of the data analysis process missing data should be
imputed. Early work assumed that imputation would be done just once on a data set and that the imputed data would be
released to all users. This arrangement is appealing for large surveys, when the imputer may have access to confidential
information that is helpful for imputing but which cannot be publicly released. However, it requires the imputer to allow
for any association that might be of interest to future analysts, a virtually impossible task. We prefer the arrangement
of imputing data once for each project to be conducted on a data set, since it should be possible to produce a moderate
list of variables and interactions that might be considered and include them all in the imputation model. It may then be
desirable to re-impute the data in a way that is congenial with key analysis models. A third alternative is imputing data
once for each individual analysis: this makes it easier to ensure that the imputation model is appropriate, but typically
involves more imputation effort than most analysts are willing to tolerate.
One data set of size 5000 was generated from the model x ∼ N(0, 1), y ∼ N(x+x2 , 1). Values of x were deleted either
with probability 0.3 (an MCAR mechanism) or if y>3 (an extreme MAR mechanism). Missing values of x were imputed
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
using the JAV approach: that is, x2 was computed as x2 , and MICE was performed with m = 10, imputing x using
linear regression on y and x2 and imputing x2 using a linear regression on y and x. Estimated values of 2 from the
analysis model y ∼ N(0 +1 x+2 x2 , 2 ) are given in the table.
The JAV procedure appears unbiased under MCAR but is biased under this form of MAR.
Acknowledgements
The authors thank the UK700 investigators for permission to use a subset of their data; Shaun Seaman for very helpful comments
and discussions; and all the audiences on our courses for asking good questions.
Ian White and Patrick Royston were supported by UK Medical Research Council grants U.1052.00.006 and
U.1228.06.01.00002.01.
References
1. Little RJA, Rubin DB. Statistical Analysis with Missing Data (2nd edn). Wiley: Hoboken, NJ, 2002.
2. Rubin DB. Multiple Imputation for Nonresponse in Surveys. Wiley: New York, 1987.
3. Harel O, Zhou XH. Multiple imputation: review of theory, implementation and software. Statistics in Medicine 2007; 26:3057--3077.
4. Horton NJ, Kleinman KP. Much ado about nothing: a comparison of missing data methods and software to fit incomplete data regression
models. The American Statistician 2007; 61:79--90.
5. Carpenter JR, Kenward MG, White IR. Sensitivity analysis after multiple imputation under missing at random: a weighting approach.
Statistical Methods in Medical Research 2007; 16(3):259--275.
6. Robins J, Wang N. Inference for imputation estimators. Biometrika 2000; 87(1):113--124.
7. Schafer JL. Analysis of Incomplete Multivariate Data. Chapman & Hall: London, 1997.
8. Barnard J, Rubin DB. Small-sample degrees of freedom with multiple imputation. Biometrika 1999; 86:948--955.
9. van Buuren S, Oudshoorn CGM. Multivariate Imputation by Chained Equations: MICE V1.0 User’s manual. TNO Report PG/VGZ/00.038.
TNO Preventie en Gezondheid: Leiden, 2000. Available from: http://www.multiple-imputation.com/.
10. van Buuren S. Multiple imputation of discrete and continuous data by fully conditional specification. Statistical Methods in Medical
Research 2007; 16(3):219--242.
11. Raghunathan TE, Lepkowski JM, Hoewyk JV, Solenberger P. A multivariate technique for multiply imputing missing values using a
sequence of regression models. Survey Methodology 2001; 27:85--95.
12. StataCorp, Stata Statistical Software: Release 11. Stata Press: College Station, TX, 2009.
13. Royston P. Multiple imputation of missing values. Stata Journal 2004; 4:227--241.
14. Royston P. Multiple imputation of missing values: update. Stata Journal 2005; 5:188--201.
15. Royston P. Multiple imputation of missing values: update. Stata Journal 2005; 5:527--536.
16. Royston P. Multiple imputation of missing values: Further update of ice, with an emphasis on interval censoring. Stata Journal 2007;
7:445--464.
17. Royston P. Multiple imputation of missing values: further update of ice, with an emphasis on categorical variables. Stata Journal 2009;
9:466--477.
397
18. Carlin JB, Galati JC, Royston P. A new framework for managing and analysing multiply imputed data sets in Stata. Stata Journal 2008;
8:49--67.
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
19. Burns T, Creed F, Fahy T, Thompson S, Tyrer P, White I, for the UK700 trial group. Intensive versus standard case management for
severe psychotic illness: a randomised trial. Lancet 1999; 353:2185--2189.
20. Johnson NL. Systems of frequency curves generated by methods of translation. Biometrika 1949; 36:149--176.
21. Wright EM, Royston P. Age-specific reference intervals (‘normal ranges’). Stata Technical Bulletin 1996; 34:24--34.
22. Little RJA. Missing-data adjustments in large surveys. Journal of Business and Economic Statistics 1988; 6:287--296.
23. Schenker N, Taylor JMG. Partially parametric techniques for multiple imputation. Computational Statistics and Data Analysis 1996;
22:425--446.
24. Burton A, Billingham LJ, Bryan S. Cost-effectiveness in clinical trials: using multiple imputation to deal with incomplete cost data. Clinical
Trials 2007; 4:154--161.
25. Moons KG, Donders RA, Stijnen T, Harrell FE. Jr. Using the outcome for imputation of missing predictor values was preferred. Journal
of Clinical Epidemiology 2006; 59:1092--1101.
26. van Buuren S, Boshuizen HC, Knook DL. Multiple imputation of missing blood pressure covariates in survival analysis. Statistics in
Medicine 1999; 18:681--694.
27. Clark TG, Altman DG. Developing a prognostic model in the presence of missing data: an ovarian cancer case study. Journal of Clinical
Epidemiology 2003; 56:28--37.
28. Barzi F, Woodward M. Imputations of missing values in practice: Results from imputations of serum cholesterol in 28 cohort studies.
American Journal of Epidemiology 2004; 160:34--45.
29. White IR, Royston P. Imputing missing covariate values for the Cox model. Statistics in Medicine 2009; 28:1982--1998.
30. Fay RE. When are inferences from multiple imputation valid? Proceedings of the Survey Research Methods Sections. American Statistical
Association, Alexandria, VA, 1992; 227--232.
31. Meng XL. Multiple-imputation inferences with uncongenial sources of input. Statistical Science 1994; 9:538--558.
32. Rubin DB. Multiple imputation after 18+ years. Journal of the American Statistical Association 1996; 91:473--489.
33. Royston P, Sauerbrei W. Multivariable Model-building: A Pragmatic Approach to Regression Analysis based on Fractional Polynomials
for Modelling Continuous Variables. Wiley: Chichester, 2008.
34. Von Hippel PT. How to impute squares, interactions, and other transformed variables. Sociological Methodology 2009; 39:265--291.
35. Graham JW, Olchowski AE, Gilreath TD. How many imputations are really needed? Some practical clarifications of multiple imputation
theory. Prevention Science 2007; 8:206--213.
36. Horton NJ, Lipsitz SR. Multiple imputation in practice: comparison of software packages for regression models with missing variables.
American Statistician 2001; 55:244--254.
37. Royston P, Carlin JB, White IR. Multiple imputation of missing values: new features for mim. Stata Journal 2009; 9:252--264.
38. Bodner TE. What improves with increased missing data imputations? Structural Equation Modeling: A Multidisciplinary Journal 2008;
15:651--675.
39. Wood A, White I, Hillsdon M, Carpenter J. Comparison of imputation and modelling methods in the analysis of a physical activity trial
with missing outcomes. International Journal of Epidemiology 2005; 34:89--99.
40. Little RJA. Regression with missing X’s: a review. Journal of the American Statistical Association 1992; 87:1227--1237.
41. Von Hippel PT. Regression with missing Ys: an improved strategy for analyzing multiply imputed data. Sociological Methodology 2007;
37(1):83--117.
42. Molenberghs G, Kenward MG. Missing Data in Clinical Studies. Wiley: Chichester, 2007.
43. Wood AM, White IR, Royston P. How should variable selection be performed with multiply imputed data. Statistics in Medicine 2008;
27:3227--3246.
44. Meng XL, Rubin DB. Performing likelihood ratio tests with multiply-imputed data sets. Biometrika 1992; 79:103--111.
45. White IR, Thompson SG. Adjusting for partially missing baseline measurements in randomised trials. Statistics in Medicine 2005;
24:993--1007.
46. Sterne JAC, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, Wood AM, Carpenter JR. Multiple imputation for missing data in
epidemiological and clinical research: potential and pitfalls. British Medical Journal 2009; 338:b2393.
47. Gilks W, Richardson S, Spiegelhalter D. Markov Chain Monte Carlo in Practice. Chapman & Hall/CRC: London, Boca Raton, FL, 1996.
48. Kenward MG, Carpenter J. Multiple imputation: current perspectives. Statistical Methods in Medical Research 2007; 16(3):199--218.
49. van Buuren S, Brand JPL, Groothuis-Oudshoorn CGM, Rubin DB. Fully conditional specification in multivariate imputation. Journal of
Statistical Computation and Simulation 2006; 76:1049--1064.
50. Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, May M, Brindle P. Derivation and validation of QRISK, a new cardiovascular
disease risk score for the United Kingdom: prospective open cohort study. British Medical Journal 2007; 335:136.
51. Hippisley-Cox J, Vinogradova Y, Robson J, May M, Brindle P. QRISK: authors’ response. British Medical Journal 2007. Available from:
http://www.bmj.com/cgi/eletters/bmj.39261.471806.55v1#174181.
52. Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, Brindle P. QRISK cardiovascular disease risk prediction algorithm—comparison
of the revised and the original analyses technical supplement, 2007. Available from: http://www.qresearch.org/Public_Documents/
QRISK1%20Technical%20Supplement.pdf.
53. White IR, Daniel R, Royston P. Avoiding bias due to perfect prediction in multiple imputation of incomplete categorical variables.
Computational Statistics and Data Analysis 2010; 54:2267--2275.
54. Collins LM, Schafer JL, Kam CM. A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological
Methods 2001; 6:330--351.
55. Carpenter JR, Kenward MG. Missing Data in Clinical Trials—A Practical Guide. National Institute for Health Research, Publication
RM03/JH17/MK: Birmingham, 2008. Available from: http://www.pcpoh.bham.ac.uk/publichealth/methodology/projects/RM03_JH17_MK.
shtml.
56. Hogan JW, Lancaster T. Instrumental variables and inverse probability weighting for causal inference from longitudinal observational
studies. Statistical Methods in Medical Research 2004; 13:17--48.
57. Allison PD. Multiple imputation for missing data: a cautionary tale. Sociological Methods and Research 2000; 28:301--309.
398
58. White IR, Carlin JB. Is multiple imputation always better than complete-case analysis for handling missing covariate values? Statistics in
Medicine, DOI: 10.1002/sim.3944.
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399
I. R. WHITE, P. ROYSTON AND A. M. WOOD
59. Schafer JL. Software for multiple imputation, 2008. Available from: http://www.stat.psu.edu/∼jls/misoftwa.html.
60. Bernaards CA, Belin TR, Schafer JL. Robustness of a multivariate normal approximation for imputation of incomplete binary data. Statistics
in Medicine 2007; 26:1368--1382.
61. SAS Institute Inc., SAS/STAT 9.1 User’s Guide. SAS Institute Inc.: Cary, NC, 2004, chapter 46.
62. Raghunathan TE, Solenberger PW, Hoewyk JV. IVEware Imputation and Variance Estimation Software, 2007. Available from:
http://www.isr.umich.edu/src/smp/ive.
63. SPSS inc. Build Better Models When You Fill in the Blanks. Available from: http://www.spss.com/media/collateral/statistics/missing-values.pdf
(20 April 2010).
399
Copyright © 2010 John Wiley & Sons, Ltd. Statist. Med. 2011, 30 377--399