0% found this document useful (0 votes)
42 views

Systematic Statistical Analysis of Microbial Data From Dilution Series

This document proposes a new statistical model for analyzing microbial dilution series data. Dilution series experiments involve diluting microbial samples multiple times and counting colony forming units (CFUs) at each dilution to estimate the original microbial abundance. The document outlines common challenges with existing models and proposes a novel binomial model that accounts for the experimental design and hierarchy of repetitions and inter-laboratory studies. A Bayesian approach is used to analyze real dilution series data sets.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Systematic Statistical Analysis of Microbial Data From Dilution Series

This document proposes a new statistical model for analyzing microbial dilution series data. Dilution series experiments involve diluting microbial samples multiple times and counting colony forming units (CFUs) at each dilution to estimate the original microbial abundance. The document outlines common challenges with existing models and proposes a novel binomial model that accounts for the experimental design and hierarchy of repetitions and inter-laboratory studies. A Bayesian approach is used to analyze real dilution series data sets.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Systematic statistical analysis of microbial

data from dilution series


J Andrés Christen∗† Albert E. Parker‡
arXiv:2003.09039v1 [stat.AP] 19 Mar 2020

12MAR2020

Abstract
In microbial studies, samples are often treated under different experimental conditions and then
tested for microbial survival. A technique, dating back to the 1880’s, consists of diluting the
samples several times and incubating each dilution to verify the existence of microbial Colony
Forming Units or CFU’s, seen by the naked eye. The main problem in the dilution series data
analysis is the uncertainty quantification of the simple point estimate of the original number
of CFU’s in the sample (i.e., at dilution zero). Common approaches such as log-normal or
Poisson models do not seem to handle well extreme cases with low or high counts, among
other issues. We build a novel binomial model, based on the actual design of the experimental
procedure including the dilution series. For repetitions we construct a hierarchical model for
experimental results from a single lab and in turn a higher hierarchy for inter-lab analyses.
Results seem promising, with a systematic treatment of all data cases, including zeros, censored
data, repetitions, intra and inter-laboratory studies. Using a Bayesian approach, a robust and
efficient MCMC method is used to analyze several real data sets.

Keywords: Dilution experiments; binomial likelihood; Bayesian inference; Hierarchical mod-


els; MCMC.


J Andrés Christen, Centro de Investigación en Matemáticas (CIMAT-CONACYT), Jalisco S/N, Valen-
ciana, Guanajuato, GTO, 36023, MEXICO, email:jac at cimat.mx.

Corresponding author.

Center for Biofilm Engineering (CBE) and Department of Mathematical Sciences, Montana State Uni-
versity, Bozeman, MT, USA, email: parker at math.montana.edu.
1
1 Introduction
Dilution experiments are an important tool to detect the presence of microbes, even in very
low concentrations, relying on basic microbiology techniques and relatively simple laboratory
equipment. The microbes are first sampled from a natural environment, such as soil or
drinking water, or from an engineered bench-top reactor system where they can be grown
planktonically or in a biofilm before potentially being exposed to some treatment. Whatever
the case, the sample containing microbes is transferred into a volume V0 of liquid in tube
0, most commonly buffered water (Rice et al., 2017) but any appropriate liquid medium
may be used. It is then of interest to estimate the number of microbes in this tube 0, i.e.,
the abundance of microbes in the original sample. Analyzing the original sample directly
may be possible using specialized equipment and more complex lab processes (e.g. a confocal
scanning laser microscope Parker et al., 2018a; Pitts and Stewart, 2008). Instead, subvolumes
from tube 0 may be “spread-plated” or “drop-plated” (Herigstad and Hamilton, 2001) onto
growth media in a Petri dish to let bacteria grow until distinguinshable by the naked eye (see
Figure 1(c)). When plating directly from samples that contain a high density of microbes,
it is not possible to identify and count individual colonies, in which case it is necessary to
dilute the original volume until only a few colonies can be counted after plating in a Petri
dish. From these counts at some dilution(s), the microbial abundance is inferred in tube 0
and hence in the original sample. The process is called a dilution series and is described as
follows.
To begin the dilution series, from the volume V0 in tube 0 a subvolume V is taken which
is diluted at a factor α, typically α = 10, to form a new volume αV in a new tube at dilution
j = 1. This process is repeated for j = 1, ..., J − 1: a subvolume V is taken from the volume
αV in tube j, and then diluted by a factor α to form a new volume αV in a new tube at
dilution j + 1. A smaller volume u is then removed from each tube and then plated onto
growth media in a Petri dish, and kept at optimal environment conditions (e.g., temperature
and pH) to allow any viable microbes to grow and reproduce. Individual microbes or clumps
of microbes will start forming colonies in the Petri dish, until, at a high enough dilution,
they are distinguishable to the naked eye as dots called Colony Forming Units (CFUs).
Estimating microbial abundances from colonies dates at least as far back as the seminal
work of Robert Koch in the 1880’s (Prescott et al., 1996, p. 9). These estimates are

2
expressed as CFUs rather than as the number of microbes because of a number of known
limitations, the most obvious being the questionable assumption that each colony arises from
an individual cell (Prescott et al. (1996, p.119); see also Cundell (2015) for a more recent
discussion). Still, especially for a single microbial species isolated from a consortia in an
environmental sample, or for a mono-culture grown in the laboratory, the CFU remains a
useful quantitative measure for estimating microbial abundances. Dilution series for CFU
counting are performed routinely in many government, academic and private laboratories for
experimentation as well as for testing and public standard compliance (see for example Ben-
David and Davidson, 2014; FDA, 2018, or a “dilution series” search in fda.gov). Indeed,
CFU counting is a required international metric for assessing the efficacy of antimicrobial
treatments in North America and Europe (Parker et al., 2018b).
In a dilution series, for some lower dilutions, too many CFUs may cluster and will be
impossible to count and are reported as “too numerous to count” (TNTC). For higher di-
lutions there will eventually be no CFUs (no microbial activity). Commonly, one dilution
is then selected, namely dilution j, 0 ≤ j ≤ J − 1, for CFU counting, having a minimum
number of distinguishable CFUs per plate or drop, and referred to as the lowest countable
dilution. From this, the microbial abundance in the original sample is to be estimated. The
crudest estimate is the (number of CFUs) ×α0 × αj × αp , where α0 = V0 /V and αp = V /u
is the ratio of the dilution tube volume V over the volume plated, or drop volume, u. The
usual formula used by practitioners is (number of CFUs) ×V0 /u × αj , which is precisely
equal to the latter. See Figure 1 for an illustration of the process, CFU counting and basic
data analysis.
Instead of taking solely the lowest countable dilution, a possible variant is to weight-
edly average the CFUs across multiple dilutions (Hedges, 2002; Maturin and Peeler, 1998;
Niemela, 1983; Niemi and Niemela, 2001; Parkhurst and Stern, 1998) that is motivated by
the Horvitz-Thompson estimator, popular in field ecology (Horvitz and Thompson, 1952).
Hamilton and Parker (2010) argue that the added information is minimal for common dilu-
tion experimental designs. Our investigation (presented here) leads to this same conclusion,
which supports the microbiologist’s conventional practice of using data from only the first
countable dilution to estimate the microbial abundance in the original sample.
In most situations the abundance of microbes in the original sample spans several orders of
magnitude and therefore it is common to estimate a log abundance that is the log10 -transform

3
(a) (b) (c)

Figure 1: (a) Treated samples (here there are K = 3 repetitions) are transported and placed
in dilution 0 tubes (top right) and from these subsequent dilutions are formed (bottom left)
by adding buffered water and then homogenizing with an orbital shaker (center). (b) From
each dilution a volume is taken and drops are plated using an electronic micropipette. (c)
Example of drops plated in Petri dish from dilutions 0, 1, 2 and 3. Dilution 3 (ie. j = 3) is
selected for CFU counting, others are “too numerous to count” (TNTC). Counts for dilution
3 are 13, 10, 6, 9 and 16. As in the example in Section 3.1, α0 = 1, α = 10 and αp = 103
leading to a simple mean estimator of 10.8 × 106 . We use a binomial model to approach
this estimation problem formally, and use a Bayesian approach to quantify the uncertainty
in estimates of CFU counts.

of the CFU count in dilution 0. The estimated mean log abundance from statistical analyses
can easily be normalized to a mean log density per unit of the volume or surface area of the
original specimen, Sc , by simply subtracting by log10 (Sc ).
The efficacy of an antimicrobial treatment is usually quantified by a log reduction (LR)
which is the estimated mean log abundance of microbes that survived the treatment sub-
tracted from the estimated mean log abundance of microbes in a concurrent control. The
microbes in the control samples are subjected to the same conditions as the treated samples
with the exception that the controls are subjected to a placebo treatment. Perhaps not sur-
prisingly, the log abundances of the microbes in the control samples are typically much less
variable than the microbes subjected to an antimicrobial (Parker et al., 2018b). Hence the
constant variance assumption of many statistical models is often violated when analyzing a
data set that includes both control and treated samples.
There are two common approaches to analyzing CFU data. One uses a Poisson likelihood
model of the counts (Haas et al., 1999), while the other uses a log-normal likelihood model
of the counts (Hamilton et al., 2013). Both maximize the likelihood to provide microbial
abundance estimates and quantify uncertainty. Both also can be extended to handle random

4
effects (Zuur et al., 1999) due to samples being repeatedly collected from the same site,
experiment and/or the same laboratory. Software is readily available to fit either of these
types of models (see, e.g., Bates et al., 2015).
Haas et al. (1999, p. 213) argue that a Poisson maximum likelihood estimator (MLE)
approach is preferred over the log-normal MLE. One reason is that the Poisson likelihood
naturally deals with zero CFU counts. The obvious downside to the Poisson model is the
requirement that the variance is equal to the mean. The control data that we present here
clearly do not adhere to this restriction. The Generalized Poisson or Negative binomial are
both extensions of the Poisson that allow for the variance to be independently estimated
from the mean (Joe and Zhu, 2005). Nonetheless, neither of these Poisson approaches allow
one to directly analyze LRs.
The log-normal MLE approach overcomes the restriction that the mean equals the vari-
ance by including separate parameters for the mean and variance. To deal with the differing
variability of microbes treated by antimicrobials versus controls, one can aggregate the log
abundance estimates into LRs and then apply a normal model to the LRs (Hamilton et al.,
2013). Unfortunately, this approach does not allow one to separately model or estimate the
variance among counts on different plates at different dilutions. When presented with CFU
data including a zero or TNTC, a common tactic when using a log-normal likelihood is to
substitute in a small value (CFU = 0.5 or 1) for zero and to substitute in the largest count
for TNTC (30 for the drop plate method, 300 for the spread plate method). Many published
˙
simulation studies show that as long as there are not too many censored data (≤15% of the
total data set), the log-normal model has little bias and mean squared error when estimating
the mean log abundance of organisms (see, eg., Clarke, 1998; EPA, 1996; Haas and Scheff,
1990; Singh and Nocerino, 2002).
The approach that we present here has several advantages over the conventional ap-
proaches just described.

1. The design and assumptions of the dilution experiments lead directly to our binomial
model in (1) and (2).

2. Any censored data are directly modeled in the binomial likelihood (i.e., no substitution
rules are employed) resulting in zeros and TNTCs handled systematically by the same
model.

5
3. Counts from multiple dilutions are directly incorporated into the model (ie. they are
not aggregated together before statistical analysis), which allows us to separate out the
variance among counts on different plates and on different dilutions from other sources
of variance that contribute to the variance of log abundances and LRs.

4. Our model accounts for clustering of the cells (with the miscount probability q) that
violates the assumption that one microbe generates one CFU. This aspect of the model
can also deal with miscounts by the technician who actually counts the CFUs.

5. Instead of summarizing the results from each experiment by a LR and then analyzing
the LRs using a normal model, we provide an over-arching hierarchy for the analysis
of LRs that has the explicit information regarding the CFUs and dilutions that led to
the LR.

The paper is organized as follows. In section 2 we build our binomial model from the
description of the experimental design. Using a basic theorem a simplification is obtained
leading a straightforward likelihood. We also describe our Bayesian inference process and a
hierarchical modeling strategy to analyze intra and inter-lab data and the MCMC method
used. Two examples are presented in section 3 considering real data and in section 4 we
present a discussion of the paper.

2 Model and Inference


As explained above, microbial samples are treated with, for example, a chemical agent at
some concentration, or water temperature for a specified contact time, and the effect of this
treatment is observed as the number of CFUs from surviving microbes using a dilution series.
Commonly, several experiments may be conducted with the same treatment, and within
each experiment, multiple repetitions or samples are also considered. To ease analysis and
notation, we concentrate on one experiment for a single treatment. We make a comment on
the analysis of multiple treatments in section 4.
Throughout, we will refer to the plate or drop from which CFUs were counted only as
‘drop’; our model may consider in principle any such volume from which CFUs are counted
simply by using a different division factor αp = V /u.

6
Let K be the number of repetitions of microbial samples to be analyzed in one experiment.
Each sample, grown in an original surface area, volume etc. Sc , is transported into a volume
V0 (typically buffered water) and carefully homogenized in tube 0 of each repetition k =
1, 2, . . . , K. Some fraction α−1 α0−1 V0 of the volume V0 in this first tube is taken to tube 1,
diluted in (1 − α−1 )α0−1 V0 volume and carefully homogenized. This process is repeated to
produce tubes j = 1, . . . , J −1 for each repetition. Tubes 1 through J −1 will have V = α0−1 V0
volume and α0 = V0 /V is the proportion of volume in tube 0 vs the common volume V in
the rest of the dilution series tubes. In our examples V0 = V = 10 ml, α0 = 1 (intra-lab
example studying heat treatments) and V0 = 40 ml, V = 10 ml and α0 = 4 (inter-lab example
studying bleach treatments).
D ‘drops’ of volume u are then plated from each dilution tube. For example, for the drop
plate method, a total of D = 10 drops, each with volume u = 10 µl, are usually plated. For
the spread plate method, D = 2 drops, each with volume u = 0.1 ml or u = 1 ml, is common.
One or several Petri dishes may be used to grow CFUs from these drops, but we will not
distinguish between different dishes from the same dilution tube. For j = 1, . . . , J − 1 the
proportion plated is αp = V /u and for j = 0, αp α0 = V0 /u.
Let Njk be the r.v. representing the number of CFUs in dilution j for repetition k and let
nkj be any particular realization for it (we use the standard probabilistic notation of upper
case being the r.v. and lower case a particular value for it, eg. P (Y ≤ y|X = x), the
probability of Y being less or equal to the particular value y conditional on X = x) . Let
k
Yj,i be the CFU count in drop i = 1, 2. . . . , D, for dilution j for repetition k. Our approach
is to build a hierarchical model following the dilution process and the counting process just
described. Due to homogenization, we can safely assume that Njk and Yj,i
k
are binomial
random variables. Namely

−δ0,j
k
Yj,i |Njk = nkj , q ∼ Bi(nkj , αp−1 α0 (1 − q)) (1)
−δ1,j
Njk |Nj−1
k
= nkj−1 ∼ Bi(nkj−1 , α−1 α0 ) (2)

for j = 0, 1, . . . , J − 1, where δi,j is the Dirac function. Here 1 − q is included as an additional


probability that each individual microbe in each drop actually does form a distinguishable
colony and adds to the CFU count.
The proposed model above brings up an interesting philosophical point. This binomial

7
model is simply describing the way that the experiment is conducted, as opposed to a Poisson
or log-normal. As explained in the introduction, the above model is in fact describing the ex-
perimental design with no further assumptions in the statistical modeling than those already
assumed by the experimenters conducting the dilution experiments. These assumptions are
that: through careful homegenization, a drop from dilution 1 has a proportion αp−1 α−1 of
CFUs from dilution 0 and dilution j has a proportion α−1 of CFUs from dilution j − 1.
Moreover, only a small proportion q (maybe less that 5%) of CFUs fail to make countable
colonies.
Moreover, use of the binomial permits direct estimation of the number of CFUs (Njk ’s)
with no need for scaling and alleviates the main issues of the other models: over- and under-
dispersion in the Poisson case and the use of substitution rules for 0’s and TNTCs in the
log-normal case.
To model all repetitions in a single experiment, and because it is common to consider log
abundances when N0k is large, we take the usual approach of constructing a model linking
all log abundances log10 (N0k + 1), k = 1, 2, . . . , K, as realizations from a population having a
common mean E with some dispersion. This again reflects/models what is being done in the
lab: The intention of performing repetitions is to try to estimate the (log) abundance of the
experiment itself, and asses its variability by performing K repetitions under repeatitable
conditions. Accordingly E is interpreted as the mean log abundance for a single experiment,
which we will infer, and using a Bayesian approach, it will be taken as a r.v. and its posterior
distribution estimated. However, before describing the details of the latter, we first comment
on the following two points.
First, as opposed to usual practices, from the onset we add 1 when calculating the log
abundance to properly define it when N0k = 0, that is, when there are only zero CFUs
as occurs when no microbes survive an antimicrobial treatment. Note that for N0k ≥ 10
(as almost always occurs for control samples), log10 (N0k + 1) is already nearly the same as
log10 (N0k ), so the interpretation of the log abundance defined as log10 (N0k + 1) should be
straightforward for these large values of N0k . When N0k is closer to 0, then a log abundance
is less useful. Instead, one can examine directly N0k and its posterior for statistical inference.
Still, Taylor’s Theorem shows that log10 (N0k + 1) ≈ N0k for small N0k , which suggests defining
the log abundance as log10 (N0k + 1) even in this case. Moreover, this definition will permit
us not only to be consistent in all cases, but also to discuss the limit of detection (LOD)

8
when no counts are detected; see section 2.2.
Second, it is common to normalize abundances to the volume or surface area of the
original specimen, Sc , via N0k /Sc . We recommend applying this normalization to convert
 k 
k
 N +1
log abundances to log densities by log10 N0 + 1 − log10 (Sc ) = log10 0Sc . This brings
up an interesting point that is not well appreciated by microbiologists. Changing the units
via Sc clearly changes the mean log density but leaves the variance unchanged (due to
the multiplicative property of the log transform). For example, for biofilm samples, when
changing the units from CFU/mm2 to CFU/cm2 , Sc decreases by a factor of 100, so that the
mean log density increases by 2 but the standard error of the sample mean (SEM) and the
SD of the individual log densities remain unchanged. Hence, any frequentist hypothesis test

of the population mean of log10 N0k + 1 that depends on a t-ratio of the sample mean to the
SEM will always become “statistically significant” for a drastic enough change in units. This
issue is mitigated when considering the log abundance of microbes compared to a concurrent
control via a log reduction (LR), as occurs when assessing antimicrobial treatments. This is
because the LR is unitless.
Returning to the experimental mean log abundance E, the first and nearly default mod-
eling approach would be taking

log10 N0k + 1 = E + k ; k ∼ N (0, σ)



(3)


or log10 N0k + 1 ∼ N (E, σ). This model might indeed be appropriate for relatively small

σ, to mantain log10 N0k + 1 positive, but the Gaussian model is certainly not well suited in
general. To make a more robust modeling approach, first assume that

E log10 N0k + 1 E = e = e; for k = 1, 2, . . . , K


  
(4)

 
ie. E log10 N0k + 1 = E. Second we introduce A as a dispersion parameter, to stipulate
the model
log10 N0k + 1 E = e, A = a ∼ Ga(a, ea−1 ).

(5)

Here we use the parametrization for the Gamma distribution Ga(a, b) where b is the ‘scale’
parameter, and therefore the expected value above is precisely e as required. Moreover, its
standard deviation is √e and its signal-to-noise ratio (ie. mean over standard deviation or
a

9

the inverse of the coefficient of variation) is a, representing the unitless dispersion in the

model which is specially well suited for positive r.v.’s such as log10 N0k + 1 . Note also that
the above gamma model correctly generalizes the Gaussian model (3) since for large A (eg.
˙ (E, √EA )

A > 30) the gamma distribution is already close to a Gaussian and log10 N0k + 1 ∼N
and this is precisely the case when we have relatively small variances (low coefficient of
variation), making the Gaussian model appropriate. Then simply, (5) generalizes the default
Gaussian model in (3) to positive only values, only using a different, and perhaps better
suited, parametrization.
The specification above in fact creates a hierarchical model and using a simple well known
result we may integrate out the Njk ; j = 0, 1, . . . , J −1 and the binomial model in (1) becomes

k
Yj,i |N0k = nk0 , q ∼ Bi(nk0 , α−j αp−1 α0−1 (1 − q)). (6)

See further details on this and other technical elements of our model in Appendix A.
We use a Bayesian approach to make inferences for the parameters of interest by first
stating prior distributions and performing an MCMC to sample from the posterior.
We require prior distributions for E, A, N01 , N02 , . . . , N0K . We take a pragmatic approach
in which the prior for N0k is a discrete uniform distribution from 0 to 10M , E ∼ U (0, M )
and A ∼ Exp(b), b is a scale parameter. 10M is a maximum physical capacity of CFUs
for the surface with surface area SC or the volume with volume Sc , etc. put to treatment.
In engineered reactor systems especially, experimentalists know the maximum microbial
abundance in a sample (i.e., they know M ). Indeed, they choose which dilutions to plate
based on this knowledge. A is the shape parameter of the Gamma conditional distribution
fN0k |E,A (nk0 |e, a) in (5). A simple approach is to take the prior for A as exponential, resulting
with most of its mass from 0 to 2b. In the drop plate examples below we use M = 10 and b =

500. This prior parameter for A (b = 500) was calibrated such that each log10 N0k + 1 had
nearly the same prior as E, that is a U (0, M ) (ie. a priori fN0k |E,A (nk0 |e, a) is approximately
U (0, M ) for all k; this may be done calibrating b with simulated samples from (5)).
We use the t-walk (Christen and Fox, 2010) to produce a MCMC algorithm to sample
from the resulting posterior distribution. The t-walk is a self adjusting MCMC algorithm,
that requires the log posterior and two initial points. In the resulting MCMC in all examples
we typically generated chains of length 500,000 with an IAT (Geyer, 1992) of 50 leading to
an effective sample size of roughly 10,000. In our Python implementation the corresponding
10
computations took 50 seconds on a 2.2 GHz processor. By simulating initial values for each
N0k from its ‘free’ posterior (see Appendix A) the burn-in resulted very short indeed in most
cases. The initial value for E is taken as the mean of the log10 (N0k + 1)s and the initial value
for A is taken by simulating from its prior distribution. Overall, the MCMC is very robust,
working nearly unsupervised in all the examples we tested, including all those presented
here.

2.1 Goodness of fit of the binomial model


Our binomial model simply follows what is actually done in the laboratory and therefore
we claim it models dilution series count data correctly. For example, it accounts for all
extreme or censored data cases. However, there may be unaccounted sources of variability
that could question the appropriateness of the binomial model. To support our claim that
our approach is an overall better model for making the correct inferences, we here compare
it to an alternative model. Obvious models to compare the binomial model with include
the Poisson, negative binomial, or generalized Poisson. The generalized Poisson or negative
binomial are both Poisson mixtures (Joe and Zhu, 2005), and would be good candidates for a
comparison but it is not clear how to model the dilution series, as we did in (1) and (2) with
respect to the parameter of interest N0k (the abundance in the original sample) to provide a
fair comparison with our model. Certainly alternative definitions can be attempted to use
some Poisson mixture as a model for dilution data, but a straightforward generalization of
our binomial approach is a Beta-binomial distribution.
The Beta-binomial distribution is the result of a mixture of a binomial and a Beta distri-
bution, and as such is a generalization of the binomial distribution. Namely, if Y |S = s ∼
Bi(n, s) and S ∼ Beta(α, β) then Y ∼ BetaBinomial(n, α, β). That is, this generalizes the
binomial distribution by letting the sucess probability to be random, resulting in a variance
that is independent of the mean, whereas for the binomial model, its expected value over
variance is always > 1.
It is clear now how to extend our binomial model in (6), namely,

k
Yj,i |N0k = nk0 , S = s ∼ Bi(nk0 , s) and S ∼ Beta(s∗ λ, (1 − s∗ )λ),

where s∗ = α−j αp−1 α0−1 (1 − q). Here E(S) = s∗ as required, for any λ > 0. That is, instead

11
Figure 2: Boxplots of the posterior odds (Bayes Factors, BF) comparing the Beta-binomial
vs the binomial model, except for 3 cases with BF > 4. From the 69 data sets analyzed,
only 4 had a BF > 3 and 43 had BF < 1. These results fail to indicate clear evidence in
favor of the Beta-binomial model but instead seem to favor the binomial model.

k
of setting Yj,i |N0k = nk0 ∼ Bi(nk0 , s∗ ) as in (6) we let s∗ become a parameter, the r.v. S, with
E(S) = s∗ , and in this way generalize the binomial model to a Beta-binomial. Here λ is an
additional hyperparameter for the above Beta (prior) distribution for S.
We use the Bayesian model comparison machinery, calculating the posterior odds of the
Beta-binomial vs the binomial model, namely, the Bayes Factors (BF) of the Beta-binomial
in favor of the binomial model (see Kass and Raftery, 1995, for example). In either case, the
posterior probability for N0k is discrete and its normalization constant is calculated with a
sum; this normalization constant is calculated for each of the binomial and the Beta-binomial
models. The posterior probability of each model is proportional to its normalization constant
and the BF is the ratio of these posterior probabilities or in fact the ratio of the normalization
constants of each model.
We still need to fix λ to stipulate the Beta prior distribution for S in each case. If s∗ λ < 1
the prior mode for S is at zero and would bias the Beta-binomial model towards zero CFU
counts in all cases. To avoid this we need s∗ λ ≥ 1, that is λ ≥ (s∗ )−1 . We have seen that
not restricting s∗ λ ≥ 1 leads to a BF of practically zero in favor of the Beta-binomial model,
besides cases with zero CFU counts, as expected (results not shown). Since in most cases
(s∗ )−1 is quite large, then setting λ = (s∗ )−1 + 1 represents a neutral choice.
12
We performed the Bayesian model comparison on the data sets mentioned in section 3.
For the 51 CBE and 18 inter-lab individual dilution series count data we found only 4 BF’s
above 3: one for the former, with a BF = 15 (tube 3 of experiment 70 o C, 10 min), and
three for the latter data set, with BF’s of 15, 5.5 (from tube 1 of lab 5, tube 1 of lab 6 )
and 3.45. Leaving out those BF’s above 4, the rest of the BF’s are plotted in the boxplots
in Figure 2. We are using the common recommendation that a BF above 3 provides positive
evidence agains the default model (Kass and Raftery, 1995). These results do not provide
strong evidence for the Beta-binimial over the binomial model. Moreover, most of the BF’s
were below 1 which is evidence in favor of the binomial model over the alternative, more
general, Beta-binomial.

2.2 Censored data, zero counts and level of detection


As explained in the introduction, the usual practice when counting CFUs is that, besides the
k
actual dilution selected for counting, the other Yj,i s are not recorded. Therefore, the CFUs in
the first uncounted dilutions could be considered as censored data since we know that drops
below the selected dilution are TNTC, and also CFU counts in drops above the selected
dilution could also be counted, even if all zero. However, this added difficulty in recoding
and analysis does not seem to add any substantial information (Hamilton and Parker, 2010),
but this will depend on the particular design, mainly on the dilution parameters α and αp .
We have confirmed this using our model and the usual drop plate design using a simulation
study (see Appendix B); we do not discuss this possibility any further.
Accordingly, let jk = 0, 1, . . . , J − 1 be the dilution selected for counting (i.e., the lowest
countable dilution) for repetition k and to ease notation let Yik = Yjkk ,i , the actual CFU count
in the ith drop. The selected dilution is part of our experimental design and is taken as the
k
first dilution such that Yj,i ≤ c (c = 30 when drop plating, c = 300 when spread plating). If
zero CFUs appeared even in the first tube, we let jk = 0 and Yik = Y0,i
k
= 0. Dealing with
zero counts, that is Yik = 0 (ie. no CFU counts even at dilution 0), represents no problem
and may be dealt with consistently since (6) leads to a likelihood from which a posterior
may be calculated; see Figure 3(a).
In the case where even at the highest dilution still the CFU count is above the threshold
J
c, that is Yj,i > c then the CFU is recorded as TNTC, and we may treat this as right
J
censored data. The likelihood in this case is P [Yj,i > c|N0k = nk0 , q] and this extreme case can
13
(a) (b) (c)

Figure 3: (a) Posterior distribution for the total CFUs in dilution 0, when zero CFUs are
counted, α = 10, αp = 1000 and D = 10 drops. A comparison is shown when the miscount
probability is q = 0.05 or q = 0.0; little difference is observed in this experimental setting. (b)
Example showing all TNTC at the highest dilution, ignoring the censorship setting Yik = c
(green) and properly analyzing including censorship in the likelihood (blue). Note that in
the latter, the posterior reaches a plateau, since any arbitrarily large value is equally likely,
and is only truncated by the prior, ie. a maximal feasible number of CFUs for the coupon
surface. (c) Example of a hybrid case in which only three out of 10 drop counts where not
TNTC, all close to c = 30, ignoring the censorship (green) and considering the censorship in
the likelihood; already little differences are observed.

also be dealt with, although with a higher computational burden; it involves calculating the
binomial cdf in the likelihood. An example of this saturated data is presented in Figure 3(b)
and (c). In the real data examples presented in Sections 3.1 and 3.2 no saturated counts are
present.
Regarding the miscount parameter q, note that for usual drop plate experimental de-
signs V = 10 ml, α = 10, αp = 1000, the success probability in the binomial model is (6)
α−jk αp−1 (1 − q) which will be quite similar to α−jk αp−1 for reasonable miss count probabilities
q ≤ 0.1. The effect of q will be barely noticeable. For other experimental settings, the
effect of q could be more important, in which case q could be also considered a (nuisance)
parameter to be included in the posterior, with a tight prior for q small. However, it would
be advisable to make an experimental design to learn about the miscount parameter q, with
a reference sample with a known microbial abundance and several repetitions. However,
to our knowledge, these hypothetical experiments have not been conducted as yet. In the
examples presented in Section 3 we simply fix q = 0.05, this should barely have an effect, as
seen in Figure 3(a). Nonetheless, q is included for overall consistency of our approach.
Regarding the lower limit of detection (LOD, Currie, 1968), that is, the minimum number
of CFUs that can be detected in dilution 0 with the chosen experimental design, we may
14
Figure 4: Posterior pmf for 10E − 1 (in 10 CFU bins), setting all drop counts to zero, with
K repetitions. Drop plate design α = 10, αp = 1000, D = 10 drops. Lower LODs, ie. LK
such that P (10E − 1 < LK |Y0k = 0, k = 1, . . . K) < 0.95, are L1 = 110, L3 = 50 and L12 = 30.

calculate LK such that P (10E − 1 < LK |Y0k = 0, k = 1, . . . K) < 0.95, for various repetitions
K. That is, setting all drop counts to zero, we define the lower LOD as the 95% upper
quantile LK of the corresponding posterior distribution for the microbial abundance 10E − 1.
In Figure 4 we present examples of these posteriors for the drop plate design (α = 10, αp =
1000, D = 10 drops) with K = 1, 3 and 12 repetitions. The results are L1 = 110, L3 = 50
and L12 = 30. A substantial increase in the lower LOD is obtained when increasing from 1
to 3 repetitions, but the increase in the lower LOD is slower then onwards, coinciding with
the current practices of performing K = 3 repetitions per experiment. This same approach
can be applied to similarly assess the upper LOD for any experimental design (a useful
quantity to estimate when some drop counts are TNTC). We further comment on LODs in
our approach in section 4.

2.3 Inter-laboratory hierarchical model analysis


If the treatment has also been repeatedly studied in L different laboratories, each lab will
have an independent hierarchical E variable, denoted by El ; l = 1, 2, . . . , L. Analogously we
may define a global variable E and A where

El |E = e, A = a ∼ Ga(a, ea−1 ); l = 1, 2, . . . , L (7)

15
and again this generalizes the default Gaussian approach to positive values. This also de-
scribes what practitioners initially intend to do: infer an overall log10 (CF U + 1) across
many labs and asses its variability. This constitutes an additional hierarchy, where now
E represents the global mean for log10 (CF U + 1), for the experiment across laboratories.
The t-walk can be used to generate a chain of posterior samples, as was the case for the
hierarchical models within each lab. We use this approach in the inter-laboratory examples
in Section 3.2.

3 Examples
All Python code and data from examples in sections 3.1 and 3.2 are available in the supple-
mental material.

3.1 Intra-lab analysis


Data are taken from a series of experiments performed in the Center for Biofilm Engineering,
Montana State University, MT, USA. Biofilms of Sphingomonas parapaucimobilis were grown
on a cylinder with surface area Sc = 4.52 cm2 and then subjected to different temperatures
for varying contact times. Each temperature and time combination represents a treatment
and each treatment was applied to K = 3 replicate biofilm samples, as described in Wahlen
et al. (2016). By sonication the biofilm is harvested from the cylinders into V0 = 10 ml
of buffered water to form dilution j = 0. To begin a α = 10-fold dilution series, next
α−1 α0−1 V0 = 1 ml is taken from dilution 0 and then 9 ml of water is added to form dilution
j = 1, etc. up to dilution j = 6. Ten drops of u = 10 µl = 10−3 · 10 ml = αp−1 · V each
are plated and placed at 36 ± 2 o C for 48 ± 2 h and in turn inspected for CFU formation.
Therefore we have J = 7, α0 = 1, α = 10, αp = 103 and D = 10.
In Figure 5 we present the results of our analysis of 4 treatments: Room temperature
for 15 min, 65 o C for 15 min, 70 o C for 10 min and 75 o C for 10 min. Also included
are results from the classic simple analysis (using the log-normal approach) of calculating
confidence intervals from a mean and a standard deviation on the estimated log abundances.
The resulting intervals, in general, do not coincide with the variability in the posterior
distributions and in some cases may result in confidence intervals that include negative
values; see Figure 5(d).
16
(a) (b)

(c) (d)

Figure 5: Posterior marginal distributions for E (thick black) and log10 (N0k + 1); k = 1, 2, 3
without hierarchical model (thin blue), along with the 3σ intervals and data, for: (a) treat-
ment room temperature for 15 min the 3σ interval seems over dispersed. (b) treatment 65 o C
for 15 min, the classic 3σ more or less coincides with the posterior variability. (c) treatment
70 o C for 10 min, the 3σ interval looks clearly over dispersed. (d) treatment 75 o C for 10
min the 3σ interval is wrong, covering negative values.

17
A more extreme example is when most CFU counts are zero. This case is simply un-
treatable using a mean and standard deviation. For the 80 o C and 2 min treatment, two of
the three repetitions had only zero counts (as mentioned in Section 2.2, we set jk = 0 and
all yik = 0, k = 1, 2) and the third repetition had one drop with one CFU only at dilution 0,
that is j3 = 0, y13 = 1 and yi3 = 0, i = 2, 3, . . . , D. We show the corresponding posterior in
Figure 6(b). Moreover, to appreciate the effect of the hierarchical model we include the ‘free’
posterior distributions for the N0k s (see Appendix A), that is not considering the hierarchical
model and each independent posterior for N0k based only on the likelihood based on (6);
see Figure 6(a). Since for repetitions k = 1, 2 we only had zero counts there is a positive
probability that N0k = 0, and results in that the mode of this free posterior is precisely at 0.
However, since repetition k = 3 had one CFU then this renders N0k = 0 logically impossible,
and therefore it has zero posterior probability at N0k = 0, see 6(a).
The corresponding marginal posterior probabilities, now using the hierarchical model,
seen in Figure 6(b), do not match completely with the former posteriors transformed to the
log10 (CF U + 1) scale. This is indeed a result of the hierarchical model and the shift in the
individual marginal posteriors is a case of “borrowing strength” from one repetition to the
other.
In many cases it is desired to study the log abundance of a treatment with respect to its
concurrent untreated control, that is the log reduction (LR). For example, it is common for
antimicrobial products to make claims such as “kills 99.9%” of bacteria which corresponds
to LR = 3. A considerable advantage of using a Bayesian approach is that the LR is
analyzed explicitly through its constitutive parts (the controls and treated samples) which
can open new possibilities for more comprehensive and goal oriented analyses for dilution
series experiments (see also the LR analysis in Section 3.2 and the ‘activation probability’
analysis discussed in Section 4). Namely, the same hierarchical model may be fitted to the
control data obtaining a posterior sample for the hierarchical log abundance, which we call
E0 and inference regarding the LR is simply based on the posterior distribution of E0 − E.
Since we have MCMC posterior samples from both variables, obtaining a posterior sample
for E0 − E is immediate; see for an example Figure 6(c). Moreover, we may calculate
P (LR > 3|Data) which in this case equals 0.9993 ie. “with near certainty the 80 o C for 2
min treatment killed at least 99.9% of the biofilm”.

18
(a) (b) (c)

Figure 6: Results from treatment 80 o C for 2 min: (a) Individual (‘free’) posterior pmf’s,
without hierarchical model, for N01 and N02 (dashed, both identical) and N03 (dashed and
dotted) and (b) these same posteriors transformed to the log10 (CF U + 1) scale as cdf’s, the
posterior marginal cdf for E (black) and for log10 (N0k +1) (blue solid lines) in the hierarchical
model. (c) The log reduction of the experiment vs. the control (Room Temperature); this is
the posterior distribution of E0 −E, where E0 is the log10 (CF U +1) hierarchical parameter for
the control experiment, which is in fact seen in Figure 5(a). Using this posterior distribution
we calculate P (LR > 3|Data) = 0.9993.

3.2 Inter-lab analysis


In this example we report on results from a multi-lab study of the so-called “single tube
method” (ASTM, 2013a,b), recently standardized by ASTM International (https://www.
astm.org/), to test antimicrobial efficacy against biofilms of Pseuodomonas aeruginosa. In
this example we focus on 3 labs from the inter-laboratory study of the single tube method,
and a single experiment from each lab. In each experiment, we analyze K = 3 control
samples and K = 3 samples treated with a low concentration of bleach (i.e., 123 ppm of
sodium hypochlorite) against biofilms. The biofilms in the labs were harvested into a V0 = 40
ml volume to form dilution 0. As in the previous Section, to begin a α = 10-fold dilution
series, α−1 α0−1 V0 = 1 ml is taken from dilution 0 and then 9 ml of water is added to form
dilution 1, etc. up to dilution 6 (so there is a volume V = 10 ml in each dilution tube). Two
drops of u = 100 µl each are spread plated, and placed at 36 ± 2 o C for 26 ± 2 hr and in turn
inspected for CFU formation. Therefore we have J = 7, α0 = 4, α = 10, αp = 10/0.1 = 100
and D = 2. Figure 7 presents the results of our analyses of the log densities (log10 (CF U +1))
for each of the control and bleach treatments, and also the LRs.
Unlike other fields of science where there is a demonstrable lack of reproducibility as-
sessments, in the field of antimicrobial science, many organizations have pooled resources to
quantifiably assess the reproducibility, across different laboratories, of methods that assess
19
(a) (b) (c)

Figure 7: Inter-lab analysis of results for sodium hypochlorite treatment, individual El s


for 3 labs not considering the hierarchical model (blue) and the global E (black) for the
hierarchical model in 7 (a) control, (b) treatment and (c) LR.

antimicrobial efficacy (Parker et al., 2018b). The reproducibility of test results of antimi-
crobials is of paramount importance for public health: regulators want to keep ineffective
products out of the marketplace, and producers of highly effective products want to bring
their products to market. All stakeholders want laboratory methods that accurately and
reproducibly make decisions regarding which products are effective.
The reproducibility issue seems apparent in this example since the results from the LRs
for this treatment, as seen in Figures 7(b) and (c) span 4 − 5, orders of magnitude. In a
classical setting an analysis of variance (ANOVA) is performed on the dilution series data
from multiple labs, but that inherits the problems of the log normal model already mentioned.
In our setting, a formal reproducibility question may be put forward and then assessed with
a posterior distribution. A first choice would be to compare the models for the mean log
abundnce El for each individual lab versus the full hierarchical model for the global mean
log abundance E, calculating the posterior distribution of each case, for example comparing
the individual lab LR’s LRl = El0 − El vs. the global LR LR = E0 − E; see Table 1 for a
tentative discussion. This approach is one possibility, but a more in depth analysis is needed
to address other reproducibility questions that stakeholders may phrase. The best approach
is to establish the necessary posterior probabilities and, to provide guidance for the involved
decisions to be taken, to maximize posterior expected utilities. Our Bayesian setting opens
up these possibilities, but not without further effort.

20
Table 1: Further investigating the reproducibility issues in our inter-laboratory example,
according to our hierarchical posterior, the global log reduction LR is expected to be within
1.7 logs of the individual LRl s for each lab (row 2). That is, the pooled hierarchical model
seem to work well, however, its posterior distribution spans more that 5 orders of magnitud,
as seen in Figure 7(c). This lab to lab inconsistency is also suggested in the individual
posterior P (LRl > 3) (row 3), spanning from close to zero to 12 , while the inter-lab posterior
LR is in fact P [LR > 3] = 0.0646.

Laboratories l=1 l=2 l=3


E[|LR − LRl |] 1.6443 1.4053 1.2654
P (LRl > 3) 0.5348 0.0014 0.2257

4 Discussion
A key aspect of our model is the use of a binomial likelihood (Bi(N0 , ·)) with the parameter of
interest being N0 , the abundance of microbes in the original sample. We prefer the binomial
likelihood because it models what the technicians actually do in the lab. This is in stark
contrast to other common approaches (e.g., the log-normal and Poisson approaches) that
provide only an approximation, and markedly different from the case where the statistician
imposes an experimental design solely for convenience of the statistical analysis.
This work provides contributions to both the statistical and applied sciences. To the
former, we show how to apply a Bayesian hierarchical model with a binomial likelihood
to estimate and quantify uncertainty about microbial abundances (N0 ) from dilution series
experiments. The binomial likelihood has a rich history analyzing microbiological data in
the case when, instead of CFUs, only a presence/absence response is measured from each
tube in a dilution series from which N0 can be estimated using MLE theory (Cochran, 1950;
Garthright, 1993; McCrady, 1915). Interesting, this MLE approach is called the “most prob-
able number” technique, coined by McGrady in 1915 before MLE theory had been developed
(Hamilton, 2011). Our contribution is the first time to our knowledge that CFU data have
been modeled with a binomial likelihood. Regarding our contribution to the applied sciences,
we provide a sound alternative to the log-normal and Poisson modeling approaches that are
commonly applied in the analysis of CFU data (Haas et al., 1999; Hamilton et al., 2013). We
have shown that these common approaches have serious deficiencies when modeling CFUs.
For example: in the Poisson case, the restriction on the variance does not hold for CFUs

21
collected from control conditions (that tend to exhibit high abundances and low variability);
and in the log-normal case, it is not possible to deal with 0’s and TNTC’s (without ad hoc
substitution rules). The Negative binomial or Generalized Poisson models have been used to
extend the Poisson model (Joe and Zhu, 2005). Instead, we present a Bayesian approach with
a binomial likelihood that allows us to directly estimate the abundance of microbes without
a severe constraint on the variance like the Negative binomial or Generalized Poisson would
do, and unlike these approaches, our approach can directly incorporate data from multiple
dilutions, directly analyze log reductions from the CFUs recovered from control and treated
samples that have different levels of variability, and account for miscounts (recently a shifted
Poisson model has been suggested to also handle miscounts Ben-David and Davidson, 2014).
Furthermore, while there is readily available software for application of mixed effects models
with either a log-normal or Poisson likelihood to assess reproducibility (see, e.g., Bates et al.,
2015), we are not aware of similar extensions for the Negative binomial or Generalized Pois-
son approaches. We show how our approach may be applied for reproducibility assessments,
by comparing the posterior for each individual lab to the posterior for the population of all
labs. While this is an area of active research, the results we have presented seem promising.
A point not directly addressed in this paper is the analysis and comparison of multi-
ple treatments, since here we concentrated on the novel Bayesian binomial approach using
hierarchies for multiple repetitions and multiple laboratories for a single treatment. The
usual objective in analyzing a series of treatments is to fit a regression model to test and/or
predict their effect (see for example Wahlen et al., 2016). One of the main benefits of per-
forming a Bayesian inference is that we may ask as many questions of interest as we require
regarding our parameter of interest, and these are answered with its corresponding poste-
rior probability. A simple initial approach, that exploits our Bayesian analysis, would be
to calculate the actual posterior probability of the desired result, and compare such poste-
rior probabilities across treatments. For example, commonly in microbiological experiments
we are interested in the number of surviving microbes after treatment. This translates in
calculating an ‘activation probability’ of a treatment intended to kill microbial activity, ie.
P (E < eh | Y = y) for eh some (small) agreed threshold. If compared to a control E0 , then
we could calculate the posterior probability of a log reduction above some threshold, that is
P (LR = E0 − E > lh | Y = y, Y0 = y0 ) etc., see Figure 8 for an example. This approach
lacks the formal predictive approach of fitting a regression model using covariates.

22
Figure 8: “Activation Probability” P (E < eh | Y = y) with threshold eh = 2, ie. 100 CFUs,
for the data described in section 3.1. Coupons were treated at different temperatures and
times; at 65 o C the log reduction was not achieved even over a 90 min exposure, whereas at
80 o C the Activation Probability is over 0.8 with an exposure time of only 2 min. There is
a clear difference between 70 o C and 75 o C where similar results are obtained at 20 min at
75o C and only after 60 min for 70 o C.

Although not trivial to do, a regression model may indeed be incorporated using a
Bayesian approach, embedded in our hierarchical model and binomial likelihood, in a multi-
experiment multi-lab scenario. These interesting possibilities are the focus of future research.
In general, using a Bayesian approach opens the door to a formal Uncertainty Quantification
(UQ) approach to the analysis of dilution series data, by exploiting the posterior distribu-
tion obtained and the ease in calculation of expectations or posterior probabilities given the
MCMC sample obtained.
Instances of 0’s and TNTCs relate directly to the lower and upper limits of detection
(LOD) for the process used to generate CFUs (Currie, 1968). The process includes how the
dilution experiment was conducted such as the dilution factors (α0 and α), which dilutions
were plated (J), and the volume plated (u). The process also includes properties about the
method used to harvest the bacteria from the sample (e.g., sonication, scraping, stomaching
or a RODAC plate), the method used to disaggregate/homogenize the bacteria into single
cells in the original volume V0 and the environmental conditions used to incubate the bacteria
after plating. In microbiology, it is common to refer to either 0.5/Sc or 1/Sc as the LOD
of a CFU counting method. This is unfortunate since neither of these values necessarily
are associated with a high level of statistical confidence or probability regarding how many

23
viable microbes survive in the sample. In our analyses (Figure 4) we defined the LOD of the
process used to generate CFUs as the lowest microbial abundance that can survive in the
sample with probability 0.95 when there were no CFUs recovered from the sample. From the
Bayesian analyses that we present, this calculation is straightforward: the LOD is simply
the 95th percentile from the posterior distribution for N0 |Y = 0; see Figures 3(a) and 4.
Our hierarchical model also easily and consistently can provide the LOD as a function of
dilution experiment design (i.e., J, α0 , α and αp ) and the number of repetitions K by using
the posterior for 10E − 1. For example, in Figure 4, it is seen the expected result that the
LOD decreases as the number of repetitions increases. The LOD for a particular experimental
design can be calculated, beforehand, at minimal computational cost, to asses if the proposed
design meets any LOD requirements.
Regarding the LOD or any other characteristic desired in an experiment, an intensive
search can be conducted to optimize the design parameters in Bayesian analyses (Christen
and Buck, 1998; Huan and Marzouk, 2014; Weaver et al., 2016). This, however, involves
calculations of far higher costs, commonly needing parallel computing and other sophisticated
software and numerical analysis resources. Given a design, parameters are simulated from
the prior and in turn synthetic data sets from the model, to calculate the corresponding
posterior for quantities of interest to asses the expected information gain in data from the
design. This then needs to be repeated from a set of candidate designs to find an optimal
design. We leave this interesting dilution experiment design possibility for future research.

Acknowledegments
JAC is partially funded by CONACYT CB-2016-01-284451, RDECOMM and ONRG grants.
AEP is partially funded by NSF grant 1516951. The authors gratefully acknowledge the
industrial associates of the CBE, Lindsey Lorenz and Professor Emeritus Martin Hamilton.

A Details on the hierarchical model


Our hierarchical model may be summarized with the Directed Acyclical Graph (DAG) de-
picted in Figure 9(a). As mentioned in the main text, using the following well known
result we may integrate out the Njk ; j = 0, 1, . . . , J − 1. That is, if Z ∼ Bi(n, p1 ) and

24
X|Z = z ∼ Bi(z, p2 ) then X ∼ Bi(n, p1 p2 ). Using it recursively (1) becomes for
j = 0, 1, . . . , J − 1

k
Yj,i |N0k = nk0 , q ∼ Bi(nk0 , α−j αp−1 α0−1 (1 − q)) (8)

and the corresponding DAG is as shown in Figure 9(b). This substantiates the equation (6)
provided earlier. Since considering N0k = 0 is important and may indeed happen note that
k
it must be the case that P (Yj,i = 0|N0k = 0, q) = 1. Accordingly we define Bi(0, p) as the
Dirac’s delta at zero.

Note that, strictly speaking, log10 N0k + 1 is a discrete variable and we are modeling
it here as a continuos r.v. The binomial model may be changed to accept a non-integer
“number of trials” N0k using gamma functions in the combination function, having as a
particular case the conventional binomial pmf (this we do in our implementation providing

no further details here). Then for mathematical convenience log10 N0k + 1 is modeled as a
continuos r.v. while still N0k is taken discrete.
Moreover, by ignoring the hierarchy involving E and A we may calculate the posterior
distribution of each N0k independently. In this case, since it is a single discrete parameter,
calculating its posterior pmf is straightforward, constructing a likelihood from (??). For
illustration purposes and comparisons this is done in Section 2.2 and in Figure 6(a). We call
this the free posterior for N0k , the microbial abundance in the k th .
The full set of parameters are E, A, N01 , N02 , . . . , N0K . Only binomial and Gamma densities
are involved and therefore the corresponding likelihood is simple to write and calculate. The
likelihood, indeed since repetitions are exchangeable, is

K
(D )
Y Y
fY|E,A,N01 ,N02 ,...,N0K (y|e, a, n10 , n20 , . . . , nK
0 ) = fYik |N0k (yik |nk0 ) fN0k |E,A (nk0 |e, a) (9)
k=1 i=1

where fYik |N0k (yik |nk0 ) is the corresponding binomial pmf stated in (8) and fN0k |E,A (nk0 |e, a)
arises from (5), Y and y are the r.v.’s of observed CFU counts arranged in K × D matrices
such that Y = (Yik ) and y = (yik ). fN0k |E,A (nk0 |e, a) is established using the fact that if
b−a −1
log10 (Z) ∼ Ga(a, b) then fZ (z) = Γ(a)
(log10 (z))a−1 y −(b log(10)) −1 .

25
E A

1 2 K
N01 Y0,i N02 Y0,i ··· N0K Y0,i

1 2 K
N11 Y1,i N12 Y1,i N1K Y1,i

··· ··· ···

1 2 K
NJ1 YJ,i NJ2 YJ,i NJK YJ,i

(a)
E A

N01 N02 ··· N0K

1 1 1 2 2 2 K K K
Y0,i Y1,i ··· YJ,i Y0,i Y1,i ··· YJ,i Y0,i Y1,i ··· YJ,i

(b)

Figure 9: (a) Directed Acyclic Graph (DAG) representing our hierarchical model and (b)
DAG representing our model once the Njk s have been integrated out, depending now only
on the N0k s. Circle nodes are r.v.’s to be estimated, circle and square nodes are r.v.’s that
are observables (the CFU counts).

26
B Multidilution data analysis

(a) (b)

Figure 10: Expected posterior distribution when all dilution counts are considered (blue)
and when only the first dilution counts are considered (green), for (a) drop plate design,
with true value N0 = 500, and (b) spread plate design, with true value N0 = 50.

We present results on whether it is worth analyzing CFU counts over all dilutions vs over
only the first countable dilution. This question is difficult to answer in complete generality.
We will then concentrate on a two common designs that correspond to data in Sections 3.1
and 3.2. Namely, α0 = 1, α = 10, αp = 1000, J = 6, c = 30 for the drop plate data in
Section 3.1 and α0 = 4, α = 10, αp = 100, J = 6, c = 300 for the spread plate data in
Section 3.2.
We study the extreme case when the first dilution can be counted, that is all counts are
below the TNTC threshold c, and there is only one repetition (K = 1). We fix a true value
for N01 and simulate data at all dilutions using the binomial model in (8). Then we calculate
the discrete posterior pmf of N01 and repeat the process averaging over all resulting posteriors
over 120 simulated data sets. The process is done calculating the posterior when only the
first dilution is used and when all dilution counts are considered in the calculation of the
posterior. For the true value of the abundance N01 we considered only a set of representative
values for both designs, below a maximum to maintain expected simulated counts below c.
Namely, N01 = 500, 5,000, 10,000, 20,000, 30,000 for the drop plate data design and N01 =
50, 500, 1,000, 2,000, 3,000 for the spread plate data design. The posteriors with the most
noticeable differences, although still quite low, were at N01 = 500 and N01 = 50, respectively;

27
we present these posteriors in Figure 10. In all other cases the differences in posteriors were
even smaller (not shown), suggesting that the added difficulty in counting, recording and
analyzing CFUs at all dilutions is not worth the expected information gain, and therefore
we recommend only to record and analyze the first countable dilution.
The reason that the posteriors are so similar can be seen in Figure 3. The likelihood
for TNTCs in dilution j becomes flat and provides no information for higher values than
2c αj αp . If there are countable drops for dilution j + 1, the likelihood for these data is placed
at α fold higher values well beyond 2c αj αp and therefore including the TNTCs likelihood
in dilution j will not have any significant effect on the posterior, at least for the common
case where α = 10. Then including the censored likelihood terms for TNTC’s will not add
substantial information and does not justify the added computational cost.

References
ASTM (2013a). An interlaboratory study was conducted by eight laboratories testing three
samples of varying targeted results to establish a precision statement for test method
e2871. ASTM International, West Conshohocken, PA, USA, https://www.astm.org/
DATABASE.CART/RESEARCH_REPORTS/RR-E35-1008.htm.

ASTM (2013b). Standard test method for evaluating disinfectant efficacy against pseu-
domonas aeruginosa biofilm grown in cdc biofilm reactor using single tube method. ASTM
International, West Conshohocken, PA, USA, https://www.astm.org/Standards/
E2871.htm.

Bates, D., Mächler, M., Bolker, B., and Walker, S. (2015). Fitting linear mixed-effects models
using lme4. Journal of Statistical Software, 67(1):1–48.

Ben-David, A. and Davidson, C. E. (2014). Estimation method for serial dilution experi-
ments. Journal of Microbiological Methods, 107:214–221.

Christen, J. and Fox, C. (2010). A general purpose sampling algorithm for continuous
distributions (the t-walk). Bayesian Analysis, 5(2):263–282.

Christen, J. A. and Buck, C. E. (1998). Sample selection in radiocarbon dating. Journal of


the Royal Statistical Society: Series C (Applied Statistics), 47:543–557.
28
Clarke, J. (1998). Evaluation of censored data to allow statistical comparisons among very
small samples with below detection limit observations. Environmental Science and Tech-
nology, 32:177–183.

Cochran, W. G. (1950). Estimation of bacterial densities by means of the “most probable


number”. Biometrics, 6:263–282.

Cundell, T. (2015). The limitations of the colony-forming unit in microbiology. European


Pharmaceutical Review, 6.

Currie, L. (1968). Limits for qualitative detection and quantitative determination. Anal.
Chem., 40(586).

EPA (1996). Guidance for data quality assessment. Office of Research and Development,
US Enivironmental Protection Agency.

FDA (2018). BAM 4: Enumeration of Escherichia coli and the Coliform Bacteria. US Food
and Drug Administration, https://www.fda.gov/food/laboratory-methods-food/bam-4-
enumeration-escherichia-coli-and-coliform-bacteria.

Garthright, W. E. (1993). Bias of the logarithm of microbial density estimates from serial
dilutions. Biometrical Journal, 35(3):299–314.

Geyer, C. (1992). Practical markov chain monte carlo. Statistical Science, 7(4):473–511.

Haas, C., Rose, J., and Gerba, C. (1999). Quantitative Microbial Risk Assessment.

Haas, C. N. and Scheff, P. A. (1990). Estimation of averages in truncated samples. Enviro-


nental Science Technology, 24:912–919.

Hamilton, M., Hamilton, G., Goeres, D., and Parker, A. (2013). Guidelines for the statistical
analysis of a collaborative study of a laboratory disinfectant product performance test
method. JAOAC International, 96(5):1138–1151.

Hamilton, M. A. (2011). The P/N formula for the log reduction when using a semi-
quantitative disinfectant test of type SQ1. Center for Biofilm Engineering.

Hamilton, M. A. and Parker, A. E. (2010). Enumerating viable cells by pooling counts for
several dilutions. Center for Biofilm Engineering.
29
Hedges, A. J. (2002). Estimating the precision of serial dilutions and viable bacterial counts.
International Journal of Food Microbiology, 76:207–214.

Herigstad, B. R. and Hamilton, M. (2001). How to optimize the drop plate method for
enumerating bacteria. J Microbiological Methods, 44(2):121–129.

Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replace-


ment from a finite universe. Journal of the American Statistical Association, 47:663–685.

Huan, X. and Marzouk, Y. M. (2014). Gradient-based stochastic optimization methods


in Bayesian experimental design. International Journal for Uncertainty Quantification,
4(6):479–510.

Joe, H. and Zhu, R. a. (2005). Generalized poisson distribution: the property of mix-
ture of poisson and comparison with negative binomial distribution. Biometrical Journal,
47(2):219–229.

Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical
Association, 90(430):773–795.

Maturin, L. and Peeler, J. T. (1998). Chapter 3 – aerobic plate count, section: Conventional
plate count method. In Council, B., editor, FDA Bacteriological Analytical Manual. US
Food and Drug Administration.

McCrady, M. H. (1915). The numerical interpretation of fermentation-tube results. J.


Infectious Disease, 17:183–212.

Niemela, S. (1983). Statistical evaluation of results from quantitative microbiological exam-


inations. Nordic Committee on Food Analysis. Ord & Form AB, Uppsala.

Niemi, M. M. and Niemela, S. I. (2001). Measurement uncertainty in microbiological culti-


vation methods. Accreditation and Quality Assurance, 6:372–375.

Parker, A., Pitts, B., Lorenz, L., and Stewart, P. (2018a). Polynomial accelerated solutions
to a large gaussian model for imaging biofilms: in theory and finite precision. Journal of
the American Statistical Association, 113(524):1431–1442.

30
Parker, A. E., Hamilton, M. A., and Goeres, D. M. (2018b). Reproducibility of antimicrobial
test methods. Scientific Reports, 8(1):12531.

Parkhurst, D. F. and Stern, D. A. (1998). Determining average concentrations of cryp-


tosporidium and other pathogens in water. Environmental Science and Technology,
32:3424–34–3429.

Pitts, B. and Stewart, P. (2008). Confocal laser microscopy on biofilms: Successes and
limitations. Microscopy Today, 16(4):18–21.

Prescott, L., Harley, J., and Kline, D. (1996). Microbiology. Wm. C. Brown Publishers, third
edition edition.

Rice, E., Baird, R., and Eaton, A. (2017). Standard Methods for the Examination of Water
and Wastewater. American Public Health Association, American Water Works Associa-
tion, Water Environment Federation, 23rd edition edition.

Singh, A. and Nocerino, J. (2002). Robust estimation of mean and variance using environ-
mental data sets with below detection limit observations. Chemometrics and intelligent
laboratory systems, 60:69–86.

Wahlen, L., Parker, A., Walker, D., Pasmore, M., and Sturman., P. (2016). Predictive
modeling for hot water inactivation of planktonic and biofilm-associated sphingomonas
parapaucimobilis to support hot water sanitization programs. Biofouling, 32(7).

Weaver, B. P., Williams, B. J., Anderson-Cook, C. M., and Higdon, D. M. (2016). Computa-
tional enhancements to bayesian design of experiments using gaussian processes. Bayesian
Analysis, 11:191–213.

Zuur, A., Ieno, E., Walker, N., Savelliev, A., and Smith, G. (1999). Mixed Effects Models
and Extensions in Ecology with R.

31

You might also like