0% found this document useful (0 votes)

2 views

Bayesian Computation and Model Selection without

The document discusses a novel approach to Bayesian computation and model selection that does not rely on likelihoods, specifically through the use of Approximate Bayesian Computation (ABC) and General Linear Models (GLM). The authors propose a reformulation of regression adjustment techniques to enhance model selection and apply this methodology to study population subdivision in western chimpanzees. This work aims to improve the efficiency and accuracy of Bayesian inference in complex models where likelihood functions are difficult to compute analytically.

Uploaded by

Anna Mucha

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Bayesian Computation and Model Selection without

Uploaded by

Anna Mucha

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Genetics: Published Articles Ahead of Print, published on September 28, 2009 as 10.1534/genetics.109.

109058

Bayesian Computation and Model Selection without

Likelihoods

Christoph Leuenberger1,2 and Daniel Wegmann1,3

1
Both authors contributed equally to this work
2
Département de mathématiques, Université de Fribourg, Switzerland
3
Computational and Molecular Population Genetics Laboratory, Institute of Ecology and Evolution,
University of Bern, Switzerland

1
Running Head: Bayesian Computation without Likelihoods

Keywords: Approximate Bayesian Computation, General Linear Model, Bayes Factor, Popula-
tion Genetics, Pan troglodytes verus

Corresponding author:
Christoph Leuenberger
Département de mathématiques,
Université de Fribourg,
Fribourg,
Switzerland
Phone +41 26 300 9192
Fax +41 26 300 9744
Email: [email protected]

2
ABSTRACT

Until recently, the use of Bayesian inference was limited to a few cases because for many
realistic probability models the likelihood function cannot be calculated analytically. The situation
changed with the advent of likelihood-free inference algorithms, often subsumed under the term
Approximate Bayesian Computation (ABC). A key innovation was the use of a post-sampling
regression adjustment, allowing larger tolerance values and as such shifting computation time to
realistic orders of magnitude. Here we propose a reformulation of the regression adjustment in
terms of a General Linear Model (GLM). This allows the integration into the sound theoretical
framework of Bayesian statistics and the use of its methods, including model selection via Bayes
factors. We then apply the proposed methodology to the question of population subdivision among
western chimpanzees Pan troglodytes verus.

3
INTRODUCTION

With the advent of ever more powerful computers and the refinement of algorithms like MCMC
or Gibbs sampling, Bayesian statistics has become an important tool for scientific inference during
the past two decades. Consider a model M creating data D (DNA sequence data, for example)
determined by parameters θ from some (bounded) parameter space Π ⊂ Rm whose joint prior
density we denote by π(θ). The quantity of interest is the posterior distribution of the parameters
which can be calculated by Bayes rule

π(θ|D) = c · fM (D|θ)π(θ),

where fM (D|θ) is the likelihood of the data and c is a normalizing constant. Direct use of this
formula, however, is often prevented by the fact that the likelihood function cannot be calculated
analytically for many realistic probability models. In these cases one is obliged to use stochastic
simulation. TAVARÉ et al. (1997) propose a rejection sampling method for simulating a posterior
random sample where the full data D is replaced by a summary statistic s (like the number of
segregating sites in their setting). Even if the statistic is not sufficient for D – that is, the statistic
does not capture the full information contained in the data –, rejection sampling allows for the
simulation of approximate posterior distributions of the parameters in question (the scaled mutation
rate in their model). This approach was extended to multiple-parameter models with multivariate
summary statistics s = (s1 , . . . , sn )T by W EISS and VON H AESELER (1998). In their setting
a candidate vector θ of parameters is simulated from a prior distribution and is accepted if its
corresponding vector of summary statistics is sufficiently close to the observed summary statistics
sobs with respect to some metric in the space of s, i.e. if dist(s, sobs ) < for a fixed tolerance .
We suppose that the likelihood fM (s|θ) of the full model is continuous and non-zero around sobs .
In practice the summary statistics are often discrete but the range of values is large enough to be
approximated by real numbers. The likelihood of the truncated model M (sobs ) obtained by this

4
acceptance-rejection process is given by

Z
f (s|θ) = Ind(s ∈ B (sobs )) · fM (s|θ) · ( fM (s|θ)ds)−1 (1)
B

where B = B (sobs ) = {s ∈ Rn |dist(s, sobs ) < } is the -ball in the space of summary statistics
and Ind(·) is the indicator function. Observe that f (s|θ) degenerates to a (Dirac) point measure
centered at sobs as → 0. If the parameters are generated from a prior π(θ) then the distribution of
the parameters retained after the rejection process outlined above is given by

R
π(θ) fM (s|θ)ds
π (θ) = R RB . (2)
Π
π(θ) B
fM (s|θ)dsdθ

We shall call this density the truncated prior. Combining 1 and 2 we get

Thus the posterior distribution of the parameters under the model M for s = sobs given the prior
π(θ) is exactly equal to the posterior distribution under the truncated model M (sobs ) given the
truncated prior π (θ). If we can estimate the truncated prior and make an educated guess for a
parametric statistical model of M (sobs ), we arrive at a reasonable approximation of the posterior
π(θ|sobs ) even if the likelihood of the full model M is unknown. It is to be expected that due to
the localization process the truncated model will exhibit a simpler structure than the full model M
and thus be easier to estimate.

Estimating π (θ) is straightforward, at least when the summary statistics can be sampled from
M in a reasonable amount of time: Sample the parameters from the prior π(θ), create their re-
spective statistics s from M and save those parameters whose statistics lie in B (sobs ) in a list
P = {θ 1 , . . . , θ N }. The empirical distribution of these retained parameters yields an estimate
of π (θ). If the tolerance is small then one can assume that fM (s|θ) is close to some (un-

5
known) constant over the whole range of B (sobs ). Under that assumption, formula (3) shows that
π(θ|sobs ) ≈ π (θ). However, when the dimension n of summary statistics is high – and for more
complex models dimensions like n = 50 are not unusual –, the “curse of dimensionality” implies
that the tolerance must be chosen rather large or else the acceptance rate becomes prohibitively low.
This, however, distorts the precision of the approximation of the posterior distribution by the trun-
cated prior (see W EGMANN et al. (2009)). This situation can be partially alleviated by speeding
up the sampling process; such methods are subsumed under the term approximate Bayesian com-
putation (ABC). M ARJORAM et al. (2003) develop a variant of the classical Metropolis-Hastings
algorithm (termed ABC-MCMC in S ISSON et al. (2007)) which allows them to sample directly
from the truncated prior π (θ). In S ISSON et al. (2007) a sequential Monte Carlo sampler (ABC–
PRC) is proposed requiring substantially less iterations than ABC–MCMC. But even when such
methods are applied, the assumption that fM (s|θ) is constant over the -ball is a very rough one,
indeed.
In order to take into account the variation of fM (s|θ) within the -ball, a post-sampling regres-
sion adjustment (ABC-REG) of the sample P of retained parameters is introduced in the important
article by B EAUMONT et al. (2002). Basically, they postulate a (locally) linear dependence between
the parameters θ and their associated summary statistics s. More precisely, the (local) model they
implicitly assume is of the form θ = Ms + m0 + , where M is a matrix of regression coefficients,
m0 a constant vector and a random vector of zero mean. Computer simulations suggest that for
many population models ABC–REG yields posterior marginal densities that have narrower HPD
(highest posterior density) regions and are more closely centered around the true parameter val-
ues than the empirical posterior densities directly produced by ABC–samplers (W EGMANN et al.,
2009). An attractive feature of ABC–REG is that the posterior adjustment is performed directly
on the simulated parameters which makes estimation of the marginal posteriors of individual pa-
rameters particularly easy. The method can also be extended to more complex, non–linear models
as demonstrated e.g. in B LUM and F RANCOIS (2009). In extreme situations, however, ABC-REG
may yield posteriors that are non-zero in parameter regions where the priors actually vanish (see

6
Figure 1B for an illustration of this phenomenon). Moreover, it is not clear how ABC-REG could
yield an estimate of the marginal density of model M at sobs , an information that is useful for
model comparison.

In contrast to ABC-REG we treat the parameters θ as exogenous and the summary statistics s as
endogenous variables and we stipulate for M (sobs ) a General Linear Model (abbreviated as GLM
in the literature – not to be confused with the Generalized Linear Models which unfortunately share
the same abbreviation.) To be precise, we assume the summary statistics s created by the truncated
model’s likelihood f (s|θ) to satisfy

s|θ = Cθ + c0 + , (4)

where C is a n × m-matrix of constants, c0 a n × 1-vector and a random vector with a

multivariate normal distribution of zero mean and covariance matrix Σs :

∼ N (0, Σs ).

A GLM has the advantage to take into account not only the (local) linearity, but also the strong
correlation normally present between the components of the summary statistics. Of course, the
model assumption (4) can never represent the full truth since its statistics are in principle unbounded
whereas the likelihood f (s|θ) is supported on the -ball around sobs . But since the multivariate
Gaussians will fall off rapidly in practice and not reach far out off the boundary of B (sobs ) this is
a disadvantage we can live with. In particular, the OLS-estimate outlined below implies that for
→ 0 the constant c0 tends to sobs whereas the design matrix C and the covariance matrix Σs both
vanish. This means that in the limit of zero tolerance = 0 our model assumption yields the true
posterior distribution of M.

7
THEORY SECTION

In this section we describe the above methodology – referred to as ABC-GLM in the following
– in more detail. The basic two-step procedure of ABC-GLM may be summarized as follows:

GLM1 Given a model M creating summary statistics s and given a value of observed summary
statistics sobs , create a sample of retained parameters θ j , j = 1, . . . , N , with aid of some
ABC-sampler (ABC-REJ, ABC-MCMC or ABC-PRC) based on a prior distribution π(θ)
and some choice of the tolerance > 0.

GLM2 Estimate the truncated model M (sobs ) as a General Linear Model and determine, based
on the sample θ j , from the truncated prior π (θ) an approximation to the posterior π(θ|sobs )
according to formula (3).

Let us look more closely at these two steps:

GLM1: ABC-sampling. We refer the reader to M ARJORAM et al. (2003) and S ISSON et al.
(2007) for details concerning ABC-algorithms and to M ARJORAM and TAVARÉ (2006) for a com-
prehensive review of computational methods for genetic data analysis. In practice, the dimension of
the summary statistics is often reduced by a Principal Components Analysis (PCA). PCA also has a
certain de-correlation effect. A more sophisticated method of reducing the dimension of summary
statistics, based on Partial Least Squares (PLS), is described in W EGMANN et al. (2009). In a re-
cent preprint, VOGL et al. (2009) propose a Box-Cox-type transformation of the summary statistics
which makes the likelihood close to multivariate Gaussian. This transformation might be especially
efficient in our context as we assume normality of the error terms in our model assumption.

To fix the notation, let P = {θ 1 , . . . , θ N } be a sample of vector-valued parameters created

by some ABC-algorithm simulating from some prior π(θ), and S = {s1 , . . . , sN } the sample of
associated summary statistics produced by the model M. Each parameter θ j is an m-dimensional
column vector θ j = (θj , . . . , θm
j t
) and each summary statistics an n-dimensional column vector
sj = (sj1 , . . . , sjn )t ∈ B (sobs ). The samples P and S can thus be viewed as m × N - and n × N -
matrices P and S, respectively.

8
The empirical estimate of the truncated prior π (θ) is given by the discrete distribution that puts
a point mass of 1/N on each value θ j ∈ P. We smooth out this empirical distribution by placing a
sharp Gaussian peak over each parameter value θ j . More precisely, we set

N
1 X
π (θ) = φ(θ − θ j , Σθ ), (5)
N j=1

where
1 j −1 j
− 21 (θ −θ )t Σθ (θ −θ )
φ(θ − θ j , Σθ ) = e
|2πΣθ |1/2

and
Σθ = diag(σ1 , . . . , σm )

is the covariance matrix of φ which determines the width of the Gaussian peaks. The larger the
number N of sampled parameter values, the sharper the peaks can be chosen in order to still get a
rather smooth π . If the parameter domain Π is normalized to [0, 1]m , say, then a reasonable choice
is σk = 1/N . Otherwise, σk should be adapted to the parameter range of the parameter component
θk . Too small values of σk will result in wiggly posterior curves, too large values might unduly
smear out the curves. The best advice is to run the calculations with several choices for Σθ . If
π induces a correlation between parameters, a non-diagonal Σθ might be beneficial. In practice,
however, the posterior estimates are most sensitive to the diagonal values of Σθ .

GLM2: General Linear Model. As explained in the introduction, we assume the truncated
model M (sobs ) to be normal linear, i.e. the random vectors s satisfy (4). The covariance matrix
Σs encapsulates the strong correlations normally present between the components of the summary
statistics. C, c0 and Σs can be estimated by standard multivariate regression analysis (OLS) from
.
the sample P, S created in step GLM1.1 To be specific, set X = (1..Pt ), where 1 is an N ×1-vector
1
Strictly speaking, one must redo an ABC-sample from uniform priors over Π in order to get an unbiased estimate
of the GLM if the prior π(θ) is not uniform already. On the other hand, ordinary least squares estimators are quite
insensitive to the prior’s influence. In practice, one can as well use the sample P to do the estimate. We applied
both estimation methods to the toy models presented in the simulation section of this article and found no significant
difference between the estimated posteriors. The same holds true for the so-called Feasible Generalized Least Squares
(FGLS) estimator, see G REENE (2003). In this two-stage algorithm the covariance matrix is first estimated as in our

9
of 1’s. C and c0 are determined by the usual least squares estimator

.
(ĉ0 ..Ĉ) = SX(Xt X)−1 ,

and for Σs we have the estimate

1
Σ̂s = R̂t R̂, (6)
N −m
.
where R̂ = St − X · (ĉ0 ..Ĉ)t are the residuals. The likelihood for this model – dropping the
hats on the matrices to unburden the notation – is given by

−1
f (s|θ) = |2πΣs |−1/2 · e− 2 (s−Cθ −c0 ) Σs (s−Cθ −c0 ) .
1 t
(7)

An exhaustive treatment of linear models in a Bayesian (econometric) context is given in Zell-

ner’s book (Z ELLNER, 1971).

Recall from (3) that for a prior π(θ) and an observed summary statistic sobs , the parameter’s
posterior distribution for our full model M is given by

π(θ|sobs ) = c · f (sobs |θ)π (θ). (8)

where f (sobs |θ) is the likelihood of the truncated model M (sobs ) given by (7) and π (θ) is the
estimated (and smoothed) truncated prior given by (5).
Performing some matrix algebra (see Appendix A) one can show that the posterior (8) is – up
to a multiplicative constant – of the form N 1
P
i=j exp(− 2 Qj ) where

setting but in a second round the design matrix C is newly estimated. When we applied FGLS to our toy models
we found a difference in the estimated matrices only after the eighth significant decimal. FGLS is a more efficient
estimator only when the sample sizes are relatively small as is often the case in economical data sets but not in ABC
situations. In theory, both OLS and FGLS are consistent estimators but FGLS is more efficient.

10
Qj = (θ − tj )t T−1 (θ − tj ) + . . .

. . . + (sobs − c0 )t Σ−1
s (sobs − c0 ) + . . .

. . . + (θ j )t Σ−1 j j t j
θ θ − (v ) Tv .

Here T, tj and vj are given by

−1
T = Ct Σ−1 −1
s C + Σθ (9)

and tj = Tvj , where

vj = Ct Σ−1 −1 j
s (sobs − c0 ) + Σθ θ . (10)

From this we get

N
c(θ j )e− 2 (θ −t θ −tj ) ,
1 j )t T−1 (
X
π(θ|sobs ) ∝ (11)
j=1

where

j 1 j t −1 j j t j

c(θ ) = exp − (θ ) Σθ θ − (v ) Tv . (12)
2

When the number of parameters exceeds two, graphical visualization of the posterior distribu-
tion becomes impractical and marginal distributions must be calculated. The marginal posterior
density of the parameter θk is defined by

Z
π(θk |s) = π(θ|s)dθ −k ,
Rm−1

where integration is performed along all parameters except θk .

Recall that the marginal distribution of a multivariate normal N (µ, Σ) with respect to the k-th
component is the univariate normal density N (µk , σk,k ). Using this fact, it is not hard to show that

11
the marginal posterior of parameter θk is given by

N
!
X (θk − tjk )2
π(θk |sobs ) = a · c(θ j ) exp − . (13)
j=1
2τk,k

where τk,k is the k-th diagonal element of the matrix T, tjk is the k-th component of the vector
tj , and c(θ j ) is still determined according to (12). The normalizing constant a could, in principle,
be determined analytically but is in practice more easily recovered by a numerical integration.
Strictly speaking, the integration should only be done over the bounded parameter domain Π and
not over the whole of Rm . But this no longer allows for an analytic form of the marginal posterior
distribution. For large values of N the diagonal elements in the matrix Σθ can be chosen so small
that the error is in any case negligible.

Model selection. The principal difficulty of model selection methods in non-parametric settings
is that it is nearly impossible to estimate the likelihood of M at sobs due to the high dimension of the
summary statistics (“curse of dimensionality”); see B EAUMONT (2007) for an approach based on
multinomial logit. Parametric models on the other hand lend themselves readily to model selection
via Bayes factors. Given the model M, one must determine the marginal density

Z
fM (sobs ) = f (sobs |θ)π(θ)dθ.
Π

It is easy to check from (1) and (2) that

Z
fM (sobs ) = A (sobs , π) · f (sobs |θ)π (θ)dθ.
Π

Here
Z Z
A (sobs , π) := π(θ) fM (s|θ)dsdθ (14)
Π B

is the acceptance rate p of the rejection process. It can easily be estimated with aid of ABC-
REJ: Sample parameters from the prior π(θ), create the corresponding statistics s from M and

12
count what fraction of the statistics fall into the -ball B centered at sobs .
If we assume the underlying model of M (sobs ) to be our GLM then the marginal density of
M at sobs can be estimated as follows:

N
A (sobs , π) X − 1 (sobs −mj )t D−1 (sobs −mj )
fM (sobs ) = e 2 (15)
N |2πD|1/2 j=1

where the sum runs over the parameter sample P = {θ 1 , . . . , θ N },

D = Σs + CΣθ Ct

and
mj = c0 + Cθ j .

For two models MA and MB with prior probabilities πA and πB = 1 − πA , the Bayes factor
BAB in favor of model MA over model MB is

fMA (sobs )
BAB = (16)
fMB (sobs )

where the marginal densities fMA and fMB are calculated according to (15). The posterior proba-
bility of model MA is
BAB πA
f (MA |sobs ) = .
BAB πA + πB

13
EXAMPLES FROM POPULATION GENETICS

Toy models. In Figure 1 we present the comparison of posteriors obtained with rejection sampling,
ABC-REG and ABC-GLM, with those determined analytically (“true posteriors”). As a toy model
we inferred the population-mutation parameter θ = 4N µ of a panmictic population model from the
number of segregating sites S of a sample of sequences with 10,000 bp for different observed values
and tolerance levels. Estimations are always based on 5000 simulations with dist(S, Sobs ) < , and
we report the average of 25 independent replications per data point. Estimation bias of the different
approaches was assessed by computing the total variation distance between the inferred posterior
and the true one obtained from analytical calculations using the likelihood function introduced by
WATTERSON (1975). Recall that the L1 -distance of two densities f (θ) and g(θ) is given by

Z
1
d1 (f, g) = |f (θ) − g(θ)|dθ
2

It is equal to 1 when f and g have disjoint supports and it vanishes when the functions are identical.
When we used a uniform prior θ ∼ Unif([0.005, 10]) (Figure 1A to C), both ABC-REG and
ABC-GLM give comparable results and improve the posterior estimation compared to the simple
rejection algorithm except for very low tolerance values where the rejection algorithm is expected
to be very close to the true posterior. The average total variation distances over all observed data sets
and tolerance values are 0.236, 0.130 and 0.091 for the rejection algorithm, ABC-REG and ABC-
GLM, respectively. Note that perfect matches between the approximate and the true posteriors are
difficult to obtain because all approximate posteriors depend on a smoothing step which may not
give accurate results close to borders of their supports. However, when we used a discontinuous
prior θ ∼ Unif([0.005, 3] ∪ [6, 10]) with an – admittedly extremely artificial – “gap” in the middle,
we observed a quite distinct pattern (Figure 1D to E). One clearly recognizes that posteriors inferred
with ABC-REG are frequently misplaced and often even further away from the true posterior (in
total variation distance) than the prior, especially for cases where the likelihood of the observed
data is maximal within the gap. The reason for this is that in the regression step of ABC-REG

14
parameter values may easily be shifted outside the prior support. This behavior of ABC-REG has
been observed earlier (B EAUMONT et al. (2002); TALLMON et al. (2004); E STOUP et al. (2004))
and as an ad hoc solution H AMILTON et al. (2006) proposed to transform the parameter values
x − a π −1
prior to the regression step by a transformation of the form y = −ln(tan( ) ) where a
b−a 2
and b are the lower and upper borders of the prior support interval. For more complex priors – like
the discontinuous prior used here – this transformation may not work. ABC-GLM is much less
affected by the “gap” prior than ABC-REG. The average total variation distance over all observed
data sets and tolerance values are 0.221, 0.246 and 0.094 for the rejection algorithm, ABC-REG
and ABC-GLM, respectively. Example posteriors with Sobs = 16 based on 5000 simulations with
dist(S, Sobs ) < 10 are shown in Figure 2.

The success of ABC-GLM depends on how well a General Linear Model fits the truncated
model M (sobs ). Under the null hypothesis that the fit is perfect the estimated residuals rj (see
(6)) are independently multivariate normally distributed random vectors. Hence the Mahalanobis
distances

dj = rtj Σ−1 2
s rj ∼ χn (17)

follow a χ2 -distribution with n degrees of freedom. As a quantification of model assessment

we propose to report the Kolmogorov-Smirnov test statistic for the empirical distribution of dj and
the reference χ2 -distribution. (Reporting p-values will be of little use in practice since the null
hypothesis does never hold exactly and hence the p-values will become very small due to the large
sample size.)
When the summary statistics are created from a General Linear Model the fit should be optimal.
This is indeed the case as the simulation results in Table 1 show. We performed 200 simulations of
randomly created General Linear Models with m = 3 parameters, n = 4 summary statistics and
a multivariate normal prior. The observed statistics were also created from the respective models.
For each simulated observed statistic and different acceptance rates p = 1.00, 0.50, 0.10, 0.05, 0.01
we calculated the approximate posterior distributions π , πREG , πGLM for the rejection algorithm,

15
ABC-REG and ABC-GLM, respectively. As the prior is multivariate normal, the true posterior
π0 can be analytically determined. Table 1 contains the means and standard deviations over the
200 simulations of the total variation distances of the approximate posteriors to the true posterior
π0 as well as the mean and standard deviations of the Kolmogorov-Smirnov test statistics for the
GLM model fit. As to be expected, the model fit is perfect (i.e. the K-S statistic is close to 0)
for acceptance rate p = 1. As the acceptance rate becomes lower the model fit deteriorates since
the truncated model of a GLM is no longer exactly a General Linear Model. The total variation
distance to the true posterior increases slightly as p gets smaller but the improved rejection posterior
π mostly outbalances the poorer model fit. As is to be expected in this ideal situation, ABC-GLM
as well as ABC-REG substantially improve the posterior estimation over the pure rejection prior.
In order to test the other extreme we performed 200 simulations for a non-linear one-parameter
model with uniformly rather than normally distributed error terms; the prior was again a normal
distribution. (The details of this toy model are described in Appendix B.) As Table 2 shows the
GLM model fit is already poor for an acceptance rate of p = 1.00 (KS statistic about 0.10) and
further deteriorates as p decreases. Note that the approximate posteriors πREG and πGLM are closer
to the true posterior in average than π and that both adjustment methods perform similarly. As
expected, the accuracy of the posteriors increases with smaller acceptance rates, despite the fact the
model fit within the –ball decreases. This suggests that the rejection step contributes substantially
to the estimation accuracy, especially when the true model is non-linear. We should mention that
in roughly 30% of the simulations both ABC-GLM and ABC-REG actually increased the distance
to the true posterior in comparison to the rejection posterior π . As a rule of thump we suggest that
posterior adjustments obtained by ABC-GLM or ABC-REG should not be trusted without further
validation if the Kolmogorov-Smirnov statistic for the GLM model fit exceeds a value of, say, 0.10.
In that case linear models are not sufficiently flexible to account for effects like non-linearity in the
parameters and non-normality and heteroscedasticity in the error terms. In the setting of ABC-REG
a wider class of models is introduced in B LUM and F RANCOIS (2009) where machine-learning
algorithms are applied for the parameter estimations. Whether these extensions can be applied in

16
our context remains to be seen. The advantage of the General Linear Model is that estimations
can be done with ordinary least squares and the important quantities like marginal posteriors and
marginal likelihoods can be obtained analytically. For more complex models these quantities will
probably only be accessible via numerical integration, Monte Carlo methods etc.

Application to Chimpanzees. In standard taxonomies, chimpanzees, the closest living relatives of

humans, are classified into two species: the common chimpanzee (Pan troglodytes) and the bonobo
(Pan paniscus). Both species are restricted to Africa and diverged roughly 9 MYA ago (W ON and
H EY, 2005; B ECQUET and P RZEWORSKI, 2007). The common chimpanzees are further subdi-
vided into three large populations or subspecies based on their separation by geographic barriers.
Among them, the western chimpanzees (P. troglodytes verus) form the most remote group. Inter-
estingly, recent multilocus studies found consistent levels of gene flow between the western and
the central (P. t. troglodytes) chimpanzees (W ON and H EY, 2005; B ECQUET and P RZEWORSKI,
2007). Nonetheless, a recent study of 310 microsatellites in 84 common chimpanzees supports a
clear distinction between the previously labeled populations (B ECQUET et al., 2007). Using a PCA
analysis, indication for substructure within the western chimpanzees was found in the same study.
To demonstrate the applicability of the model selection given in the Theory Section we contrast
two different models of the western chimpanzee population with this data set: a model of a single
panmictic population with constant size and a finite island model of constant size and constant
migration among demes. While we estimated θ = 2Ne µ, priors were set on Ne and µ separately
with log10 (Ne ) ∼ Unif([3, 5]) and µ ∼ N (5 · 10−4 , 2 · 10−4 ) truncated on µ ∈ [10−4 , 10−3 ]. In the
case of the finite island model, we had an additional prior npop ∼ Unif([10, 100]) on the number of
islands, and individuals were attributed randomly to the different islands.
We obtained genotypes for all 50 individuals reported to be of western chimpanzee origin from
the study of B ECQUET et al. (2007) excluding captive born hybrids. We checked the proposed
(B ECQUET et al., 2007) mutation pattern for each individual locus, and all alleles not matching
the assumed step-wise mutation model were set as missing data. A total of 265 loci were used,
after removing the loci on the X and Y chromosome as well as those being monomorphic among

17
the western chimpanzees. All simulations were performed using the software SIMCOAL2 (L AVAL
and E XCOFFIER, 2004) and we reproduced the pattern of missing data observed in the dataset.
Using the software package Arlequin3.0 (E XCOFFIER et al., 2005) we calculated two summary
statistics on the dataset: the average number of alleles per locus, K, and FIS , the fixation index
within the western chimpanzees. We performed a total of 100,000 simulations per model.
In Figure 3 we report the Bayes factor of the island model according to (16) for different accep-
tance rates A , see (14). While there is a large variation for very small acceptance rates, the Bayes
factor stabilizes for A ≥ 0.005. Note that A ≤ 0.005 corresponds to less than 500 simulations,
and that the ABC-GLM approach, based on a model estimation and a smoothing step, is expected
to produce poor results since the estimation of the model parameters is unreliable due to the small
sample size. The good news is that the Bayes factor is stable over a large range of tolerance values.
We may therefore safely reject the panmictic population model in favor of population subdivision
among western chimpanzees with a Bayes factor of B ≈ 105 .

18
DISCUSSION

Due to still increasing computational power it is nowadays possible to tackle estimation prob-
lems in a Bayesian framework for which analytical calculation of the likelihood is inhibited. In
such cases, Approximate Bayesian Computation is often the choice. A key innovation in speeding
up such algorithms was the use of a regression adjustment, termed ABC-REG in this note, which
used the frequently present linear relationship between generated summary statistics s and param-
eters of the model θ in a neighborhood of the observed summary statistics sobs (B EAUMONT et al.,
2002). The main advantage is that larger tolerance values still allow us to extract reasonable infor-
mation about the posterior distribution π(θ|s) and hence less simulations are required to estimate
the posterior density.
Here we present a new approach to estimate approximate posterior distributions, termed ABC-
GLM, similar in spirit to ABC-REG, but with two major advantages: Firstly, by using a GLM
to estimate the likelihood function, ABC-GLM is always consistent with the prior distribution.
Secondly, while we do not find the ABC-GLM approach to substantially outperform ABC-REG
in standard situations, it is naturally embedded into a standard Bayesian framework, which in turn
allows the application of well-known Bayesian methodologies such as model averaging or model
selection via Bayes factors. Our simulations show that the rejection step is especially beneficial if
the true model is non-linear for both ABC approaches. ABC-GLM is further compatible with any
type of ABC-sampler, including likelihood-free MCMC (M ARJORAM et al., 2003) or Population
Monte Carlo (B EAUMONT and C ORNUET J-M, 2009). Also, more complicated regression regimes
taking non-linearity or heteroscedacity into account may be envisioned when the GLM is replaced
by some more complex model. A great advantage of the current GLM-setting is its simplicity
which renders implementation in standard statistical packages feasible.
We showed the applicability of the model selection procedure via Bayes factors by opposing
two different models of population structure among the western chimpanzees Pan troglodytes verus.
Our analysis strongly suggests population substructure within the western chimpanzees since an
island model is significantly favored over a model of a panmictic population. While none of our

19
simple models is thought to mimic the real setting exactly, we still believe that they capture the main
characteristics of the demographic history influencing our summary statistics, namely the number
of alleles K and the fixation index FIS . While the observed FIS of 2.6% has been attributed to
inbreeding previously (B ECQUET et al., 2007), we propose that such values may easily arise if
diploid individuals are sampled in a randomly scattered way over a large, substructured population.
While it was almost impossible to simulate the value FIS = 2.6% in the model of a panmictic
population, it easily falls within the range of values obtained from an island model.

20
ACKNOWLEDGMENTS

We are grateful to Laurent Excoffier, David J. Balding, Christian P. Robert and the anonymous
referees for their useful comments on a first draft of this manuscript. This work has been supported
by a grant of the Swiss National Foundation No 3100A0-112072 to Laurent Excoffier.

21
APPENDIX A: PROOFS OF THE MAIN FORMULAE

To keep the paper self-contained, we present a proof of formulae (11) and (15). The argument
is an adaptation from the proof of Lemma 1 in L INDLEY and S MITH (1972). By linearity it clearly
suffices to show the formulae for one fixed sampled parameter θ j . The results then follows from

Theorem. Suppose that, given the parameter vector θ, the distribution of the statistics vector s
is multivariate normal,
s ∼ N (Cθ + c0 , Σs ),

and, given the fixed parameter vector θ j , the distribution of the parameter θ is

θ ∼ N (θ j , Σθ ).

Then:

1. The distribution of θ given s is

θ|s ∼ N (Tvj , T)

−1
where T = Ct Σ−1 −1
s C + Σθ and vj = Ct Σ−1 −1 j
s (s − c0 ) + Σθ θ .

2. The marginal distribution of s is

s ∼ N (mj , D)

where mj = c0 + Cθ j and D = Σs + CΣθ Ct .

Proof. By Bayes’ Theorem

π(θ|s) ∝ f (s|θ)π(θ).

22
The product on the right hand side is of the form exp(− 21 Q) where

Q = (s − c0 − Cθ)t Σs−1 (s − c0 − Cθ) + (θ − θ j )t Σ−1 j

θ (θ − θ )

= θ t (Ct Σ−1 −1 t −1 j t −1
s C + Σθ )θ − 2((s − c0 ) Σs Cθ + (θ ) Σθ )θ + . . .

· · · + (s − c0 )t Σ−1 j t −1 j
s (s − c0 ) + (θ ) Σθ θ

= θ t T−1 θ − 2(vj )t θ + (s − c0 )t Σ−1 j t −1 j

s (s − c0 ) + (θ ) Σθ θ

= (θ − Tvj )t T−1 (θ − Tvj ) − (vj )t Tvj + . . .

· · · + (θ j )t Σ−1 j t −1
θ θ + (s − c0 ) Σs (s − c0 ).

In the last step we completed the square with respect to θ and used the fact that T is symmetric.
Up to a constant which does not depend on θ j we hence get

j 1 j t −1 j
π(θ|s) ∝ c(θ ) exp − ((θ − Tv ) T (θ − Tv ))
2

where c(θ j ) = exp(− 12 ((θ j )t Σ−1 j j t j

θ θ − (v ) Tv )). This proves the first part of the theorem and –

by linear superposistion – the validity of formula (11).

To prove the second part of the theorem, observe that s = Cθ + c0 + with ∼ N (0, Σs ) and
θ = θ j + η with η ∼ N (0, Σθ ). Putting these equalities together, we get

s ∼ Cθ j + c0 + Cη + .

This, being a linear combination of independent multivariate normal variables, is still multivariate
normal with mean Cθ j + c0 and its covariance matrix is given by

E[(Cη + )(Cη + )t ] = E[Cη(Cη)t + t ] = CE[ηη t ]Ct + E[t ] = CΣθ Ct + Σs .

This proves the second part of the theorem as well as formula (15). 2

23
APPENDIX B: NONLINEAR TOY MODELS

In this section we describe a class of toy models that are non-linear in the parameter θ ∈ R and
have non-normal, possibly heteroscedastic error terms. Still their likelihoods are easy to calculate
analytically. We set    
 f1 (θ)   1 (θ) 
 .   . 
s = f (θ) + (θ) =  .  +  .. .
 .   
   
fn (θ) n (θ)

Here fi (θ) are monotonically increasing continuous functions of θ and i (θ) are independent, uni-
formly distributed error terms in the interval [−ui (θ), ui (θ)] ⊆ R where ui (θ) are non-decreasing,
continuous functions:
i (θ) ∼ Unif([−ui (θ), ui (θ)]).

It is straightforward to check that for a prior π(θ) the posterior distribution of θ given s = (s1 , . . . , sn )t
is (up to a normalizing constant)

1
π(θ|s) ∝ Ind([θmin , θmax ])π(θ)
u1 (θ) · . . . · un (θ)

where
θmin = max{gi−1 (si )}, θmax = min{h−1
i (si )}
i i

and
gi (θ) = fi (θ) + ui (θ), hi (θ) = fi (θ) − ui (θ).

For the simulations in Table 2 we chose n = 5, f1 (θ) = · · · = f5 (θ) = θ3 and u1 (θ) = · · · =

u5 (θ) ≡ 10. The prior was π(θ) ∼ N (0, 4).

24
REFERENCES

B EAUMONT, M., 2007 Simulations, Genetics, and Human Prehistory – a Focus on Islands.. Mc-
Donald Institute Monographs, Univ. of Cambridge, Cambridge UK.

B EAUMONT, M., and R. C. C ORNUET J-M, M ARIN J-M, 2009 Adaptivity for ABC Algorithms:
the ABC-PMC Scheme. Biometrika .

B EAUMONT, M., W. Z HANG, and D. BALDING, 2002 Approximate Bayesian Computation in

Population Genetics. Genetics 162: 2025–2035.

B ECQUET, C., N. PATTERSON, A. S TONE, M. P RZEWORSKI, and D. R EICH, 2007 Genetic Struc-
ture of Chimpanzee Populations. Genome Research 17: 1505–1519.

B ECQUET, C., and M. P RZEWORSKI, 2007 A new Approach to Estimate Parameters of Speciation
Models with Application to Apes. Genome Research 17: 1505–1519.

B LUM , M., and O. F RANCOIS, 2009 Non-linear Regression Models for Approximate Bayesian
Computation. Stat. Comput. (in press) .

E STOUP, A., M. B EAUMONT, F. S ENNEDOT, C. M ORITZ, and J. M. C ORNUET, 2004 Genetic

Analysis of Complex Demographic Scenarios: Spatially Expanding Populations of the Cane
Toad, Bufo marinus. Evolution 58: 2021–36. 0014-3820 Journal Article.

E XCOFFIER , L., G. L AVAL, and S. S CHNEIDER, 2005 Arlequin (version 3.0): an Integrated Soft-
ware Package for Population Genetics Data Analysis. Evolutionary Bioinformatics Online 1:
47–50.

G REENE , W., 2003 Econometric Analysis (5th ed.). Pearson Education Inc., New Jersey.

H AMILTON , G., M. S TONEKING, and L. E XCOFFIER, 2006 Molecular Analysis Reveals Tighter
Social Regulation of Immigration in Patrilocal Populations than in Matrilocal Populations. Proc
Natl Acad Sci USA 102: 7476–7480.

25
L AVAL , G., and L. E XCOFFIER, 2004 Simcoal 2.0: a Program to Simulate Genomic Diversity over
Large Recombining Regions in a Subdivided Population with a Complex History. Bioinformatics
20: 2485–2487.

L INDLEY, D., and A. S MITH, 1972 Bayes Estimates for the Linear Model. J. R. Statist. Soc. B 34:
1–44.

M ARJORAM , P., J. M OLITOR, V. P LAGNOL, and S. TAVARÉ, 2003 Markov Chain Monte Carlo
Without Likelihoods. Proc Natl Acad Sci USA 100: 15324–15328.

M ARJORAM , P., and S. TAVARÉ, 2006 Modern Computational Approaches for Analysing Molec-
ular Genetic Variation Data. Nat. Rev. Genet. 10: 759–770.

S ISSON , S., Y. FAN, and M. TANAKA, 2007 Sequential Monte Carlo Without Likelihoods. Proc
Natl Acad Sci USA 104: 1760–1765.

TALLMON , D. A., G. L UIKART, and M. A. B EAUMONT, 2004 Comparative Evaluation of a New

Effective Population Size Estimator Based on Approximate Bayesian Computation. Genetics
167: 977–88. Journal Article Research Support, Non-U.S. Gov’t Research Support, U.S. Gov’t,
Non-P.H.S. United States.

TAVARÉ , S., D. BALDING, R. G RIFFITHS, and P. D ONNELLY, 1997 Inferring Coalescence Times
from DNA Sequence Data. Genetics 145: 505–518.

VOGL , C., C. F UTSCHIK, and C. S CHLOETTERER, 2009 Approximate Inference of Population

Demography Using a Gaussian Likelihood with Transformed Summary Statistics. submitted .

WATTERSON , G., 1975 Number of Segregating Sites in Genetic Models without Recombination.
Theo. Pop. Biol. 7: 256–276.

W EGMANN , D., C. L EUENBERGER, and L. E XCOFFIER, 2009 Approximate Bayesian Computa-

tion and Markov Chain Monte Carlo. Genetics .

26
W EISS , G., and A. VON H AESELER, 1998 Inference of Population History Using a Likelihood
Approach. Genetics 149: 1539–1546.

W ON , Y., and J. H EY, 2005 Divergence Population Genetics of Chimpanzees. Mol. Biol. Evol.
22: 297–307.

Z ELLNER , A., 1971 An Introduction to Bayesian Inference in Econometrics.. Wiley, New York.

27
FIGURE LEGENDS

Figure 1. Comparison of rejection (A and D), ABC-REG (B and E) and ABC-GLM (C and
F) posteriors with those obtained from analytical likelihood calculations. We estimated the
population–mutation parameter θ = 4N µ of a panmictic population for different observed num-
bers of segregating sites, see text. Shades indicate the L1 -distance between inferred and analyt-
ically calculated posterior. White corresponds to an exact match (zero distance) and darker grey
shades indicate larger distances. If the inferred posterior differs from the analytical more than the
prior does, squares are marked in black. The upper row (A to C) corresponds to cases with a
uniform prior θ ∼ Unif([0.005, 10]), the lower row (D to F) to cases with a discontinuous prior
θ ∼ Unif([0.005, 3] ∪ [6, 10]) with “gap”. The tolerance is given as the absolute distance in num-
ber of segregating sites. Shown are averages over 25 independent estimations. In order to have a
fair comparison, we adjusted the smoothing parameters (bandwidths) such as to get the best results
for both approaches.

Figure 2. Example posteriors for uniform (A) and discontinuous priors (B). The model is the
same as in Figure 1. Posterior estimates using ABC-GLM and ABC-REG for Sobs = 16 were
based on 5000 simulations with dist(S, Sobs ) < 10. ABC-REg posteriors were smoothed with a
Bandwidth of 0.4, the width of the Dirac peaks in the ABC-GLM approach was set to 10−5 .

Figure 3. Bayes factor for the island relative to the panmictic population model for different
acceptance rates (logarithmic scale). For very low acceptance rates we observe large fluctuations
whereas the Bayes factor is quite stable for larger values. Note that A ≤ 0.005 corresponds to
≤ 500 simulations, too small a sample size for robust statistical model estimation.

28
TABLES

Table 1. Mean and standard deviation of the L1 –distance between inferred and expected posteriors
for randomly generated GLMs with NP = 3, NS = 4 (prior N (0, 0.22 ), 200 simulations).

pa d1 (π0 , π ) d1 (π0 , πREG ) d1 (π0 , πGLM ) KS statisticsb

1.00 0.51 ± 0.22 0.15 ± 0.10 0.01 ± 0.001 0.004 ± 0.001
0.50 0.42 ± 0.19 0.13 ± 0.10 0.02 ± 0.008 0.007 ± 0.003
0.10 0.29 ± 0.18 0.13 ± 0.11 0.03 ± 0.01 0.02 ± 0.01
0.05 0.24 ± 0.16 0.13 ± 0.12 0.03 ± 0.01 0.03 ± 0.01
0.01 0.21 ± 0.17 0.15 ± 0.14 0.05 ± 0.02 0.06 ± 0.02

a
Acceptance rate as a fraction.
b
KS statistic describing the linear model fit (see text).

Table 2. Mean and standard deviation of the L1 –distance between inferred and expected posteriors
for the uniform errors model (see Appendix) with NP = 1, NS = 5 (prior N (0, 22 ), error unif
[−10, 10] , 200 simulations).

pa d1 (π0 , π ) d1 (π0 , πREG ) d1 (π0 , πGLM ) KS statisticsb

1.00 0.56 ± 0.24 0.49 ± 0.25 0.46 ± 0.29 0.09 ± 0.01
0.50 0.40 ± 0.30 0.36 ± 0.28 0.37 ± 0.27 0.12 ± 0.01
0.10 0.38 ± 0.28 0.35 ± 0.26 0.34 ± 0.23 0.14 ± 0.03
0.05 0.34 ± 0.29 0.33 ± 0.27 0.32 ± 0.23 0.14 ± 0.02
0.01 0.29 ± 0.23 0.26 ± 0.22 0.26 ± 0.18 0.16 ± 0.03

a
Acceptance rate as a fraction.
b
KS statistic describing the linear model fit (see text).

IDQP Standards February 2018
No ratings yet
IDQP Standards February 2018
32 pages
1992 Smith Gelfand
No ratings yet
1992 Smith Gelfand
6 pages
Covid Vaccine Certificate
No ratings yet
Covid Vaccine Certificate
1 page
Bayesian Computation and Model Selection Without Likelihoods
No ratings yet
Bayesian Computation and Model Selection Without Likelihoods
10 pages
Approximate Bayesian Computation Using Asymptotically Normal Point Estimates
No ratings yet
Approximate Bayesian Computation Using Asymptotically Normal Point Estimates
38 pages
Iterative and Non-Iterative Simulation Algorithms
No ratings yet
Iterative and Non-Iterative Simulation Algorithms
6 pages
MCMC With Temporary Mapping and Caching With Application On Gaussian Process Regression
No ratings yet
MCMC With Temporary Mapping and Caching With Application On Gaussian Process Regression
16 pages
Using Early Rejection Markov Chain Monte Carlo and Gaussian Processes To Accelerate ABC Methods
No ratings yet
Using Early Rejection Markov Chain Monte Carlo and Gaussian Processes To Accelerate ABC Methods
33 pages
Lec8 MLE
No ratings yet
Lec8 MLE
35 pages
Barthelme EP2
No ratings yet
Barthelme EP2
58 pages
GLMConstrained
No ratings yet
GLMConstrained
11 pages
Point Estimation: Definition of Estimators
No ratings yet
Point Estimation: Definition of Estimators
8 pages
15.097: Probabilistic Modeling and Bayesian Analysis
No ratings yet
15.097: Probabilistic Modeling and Bayesian Analysis
42 pages
Talk On Regression Based Method For Bayesian Nonparanormal Graphical Models
No ratings yet
Talk On Regression Based Method For Bayesian Nonparanormal Graphical Models
40 pages
How to Make Pls Consistent
No ratings yet
How to Make Pls Consistent
6 pages
STAT 5392 hw6
No ratings yet
STAT 5392 hw6
16 pages
Variational Bayes
No ratings yet
Variational Bayes
38 pages
ECON 1630 Problem Set #2 Fall 2021: Bias Variance
No ratings yet
ECON 1630 Problem Set #2 Fall 2021: Bias Variance
9 pages
16-STS601
No ratings yet
16-STS601
3 pages
Inference For SDE Models Via Approximate Bayesian Computation
No ratings yet
Inference For SDE Models Via Approximate Bayesian Computation
27 pages
17-STS576REJ
No ratings yet
17-STS576REJ
3 pages
X400004_20220215_solutions
No ratings yet
X400004_20220215_solutions
8 pages
A Review Constructing Priors That Penalizes The Complexity of Gaussian Random Fields - Fuglstad Et Al
No ratings yet
A Review Constructing Priors That Penalizes The Complexity of Gaussian Random Fields - Fuglstad Et Al
8 pages
Gaussian Mixture Model: P (X - Y) P (Y - X) P (X)
No ratings yet
Gaussian Mixture Model: P (X - Y) P (Y - X) P (X)
3 pages
A Diffusion Process Perspective On Posterior Contraction Rates For Parameters
No ratings yet
A Diffusion Process Perspective On Posterior Contraction Rates For Parameters
36 pages
Fuskpaper Bayes
No ratings yet
Fuskpaper Bayes
51 pages
IntroBayesTimeSeries1
No ratings yet
IntroBayesTimeSeries1
72 pages
MCMC Bayes PDF
No ratings yet
MCMC Bayes PDF
27 pages
90206271d48a1ed0a3638e9d69bddc2117ee
No ratings yet
90206271d48a1ed0a3638e9d69bddc2117ee
10 pages
abc_slides
No ratings yet
abc_slides
68 pages
Lec16 SummarizingPosteriors BayesianModelSelection
No ratings yet
Lec16 SummarizingPosteriors BayesianModelSelection
59 pages
UNIT 3-Bayesian Statistics
No ratings yet
UNIT 3-Bayesian Statistics
80 pages
Solutions To The Exercises On The Bias-Variance Dilemma
No ratings yet
Solutions To The Exercises On The Bias-Variance Dilemma
8 pages
Rarefied Gas Dynamics - DSMC Course
No ratings yet
Rarefied Gas Dynamics - DSMC Course
50 pages
Machine Learning and Pattern Recognition Bayesian Complexity Control
No ratings yet
Machine Learning and Pattern Recognition Bayesian Complexity Control
4 pages
On Input Selection With Reversible Jump Markov Chain Monte Carlo Sampling
No ratings yet
On Input Selection With Reversible Jump Markov Chain Monte Carlo Sampling
10 pages
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
No ratings yet
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
14 pages
s-m-s-t-c--lecture-2425-3
No ratings yet
s-m-s-t-c--lecture-2425-3
61 pages
Approximate Bayesian Computation For Finite Mixture Models
No ratings yet
Approximate Bayesian Computation For Finite Mixture Models
21 pages
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
No ratings yet
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
49 pages
Linear - Regression
100% (1)
Linear - Regression
39 pages
2009 Paninsky Nonparametric estimation of entropy and distributions
No ratings yet
2009 Paninsky Nonparametric estimation of entropy and distributions
34 pages
Stats, Mle, and Other Stuff: 1 Sevssd
No ratings yet
Stats, Mle, and Other Stuff: 1 Sevssd
10 pages
Sorbom, Dag (1974)
No ratings yet
Sorbom, Dag (1974)
11 pages
i i 2 i 1 2 θ i 2 2 3 2
No ratings yet
i i 2 i 1 2 θ i 2 2 3 2
159 pages
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
17 pages
Lecture Notes Statistics II PDF
No ratings yet
Lecture Notes Statistics II PDF
139 pages
Lec11 Introduction2BayesianStatistics
No ratings yet
Lec11 Introduction2BayesianStatistics
48 pages
slice
No ratings yet
slice
36 pages
Bayesian Monte Carlo: Carl Edward Rasmussen and Zoubin Ghahramani
No ratings yet
Bayesian Monte Carlo: Carl Edward Rasmussen and Zoubin Ghahramani
8 pages
3 Practical
No ratings yet
3 Practical
2 pages
Dalalyan - 2017 - Theoretical Guarantees For Approximate Sampling From Smooth and Log-Concave Densities
No ratings yet
Dalalyan - 2017 - Theoretical Guarantees For Approximate Sampling From Smooth and Log-Concave Densities
26 pages
Lab 8: Introduction To Winbugs: Goals
No ratings yet
Lab 8: Introduction To Winbugs: Goals
8 pages
Inference in a Class of Optimization Problems Confidence Regions and Finite Sample Bounds on Errors in Coverage Probabilities
No ratings yet
Inference in a Class of Optimization Problems Confidence Regions and Finite Sample Bounds on Errors in Coverage Probabilities
53 pages
Marginal Likelihood Estimation Via Power Posteriors: N. Friel
No ratings yet
Marginal Likelihood Estimation Via Power Posteriors: N. Friel
13 pages
Bayesian Uncertainty Quantification
No ratings yet
Bayesian Uncertainty Quantification
23 pages
Weatherwax Theodoridis Solutions
No ratings yet
Weatherwax Theodoridis Solutions
212 pages
Cram Er-Rao Lower Bound and Information Geometry: 1 Introduction and Historical Background
No ratings yet
Cram Er-Rao Lower Bound and Information Geometry: 1 Introduction and Historical Background
27 pages
Fifth Dimension: The Light to See
From Everand
Fifth Dimension: The Light to See
Marc E. King
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Lectures on Measure and Integration
From Everand
Lectures on Measure and Integration
Harold Widom
No ratings yet
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Jhered.esv095.Full
No ratings yet
Jhered.esv095.Full
11 pages
Tjepkes Umn 0130M 16753łosie
No ratings yet
Tjepkes Umn 0130M 16753łosie
80 pages
Finding Haplotype Tagging SNPs by Use of Principal Components Analysis
No ratings yet
Finding Haplotype Tagging SNPs by Use of Principal Components Analysis
12 pages
jcm-1529873 (1)
No ratings yet
jcm-1529873 (1)
14 pages
An introduction to repeatability estimation with rptR
No ratings yet
An introduction to repeatability estimation with rptR
27 pages
Kumar2022_Article_GeneticAdmixtureAndPopulationS
No ratings yet
Kumar2022_Article_GeneticAdmixtureAndPopulationS
12 pages
Correlated_variation_between_the_lateral
No ratings yet
Correlated_variation_between_the_lateral
11 pages
genes-14-02126
No ratings yet
genes-14-02126
11 pages
1-s2.0-S0006320722002026-main
No ratings yet
1-s2.0-S0006320722002026-main
9 pages
JAG2014
No ratings yet
JAG2014
10 pages
Comparison of SNP and microsatellite genotyping panels for spatial
No ratings yet
Comparison of SNP and microsatellite genotyping panels for spatial
8 pages
Differentiation of North African foxes and population genetic dynamics in the desert—insights into the evolutionary history
No ratings yet
Differentiation of North African foxes and population genetic dynamics in the desert—insights into the evolutionary history
25 pages
catttle diversity
No ratings yet
catttle diversity
85 pages
APA Style Reerence Citations
No ratings yet
APA Style Reerence Citations
4 pages
Toronto Star 1806
No ratings yet
Toronto Star 1806
37 pages
KB Rebar Brochure
100% (1)
KB Rebar Brochure
4 pages
Transpo Law Case Digest
No ratings yet
Transpo Law Case Digest
6 pages
H001HHL1525962_BHFL In-Principle Sanction Letter
No ratings yet
H001HHL1525962_BHFL In-Principle Sanction Letter
3 pages
Alloying Elements Excel
No ratings yet
Alloying Elements Excel
18 pages
Letter From James Pallotta
100% (2)
Letter From James Pallotta
2 pages
Bell Tech Introduction Brocher 1
No ratings yet
Bell Tech Introduction Brocher 1
4 pages
Owners & Installation Manual: Classic H2100M & I3100L Wood Fireplace Insert & Hearth Heater
No ratings yet
Owners & Installation Manual: Classic H2100M & I3100L Wood Fireplace Insert & Hearth Heater
32 pages
Quectel BG96 TCPIP at Commands Manual V1.1
No ratings yet
Quectel BG96 TCPIP at Commands Manual V1.1
44 pages
The Doctors' INN The Engineers' INN Physics II: Ch#21 Nuclear Physics Test
100% (1)
The Doctors' INN The Engineers' INN Physics II: Ch#21 Nuclear Physics Test
3 pages
Modul Grammar Basic Beserta Latihannya
No ratings yet
Modul Grammar Basic Beserta Latihannya
20 pages
Mirza _CV
No ratings yet
Mirza _CV
3 pages
26 Headache Disability Index
No ratings yet
26 Headache Disability Index
2 pages
Capacitacion Tecnico-Comercial Generac, Personal Ride
No ratings yet
Capacitacion Tecnico-Comercial Generac, Personal Ride
81 pages
DDC-QAQC-DMI24-ITP-001 - R00 Dimisa Steel Structure
No ratings yet
DDC-QAQC-DMI24-ITP-001 - R00 Dimisa Steel Structure
3 pages
Top Floor Publishing - Advanced Web Sites Made Easy
No ratings yet
Top Floor Publishing - Advanced Web Sites Made Easy
313 pages
Indo Uas Nyabuk Gunung Mendeley JADI
No ratings yet
Indo Uas Nyabuk Gunung Mendeley JADI
5 pages
Group6 Ebook Onboarding
No ratings yet
Group6 Ebook Onboarding
20 pages
Class Test 2 - Some Basic Concepts of Macroeconomics
No ratings yet
Class Test 2 - Some Basic Concepts of Macroeconomics
4 pages
Phrozen User_Guide_Onyx_Impact_Plus_EN
No ratings yet
Phrozen User_Guide_Onyx_Impact_Plus_EN
13 pages
AWP Reference Guide PDF
No ratings yet
AWP Reference Guide PDF
40 pages
Notes Technical Writing
No ratings yet
Notes Technical Writing
4 pages
The Relationship Between, Internal Control and Fraud: What Have We Learned From Parmalat?
No ratings yet
The Relationship Between, Internal Control and Fraud: What Have We Learned From Parmalat?
5 pages
RESUME2025
No ratings yet
RESUME2025
5 pages
Protocol Deviation Tracking Log Ver2!07!17-2015
No ratings yet
Protocol Deviation Tracking Log Ver2!07!17-2015
3 pages
W1 Orientation - PPT
No ratings yet
W1 Orientation - PPT
13 pages
Review Test: I: Choose The Word Which Has Different Sound in The Underlined Part in Each Line
No ratings yet
Review Test: I: Choose The Word Which Has Different Sound in The Underlined Part in Each Line
4 pages

Bayesian Computation and Model Selection without

Uploaded by

Bayesian Computation and Model Selection without

Uploaded by

Genetics: Published Articles Ahead of Print, published on September 28, 2009 as 10.1534/genetics.109.

Bayesian Computation and Model Selection without

Christoph Leuenberger1,2 and Daniel Wegmann1,3

where C is a n × m-matrix of constants, c0 a n × 1-vector and  a random vector with a

Let us look more closely at these two steps:

To fix the notation, let P = {θ 1 , . . . , θ N } be a sample of vector-valued parameters created

and for Σs we have the estimate

An exhaustive treatment of linear models in a Bayesian (econometric) context is given in Zell-

π(θ|sobs ) = c · f (sobs |θ)π (θ). (8)

Here T, tj and vj are given by

and tj = Tvj , where

From this we get

where integration is performed along all parameters except θk .

It is easy to check from (1) and (2) that

where the sum runs over the parameter sample P = {θ 1 , . . . , θ N },

follow a χ2 -distribution with n degrees of freedom. As a quantification of model assessment

Application to Chimpanzees. In standard taxonomies, chimpanzees, the closest living relatives of

1. The distribution of θ given s is

2. The marginal distribution of s is

where mj = c0 + Cθ j and D = Σs + CΣθ Ct .

Proof. By Bayes’ Theorem

Q = (s − c0 − Cθ)t Σs−1 (s − c0 − Cθ) + (θ − θ j )t Σ−1 j

= θ t T−1 θ − 2(vj )t θ + (s − c0 )t Σ−1 j t −1 j

= (θ − Tvj )t T−1 (θ − Tvj ) − (vj )t Tvj + . . .

where c(θ j ) = exp(− 12 ((θ j )t Σ−1 j j t j

by linear superposistion – the validity of formula (11).

E[(Cη + )(Cη + )t ] = E[Cη(Cη)t + t ] = CE[ηη t ]Ct + E[t ] = CΣθ Ct + Σs .

For the simulations in Table 2 we chose n = 5, f1 (θ) = · · · = f5 (θ) = θ3 and u1 (θ) = · · · =

B EAUMONT, M., W. Z HANG, and D. BALDING, 2002 Approximate Bayesian Computation in

E STOUP, A., M. B EAUMONT, F. S ENNEDOT, C. M ORITZ, and J. M. C ORNUET, 2004 Genetic

TALLMON , D. A., G. L UIKART, and M. A. B EAUMONT, 2004 Comparative Evaluation of a New

VOGL , C., C. F UTSCHIK, and C. S CHLOETTERER, 2009 Approximate Inference of Population

W EGMANN , D., C. L EUENBERGER, and L. E XCOFFIER, 2009 Approximate Bayesian Computa-

pa d1 (π0 , π ) d1 (π0 , πREG ) d1 (π0 , πGLM ) KS statisticsb

pa d1 (π0 , π ) d1 (π0 , πREG ) d1 (π0 , πGLM ) KS statisticsb

You might also like

where C is a n × m-matrix of constants, c0 a n × 1-vector and a random vector with a

π(θ|sobs ) = c · f (sobs |θ)π (θ). (8)

E[(Cη + )(Cη + )t ] = E[Cη(Cη)t + t ] = CE[ηη t ]Ct + E[t ] = CΣθ Ct + Σs .

pa d1 (π0 , π ) d1 (π0 , πREG ) d1 (π0 , πGLM ) KS statisticsb

pa d1 (π0 , π ) d1 (π0 , πREG ) d1 (π0 , πGLM ) KS statisticsb