Convergence Analysis of Multifidelity Monte Carloestimation
Convergence Analysis of Multifidelity Monte Carloestimation
B. Peherstorfer
Department of Mechanical Engineering and Wisconsin Institute for Discovery, University of
Wisconsin-Madison, Madison, WI 53706
E-mail: [email protected]
M. Gunzburger
Department of Scientific Computing, Florida State University, 400 Dirac Science Library,
Tallahassee FL 32306-4120
E-mail: [email protected]
K. Willcox
Department of Aeronautics & Astronautics, MIT, Cambridge, MA 02139
E-mail: [email protected]
2 1 INTRODUCTION
1 Introduction
Inputs to systems are often modeled as random variables to account for the
uncertainties in the inputs due to inaccuracies and incomplete knowledge.
Given the input random variable and a model of the system of interest, an
important task is to estimate statistics of the model output random variable.
Monte Carlo estimation is one popular approach to estimate statistics. Ba-
sic Monte Carlo estimation generates samples of the input random variable,
discretizes the model and then solves the discretized model—the high-fidelity
model—up to the required accuracy at these samples, and averages over the
corresponding outputs to estimate statistics of the model output random vari-
able. This basic Monte Carlo estimation often requires many samples, and
consequently many approximations of the model outputs, which can become
too costly if the high-fidelity model solves are expensive. We note that other
techniques than Monte Carlo estimation are available to estimate statistics of
model outputs, see, e.g., [1, 33, 21, 15, 14, 47, 43, 45].
Several variance reduction techniques have been presented to reduce the
costs of Monte Carlo simulation compared to basic Monte Carlo estimators,
e.g., antithetic variates [39, 23, 28] and importance sampling [39, 27, 36]. Our
focus here is on the control variate framework that exploits the correlation
between the model output random variable and an auxiliary random vari-
able that is cheap to sample [30]. A major class of control variate methods
derives the auxiliary random variable from cheap approximations of the out-
puts of the high-fidelity model. For example, in situations where the model is
governed by (often elliptic) partial differential equations (PDEs), coarse-grid
approximations of the PDE—low-fidelity models—can provide cheap approx-
imations of the outputs obtained from a fine-grid high-fidelity discretization
of the PDE; however, other types of low-fidelity models are possible in the
context of PDEs, e.g., projection-based reduced models [41, 40, 20, 3, 37], data-
fit interpolation and regression models [13, 12], machine-learning-based models
such as support vector machines [46, 11], and other simplified models [29, 32].
The multifidelity Monte Carlo (MFMC) method [38] uses a control variate
approach to combine auxiliary random variables stemming from low-fidelity
models into an estimator of the statistics of the high-fidelity model output.
Key to the MFMC approach is the selection of how often each of the auxiliary
random variables is sampled, and therefore how often each of the low-fidelity
models is solved. The MFMC approach derives this selection from the correla-
tion coefficients between the auxiliary random variables and the high-fidelity
model output random variable. The selection of the MFMC approach is opti-
mal in the sense that the variance of the MFMC estimator is minimized for
given maximal costs of the estimation. We refer to the discussions in [38, 31]
for details on MFMC.
The work [38] discusses the properties of MFMC estimation in a setting
where only mild assumptions on the high- and low-fidelity models are made.
We consider here the setting where we can make further assumptions on the
errors and costs of outputs obtained with a hierarchy of low- and high-fidelity
3
models. Our contribution is to show that for an MFMC estimator with mean-
squared error (MSE) below a threshold parameter > 0, the costs of the
estimation can be bounded by −1 up to a constant under certain conditions
on the error and cost bounds of the models in the hierarchy.
We discuss that the conditions we require in the MFMC context are similar
to the conditions exploited by the multilevel Monte Carlo method [9, Theo-
rem 1]. Our analysis shows that MFMC estimation is as efficient in terms of
error and costs as multilevel Monte Carlo estimation under certain conditions
that we discuss below in detail. Multilevel Monte Carlo uses a hierarchy of
low-fidelity models—typically coarse-grid approximations—to derive a hierar-
chy of auxiliary random variables, which are combined in a judicious way to
reduce the runtime of Monte Carlo simulation. Multilevel Monte Carlo was in-
troduced in [26] and extended and made popular by the work [18]. Since then,
the properties of the multilevel Monte Carlo estimators have been studied ex-
tensively in different settings, see, e.g., [9, 8, 2, 6, 42]. Multilevel Monte Carlo
and its variants have also been applied to density estimation [5], variance es-
timation [4], and rare event simulation [44]. We also mention the continuation
multilevel Monte Carlo [10] and the extension multi-index Monte Carlo that
allows different mesh widths in the dimensions [22]. In [34, 35], a fault-tolerant
multilevel Monte Carlo is introduced and analyzed, which is well suited for
massively parallel computations. An integer optimization problem is solved
to determine the optimal number of model evaluations depending on the rate
of compute-node failures. The fault-tolerant approach thus takes into account
node failure by adapting the number of model evaluations accordingly. The
relationship between multilevel Monte Carlo and sparse grid quadrature [7,
16, 17] is discussed in [24, 25, 19].
The outline of the presentation is as follows. Section 2 introduces the prob-
lem setup and basic, multilevel, and multifidelity Monte Carlo estimators. Sec-
tion 3 derives the new convergence analysis of MFMC estimation. Numerical
examples in Section 4 illustrate the derived bounds. Conclusions are drawn in
Section 5.
2 Problem setup
This section introduces the problem setup and the various types of Monte Carlo
estimators required throughout the presentation. Section 2.1 introduces the
notation and Section 2.2 the basic Monte Carlo estimator. Multilevel Monte
Carlo and the MFMC estimation are summarized in Section 2.3 and Sec-
tion 2.4, respectively.
2.1 Preliminaries
w` ≤ κ3 sγ` ,
We make the assumption that there exists a positive lower and upper bound
for the variance σ`2 with respect to level ` ∈ N.
Assumption 3 There exist σlow ∈ R+ and σup ∈ R+ such that σlow ≤ σ` ≤
σup for ` ∈ N.
The Pearson product-moment correlation coefficient of the random variables
Q` (Z) and Ql (Z) is denoted as
Cov[Q` (Z), Ql (Z)]
ρ`,l = , `, l ∈ N , (1)
σ` σl
where Cov[Q` (Z), Ql (Z)] is the covariance of Q` (Z) and Ql (Z).
We consider the situation where the random variable Z represents an input
random variable and Q is a function that maps an input, i.e., a realization
of Z, onto an output. In our situation, evaluating Q entails solving a PDE
(“model”), but the solutions to the PDE are unavailable. We therefore revert to
solving an approximate PDE (“discretized model”), where the approximation
(e.g., the mesh width) is controlled by the level `. The functions Q` map the
2.2 BASIC MONTE CARLO ESTIMATION 5
input onto the output obtained by solving the approximate PDE on level `.
Assumption 1 specifies in which sense Q` converges to Q with ` → ∞. Solving
the approximate PDE on level ` incurs costs w` . One task in this context is to
derive estimators of E[Q(Z)] using the functions Q` . We assess the efficiency
of an estimator Q
b with its MSE
2
e(Q)
b =E Q b − E[Q(Z)] ,
b MC of E[Q` (Z)] as
Let ` ∈ N and define the basic Monte Carlo estimator Q `,m
m
b MC 1 X
Q `,m = Q` (Zi ) ,
m i=1
b MC −1 2
e(Q `,m ) = m Var[Q` (Z)] + (E[Q(Z) − Q` (Z)]) . (2)
The term m−1 Var[Q` (Z)] is the variance term and term (E[Q` (Z) − Q(Z)])2
b MC are
is the bias term. The costs of the estimator Q `,m
b MC
c(Q `,m ) = mw` ,
assuming the variance σ`2 is approximately constant with respect to the level
b MC are
`, the costs of the basic Monte Carlo estimator Q L,m
b MC −1−γ/(2α)
c(Q L,m ) . ,
see [9, Section 2.1] for a proof. The costs of the basic Monte Carlo estimator
scale with the rates γ and α.
We follow [9] for the presentation of the multilevel Monte Carlo estimation.
Consider the threshold ∈ R+ and define the maximal level L ∈ N as in (3).
Multilevel Monte Carlo exploits the linearity of the expected value to write
L
X L
X
E[QL (Z)] = E[Q1 (Z)] + E[Q` (Z) − Q`−1 (Z)] = E[∆` (Z)] ,
`=2 `=1
where ∆` (Z) = Q` (Z) − Q`−1 (Z) for ` > 1 and ∆1 (Z) = Q1 (Z). The basic
Monte Carlo estimator of ∆` (Z) with m` ∈ N samples Z1 , . . . , Zm` is
m
bMC 1 X̀
∆ `,m` = Q` (Zi ) − Q`−1 (Zi ) .
m` i=1
b ML is then given by
The multilevel Monte Carlo estimator Q L,m
L
X
b ML
Q L,m =
bMC
∆ `,m` , (4)
`=1
The following assumption sets the rate β of the decay of the variance Var[Q` (Z)−
Q`−1 (Z)] in relation to the rate γ of the increase of the costs with level `.
Assumption 5 For the rates γ of Assumption 2 and β of Assumption 4, we
have β > γ.
Set the number of samples mML = [mML ML T
1 , . . . , mL ] to
−1
ML −1 −(β−γ)/2 −(β+γ)`/2
m` = 2 κ2 1 − s s , ` = 1, . . . , L , (5)
and
m`+1
b MC 1 X
Q `,m`+1 = Q` (Zi ) , ` = 1, . . . , L − 1 , (9)
m`+1 i=1
which use the samples (6) and the evaluations (7). Note that the estimators in
b MC
(9) use the first m`+1 samples of the samples (6). Thus, the estimators Q `,m`
8 2 PROBLEM SETUP
b MC b MF
and Q `,m`+1 are dependent for ` = 1, . . . , L − 1. The MFMC estimator QL,m
is defined as
L−1
X
b MF
Q b MC
L,m = QL,mL + a` Qb MC
`,m` − Q
b MC
`,m`+1 ,
`=1
where a = [a1 , . . . , aL−1 ]T ∈ RL−1 are coefficients. The costs of the MFMC
b MF are
estimator Q L,m
b MF
c(Q T
L,m ) = w m ,
ρL,` σL
aMF
` = , ` = 1, . . . , L − 1 ,
σ`
and the number of samples mMF = [mMF MF T
1 , . . . , mL ] is set to
mMF
` = mMF
L r` , ` = 1, . . . , L ,
9
where s
wL (ρ2L,` − ρ2L,`−1 )
r` = , ` = 1, . . . , L , (10)
w` (1 − ρ2L,L−1 )
with ρL,0 = 0. Note that the selection of mMF and aMF is independent of the
rates α, β, γ, which means the approach is applicable also in situations where
rates capture the behavior of the properties of the functions Q1 , . . . , QL only
poorly, see, e.g., [38] for examples. Note further that the components of the
number of samples mMF are rounded up to integer numbers as in the mul-
tilevel Monte Carlo method, see (5) in Section 2.3. We note that in [34] an
integer optimization problem is solved to adapt the number of model evalua-
tions in multilevel Monte Carlo for an increased processor-failure tolerance on
massively-parallel compute platforms.
The MFMC estimator is unbiased with respect to E[QL (Z)], see [38, Lemma 3.1].
The variance of the MFMC estimator Q b MF MF is [38]
L,m
2
σL (1 − ρ2L,L−1 )
b MF MF ) =
Var(Q p.
L,m 2
mMF
L wL
The work [38] investigates the costs and the MSE of the MFMC estimator only
in the context of Assumption 6 and Assumption 7, and does not give insights
into the behavior of the MFMC estimator if additionally Assumptions 1–5 are
made.
We now discuss the error and costs behavior of the MFMC estimator in a
typical setting of the multilevel Monte Carlo estimators where Assumption 4
on the rate of the variance decay and Assumption 5 on the relative costs hold.
Our main result is Theorem 1 that states that the MFMC estimator is efficient
under Assumptions 1–7, which means that the MFMC estimator achieves an
MSE e(Q b MF MF ) . −1 , independent of the rates α
b MF MF ) . with costs c(Q
L,m L,m
and γ. We first state Theorem 1 and then prove two lemmata in Section 3.1
and provide the proof of Theorem 1 in Section 3.2. Corollary 1 discusses the
convergence rates of MFMC if Assumption 5 is violated.
p = κ4 −1 , (11)
b MF MF ) . −1 .
and the costs are bounded as c(Q L,m
Note that the MLMC theory developed in [9, Theorem 1] and [18, Theo-
rem 3.1] requires an additional assumption on the rate α because the rounding
up of the numbers of samples to an integer is explicitly taken into account, see
also [4, Theorem 3.2]. We ignore the rounding here and therefore can avoid
that assumption; however, we emphasize that we expect that a similar as-
sumption is necessary for MFMC as well if the rounding of the numbers of
samples is taken into account explicitly.
This section proves two lemmata that we use in the proof of Theorem 1 in
Section 3.2.
for ` ∈ N. We obtain
Var[Q`+1 (Z) − Q`−1 (Z)] ≤ Var[Q`+1 (Z) − Q` (Z)] + Var[Q` (Z) − Q`−1 (Z)]
+ 2| Cov[Q`+1 (Z) − Q` (Z), Q` (Z) − Q`−1 (Z)]| . (13)
where the last inequality holds because s > 1. Define now the sequence (bj )
with
b0 = 1 , bj = s−βj/2 + bj−1 (1 + 2s−βj/2 ) , j ∈ N.
From (14) and from the definition of the sequence (bj ), it follows with induction
that
Var[Q`+j (Z) − Q`−1 (Z)] ≤κ2 s−β(`+j) + κ2 bj−1 s−β` + 2κ2 s−β` (bj−1 s−βj )1/2
≤κ2 s−β` (s−βj + bj−1 + 2(bj−1 s−βj )1/2 )
≤κ2 s−β` (s−βj + bj−1 + 2bj−1 s−βj/2 )
≤κ2 s−β` (s−βj/2 + bj−1 (1 + 2s−βj/2 ))
≤κ2 s−β` bj ,
1/2
because bj ≥ 1 (and therefore bj ≤ bj ) and s > 1 for j ∈ N. To bound the
sequence (bj ), rewrite
j
X j
Y
bj = s−βi/2 (1 + 2s−βr/2 ) ,
i=0 r=i+1
converges, which is the case because s > 1 and therefore s−β/2 < 1. Denote
∞
Y
(1 + 2s−βr/2 ) = κ5 ,
r=0
where we used
Var[QL (Z)−Q`−1 (Z)] = Var[QL (Z)]+Var[Q`−1 (Z)]−2 Cov[QL (Z), Q`−1 (Z)] .
Because
1 1
x+ ≥1
2 x
where we used Lemma 1 to bound Var[QL (Z) − Q`−1 (Z)] and the lower bound
σlow of Assumption 3. Since ρL,` + ρL,`−1 ≤ 2, we obtain
1 −β`
ρ2L,` − ρ2L,`−1 = (ρL,` − ρL,`−1 )(ρL,` + ρL,`−1 ) . 2 s .
σlow
3.2 PROOF OF MAIN THEOREM 13
With the Lemmata 1–2 discussed in Section 3.1, we now prove Theorem 1.
b MF MF is split into
Proof (of Theorem 1) The MSE of the MFMC estimator Q L,m
the biasing and the variance term
e(Q b MF MF ] + (E[Q(Z) − QL (Z)])2 .
b MF MF ) = Var[Q (16)
L,m L,m
We first consider the biasing term of the MSE. With the maximal level L
defined as in (3), we obtain with Assumption 1
2
(E[Q(Z) − QL (Z)]) . .
2
Consider now the variance term Var[Q b MF MF ]. Assumption 3 means that σ` ≤
L,m
σup for ` = 1, . . . , L. We therefore have
L
!2
2
MF
σup 1 − ρ2L,L−1 2
σup 1 − ρ2L,L−1 X
Var[Q L,mMF ] ≤ p= w` r` ,
b 2
mMF wL pwL
L `=1
see [38, Proof of Corollary 3.5] for the transformations. With Assumption 2
and Lemma 2, we obtain
L r L L
X 1 X √ γ` −β` 1 X γ−β `
w` ρ2L,` − ρ2L,`−1 . s s . s 2 . (18)
σlow σlow
`=1 `=1 `=1
γ−β
Assumption 5 gives β > γ, and therefore s < 1 (because s > 1). Therefore,
we obtain with the geometric series that
L r γ−β
X 1 s 2
w` ρ2L,` − ρ2L,`−1 . .
σlow 1 − s γ−β
2
`=1
This means that we bounded the variance and the biasing term by /2 and
therefore have that the MSE is bounded by . The choice of the budget p in
b MF MF ) . −1 .
(11) leads to c(Q L,m
3 NEW PROPERTIES OF THE MULTIFIDELITY MONTE CARLO
14
ESTIMATOR
The following corollary considers the case where Assumption 5 is violated,
i.e., where β ≤ γ.
Proof Consider (18) in the proof of Theorem 1 and note that equation (18)
holds even if Assumption 5 is violated. Note that the following proof closely
follows [9, Theorem 1] and [18, Theorem 3.1].
We first consider the case γ > β and obtain
γ−β ≤ 0,
s− 2 −1
and therefore
L γ−β
2 L
X γ−β
` s
s 2 ≤ γ−β .
`=0 1 − s− 2
2
γ−β √ γ−β
2α
2
σup s 2 2κ1
κ7 = 2 2 γ−β
σlow 1 − s− 2
b MF MF ] . k7 − γ−β 2
Var[Q L,m 4α .
2p
γ−β
Thus, with p = κ7 −1− 2α follows the bound (19) for the case γ > β.
15
L + 1 ≤ κ8 ln(−1 ) ,
with
√ 1
κ8 = α−1 logs ( 2κ1 ) + α−1 + 2.
2 ln(s)
Set
2
σup 2 −1
p=2 2 κ8 ln()2 ,
σlow
where we used that ln(−1 )2 = ln()2 , to obtain the bound (19) for the case
γ = β.
4 Numerical experiment
Let G = (0, 1)2 be a domain with boundary ∂G. Consider the linear elliptic
PDE with random coefficients
rate constant
Assumption 1 α ≈ 1.0579 κ1 ≈ 4.0528 × 101
Assumption 2 γ ≈ 1.0551 κ3 ≈ 2.3615 × 10−6
Assumption 4 β ≈ 1.9365 κ2 ≈ 1.3744 × 103
Table 1: The table reports the rates and constants of Assumptions 1,2,4 that we estimated
for our problem (20)–(21).
The domain D is D = [10−4 , 10−1 ]9 . The right-hand side is set to f (x) = 10.
The function Q : D → R is
Z 1/2
2
Q(Z(ω)) = u(ω, x) dx .
G
Dirichlet problems such as (20)–(21) are well studied in the multilevel Monte
Carlo literature. We therefore refer to the literature for theoretical consider-
ations in the context of multilevel Monte Carlo of problem (20)–(21) and its
variations [9, 8].
We estimate the rates of Assumptions 1–4 numerically from n = 104 sam-
ples Z1 , . . . , Zn of the random variable Z and the corresponding evaluations
of Q3 , . . . , Q8 . Consider first Assumption 1. We use basic Monte Carlo esti-
mators with n = 104 samples to estimate |E[Q8 (Z) − Q` (Z)]| for ` = 3, . . . , 7
and then find κ1 ∈ R+ and α ∈ R+ that best fit the estimates in the L2 norm.
Since the domain G is in a two-dimensional space, we set s = 22 = 4. Note
that we estimate the constant κ1 and the rate α with respect to Q8 instead
of Q. We follow [9] and ignore levels that lead to too coarse grids. Note that
a general discussion on which models to select for MFMC estimation is given
in [38, Section 3.5]. The behavior of |E[Q8 (Z) − Q` (Z)]| for ` = 3, . . . , 7 is
shown in Figure 1a. The constant κ1 and the rate α are reported in Table 1.
We repeat the same procedure to obtain the rates and constants of Assump-
tions 2–4, which are visualized in Figure 1 and reported in Table 1. Note that
our estimated rates satisfy β > γ, cf. Assumption 5.
4.2 NUMERICAL ILLUSTRATION OF THE ASSUMPTIONS 17
1e+00 1e+00
E[|Q8(Z) − Q`(Z)|] costs w`
expected value
costs [s]
1e-02
1e-02
1e-03
1e-03 1e-04
4 5 6 7 4 5 6 7
level ` level `
(a) expected absolute error (b) runtime
1e+02 1e+02
Var[Q`(Z) − Q`−1(Z)] Var[Q`(Z)]
1e+00 rate β ≈ 1.9365
rate 0.0078
ρ2` − ρ2`−1
variance
1e-06
1e+00
4 5 6 7 4 5 6 7
level ` level `
(c) decay of variance (d) variance
Fig. 1: The plot in (a) shows that the rate of the decay of the expected absolute error is
α ≈ 1, see Assumption 1. The plot in (b) reports the rate γ ≈ 1 of the increase of the runtime
of the evaluations Q` for ` = 3, . . . , 7, see Assumption 2. The plots in (c) and (d) report
the behavior of the variance with respect to Assumption 4 and Assumption 3, respectively.
Note that β > γ as required by Assumption 5.
Table 2: The table reports the costs w3 , . . . , w7 of functions Q3 , . . . , Q7 , and the sample
estimates of the variances σ32 , . . . , σ72 and the correlation coefficients ρ8,3 , . . . , ρ8,7 of the
random variables Q3 (Z), . . . , Q7 (Z) estimated from 104 samples.
10 2
share of samples[%] 88.23% 87.55% 87.44% 87.44% 87.42%
0=
0=
0=
0=
0=
10
10
10
10
10
10
0
-1
-2
-3
-4
-5
(a) multilevel Monte Carlo
2
10
share of samples[%]
0=
0=
0=
0=
0=
10
10
10
10
10
10
0
-1
-2
-3
-4
-5
(b) MFMC
Fig. 2: The plots report the share of the number of samples of each level in the total number
of samples. MFMC evaluates the coarsest model more often than multilevel Monte Carlo in
this example.
For ∈ {100 , 10−1 , . . . , 10−5 }, we derive multilevel Monte Carlo and MFMC
estimates of E[Q(Z)] following Section 2.3 and Section 2.4, respectively. The
number of samples for the multilevel Monte Carlo estimators are derived using
the rates in Table 1. The number of samples and the coefficients for the MFMC
estimators are obtained using the costs, variances, and correlation coefficients
reported in Table 2. Figure 2 compares the number of samples obtained with
multilevel Monte Carlo and MFMC. The absolute numbers of samples are
reported in Table 3 for multilevel Monte Carlo and in Table 4 for MFMC. Both
methods lead to similar numbers of samples. MFMC assigns more samples to
level ` = 3 than multilevel Monte Carlo. A detailed comparison is shown in
Figure 3 for = 10−5 , which illustrates that multilevel Monte Carlo distributes
the number of samples logarithmically among the levels depending on the
rates β and γ, see Section 2.4. MFMC directly uses the costs, variances, and
correlation coefficients and derives a more fine-grained distribution among the
4.4 MFMC AND COARSE-GRID (WEAKLY-CORRELATED) MODELS
19
10 2
87.42%
share of samples[%]
97.32%
10 -2 0.02%
0.03%
M
M
LM
FM
C
C
Fig. 3: The bar plot shows a detailed comparison of the share of the samples determined
by multilevel Monte Carlo (MLMC) and MFMC for = 10−5 . Multilevel Monte Carlo
distributes the number of samples logarithmically among the levels, whereas MFMC deter-
mines a fine-grained distribution of the number of samples. Thus, the bars have the same
size on a logarithmic scale for multilevel Monte Carlo but different sizes for MFMC. Note
that the percent of the share of the total number of samples for each bar is shown in the
plot.
levels than multilevel Monte Carlo. We refer to [38] for further investigations
on the number of samples in the context of MFMC.
We repeat the multilevel Monte Carlo and the MFMC estimation 100 times
and report in Figure 4 the estimated MSE
100 2
1 Xb b Ref ,
ê(Q)
b = Qi − Q (22)
100 i=1
where Q
b i is either a multilevel Monte Carlo estimator or an MFMC estimator,
and where Q b Ref is the reference estimate, see Section 4.1. Figure 4 additionally
shows error bars with length
100 2 2
1 X
Ref
ê(Q) − Qi − Q
b b b , (23)
99 i=1
1e+01
Multilevel Monte Carlo
1e+00 MFMC
estimated MSE 1e-01
1e-02
1e-03
1e-04
1e-05
1e-02 1e-01 1e+00 1e+01 1e+02
runtime [s]
(a) estimated MSE w.r.t. runtime in seconds
1e+01
Multilevel Monte Carlo
1e+00 MFMC
estimated MSE
1e-01
1e-02
1e-03
1e-04
1e-05
1e+00 1e-01 1e-02 1e-03 1e-04 1e-05
tolerance
(b) estimated MSE w.r.t. tolerance
Fig. 4: The results are in agreement with Theorem 1, which states that the costs of the
MFMC estimator with MSE e(Q b MF
L,mMF
) . are bounded by c(Qb MF
L,mMF
) . −1 under
Assumptions 1–7. The behavior of the MFMC estimator is similar to the behavior of the
multilevel Monte Carlo estimator.
Table 3: The table reports the number of samples used in the multilevel Monte Carlo esti-
mator and the runtime in seconds. The runtime is averaged over 100 runs.
Table 4: The tables reports the number of samples used in the MFMC estimator. While the
total number of samples is higher than in multilevel Monte Carlo (see Table 3), the multi-
level Monte Carlo method requires more samples than MFMC at higher levels (i.e., more
expensive evaluations) and thus the runtimes are about the same for each ∈ {1, . . . , 10−5 }.
Table 5: The table reports the costs w1 , . . . , w5 of functions Q1 , . . . , Q5 , and the sample esti-
mates of the variances σ12 , . . . , σ52 and the correlation coefficients ρ8,1 , . . . , ρ8,5 of the random
variables Q1 (Z), . . . , Q5 (Z) estimated from 104 samples. Note that the costs, variances, and
correlation coefficients for levels ` = 3, . . . , 7 are reported in Table 2.
5 Conclusions
estimated MSE
1e+00 1e+00
1e-01 1e-01
1e-02 1e-02
1e-03 1e-03
1e-02 1e-01 1e+00 1e+01 1e+00 1e-01 1e-02
runtime [s] tolerance
(a) estimated MSE w.r.t. runtime in seconds (b) estimated MSE w.r.t. tolerance
Fig. 5: The plots report the estimated MSE of multilevel Monte Carlo and the MFMC
estimators that combine Q1 (Z), . . . , Q5 (Z) corresponding to levels ` = 1, . . . , 5. The random
variables Q1 (Z) and Q2 (Z) are only weakly correlated to Q8 (Z). The MFMC estimator
shows a similar behavior as the multilevel Monte Carlo estimator.
of the PDE. In this setting, and under standard assumptions on the discretiza-
tions of the PDE, the MFMC estimator is efficient, which means that the costs
of the MFMC estimator with MSE below a threshold are bounded linearly in
the threshold. Our numerical results illustrated the theory.
Acknowledgment
The first and the third author were supported in part by the AFOSR MURI
on multi-information sources of multi-physics systems under Award Number
FA9550-15-1-0038, program manager Jean-Luc Cambier, and by the United
States Department of Energy Applied Mathematics Program, Awards DE-
FG02-08ER2585 and DE-SC0009297, as part of the DiaMonD Multifaceted
Mathematics Integrated Capability Center. The second author was supported
by the US Department of Energy Office of Science grant DE-SC0009324 and
the Air Force Office of Scientific Grant FA9550-15-1-0001. Some of the numer-
ical examples were computed on the computer cluster of the Munich Centre
of Advanced Computing.
References
27. J. Li and D. Xiu. Evaluation of failure probability via surrogate models. Journal of
Computational Physics, 229(23):8966–8980, 2010.
28. J. S. Liu. Monte Carlo strategies in scientific computing. Springer, 2008.
29. A. J. Majda and B. Gershgorin. Quantifying uncertainty in climate change science
through empirical information theory. Proceedings of the National Academy of Sciences
of the United States of America, 107(34):14958–14963, Aug. 2010.
30. B. L. Nelson. On control variate estimators. Computers & Operations Research,
14(3):219–225, 1987.
31. L. Ng and K. Willcox. Multifidelity approaches for optimization under uncertainty.
International Journal for Numerical Methods in Engineering, 100(10):746–772, 2014.
32. L. Ng and K. Willcox. Monte-Carlo information-reuse approach to aircraft conceptual
design optimization under uncertainty. Journal of Aircraft, pages 1–12, 2015.
33. F. Nobile, R. Tempone, and C. G. Webster. A sparse grid stochastic collocation method
for partial differential equations with random input data. SIAM Journal on Numerical
Analysis, 46(5):2309–2345, 2008.
34. S. Pauli and P. Arbenz. Determining optimal multilevel Monte Carlo parameters with
application to fault tolerance. Computers & Mathematics with Applications, 70(11):2638
– 2651, 2015.
35. S. Pauli, P. Arbenz, and C. Schwab. Intrinsic fault tolerance of multilevel Monte Carlo
methods. Journal of Parallel and Distributed Computing, 84:24 – 36, 2015.
36. B. Peherstorfer, T. Cui, Y. Marzouk, and K. Willcox. Multifidelity importance sampling.
Computer Methods in Applied Mechanics and Engineering, 300:490 – 509, 2016.
37. B. Peherstorfer and K. Willcox. Online adaptive model reduction for nonlinear systems
via low-rank updates. SIAM Journal on Scientific Computing, 37(4):A2123–A2150,
2015.
38. B. Peherstorfer, K. Willcox, and M. Gunzburger. Optimal model management for multi-
fidelity Monte Carlo estimation. SIAM Journal on Scientific Computing, 38(5):A3163–
A3194, 2016.
39. C. Robert and G. Casella. Monte Carlo Statistical Methods. Springer, 2004.
40. G. Rozza, D. Huynh, and A. Patera. Reduced basis approximation and a posteriori
error estimation for affinely parametrized elliptic coercive partial differential equations.
Archives of Computational Methods in Engineering, 15(3):1–47, 2007.
41. L. Sirovich. Turbulence and the dynamics of coherent structures. Quarterly of Applied
Mathematics, 45:561–571, 1987.
42. A. L. Teckentrup, R. Scheichl, M. Giles, and E. Ullmann. Further analysis of mul-
tilevel Monte Carlo methods for elliptic PDEs with random coefficients. Numerische
Mathematik, 125(3):569–600, 2013.
43. E. Ullmann, H. C. Elman, and O. G. Ernst. Efficient iterative solvers for stochastic
Galerkin discretizations of log-transformed random diffusion problems. SIAM Journal
on Scientific Computing, 34(2):A659–A682, 2012.
44. E. Ullmann and I. Papaioannou. Multilevel estimation of rare events. SIAM/ASA
Journal on Uncertainty Quantification, 3(1):922–953, 2015.
45. E. Ullmann and C. E. Powell. Solving log-transformed random diffusion problems by
stochastic Galerkin mixed finite element methods. SIAM/ASA Journal on Uncertainty
Quantification, 3(1):509–534, 2015.
46. V. Vapnik. Statistical Learning Theory. Wiley, 1998.
47. D. Xiu. Fast numerical methods for stochastic computations: A review. Communications
in computational physics, 5:242–272, 2009.