0% found this document useful (0 votes)
8 views24 pages

Convergence Analysis of Multifidelity Monte Carloestimation

The document presents a convergence analysis of the multifidelity Monte Carlo (MFMC) estimation method, which combines low-fidelity approximations with high-fidelity models to enhance the efficiency of Monte Carlo simulations. The authors establish cost bounds for the MFMC estimator under specific error and cost assumptions, demonstrating that it can achieve similar efficiency to multilevel Monte Carlo methods. Numerical experiments support the theoretical findings and illustrate the effectiveness of the derived bounds.

Uploaded by

saadon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views24 pages

Convergence Analysis of Multifidelity Monte Carloestimation

The document presents a convergence analysis of the multifidelity Monte Carlo (MFMC) estimation method, which combines low-fidelity approximations with high-fidelity models to enhance the efficiency of Monte Carlo simulations. The authors establish cost bounds for the MFMC estimator under specific error and cost assumptions, demonstrating that it can achieve similar efficiency to multilevel Monte Carlo methods. Numerical experiments support the theoretical findings and illustrate the effectiveness of the derived bounds.

Uploaded by

saadon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Noname manuscript No.

(will be inserted by the editor)

Convergence analysis of multifidelity Monte Carlo


estimation

Benjamin Peherstorfer · Max


Gunzburger · Karen Willcox

Received: date / Accepted: date

Abstract The multifidelity Monte Carlo method provides a general frame-


work for combining cheap low-fidelity approximations of an expensive high-
fidelity model to accelerate the Monte Carlo estimation of statistics of the
high-fidelity model output. In this work, we investigate the properties of mul-
tifidelity Monte Carlo estimation in the setting where a hierarchy of approx-
imations can be constructed with known error and cost bounds. Our main
result is a convergence analysis of multifidelity Monte Carlo estimation, for
which we prove a bound on the costs of the multifidelity Monte Carlo es-
timator under assumptions on the error and cost bounds of the low-fidelity
approximations. The assumptions that we make are typical in the setting of
similar Monte Carlo techniques. Numerical experiments illustrate the derived
bounds.

Keywords multifidelity · multilevel · hierarchical methods · Monte Carlo ·


surrogates · coarse-grid approximations · partial differential equations with
random coefficients · uncertainty quantification

B. Peherstorfer
Department of Mechanical Engineering and Wisconsin Institute for Discovery, University of
Wisconsin-Madison, Madison, WI 53706
E-mail: [email protected]
M. Gunzburger
Department of Scientific Computing, Florida State University, 400 Dirac Science Library,
Tallahassee FL 32306-4120
E-mail: [email protected]
K. Willcox
Department of Aeronautics & Astronautics, MIT, Cambridge, MA 02139
E-mail: [email protected]
2 1 INTRODUCTION

1 Introduction

Inputs to systems are often modeled as random variables to account for the
uncertainties in the inputs due to inaccuracies and incomplete knowledge.
Given the input random variable and a model of the system of interest, an
important task is to estimate statistics of the model output random variable.
Monte Carlo estimation is one popular approach to estimate statistics. Ba-
sic Monte Carlo estimation generates samples of the input random variable,
discretizes the model and then solves the discretized model—the high-fidelity
model—up to the required accuracy at these samples, and averages over the
corresponding outputs to estimate statistics of the model output random vari-
able. This basic Monte Carlo estimation often requires many samples, and
consequently many approximations of the model outputs, which can become
too costly if the high-fidelity model solves are expensive. We note that other
techniques than Monte Carlo estimation are available to estimate statistics of
model outputs, see, e.g., [1, 33, 21, 15, 14, 47, 43, 45].
Several variance reduction techniques have been presented to reduce the
costs of Monte Carlo simulation compared to basic Monte Carlo estimators,
e.g., antithetic variates [39, 23, 28] and importance sampling [39, 27, 36]. Our
focus here is on the control variate framework that exploits the correlation
between the model output random variable and an auxiliary random vari-
able that is cheap to sample [30]. A major class of control variate methods
derives the auxiliary random variable from cheap approximations of the out-
puts of the high-fidelity model. For example, in situations where the model is
governed by (often elliptic) partial differential equations (PDEs), coarse-grid
approximations of the PDE—low-fidelity models—can provide cheap approx-
imations of the outputs obtained from a fine-grid high-fidelity discretization
of the PDE; however, other types of low-fidelity models are possible in the
context of PDEs, e.g., projection-based reduced models [41, 40, 20, 3, 37], data-
fit interpolation and regression models [13, 12], machine-learning-based models
such as support vector machines [46, 11], and other simplified models [29, 32].
The multifidelity Monte Carlo (MFMC) method [38] uses a control variate
approach to combine auxiliary random variables stemming from low-fidelity
models into an estimator of the statistics of the high-fidelity model output.
Key to the MFMC approach is the selection of how often each of the auxiliary
random variables is sampled, and therefore how often each of the low-fidelity
models is solved. The MFMC approach derives this selection from the correla-
tion coefficients between the auxiliary random variables and the high-fidelity
model output random variable. The selection of the MFMC approach is opti-
mal in the sense that the variance of the MFMC estimator is minimized for
given maximal costs of the estimation. We refer to the discussions in [38, 31]
for details on MFMC.
The work [38] discusses the properties of MFMC estimation in a setting
where only mild assumptions on the high- and low-fidelity models are made.
We consider here the setting where we can make further assumptions on the
errors and costs of outputs obtained with a hierarchy of low- and high-fidelity
3

models. Our contribution is to show that for an MFMC estimator with mean-
squared error (MSE) below a threshold parameter  > 0, the costs of the
estimation can be bounded by −1 up to a constant under certain conditions
on the error and cost bounds of the models in the hierarchy.
We discuss that the conditions we require in the MFMC context are similar
to the conditions exploited by the multilevel Monte Carlo method [9, Theo-
rem 1]. Our analysis shows that MFMC estimation is as efficient in terms of
error and costs as multilevel Monte Carlo estimation under certain conditions
that we discuss below in detail. Multilevel Monte Carlo uses a hierarchy of
low-fidelity models—typically coarse-grid approximations—to derive a hierar-
chy of auxiliary random variables, which are combined in a judicious way to
reduce the runtime of Monte Carlo simulation. Multilevel Monte Carlo was in-
troduced in [26] and extended and made popular by the work [18]. Since then,
the properties of the multilevel Monte Carlo estimators have been studied ex-
tensively in different settings, see, e.g., [9, 8, 2, 6, 42]. Multilevel Monte Carlo
and its variants have also been applied to density estimation [5], variance es-
timation [4], and rare event simulation [44]. We also mention the continuation
multilevel Monte Carlo [10] and the extension multi-index Monte Carlo that
allows different mesh widths in the dimensions [22]. In [34, 35], a fault-tolerant
multilevel Monte Carlo is introduced and analyzed, which is well suited for
massively parallel computations. An integer optimization problem is solved
to determine the optimal number of model evaluations depending on the rate
of compute-node failures. The fault-tolerant approach thus takes into account
node failure by adapting the number of model evaluations accordingly. The
relationship between multilevel Monte Carlo and sparse grid quadrature [7,
16, 17] is discussed in [24, 25, 19].
The outline of the presentation is as follows. Section 2 introduces the prob-
lem setup and basic, multilevel, and multifidelity Monte Carlo estimators. Sec-
tion 3 derives the new convergence analysis of MFMC estimation. Numerical
examples in Section 4 illustrate the derived bounds. Conclusions are drawn in
Section 5.

2 Problem setup

This section introduces the problem setup and the various types of Monte Carlo
estimators required throughout the presentation. Section 2.1 introduces the
notation and Section 2.2 the basic Monte Carlo estimator. Multilevel Monte
Carlo and the MFMC estimation are summarized in Section 2.3 and Sec-
tion 2.4, respectively.

2.1 Preliminaries

The set of positive real numbers is denoted as R+ = {x ∈ R : x > 0}. For


two positive quantities a and b, we define a . b to hold if a/b is bounded by
4 2 PROBLEM SETUP

a constant whose value is independent of any parameters on which a and b


depend on.
Let d ∈ N be the dimension and define the Lipschitz domain D ⊂ Rd . Let
Z : Ω → D be a random variable over a probability space (Ω, F, P), where
Ω denotes the set of outcomes, F the σ-algebra of events, and P : F → [0, 1]
a probability measure. Let further Q : D → R be a function in a suitable
function space and let Q` : D → R be functions for ` ∈ N that approximate Q
in the sense of the following assumption. Note that we assume that Q(Z) and
Q` (Z) are integrable.
Assumption 1 There exists 1 < s ∈ R and rate α ∈ R+ such that

|E[Q(Z) − Q` (Z)]| ≤ κ1 s−α` , ` ∈ N,

where κ1 ∈ R+ is a constant independent of `.


The parameter ` ∈ N is the level of Q` . Let further w` ∈ R+ be the costs of
evaluating Q` for ` ∈ N. The following assumption gives a bound on the costs
with respect to the level `.
Assumption 2 There exists a rate γ ∈ R+ with

w` ≤ κ3 sγ` ,

where the constant s is given by Assumption 1 and κ3 ∈ R+ is a constant


independent of `.
Note that in Assumption 2 the same constant s as in Assumption 1 is used.
The variance Var[Q` (Z)] of the random variable Q` (Z) is denoted as

σ`2 = Var[Q` (Z)] , ` ∈ N.

We make the assumption that there exists a positive lower and upper bound
for the variance σ`2 with respect to level ` ∈ N.
Assumption 3 There exist σlow ∈ R+ and σup ∈ R+ such that σlow ≤ σ` ≤
σup for ` ∈ N.
The Pearson product-moment correlation coefficient of the random variables
Q` (Z) and Ql (Z) is denoted as
Cov[Q` (Z), Ql (Z)]
ρ`,l = , `, l ∈ N , (1)
σ` σl
where Cov[Q` (Z), Ql (Z)] is the covariance of Q` (Z) and Ql (Z).
We consider the situation where the random variable Z represents an input
random variable and Q is a function that maps an input, i.e., a realization
of Z, onto an output. In our situation, evaluating Q entails solving a PDE
(“model”), but the solutions to the PDE are unavailable. We therefore revert to
solving an approximate PDE (“discretized model”), where the approximation
(e.g., the mesh width) is controlled by the level `. The functions Q` map the
2.2 BASIC MONTE CARLO ESTIMATION 5

input onto the output obtained by solving the approximate PDE on level `.
Assumption 1 specifies in which sense Q` converges to Q with ` → ∞. Solving
the approximate PDE on level ` incurs costs w` . One task in this context is to
derive estimators of E[Q(Z)] using the functions Q` . We assess the efficiency
of an estimator Q
b with its MSE
 2 
e(Q)
b =E Q b − E[Q(Z)] ,

and its costs c(Q),


b which are the sum of the evaluation costs w` of the functions
Q` used in the estimator Q. b An estimator Q b with MSE e(Q)
b .  below a
threshold  ∈ R+ is efficient, if the costs c(Q) .  are bounded by −1 up to
b −1

a constant. Note that  bounds the MSE, in contrast to the root-mean-squared


error (RMSE) as in, e.g., [9].

2.2 Basic Monte Carlo estimation

b MC of E[Q` (Z)] as
Let ` ∈ N and define the basic Monte Carlo estimator Q `,m

m
b MC 1 X
Q `,m = Q` (Zi ) ,
m i=1

with m ∈ N independent and identically distributed (i.i.d.) samples Z1 , . . . , Zm


of Z. The MSE of the Monte Carlo estimator Q b MC with respect to E[Q(Z)] is
`,m

b MC −1 2
e(Q `,m ) = m Var[Q` (Z)] + (E[Q(Z) − Q` (Z)]) . (2)

The term m−1 Var[Q` (Z)] is the variance term and term (E[Q` (Z) − Q(Z)])2
b MC are
is the bias term. The costs of the estimator Q `,m

b MC
c(Q `,m ) = mw` ,

because Q` is evaluated at m samples, with one evaluation having costs w` .


Let now  ∈ R+ be a threshold. One approach to obtain a basic Monte
b MC with e(Q
Carlo estimator Q b MC ) .  is to derive a maximal level L ∈ N and
`,m `,m
a number of samples m such that the bias and the variance term are bounded
by /2 up to constants. Consider first the choice of the maximal level L ∈ N.
With Assumption 1, the maximal level L is given by
l √ m
L = α−1 logs 2κ1 −1/2 , (3)

where κ1 is the constant in Assumption 1. Note that the maximal level L


defines the high-fidelity model QL in the terminology of the introduction, see
Section 1.
To achieve that the variance term is bounded by /2 up to a constant, the
number of samples m is selected such that −1 . m. With Assumption 2, and
6 2 PROBLEM SETUP

assuming the variance σ`2 is approximately constant with respect to the level
b MC are
`, the costs of the basic Monte Carlo estimator Q L,m

b MC −1−γ/(2α)
c(Q L,m ) .  ,

see [9, Section 2.1] for a proof. The costs of the basic Monte Carlo estimator
scale with the rates γ and α.

2.3 Multilevel Monte Carlo estimation

We follow [9] for the presentation of the multilevel Monte Carlo estimation.
Consider the threshold  ∈ R+ and define the maximal level L ∈ N as in (3).
Multilevel Monte Carlo exploits the linearity of the expected value to write
L
X L
X
E[QL (Z)] = E[Q1 (Z)] + E[Q` (Z) − Q`−1 (Z)] = E[∆` (Z)] ,
`=2 `=1

where ∆` (Z) = Q` (Z) − Q`−1 (Z) for ` > 1 and ∆1 (Z) = Q1 (Z). The basic
Monte Carlo estimator of ∆` (Z) with m` ∈ N samples Z1 , . . . , Zm` is
m
bMC 1 X̀
∆ `,m` = Q` (Zi ) − Q`−1 (Zi ) .
m` i=1

b ML is then given by
The multilevel Monte Carlo estimator Q L,m

L
X
b ML
Q L,m =
bMC
∆ `,m` , (4)
`=1

where the vector m = [m1 , . . . , mL ]T ∈ NL is the vector of the number of


samples at each level. Note that each basic Monte Carlo estimator ∆ bMC in
`,m`
(4) uses a separate, independent set of samples. Note further that the functions
Q1 , . . . , QL−1 are low-fidelity models in the terminology of the introduction,
see Section 1.
Under the following two assumptions, and with a judicious choice of the
number of samples m, the multilevel Monte Carlo estimator is efficient, which
means that the estimator Q b ML achieves an MSE of e(Qb ML ) .  with costs
L,m L,m
b ML ) . −1 . The first assumption states that the variance of ∆` decays
c(Q L,m
with the level `.
Assumption 4 There exists a rate β ∈ R+ with

Var[Q` (Z) − Q`−1 (Z)] ≤ κ2 s−β` , ` ∈ N,

where s is the constant of Assumption 1 and κ2 ∈ R+ is a constant independent


of `.
2.4 MULTIFIDELITY MONTE CARLO ESTIMATION 7

The following assumption sets the rate β of the decay of the variance Var[Q` (Z)−
Q`−1 (Z)] in relation to the rate γ of the increase of the costs with level `.
Assumption 5 For the rates γ of Assumption 2 and β of Assumption 4, we
have β > γ.
Set the number of samples mML = [mML ML T
1 , . . . , mL ] to
  −1 
ML −1 −(β−γ)/2 −(β+γ)`/2
m` = 2 κ2 1 − s s , ` = 1, . . . , L , (5)

where κ2 is the constant in Assumption 4 and s is defined as in Assump-


tion 1. Note that the components of mML are rounded up. It is shown in
[9] that if Assumptions 1–5 hold, then the multilevel Monte Carlo estimator
Qb ML ML with mML defined in (5) achieves an MSE of e(Q b ML ML ) .  with
L,m L,m
costs c(Qb ML ML ) . −1 . Note that under Assumptions 1–5 it is sufficient to
L,m
select the number of samples with the rates β and γ to achieve an efficient
estimator. We refer to [26, 18, 9] for details on multilevel Monte Carlo estima-
tion.

2.4 Multifidelity Monte Carlo estimation

The MFMC estimator [38] uses functions Q1 , . . . , QL up to the maximal level


L to derive an estimate of E[Q(Z)], similarly to the multilevel Monte Carlo
estimator; however, the functions Q1 , . . . , QL are combined in a different way
than in the multilevel Monte Carlo estimator, and the number of samples m
are selected by directly using correlation coefficients and costs instead of rates.
MFMC imposes on the number of samples m = [m1 , . . . , mL ]T that m1 ≥
m2 ≥ · · · ≥ mL > 0. Let
Z1 , . . . , Zm1 ∈ D (6)
be m1 i.i.d. samples of the random variable Z. Let further

Q` (Z1 ), . . . , Q` (Zm` ) , (7)

be the evaluations of Q` at the first m` samples Z1 , . . . , Zm` , for ` = 1, . . . , L.


Consider now the basic Monte Carlo estimators
m
b MC = 1

Q `,m` Q` (Zi ) , ` = 1, . . . , L , (8)
m` i=1

and
m`+1
b MC 1 X
Q `,m`+1 = Q` (Zi ) , ` = 1, . . . , L − 1 , (9)
m`+1 i=1

which use the samples (6) and the evaluations (7). Note that the estimators in
b MC
(9) use the first m`+1 samples of the samples (6). Thus, the estimators Q `,m`
8 2 PROBLEM SETUP

b MC b MF
and Q `,m`+1 are dependent for ` = 1, . . . , L − 1. The MFMC estimator QL,m
is defined as
L−1
X  
b MF
Q b MC
L,m = QL,mL + a` Qb MC
`,m` − Q
b MC
`,m`+1 ,
`=1

where a = [a1 , . . . , aL−1 ]T ∈ RL−1 are coefficients. The costs of the MFMC
b MF are
estimator Q L,m
b MF
c(Q T
L,m ) = w m ,

where w = [w1 , . . . , wL ]T , see [38].


The MFMC method provides a framework to select the number of samples
m and the coefficients a such that the variance Var[Q b MF ] of the MFMC
L,m
MF MF
estimator Q
b
L,m with costs c(Qb
L,m ) = p is minimized for a given computational
budget p ∈ R+ . The number of samples m and the coefficients a are derived
under two assumptions on the correlation coefficients of Q1 (Z), . . . , QL (Z)
and the costs w1 , . . . , wL . The first assumption specifies the ordering of the
functions Q1 (Z), . . . , QL (Z).
Assumption 6 The random variables Q1 (Z), . . . , QL (Z) are ordered ascend-
ing with respect to the absolute values of the correlation coefficients

|ρL,1 | < |ρL,2 | < · · · < |ρL,L | .

The second assumption describes inequalities of the correlation coefficients and


the costs.
Assumption 7 The costs w1 , . . . , wL and correlation coefficients ρL,1 , . . . , ρL,L
satisfy
w`+1 ρ2L,`+1 − ρ2L,`
> 2
w` ρL,` − ρ2L,`−1
for ` = 1, . . . , L − 1.
Assumption 7 enforces that the cost savings associated with a model justify its
decrease in accuracy (measured by correlation) relative to other models in the
hierarchy. If a particular model violates the condition in Assumption 7, the
MFMC method omits the model from the hierarchy. See [38] for more details.
Under Assumptions 6–7, the number of samples m and the coefficients a,
which minimize the variance of Var[Q b MF ] with costs c(Q b MF ) = p, are given
L,m L,m
as follows [38]. The coefficients aMF = [aMF MF T
1 , . . . , aL−1 ] are set to

ρL,` σL
aMF
` = , ` = 1, . . . , L − 1 ,
σ`
and the number of samples mMF = [mMF MF T
1 , . . . , mL ] is set to

mMF
` = mMF
L r` , ` = 1, . . . , L ,
9

where s
wL (ρ2L,` − ρ2L,`−1 )
r` = , ` = 1, . . . , L , (10)
w` (1 − ρ2L,L−1 )

with ρL,0 = 0. Note that the selection of mMF and aMF is independent of the
rates α, β, γ, which means the approach is applicable also in situations where
rates capture the behavior of the properties of the functions Q1 , . . . , QL only
poorly, see, e.g., [38] for examples. Note further that the components of the
number of samples mMF are rounded up to integer numbers as in the mul-
tilevel Monte Carlo method, see (5) in Section 2.3. We note that in [34] an
integer optimization problem is solved to adapt the number of model evalua-
tions in multilevel Monte Carlo for an increased processor-failure tolerance on
massively-parallel compute platforms.
The MFMC estimator is unbiased with respect to E[QL (Z)], see [38, Lemma 3.1].
The variance of the MFMC estimator Q b MF MF is [38]
L,m

2
σL (1 − ρ2L,L−1 )
b MF MF ) =
Var(Q p.
L,m 2
mMF
L wL

The work [38] investigates the costs and the MSE of the MFMC estimator only
in the context of Assumption 6 and Assumption 7, and does not give insights
into the behavior of the MFMC estimator if additionally Assumptions 1–5 are
made.

3 New properties of the multifidelity Monte Carlo estimator

We now discuss the error and costs behavior of the MFMC estimator in a
typical setting of the multilevel Monte Carlo estimators where Assumption 4
on the rate of the variance decay and Assumption 5 on the relative costs hold.
Our main result is Theorem 1 that states that the MFMC estimator is efficient
under Assumptions 1–7, which means that the MFMC estimator achieves an
MSE e(Q b MF MF ) . −1 , independent of the rates α
b MF MF ) .  with costs c(Q
L,m L,m
and γ. We first state Theorem 1 and then prove two lemmata in Section 3.1
and provide the proof of Theorem 1 in Section 3.2. Corollary 1 discusses the
convergence rates of MFMC if Assumption 5 is violated.

Theorem 1 With Assumptions 1–5, as well as Assumption 6 and Assump-


tion 7, set the maximum level L as in (3) and set the budget p to

p = κ4 −1 , (11)

with the constant


γ−β
!2
2
σup s 2
κ4 = 2 2 γ−β .
σlow 1−s 2
3 NEW PROPERTIES OF THE MULTIFIDELITY MONTE CARLO
10
ESTIMATOR
For the number of samples mMF and the coefficients aMF ∈ RL−1 defined in
Section 2.4, the MSE e(Q b MF MF ) of the MFMC estimator with respect to the
L,m
statistics E[Q(Z)] is bounded as
b MF MF ) .  ,
e(Q L,m

b MF MF ) . −1 .
and the costs are bounded as c(Q L,m

Note that the MLMC theory developed in [9, Theorem 1] and [18, Theo-
rem 3.1] requires an additional assumption on the rate α because the rounding
up of the numbers of samples to an integer is explicitly taken into account, see
also [4, Theorem 3.2]. We ignore the rounding here and therefore can avoid
that assumption; however, we emphasize that we expect that a similar as-
sumption is necessary for MFMC as well if the rounding of the numbers of
samples is taken into account explicitly.

3.1 Preliminary lemmata

This section proves two lemmata that we use in the proof of Theorem 1 in
Section 3.2.

Lemma 1 Let L ∈ N be the maximal level. From Assumption 4, it follows


that
Var[QL (Z) − Q`−1 (Z)] . s−β` , (12)
for ` = 2, . . . , L − 1.

Proof Let κ2 be the constant in Assumption 4 so that we have

Var[Q` (Z) − Q`−1 (Z)] ≤ κ2 s−β` ,

for ` ∈ N. We obtain

Var[Q`+1 (Z) − Q`−1 (Z)] ≤ Var[Q`+1 (Z) − Q` (Z)] + Var[Q` (Z) − Q`−1 (Z)]
+ 2| Cov[Q`+1 (Z) − Q` (Z), Q` (Z) − Q`−1 (Z)]| . (13)

With Assumption 4 and the Cauchy-Schwarz inequality, it follows that

Var[Q`+1 (Z) − Q`−1 (Z)] ≤ κ2 s−β(`+1) + κ2 s−β`


p
+ 2 Var[Q`+1 (Z) − Q` (Z)] Var[Q` (Z) − Q`−1 (Z)] ,

and therefore we have


Var[Q`+1 (Z) − Q`−1 (Z)] ≤κ2 s−β(`+1) + κ2 s−β` + 2κ2 s−β(2`+1)/2
≤κ2 s−β` (s−β + 1 + 2s−β/2 ) (14)
≤κ2 s−β` (s−β/2 + 1 + 2s−β/2 ) ,
3.1 PRELIMINARY LEMMATA 11

where the last inequality holds because s > 1. Define now the sequence (bj )
with
b0 = 1 , bj = s−βj/2 + bj−1 (1 + 2s−βj/2 ) , j ∈ N.
From (14) and from the definition of the sequence (bj ), it follows with induction
that

Var[Q`+j (Z) − Q`−1 (Z)] ≤κ2 s−β(`+j) + κ2 bj−1 s−β` + 2κ2 s−β` (bj−1 s−βj )1/2
≤κ2 s−β` (s−βj + bj−1 + 2(bj−1 s−βj )1/2 )
≤κ2 s−β` (s−βj + bj−1 + 2bj−1 s−βj/2 )
≤κ2 s−β` (s−βj/2 + bj−1 (1 + 2s−βj/2 ))
≤κ2 s−β` bj ,
1/2
because bj ≥ 1 (and therefore bj ≤ bj ) and s > 1 for j ∈ N. To bound the
sequence (bj ), rewrite
j
X j
Y
bj = s−βi/2 (1 + 2s−βr/2 ) ,
i=0 r=i+1

and observe that


j
Y ∞
Y
(1 + 2s−βr/2 ) ≤ (1 + 2s−βr/2 ) .
r=i+1 r=0

The infinite product converges if and only if the series



X
2s−βr/2
r=0

converges, which is the case because s > 1 and therefore s−β/2 < 1. Denote

Y
(1 + 2s−βr/2 ) = κ5 ,
r=0

with the constant κ5 ∈ R, and obtain the bound κ6 ∈ R


j
X
bj ≤ κ5 s−βi/2 ≤ κ6 , j ∈ N.
i=0

Using bj ≤ κ6 and (14) shows the lemma.

Lemma 2 From Assumption 4, Assumption 3, and Lemma 1, it follows that


1 −β`
ρ2L,` − ρ2L,`−1 . 2 s .
σlow
3 NEW PROPERTIES OF THE MULTIFIDELITY MONTE CARLO
12
ESTIMATOR
Proof First, we ensure ρL,` ≥ 0 for ` ∈ N w.l.o.g. by redefining Q` to −Q`
if necessary, and subsequently using −Q` in the estimators (8) and (9). With
the definition of the correlation coefficient (1), we obtain

Cov[QL (Z), Q`−1 (Z)]


0 ≤ ρL,` − ρL,`−1 =ρL,` −
σL σ`−1
2 2
1 1 σL 1 σL
=ρL,` − Cov[QL (Z), Q`−1 (Z)] + −
σL σ`−1 2 σL σ`−1 2 σL σ`−1
2 2
1 σ`−1 1 σ`−1
+ −
2 σL σ`−1 2 σL σ`−1
 
1 1 σL σ`−1
=ρL,` + Var[QL (Z) − Q`−1 (Z)] − + ,
2σL σ`−1 2 σ`−1 σL
(15)

where we used

Var[QL (Z)−Q`−1 (Z)] = Var[QL (Z)]+Var[Q`−1 (Z)]−2 Cov[QL (Z), Q`−1 (Z)] .

With x = σL /σ`−1 , we can rewrite the last term in (15) as


   
1 σL σ`−1 1 1
+ = x+ .
2 σ`−1 σL 2 x

Because
 
1 1
x+ ≥1
2 x

holds for x ∈ R+ , and because 0 ≤ ρL,` ≤ 1 per definition, we obtain the


following bound on ρL,` − ρL,`−1
 
1 1 σL σ`−1
0 ≤ ρL,` − ρL,`−1 =ρL,` + Var[QL (Z) − Q`−1 (Z)] − +
2σL σ`−1 2 σ`−1 σL
1
≤ Var[QL (Z) − Q`−1 (Z)]
2σL σ`−1
1
. 2 s−β` ,
σlow

where we used Lemma 1 to bound Var[QL (Z) − Q`−1 (Z)] and the lower bound
σlow of Assumption 3. Since ρL,` + ρL,`−1 ≤ 2, we obtain

1 −β`
ρ2L,` − ρ2L,`−1 = (ρL,` − ρL,`−1 )(ρL,` + ρL,`−1 ) . 2 s .
σlow
3.2 PROOF OF MAIN THEOREM 13

3.2 Proof of main theorem

With the Lemmata 1–2 discussed in Section 3.1, we now prove Theorem 1.
b MF MF is split into
Proof (of Theorem 1) The MSE of the MFMC estimator Q L,m
the biasing and the variance term
e(Q b MF MF ] + (E[Q(Z) − QL (Z)])2 .
b MF MF ) = Var[Q (16)
L,m L,m

We first consider the biasing term of the MSE. With the maximal level L
defined as in (3), we obtain with Assumption 1
2 
(E[Q(Z) − QL (Z)]) . .
2
Consider now the variance term Var[Q b MF MF ]. Assumption 3 means that σ` ≤
L,m
σup for ` = 1, . . . , L. We therefore have
  L
!2
2
MF
σup 1 − ρ2L,L−1 2
σup 1 − ρ2L,L−1 X
Var[Q L,mMF ] ≤ p= w` r` ,
b 2
mMF wL pwL
L `=1

where we used mMF L = p/(wT r) and r = [r1 , . . . , rL ]T defined in (10) in


Section 2.4. Note that Assumptions 6–7 are required for mMF and aMF to be
optimal in the sense defined in Section 2.4. We further have with the definition
of r in (10) in Section 2.4 that
!2
 2
 L L r
!
2
σup 1 − ρ2L,L−1 X 2
σup X 
w` r` = w` ρ2L,` − ρ2L,`−1 , (17)
pwL p
`=1 `=1

see [38, Proof of Corollary 3.5] for the transformations. With Assumption 2
and Lemma 2, we obtain
L r L L
X   1 X √ γ` −β` 1 X  γ−β `
w` ρ2L,` − ρ2L,`−1 . s s . s 2 . (18)
σlow σlow
`=1 `=1 `=1

γ−β
Assumption 5 gives β > γ, and therefore s < 1 (because s > 1). Therefore,
we obtain with the geometric series that
L r γ−β
X   1 s 2
w` ρ2L,` − ρ2L,`−1 . .
σlow 1 − s γ−β
2
`=1

This means that we have


γ−β
!2
2
σup 1 s 2 1 
b MF MF ]
Var[Q . = = .
L,m
p σlow 1 − s γ−β
2 2−1 2

This means that we bounded the variance and the biasing term by /2 and
therefore have that the MSE is bounded by . The choice of the budget p in
b MF MF ) . −1 .
(11) leads to c(Q L,m
3 NEW PROPERTIES OF THE MULTIFIDELITY MONTE CARLO
14
ESTIMATOR
The following corollary considers the case where Assumption 5 is violated,
i.e., where β ≤ γ.

Corollary 1 Consider the same setup as in Theorem 1, except that Assump-


tion 5 is violated and that  < e−1 . We obtain the following bounds on the
costs (
MF −1 ln()2 , γ = β
c(Q
b
L,mMF ) . γ−β , (19)
−1− 2α , γ > β

where ln denotes the logarithm with base e.

Proof Consider (18) in the proof of Theorem 1 and note that equation (18)
holds even if Assumption 5 is violated. Note that the following proof closely
follows [9, Theorem 1] and [18, Theorem 3.1].
We first consider the case γ > β and obtain

L  γ−β γ−β γ−β γ−β γ−β


X γ−β
` 1−s 2 (L+1) s− 2 −s 2 L s− 2 s 2 L
s 2 = γ−β = γ−β = γ−β − γ−β .
`=0 1−s 2 s− 2 −1 s− 2 −1 s− 2 −1

Because γ > β and s > 1, we obtain for the first term


γ−β
s− 2

γ−β ≤ 0,
s− 2 −1

and therefore
L  γ−β
2 L
X γ−β
` s
s 2 ≤ γ−β .
`=0 1 − s− 2

With the definition of L in (3) and dxe ≤ x + 1, x ∈ R, we obtain

γ−β γ−β γ−β √  γ−β


s 2 L s 2 √ s 2 2κ1 2α
γ−β
logs ( 2κ1 −1/2 ) γ−β
γ−β ≤ γ−β s 2α = γ−β − 4α .
1− s− 2 1− s− 2 1− s− 2

With the constant

2

γ−β √  γ−β

2
σup s 2 2κ1
κ7 = 2 2  γ−β

σlow 1 − s− 2

and (17), we obtain

b MF MF ] . k7  − γ−β 2
Var[Q L,m  4α .
2p
γ−β
Thus, with p = κ7 −1− 2α follows the bound (19) for the case γ > β.
15

Consider now the case γ = β. We obtain


L 
X γ−β
` √
s 2 = L + 1 ≤α−1 logs ( 2κ1 −1/2 ) + 2
`=0
√ ln(−1 )
=α−1 logs ( 2κ1 ) + α−1 + 2.
2 ln(s)

With  < e−1 follows 1 ≤ ln(−1 ), and therefore

L + 1 ≤ κ8 ln(−1 ) ,

with
√ 1
κ8 = α−1 logs ( 2κ1 ) + α−1 + 2.
2 ln(s)
Set
2
σup 2 −1
p=2 2 κ8  ln()2 ,
σlow
where we used that ln(−1 )2 = ln()2 , to obtain the bound (19) for the case
γ = β.

4 Numerical experiment

This section demonstrates Theorem 1 numerically on an elliptic PDE with


random coefficients.

4.1 Problem setup

Let G = (0, 1)2 be a domain with boundary ∂G. Consider the linear elliptic
PDE with random coefficients

−∇ · (k(ω, x)∇u(ω, x)) = f (x) , x∈G, (20)


u(ω, x) = 0, x ∈ ∂G , (21)

where u : Ω × Ḡ → R is the solution function defined on the set of outcomes


Ω and the closure Ḡ of G. The coefficient k is given as
d  
X kx − vi k2
k(ω, x) = zi (ω) exp − ,
i=1
0.045

where d = 9, Z = [z1 , . . . , zd ]T is a random vector with components that


are independent and distributed uniformly in [10−4 , 10−1 ], and the points in
V = [v1 , . . . , vd ] ∈ R2×d are given by the matrix
 
0.5 0.2 0.8 0.8 0.2 0 0.5 1 0.5
V = .
0.5 0.2 0.2 0.8 0.8 0.5 0 0.5 1
16 4 NUMERICAL EXPERIMENT

rate constant
Assumption 1 α ≈ 1.0579 κ1 ≈ 4.0528 × 101
Assumption 2 γ ≈ 1.0551 κ3 ≈ 2.3615 × 10−6
Assumption 4 β ≈ 1.9365 κ2 ≈ 1.3744 × 103

Table 1: The table reports the rates and constants of Assumptions 1,2,4 that we estimated
for our problem (20)–(21).

The domain D is D = [10−4 , 10−1 ]9 . The right-hand side is set to f (x) = 10.
The function Q : D → R is
Z 1/2
2
Q(Z(ω)) = u(ω, x) dx .
G

We are interested in estimating E[Q(Z)].


We discretize the problem (20)–(21) with piecewise bilinear finite elements
on a rectangular grid in the domain G. The level ` defines the mesh width 2−`
of the grid in one dimension. The solution of the discretized problem at level
` is denoted as u` , which gives rise to the functions
Z 1/2
Q` (Z(ω)) = u` (ω, x)2 dx , ` ∈ N.
G

b Ref ≈ 10.54829 of E[Q(Z)] is a basic Monte Carlo


Our reference estimate Q
4
estimate obtained from 10 samples

4.2 Numerical illustration of the assumptions

Dirichlet problems such as (20)–(21) are well studied in the multilevel Monte
Carlo literature. We therefore refer to the literature for theoretical consider-
ations in the context of multilevel Monte Carlo of problem (20)–(21) and its
variations [9, 8].
We estimate the rates of Assumptions 1–4 numerically from n = 104 sam-
ples Z1 , . . . , Zn of the random variable Z and the corresponding evaluations
of Q3 , . . . , Q8 . Consider first Assumption 1. We use basic Monte Carlo esti-
mators with n = 104 samples to estimate |E[Q8 (Z) − Q` (Z)]| for ` = 3, . . . , 7
and then find κ1 ∈ R+ and α ∈ R+ that best fit the estimates in the L2 norm.
Since the domain G is in a two-dimensional space, we set s = 22 = 4. Note
that we estimate the constant κ1 and the rate α with respect to Q8 instead
of Q. We follow [9] and ignore levels that lead to too coarse grids. Note that
a general discussion on which models to select for MFMC estimation is given
in [38, Section 3.5]. The behavior of |E[Q8 (Z) − Q` (Z)]| for ` = 3, . . . , 7 is
shown in Figure 1a. The constant κ1 and the rate α are reported in Table 1.
We repeat the same procedure to obtain the rates and constants of Assump-
tions 2–4, which are visualized in Figure 1 and reported in Table 1. Note that
our estimated rates satisfy β > γ, cf. Assumption 5.
4.2 NUMERICAL ILLUSTRATION OF THE ASSUMPTIONS 17

1e+00 1e+00
E[|Q8(Z) − Q`(Z)|] costs w`
expected value

rate α ≈ 1.0579 1e-01 rate γ ≈ 1.0551


1e-01

costs [s]
1e-02
1e-02
1e-03

1e-03 1e-04
4 5 6 7 4 5 6 7
level ` level `
(a) expected absolute error (b) runtime
1e+02 1e+02
Var[Q`(Z) − Q`−1(Z)] Var[Q`(Z)]
1e+00 rate β ≈ 1.9365
rate 0.0078
ρ2` − ρ2`−1
variance

1e-02 rate ≈ 1.9883 variance


1e+01
1e-04

1e-06
1e+00
4 5 6 7 4 5 6 7
level ` level `
(c) decay of variance (d) variance

Fig. 1: The plot in (a) shows that the rate of the decay of the expected absolute error is
α ≈ 1, see Assumption 1. The plot in (b) reports the rate γ ≈ 1 of the increase of the runtime
of the evaluations Q` for ` = 3, . . . , 7, see Assumption 2. The plots in (c) and (d) report
the behavior of the variance with respect to Assumption 4 and Assumption 3, respectively.
Note that β > γ as required by Assumption 5.

costs [s] variances correlation coefficients


level `=3 2.94 × 10−4 9.41 9.990894578 × 10−1
level `=4 8.77 × 10−4 9.40 9.999374083 × 10−1
level `=5 3.18 × 10−3 9.34 9.999961196 × 10−1
level `=6 1.54 × 10−2 9.10 9.999997721 × 10−1
level `=7 6.78 × 10−2 8.27 9.999999908 × 10−1

Table 2: The table reports the costs w3 , . . . , w7 of functions Q3 , . . . , Q7 , and the sample
estimates of the variances σ32 , . . . , σ72 and the correlation coefficients ρ8,3 , . . . , ρ8,7 of the
random variables Q3 (Z), . . . , Q7 (Z) estimated from 104 samples.

We measure the costs of evaluating the functions Q3 , . . . , Q7 by averag-


ing the runtime over 104 runs. We use Matlab for the implementation and
Matlab’s backslash operator to solve systems of linear equations. The time
measurements were performed on nodes with Xeon E5-1620 CPUs and 32GB
RAM. The variances σ32 , . . . , σ72 and the correlation coefficients ρ8,3 , . . . , ρ8,7
are obtained from 104 samples, see [38]. The costs w3 , . . . , w7 , the variances
σ32 , . . . , σ72 , and the correlation coefficients ρ8,3 , . . . , ρ8,7 are reported in Table 2.
18 4 NUMERICAL EXPERIMENT

10 2
share of samples[%] 88.23% 87.55% 87.44% 87.44% 87.42%

11.05% 10.99% 10.99% 10.99% level `=7


0 level `=6
10 100% 1.38% 1.38% 1.38% level `=5
11.76% level `=4
1.39% 0.17% level `=3
0.17% 0.17%
-2
10 0.02%
0=

0=

0=

0=

0=

0=
10

10

10

10

10

10
0

-1

-2

-3

-4

-5
(a) multilevel Monte Carlo
2
10
share of samples[%]

97.76% 97.51% 97.36% 97.36% 97.32%


level `=7
0 level `=6
10 100%
2.23% 2.29% 2.29% 2.31%
level `=5
0.31%
level `=4
2.23% 0.30% 0.30%
level `=3
0.25%
10 -2 0.02% 0.02% 0.03%
0=

0=

0=

0=

0=

0=
10

10

10

10

10

10
0

-1

-2

-3

-4

-5

(b) MFMC

Fig. 2: The plots report the share of the number of samples of each level in the total number
of samples. MFMC evaluates the coarsest model more often than multilevel Monte Carlo in
this example.

4.3 Numerical illustration of main theorem

For  ∈ {100 , 10−1 , . . . , 10−5 }, we derive multilevel Monte Carlo and MFMC
estimates of E[Q(Z)] following Section 2.3 and Section 2.4, respectively. The
number of samples for the multilevel Monte Carlo estimators are derived using
the rates in Table 1. The number of samples and the coefficients for the MFMC
estimators are obtained using the costs, variances, and correlation coefficients
reported in Table 2. Figure 2 compares the number of samples obtained with
multilevel Monte Carlo and MFMC. The absolute numbers of samples are
reported in Table 3 for multilevel Monte Carlo and in Table 4 for MFMC. Both
methods lead to similar numbers of samples. MFMC assigns more samples to
level ` = 3 than multilevel Monte Carlo. A detailed comparison is shown in
Figure 3 for  = 10−5 , which illustrates that multilevel Monte Carlo distributes
the number of samples logarithmically among the levels depending on the
rates β and γ, see Section 2.4. MFMC directly uses the costs, variances, and
correlation coefficients and derives a more fine-grained distribution among the
4.4 MFMC AND COARSE-GRID (WEAKLY-CORRELATED) MODELS
19

10 2
87.42%
share of samples[%]

97.32%

10.99% level `=7


level `=6
10 0
2.31%
1.38% level `=5
0.31% level `=4
0.17% level `=3

10 -2 0.02%
0.03%
M

M
LM

FM
C

C
Fig. 3: The bar plot shows a detailed comparison of the share of the samples determined
by multilevel Monte Carlo (MLMC) and MFMC for  = 10−5 . Multilevel Monte Carlo
distributes the number of samples logarithmically among the levels, whereas MFMC deter-
mines a fine-grained distribution of the number of samples. Thus, the bars have the same
size on a logarithmic scale for multilevel Monte Carlo but different sizes for MFMC. Note
that the percent of the share of the total number of samples for each bar is shown in the
plot.

levels than multilevel Monte Carlo. We refer to [38] for further investigations
on the number of samples in the context of MFMC.
We repeat the multilevel Monte Carlo and the MFMC estimation 100 times
and report in Figure 4 the estimated MSE
100 2
1 Xb b Ref ,
ê(Q)
b = Qi − Q (22)
100 i=1

where Q
b i is either a multilevel Monte Carlo estimator or an MFMC estimator,
and where Q b Ref is the reference estimate, see Section 4.1. Figure 4 additionally
shows error bars with length
100  2 2
1 X 
Ref
ê(Q) − Qi − Q
b b b , (23)
99 i=1

which is an estimate of the variance of the error ê(Q)


b if ê(Q)
b is considered as a
random variable. The estimated MSE for the multilevel Monte Carlo and the
MFMC estimators are reported in Figure 4. Both estimators lead to similar
estimated MSEs, which is in agreement with Theorem 1. The runtime of the
multilevel Monte Carlo and the MFMC estimator is reported in Table 3 and
Table 4, respectively.

4.4 MFMC and coarse-grid (weakly-correlated) models

The random variables Q3 (Z), . . . , Q7 (Z) corresponding to levels ` = 3, . . . , 7


are highly correlated to the random variable Q8 (Z), see Table 2. We now
20 4 NUMERICAL EXPERIMENT

1e+01
Multilevel Monte Carlo
1e+00 MFMC
estimated MSE 1e-01

1e-02

1e-03

1e-04

1e-05
1e-02 1e-01 1e+00 1e+01 1e+02
runtime [s]
(a) estimated MSE w.r.t. runtime in seconds

1e+01
Multilevel Monte Carlo
1e+00 MFMC
estimated MSE

1e-01

1e-02

1e-03

1e-04

1e-05
1e+00 1e-01 1e-02 1e-03 1e-04 1e-05
tolerance 
(b) estimated MSE w.r.t. tolerance 

Fig. 4: The results are in agreement with Theorem 1, which states that the costs of the
MFMC estimator with MSE e(Q b MF
L,mMF
) .  are bounded by c(Qb MF
L,mMF
) . −1 under
Assumptions 1–7. The behavior of the MFMC estimator is similar to the behavior of the
multilevel Monte Carlo estimator.

Table 3: The table reports the number of samples used in the multilevel Monte Carlo esti-
mator and the runtime in seconds. The runtime is averaged over 100 runs.

 `=3 `=4 `=5 `=6 `=7 total time[s]


100 1.20 × 101 0 0 0 0 1.20 × 101 0.003
10−1 1.20 × 102 1.60 × 101 0 0 0 1.36 × 102 0.054
10−2 1.19 × 103 1.51 × 102 1.90 × 101 0 0 1.36 × 103 0.606
10−3 1.19 × 104 1.50 × 103 1.89 × 102 2.40 × 101 0 1.36 × 104 6.499
10−4 1.19 × 105 1.50 × 104 1.89 × 103 2.38 × 102 0 1.36 × 105 64.955
10−5 1.19 × 106 1.50 × 105 1.88 × 104 2.37 × 103 2.99 × 102 1.36 × 106 674.360
21

Table 4: The tables reports the number of samples used in the MFMC estimator. While the
total number of samples is higher than in multilevel Monte Carlo (see Table 3), the multi-
level Monte Carlo method requires more samples than MFMC at higher levels (i.e., more
expensive evaluations) and thus the runtimes are about the same for each  ∈ {1, . . . , 10−5 }.

 `=3 `=4 `=5 `=6 `=7 total time[s]


100 1.20 × 101 0 0 0 0 1.20 × 101 0.003
10−1 1.75 × 102 4.00 × 100 0 0 0 1.79 × 102 0.051
10−2 1.88 × 103 4.30 × 101 5.00 × 100 0 0 1.92 × 103 0.554
10−3 1.97 × 104 4.65 × 102 6.20 × 101 6.00 × 100 0 2.02 × 104 6.505
10−4 1.96 × 105 4.64 × 103 6.18 × 102 5.80 × 101 0 2.02 × 105 64.966
10−5 2.01 × 106 4.80 × 104 6.60 × 103 7.24 × 102 7.20 × 101 2.07 × 106 674.380

costs [s] variances correlation coefficients


level `=1 1.12 × 10−4 3.23 7.761297293 × 10−1
level `=2 1.67 × 10−4 6.09 9.884862151 × 10−1
level `=3 2.94 × 10−4 9.41 9.990894578 × 10−1
level `=4 8.77 × 10−4 9.40 9.999374083 × 10−1
level `=5 3.18 × 10−3 9.34 9.999961196 × 10−1

Table 5: The table reports the costs w1 , . . . , w5 of functions Q1 , . . . , Q5 , and the sample esti-
mates of the variances σ12 , . . . , σ52 and the correlation coefficients ρ8,1 , . . . , ρ8,5 of the random
variables Q1 (Z), . . . , Q5 (Z) estimated from 104 samples. Note that the costs, variances, and
correlation coefficients for levels ` = 3, . . . , 7 are reported in Table 2.

consider MFMC with Q1 (Z), Q2 (Z), . . . , Q5 (Z), where we include levels ` =


1 and ` = 2. The estimated correlation coefficients, costs, and variances
are reported in Table 5. The random variable Q1 (Z) corresponding to level
` = 1 is significantly weaker correlated to Q8 (Z) than the random vari-
ables Q2 (Z), . . . , Q5 (Z). As in Section 4.2, we measure the rates of Assump-
tions 1,2,4, and obtain α ≈ 0.9255, β ≈ 1.6202 and γ ≈ 0.7160. These rates
are similar as the rates reported in Table 1. Note that β > γ.
We derive multilevel Monte Carlo and MFMC estimates of E[Q(Z)] for
 ∈ {101 , 100 , 10−1 , 10−2 } and report the estimated MSE (22) in Figure 5. The
error bars show the variance (23). The results illustrate that MFMC achieves
an estimated MSE that is in agreement with Theorem 1 also in this case where
the random variable Q1 (Z) corresponding to the coarsest discretization is only
weakly correlated to Q8 (Z). Multilevel Monte Carlo and MFMC show a similar
behavior. We refer to [38, Section 3.4, Section 4.3], where the performance of
MFMC with weakly-correlated models is further investigated analytically and
numerically.

5 Conclusions

The MFMC method provides a general framework for combining multiple


approximations into an estimator of statistics of a random variable that is
expensive (or impossible) to sample. We discussed MFMC in the special case
where sampling the random variable requires solving a PDE, and where we can
sample only approximations that correspond to a hierarchy of discretizations
22 References

Multilevel Monte Carlo Multilevel Monte Carlo


1e+01 MFMC 1e+01 MFMC
estimated MSE

estimated MSE
1e+00 1e+00
1e-01 1e-01
1e-02 1e-02
1e-03 1e-03
1e-02 1e-01 1e+00 1e+01 1e+00 1e-01 1e-02
runtime [s] tolerance 
(a) estimated MSE w.r.t. runtime in seconds (b) estimated MSE w.r.t. tolerance 

Fig. 5: The plots report the estimated MSE of multilevel Monte Carlo and the MFMC
estimators that combine Q1 (Z), . . . , Q5 (Z) corresponding to levels ` = 1, . . . , 5. The random
variables Q1 (Z) and Q2 (Z) are only weakly correlated to Q8 (Z). The MFMC estimator
shows a similar behavior as the multilevel Monte Carlo estimator.

of the PDE. In this setting, and under standard assumptions on the discretiza-
tions of the PDE, the MFMC estimator is efficient, which means that the costs
of the MFMC estimator with MSE below a threshold are bounded linearly in
the threshold. Our numerical results illustrated the theory.

Acknowledgment

The first and the third author were supported in part by the AFOSR MURI
on multi-information sources of multi-physics systems under Award Number
FA9550-15-1-0038, program manager Jean-Luc Cambier, and by the United
States Department of Energy Applied Mathematics Program, Awards DE-
FG02-08ER2585 and DE-SC0009297, as part of the DiaMonD Multifaceted
Mathematics Integrated Capability Center. The second author was supported
by the US Department of Energy Office of Science grant DE-SC0009324 and
the Air Force Office of Scientific Grant FA9550-15-1-0001. Some of the numer-
ical examples were computed on the computer cluster of the Munich Centre
of Advanced Computing.

References

1. I. Babuška, F. Nobile, and R. Tempone. A stochastic collocation method for elliptic


partial differential equations with random input data. SIAM Journal on Numerical
Analysis, 45(3):1005–1034, 2007.
2. A. Barth, C. Schwab, and N. Zollinger. Multi-level Monte Carlo Finite Element method
for elliptic PDEs with stochastic coefficients. Numerische Mathematik, 119(1):123–161,
Sept. 2011.
3. P. Benner, S. Gugercin, and K. Willcox. A survey of projection-based model reduction
methods for parametric dynamical systems. SIAM Review, 57(4):483–531, 2015.
4. C. Bierig and A. Chernov. Convergence analysis of multilevel Monte Carlo variance
estimators and application for random obstacle problems. Numerische Mathematik,
130(4):579–613, 2015.
References 23

5. C. Bierig and A. Chernov. Approximation of probability density functions by the mul-


tilevel Monte Carlo maximum entropy method. Journal of Computational Physics,
314:661 – 681, 2016.
6. C. Bierig and A. Chernov. Estimation of arbitrary order central statistical moments
by the multilevel Monte Carlo method. Stochastics and Partial Differential Equations
Analysis and Computations, 4(1):3–40, 2016.
7. H.-J. Bungartz and M. Griebel. Sparse grids. Acta Numerica, 13:1–123, 2004.
8. J. Charrier, R. Scheichl, and A. L. Teckentrup. Finite element error analysis of elliptic
PDEs with random coefficients and its application to multilevel Monte Carlo methods.
SIAM Journal on Numerical Analysis, 51(1):322–352, 2013.
9. K. A. Cliffe, M. Giles, R. Scheichl, and A. L. Teckentrup. Multilevel Monte Carlo
methods and applications to elliptic PDEs with random coefficients. Computing and
Visualization in Science, 14(1):3–15, 2011.
10. N. Collier, A.-L. Haji-Ali, F. Nobile, E. von Schwerin, and R. Tempone. A continuation
multilevel Monte Carlo algorithm. BIT Numerical Mathematics, 55(2):399–432, 2015.
11. C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273–297,
1995.
12. A. I. J. Forrester and A. J. Keane. Recent advances in surrogate-based optimization.
Progress in Aerospace Sciences, 45(1–3):50–79, Jan. 2009.
13. K. A. Forrester A., Sóbester A. Engineering design via surrogate modelling: a practical
guide. Wiley, 2008.
14. F. Franzelin, P. Diehl, and D. Pflüger. Non-intrusive uncertainty quantification with
sparse grids for multivariate peridynamic simulations. In M. Griebel and A. M.
Schweitzer, editors, Meshfree Methods for Partial Differential Equations VII, pages
115–143, Cham, 2015. Springer International Publishing.
15. F. Franzelin and D. Pflüger. From data to uncertainty: An efficient integrated data-
driven sparse grid approach to propagate uncertainty. In J. Garcke and D. Pflüger,
editors, Sparse Grids and Applications - Stuttgart 2014, pages 29–49, Cham, 2016.
Springer International Publishing.
16. T. Gerstner and M. Griebel. Numerical integration using sparse grids. Numerical
Algorithms, 18(3):209–232, 1998.
17. T. Gerstner and M. Griebel. Dimension–adaptive tensor–product quadrature. Comput-
ing, 71(1):65–87, 2003.
18. M. Giles. Multi-level Monte Carlo path simulation. Operations Research, 56(3):607–617,
2008.
19. M. Griebel, H. Harbrecht, and M. Peters. Multilevel quadrature for elliptic paramet-
ric partial differential equations on non-nested meshes. Stochastic Partial Differential
Equations: Analysis and Computations, 2015. submitted.
20. S. Gugercin, A. Antoulas, and C. Beattie. H2 Model Reduction for Large-Scale Linear
Dynamical Systems. SIAM Journal on Matrix Analysis and Applications, 30(2):609–
638, Jan. 2008.
21. M. D. Gunzburger, C. G. Webster, and G. Zhang. Stochastic finite element methods
for partial differential equations with random input data. Acta Numerica, 23:521–650,
5 2014.
22. A.-L. Haji-Ali, F. Nobile, and R. Tempone. Multi-index Monte Carlo: when sparsity
meets sampling. Numerische Mathematik, 132(4):767–806, June 2015.
23. J. M. Hammersley and D. C. Handscomb. Monte Carlo Methods. Methuen London,
1964.
24. H. Harbrecht, M. Peters, and M. Siebenmorgen. On multilevel quadrature for elliptic
stochastic partial differential equations. In J. Garcke and M. Griebel, editors, Sparse
Grids and Applications, pages 161–179, Berlin, Heidelberg, 2013. Springer Berlin Hei-
delberg.
25. H. Harbrecht, M. Peters, and M. Siebenmorgen. Multilevel accelerated quadrature
for PDEs with log-normally distributed diffusion coefficient. SIAM/ASA Journal on
Uncertainty Quantification, 4(1):520–551, 2016.
26. S. Heinrich. Multilevel Monte Carlo Methods. In S. Margenov, J. Waśniewski, and
P. Yalamov, editors, Large-Scale Scientific Computing, number 2179 in Lecture Notes
in Computer Science, pages 58–67. Springer, 2001.
24 References

27. J. Li and D. Xiu. Evaluation of failure probability via surrogate models. Journal of
Computational Physics, 229(23):8966–8980, 2010.
28. J. S. Liu. Monte Carlo strategies in scientific computing. Springer, 2008.
29. A. J. Majda and B. Gershgorin. Quantifying uncertainty in climate change science
through empirical information theory. Proceedings of the National Academy of Sciences
of the United States of America, 107(34):14958–14963, Aug. 2010.
30. B. L. Nelson. On control variate estimators. Computers & Operations Research,
14(3):219–225, 1987.
31. L. Ng and K. Willcox. Multifidelity approaches for optimization under uncertainty.
International Journal for Numerical Methods in Engineering, 100(10):746–772, 2014.
32. L. Ng and K. Willcox. Monte-Carlo information-reuse approach to aircraft conceptual
design optimization under uncertainty. Journal of Aircraft, pages 1–12, 2015.
33. F. Nobile, R. Tempone, and C. G. Webster. A sparse grid stochastic collocation method
for partial differential equations with random input data. SIAM Journal on Numerical
Analysis, 46(5):2309–2345, 2008.
34. S. Pauli and P. Arbenz. Determining optimal multilevel Monte Carlo parameters with
application to fault tolerance. Computers & Mathematics with Applications, 70(11):2638
– 2651, 2015.
35. S. Pauli, P. Arbenz, and C. Schwab. Intrinsic fault tolerance of multilevel Monte Carlo
methods. Journal of Parallel and Distributed Computing, 84:24 – 36, 2015.
36. B. Peherstorfer, T. Cui, Y. Marzouk, and K. Willcox. Multifidelity importance sampling.
Computer Methods in Applied Mechanics and Engineering, 300:490 – 509, 2016.
37. B. Peherstorfer and K. Willcox. Online adaptive model reduction for nonlinear systems
via low-rank updates. SIAM Journal on Scientific Computing, 37(4):A2123–A2150,
2015.
38. B. Peherstorfer, K. Willcox, and M. Gunzburger. Optimal model management for multi-
fidelity Monte Carlo estimation. SIAM Journal on Scientific Computing, 38(5):A3163–
A3194, 2016.
39. C. Robert and G. Casella. Monte Carlo Statistical Methods. Springer, 2004.
40. G. Rozza, D. Huynh, and A. Patera. Reduced basis approximation and a posteriori
error estimation for affinely parametrized elliptic coercive partial differential equations.
Archives of Computational Methods in Engineering, 15(3):1–47, 2007.
41. L. Sirovich. Turbulence and the dynamics of coherent structures. Quarterly of Applied
Mathematics, 45:561–571, 1987.
42. A. L. Teckentrup, R. Scheichl, M. Giles, and E. Ullmann. Further analysis of mul-
tilevel Monte Carlo methods for elliptic PDEs with random coefficients. Numerische
Mathematik, 125(3):569–600, 2013.
43. E. Ullmann, H. C. Elman, and O. G. Ernst. Efficient iterative solvers for stochastic
Galerkin discretizations of log-transformed random diffusion problems. SIAM Journal
on Scientific Computing, 34(2):A659–A682, 2012.
44. E. Ullmann and I. Papaioannou. Multilevel estimation of rare events. SIAM/ASA
Journal on Uncertainty Quantification, 3(1):922–953, 2015.
45. E. Ullmann and C. E. Powell. Solving log-transformed random diffusion problems by
stochastic Galerkin mixed finite element methods. SIAM/ASA Journal on Uncertainty
Quantification, 3(1):509–534, 2015.
46. V. Vapnik. Statistical Learning Theory. Wiley, 1998.
47. D. Xiu. Fast numerical methods for stochastic computations: A review. Communications
in computational physics, 5:242–272, 2009.

You might also like