0% found this document useful (0 votes)

50 views19 pages

Variance Reduction Techniques in Monte Carlo Methods: SSRN Electronic Journal November 2010

This document discusses variance reduction techniques in Monte Carlo methods. Monte Carlo methods are used to estimate quantities in statistical models through computer simulations. Variance reduction techniques are needed to improve efficiency since basic Monte Carlo sampling has high variance. The document introduces common variance reduction techniques like antithetic variables, importance sampling, control variates, and stratified sampling. It also discusses estimating probabilities of rare events, which is an important application area for variance reduction techniques.

Uploaded by

Mayank Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views19 pages

Variance Reduction Techniques in Monte Carlo Methods: SSRN Electronic Journal November 2010

Uploaded by

Mayank Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228312742

Variance Reduction Techniques in Monte Carlo Methods

Article in SSRN Electronic Journal · November 2010

DOI: 10.2139/ssrn.1715474

CITATIONS READS

14 2,731

3 authors, including:

Jack P.C. Kleijnen Ad Ridder

Tilburg University Vrije Universiteit Amsterdam
381 PUBLICATIONS 13,499 CITATIONS 81 PUBLICATIONS 761 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

1. SPRT for nonnormal simulation responses; 2. Survey of Kriging metamodels View project

Kriging and Gaussian process View project

All content following this page was uploaded by Jack P.C. Kleijnen on 26 March 2019.

The user has requested enhancement of the downloaded file.

No. 2010-117

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO

METHODS

By J. P. C. Kleijnen, A. A. N. Ridder, R.Y. Rubinstein

November 2010

ISSN 0924-7815
Variance Reduction Techniques in Monte
Carlo Methods
Jack P. C. Kleijnen1 , Ad A. N. Ridder2 and Reuven Y. Rubinstein3

(1) Tilburg University, Tilburg, The Netherlands, [email protected]

(2) Vrije University, Amsterdam, The Netherlands, [email protected]
(3) Technion, Haifa, Israel, [email protected]

Keywords: common random numbers, antithetic random numbers, importance sampling,

control variates, conditioning, strati ed sampling, splitting, quasi Monte Carlo

JEL: C0, C1, C9

INTRODUCTION
Monte Carlo methods are simulation algorithms to estimate a numerical quantity in a
statistical model of a real system. These algorithms are executed by computer programs.
Variance reduction techniques (VRT) are needed, even though computer speed has been
increasing dramatically, ever since the introduction of computers. This increased computer
power has stimulated simulation analysts to develop ever more realistic models, so that the net
result has not been faster execution of simulation experiments; e.g., some modern simulation
models need hours or days for a single 'run' (one replication of one scenario or combination
of simulation input values). Moreover there are some simulation models that represent rare
events which have extremely small probabilities of occurrence), so even modern computer
would take 'for ever' (centuries) to execute a single run—were it not that special VRT can
reduce theses excessively long runtimes to practical magnitudes.

Preliminaries
In this contribution the focus is to estimate a quantity

` = E(H(Y)); (1)

where H(Y) is the performance function driven by an input vector Y with probability density
function f (y). To estimate ` through simulation, one generates a random sample Yi with
i = 1; : : : ; N from f (y), computes the sample function H(Yi ), and the sample-average
estimator
N
ˆ`N = 1 ∑ H(Yi ):
N i=1
This is called crude Monte Carlo sampling (CMC). The resulting sample-average estimator is
an unbiased estimator for `. Furthermore, as N gets large, laws of large numbers may be
invoked (assuming simple conditions) to verify that the sample-average estimator
stochastically converges to the actual quantity to be estimated. The ef ciency of the estimator
is captured
q by its relative error (RE), i.e., the standard error divided by the mean:
RE = Var(`ˆN )=E(`ˆN ). Applying the Central Limit Theorem, one easily gets that

1
z1 α=2 RE < ε, where z1 α=2 is the (1 α=2)th quantile of the standard normal distribution
(typically one takes α = 0:05 so z1 α=2 = 1:96) if and only if

`ˆN `
P <ε >1 α: (2)
`
When (??) holds, the estimator is said to be (1 α; ε)-ef cient.

To illustrate, consider the one-dimensional version of (??):

Z
`= h(y) f (y) dy:

Monte Carlo integration is a good way to estimate the value of the integral when the
dimension is much higher than one, but the concept is still the same. Monte Carlo integration
has become an important tool in nancial engineering for pricing nancial products such as
options, futures, and swaps (Glasserman, 2003). This Monte Carlo estimate samples Y1 ; : : : ;YN
independently from f and calculates

1 N
`ˆN = ∑ h(Yi ):
N i=1

Then `ˆN is an unbiased estimator for `, and the standard error is

q r r r Z
1 1 1
Var `ˆN = Var (h(Y )) = 2
E (h(Y ) `) = (h(y) `)2 f (y) dy:
N N N
p
Hence, the relative error (or ef ciency) of the estimator is proportional to 1= N. This is a
poor ef ciency in case of high-dimensional problems where the generation of a single output
vector is costly and consumes large computing time and memory. VRT improve ef ciency if
they indeed require smaller sample sizes. To be more speci c, consider again the performance
measure (??), and assume that besides the CMC-estimator `ˆN , a VRT results in another
unbiased estimator, denoted `ˆN , also based on a sample of N independent and identical
observations. The VRT-estimator is said to be statistically more ef cient than the
CMC-estimator if
Var(`ˆN ) < Var(`ˆN ):
Then one usually computes the reduction factor for the variance:
Var(`ˆN ) Var(`ˆN )
100%:
Var(`ˆN )
Notice that this factor does not depend on the sample size N. Suppose that the reduction factor
is 100r%, so r = 1 (Var(`ˆ )=Var(`)),ˆ and suppose that (1 α; ε)-ef ciency is desired. The
required sample size for the CMC-estimator is N, given by z1 α=2 RE = ε, which holds iff

`2 ε 2 1 z21 α=2
= Var(`ˆN ) = Var(`ˆ1 ) , N = 2 2 Var(`ˆ1 ):
z21 α=2
N ` ε

The same reasoning holds for the VRT-estimator with a required sample size N .
Consequently, the reduction in sample size becomes
N N Var(`ˆ1 ) Var(`ˆ1 )
= = r;
N Var(`ˆ1 )

2
which is the same reduction as for the variance.

Generating samples under a VRT consumes generally more computer time (exceptions are
antithetic and common random numbers; see next section). Thus to make a fair comparison
with CMC, the computing time should be incorporated when assessing ef ciency
improvement. Therefore, denote the required time to compute `ˆN by TM(`ˆN ). Then the effort
of an estimator may be de ned to be the product of its variance and its computing time:
EFFORT = Var TM(`ˆN ). Notice that the effort does not depend on the sample size, if the
computing time of N samples equals N times the computing time of a single sample. Then the
estimator `ˆN is called more ef cient than estimator `ˆN if the former requires less effort:

EFFORT(`ˆN ) < EFFORT(`ˆN ):

Again, a reduction factor for the effort can be de ned, and one can analyze the reduction in
computer time needed to obtain (1 α; ε)-ef ciency.

Estimating the Probability of Rare Events

An important class of statistical problems assesses probabilities of risky or undesirable events.
These problems have become an important issue in many elds; examples are found in
reliability systems (system failure), risk management (value-at-risk), nancial engineering
(credit default), insurance (ruin), and telecommunication (packet loss); see Juneja and
Shahabuddin (2006); Rubino and Tuf n (2009). These problems can be denoted in the format
of this contribution by assuming that a set A contains all the risky or undesirable input vectors
y, so that (??) becomes
` = P(A) = P(Y 2 A) = E(IA (Y));
where IA is the indicator function of the set A (and pthus in (??) H = IA ). The standard error of
the Monte Carlo estimator is easily computed as `(1 `)=N. Hence, the relative error
becomes p p
`(1 `) (1 `)
RE = p = p : (3)
` N `N
This equation implies that the sample size is inverse proportional to the target probability `
when requiring a prespeci ed ef ciency; for instance, to obtain (95%,10%)-ef ciency, the
sample size should be N 385(1 `)=`. This leads immediately to the main issue of this
contribution; namely ` << 1 so A is called a rare event. To illustrate, suppose Y = (Y1 ; : : : ;Yn ),
where Y j ( j = 1; : : : ; n) are identically and
p independently distributed (IID) with nite mean
µ = E(Y1 ) and standard deviation σ = Var(Y1 ). Denote their sum by S(Y) = Y1 + +Yn ,
and let the rare event be A = fS(Y) > n(µ + δ )g for a positive δ . A normal approximation
results for n = 500; δ = 0:5; σ = 1 that ` 2:5E-29. A (95%,10%)-ef cient CMC-estimator
would need sample size N 1:5E+31; which is impossible to realize. For example, the
practical problem might require the daily simulation of a nancial product for a period of two
years in which a single normal variate needs to be generated per simulated day. Fast
algorithms for normal variate generation on standard PCs require about 20 seconds for E+9
samples. This gives only E+5 vector samples Y per second. Note that the number of calls of
the random number generator (RNG) is at least N n, which in our numerical example equals
7.5E+33; this number is large, but modern RNGs can meet this requirement (L'Ecuyer, 2006).

In conclusion, the desired level of ef ciency of the CMC estimator for rare event problems
requires sample sizes that go far beyond available resources. Hence, researchers have looked

3
for ways to reduce the variance of the estimator as much as possible for the same amount of
sampling resources. Traditional VRTs are common random numbers, antithetic variates,
control variates, conditioning, strati ed sampling and importance sampling (Law, 2007;
Rubinstein and Kroese, 2008). Modern VRTs include splitting techniques, and quasi-Mont
Carlo sampling (Asmussen and Glynn, 2007; Glasserman, 2003).

ANTITHETIC AND COMMON RANDOM NUMBERS

Consider again the problem of estimating ` = E(H(Y)) de ned in (??). Now let Y1 and Y2 be
two input samples generated from f (y). Denote Xi = H(Yi ) with i = 1; 2. Then
`ˆ = (X1 + X2 )=2 is an unbiased estimator of ` with variance

ˆ = 1
Var(`) Var(X1 ) + Var(X2 ) + 2Cov(X1 ; X2 ) :
4
If X1 ; X2 would be independent (as is the case in CMC), then Var(`)ˆ would be
1
4 (Var(X1 ) + Var(X2 )). Obviously, variance reduction is obtained if Cov(X1 ; X2 ) < 0. The
usual way to make this covariance negative is as follows. Whenever the uniform random
number U is used for a particular purpose (for example, the second service time) in generating
Y1 , use the antithetic number 1 U for the same purpose to generate Y2 . Because U and
1 U have correlation coef cient 1, it is to be expected that Cov(X1 ; X2 ) < 0. This can be
formalized by the following technical conditions.

(b). The sample vector Y = (Y1 ; : : : ;Yn ) has components Y j that are one-dimensional,
independent random variables with distribution functions Fj that are generated by the
inverse transformation method; i.e., Y j = Fj 1 (U j ), for j = 1; : : : ; n.

(c). The performance function H is monotone.

Under these conditions, negative correlation can be proved (Rubinstein and Kroese, 2008). In
condition (a) the inverse transformation requirement can be replaced by the assumption that
all Y j -components are Gaussian: when Y N(µ; σ 2 ), then Ỹ = 2µ Y N(µ; σ 2 ), and
clearly Y and Ỹ are negatively correlated. This alternative assumption is typically applied in
nancial engineering for option pricing (Glasserman, 2003).

The method of common random numbers (CRN) is often applied in practice, because
simulationists nd it natural to compare alternative systems under `the same circumstances';
for example, they compare different queueing disciplines (such as First-In-First-Out or FIFO,
Last-In-First-Out or LIFO, Shortest-Jobs-First or SJF) using the same sampled arrival and
service times in the simulation.

To be more speci c, let Y be an input vector for two system performances E(H1 (Y)) and
E(H2 (Y)), and the performance quantity of interest is their difference

` = E(H1 (Y)) E(H2 (Y)):

To estimate `, two choices produce an unbiased estimator:

4
2. Generate one sequence of IID input vectors Y1 ; : : : ; YN , and estimate ` by

1 N
`ˆN = ∑ (H1 (Yi )
(1)
H2 (Yi )) :
N i=1

(1) (1)
3. Generate two independent IID sequences of input vectors Y1 ; : : : ; YN , and
(2) (2)
Y1 ; : : : ; YN , and estimate ` by

1 N 1 N
`ˆN = ∑ H1 (Yi ) ∑
(2) (1) (2)
H2 (Yi ):
N i=1 N i=1

The rst method is the CRN method, and is intuitively prefered because it reduces variability:

Var(`ˆN ) < Var(`ˆN ):

(1) (2)

To prove this inequality, denote Xi = Hi (Yi ). Then `ˆ = X1 X2 is an unbiased estimator of `

with variance
ˆ = Var(X1 ) + Var(X2 ) 2Cov(X1 ; X2 ):
Var(`) (4)
If X1 and X2 are independent (as is the case in the second method), then (??) becomes
Var(X1 ) + Var(X2 ). Hence, variance reduction is obtained if Cov(X1 ; X2 ) > 0 in (??). This
requirement is precisely the opposite of what was needed in antithetic variates. To force the
covariance to become positive through CRN, the uniform random number U used for a
particular purpose in generating Y1 , is used for the same purpose to generate Y2 . This can be
formalized by the technical conditions completely analogous to those for antithetic variates.

CRN is often applied not only because it seems 'fair' but also because CRN is the default in
many simulation software systems; e.g., Arena compares different scenarios using the same
seed—unless, the programmer explicitly selects different seeds to initialize the various
sampling processes (arrival process, service time at work station 1, etc.) for different
scenarios. Detailed examples are given in Law (2007), pp. 582-594.

So while the simulation programmers need to invest little extra effort to implement CRN, the
comparisons of various scenarios may be expected to be more accurate; i.e., the what-if or
sensitivity analysis gives estimators with reduced variances. However, some applications may
require estimates of the absolute (instead of the relative) responses; i.e., instead of sensitivity
analysis the analysis aims at prediction or interpolation from the observed responses for the
scenarios that have already been simulated. In these applications, CRN may give worse
predictions; also see Chen, Ankenman, and Nelson (2010).

The analysis of simulation experiments with CRN should go beyond (??), which compares
only two scenarios. The simplest extension is to compare a xed set of (say) k scenarios using
(??) combined with the Bonferroni inequality so that the type-I error rate does not exceed (say)
α; i.e., in each comparison of two scenarios the value α is replaced by α=m where m denotes
the number of comparisons (e.g., if all k scenarios are compared, then m = k(k 1)=2).
Multiple comparison and ranking techniques are discussed in Chick and Gans (2009).

However, the number of interesting scenarios may be not xed in advance; e.g., the scenarios
differ in one or more quantitative inputs (e.g., arrival speed, number of servers) and the

5
optimal input combination is wanted. In such situations, regression analysis is useful; i.e., the
regression model is then a metamodel that enables validation, sensitivity analysis, and
optimization of the simulation model; see Kleijnen (2008). The estimated regression
coef cients (regression parameters) may have smaller variances if CRN is used—because of
arguments based on (??)—except for the intercept (or the 'grand mean' in Analysis of
Variance or ANOVA terminology). Consequently, CRN is not attractive in prediction, but it is
in sensitivity analysis and optimization.
A better metamodel for prediction may be a Kriging or Gaussian Process model, assuming the
scenarios correspond with combinations of quantitative inputs; e.g., the scenarios represent
different traf c rates in a queuing simulation. Kriging implies that the correlation between the
responses of different scenarios decreases with the distance between the corresponding input
combinations; i.e., the Gaussian process is stationary (Kleijnen, 2008). In random simulation
(unlike deterministic simulation, which is popular in engineering) the Kriging metamodel also
requires the estimation of the correlations between the 'intrinsic' noises of different scenarios
caused by the use of random numbers U; see Chen, Ankenman, and Nelson (2010).
An important issue in the implementation of Antithetics and CRN is synchronization, which is
a controlling mechanism to ensure that the same random variables are generated by the same
random numbers from the random number generator. As an example, consider comparing a
single-server queue GI=GI=1 with a two-server system GI=GI=2. The two systems have
statistically similar arrivals and service times, but the single server works twice as fast. The
performance measure is the expected waiting time per customer (which is conjectured to be
less in the two-server system). In a simulation study, the two simulation models with CRN
should have the same arrival variates, and the same service-time variates. Suppose that
A1 ; A2 ; : : : are the consecutive interarrival times in a simulation run of the GI=GI=1 model, and
S1 ; S2 ; : : : are their associated service-time requirements. Then, in the corresponding
simulation run of the GI=GI=2 model, these same values are used for the consecutive
interarrival times, and their associated service times; see Kelton, Sadowski, and Sturrock
(2007); Law (2007).
Antithetic and common random numbers can be combined. Their optimal combination is the
goal of the Schruben-Margolin strategy; i.e., some blocks of scenarios use CRN, whereas
other blocks use antithetic variates, etc.; see Song and Chiu (2007).

CONTROL VARIATES
Suppose that `ˆ is an unbiased estimator of ` in the estimation problem (??); for example, C is
the arrival time in a queueing simulation. A random variable C is called a control variate for `ˆ
if it is correlated with `ˆ and its expectation γ is known. The linear control random variable
ˆ
`(α) is de ned as
ˆ
`(α) = `ˆ α(C γ);
ˆ
where α is a scalar parameter. It is easy to prove that the variance of `(α) is minimized by
ˆ
Cov(`;C)
α = :
Var(C)
The resulting minimal variance is
ˆ )) = 1
Var(`(α ρ 2`C
ˆ
ˆ
Var(`); (5)

6
where ρ `C ˆ ˆ
ˆ denotes the correlation coef cient between ` and C. Since Cov(`;C) is unknown,
the optimal control coef cient α must be estimated from the simulation. Estimating both
ˆ
Cov(`;C) and Var(C) means that linear regression analysis is applied to estimate α .
Estimation of α implies that the variance reduction becomes smaller than (??) suggests, and
that the estimator may become biased. The method can be easily extended to multiple control
variables (Rubinstein and Marcus, 1985).

A well-known application of control variates is pricing of Asian options. The payoff of an

Asian call option is given by
n
1
H(Y) = max 0;
n ∑ Yj K ;
j=1

where Y j = S jT =n , the expiration date T is discretized into n time units, K is the strike price,
and St is the asset price at time t, which follows a geometric Brownian motion. Let r be the
interest rate; then the price of the option becomes
rT
`=E e H(Y) :

As control variate may be C = e rT max(0; ST K) whose expectation is readily available

from the Black-Scholes formula. Alternative control variates are ST , or n1 ∑nj=1 S jT =n .

CONDITIONING
The method of conditional Monte-Carlo is based on the following basic probability formulas.
Let X and Z be two arbitrary random variables, then

E(E(XjZ)) = E(X) and Var(X) = E(Var(XjZ)) + Var(E(XjZ)): (6)

Because the last two terms are both nonnegative, variance reduction is obvious:

Var(E(XjZ)) Var(X):

The same reasoning holds for the original problem (??), setting X = H(Y). Also Z is allowed
to be a vector variable. These formulas are used in a simulation experiment as follows. The
vector Z is simulated, and the conditional expectation C = E(H(Y)jZ) is computed.
Repeating this N times gives the conditional Monte-Carlo estimator

1 N
`ˆN = ∑ Ci :
N i=1

A typical example is a level-crossing probability of a random number of variables:

R
`=P ∑ Yj > b ;
j=1

where Y1 ;Y2 ; : : : are IID positive random variables, R is a nonnegative integer-valued random
variable, independent of the Y j variables, and b is some speci ed constant. Such problems are
of interest in insurance risk models for assessing aggregate claim distributions (Glasserman,

7
2003). CMC can be improved by conditioning on the value of R for which level crossing
occurs. To be more speci c, denote the event of interest by A, so ` = E(IA (Y)). De ne
r
M = min r : ∑ Yj > b :
j=1

Assume that the distribution of Y can be easily sampled, and that the distribution of R is
known and numerically available (for instance, Poisson). Then it is easy to generate a value of
M. Suppose that M = m. Then E(IA (Y)jM = m) = P(R m), which can be easily computed.

STRATIFIED SAMPLING
Recall the original estimation problem ` = E(H(Y)), and its crude Monte Carlo estimator `ˆN .
Suppose now that there is some nite random variable Z taking values from fz1 ; : : : ; zm g, say,
such that

1.
(ii). the probabilities pi = P(Z = zi ) are known;
(iii). for each i = 1; : : : ; m, it is easy to sample from the conditional distribution of Y given
Z = zi .

Because
m
` = E(E(H(Y))) = ∑ pi E(H(Y)jZ = zi );
i=1
the strati ed sampling estimator of ` may be
m Ni
ˆ`N = ∑ pi 1 ∑ H(Yi j );
i=1 Ni j=1

where Ni IID samples Yi1 ; : : : ; YiNi are generated from the conditional distribution of Y given
Z = zi , such that N1 + + Nm = N. Notice that the estimator is unbiased. To assess its
variance, denote the conditional variance of the performance estimator by
σ 2i = Var(H(Y)jZ = zi ). The variance of the strati ed sampling estimator is then given by
m
p2i σ 2i
Var(`ˆN ) = ∑ :
i=1 Ni

Because of (??)
m
Var(H(Y)) Var(H(Y)jZ) = ∑ pi σ 2i :
i=1
Selecting proportional strata sample sizes Ni = pi N gives variance reduction:
m
pi σ 2i 1
Var(`ˆN ) = ∑ Var(H(Y)) = Var(`ˆN ):
i=1 N N

It can be shown that the strata sample sizes Ni that minimize this variance are
pi σ i
Ni = N m ;
∑ j=1 p j σ j

8
see Rubinstein and Kroese (2008). A practical problem is that the standard deviations σ i are
usually unknown, so these variances are estimated by pilot runs. Strati ed sampling is used in
nancial engineering to get variance reductions in problems such as value-at-risk, and pricing
path-dependent options (Glasserman, 2003).

IMPORTANCE SAMPLING
The idea of importance sampling is explained best in case of estimating the probability of an
event A. The underlying sample space is (Ω; F ) for which A 2 F , and the probability
measure P on this space is given by the speci c simulation model. In a simulation experiment
for estimating P(A), the CMC estimator would be `ˆN = ∑N
(i) (1) (N)
i=1 IA , where IA ; : : : ; IA are IID
indicator functions of event A generated under P. On average in only one out of 1=P(A)
generated samples the event A occurs, and thus for rare events (where P(A) is extremely
small) this procedure fails. Suppose that there is an alternative probability measure P on the
same (Ω; F ) such that (i) A occurs much more often, and (ii) P is absolutely continuous with
respect to P , meaning
8F 2 F : P(F) > 0 ) P (F) > 0:
Then accordingRto the Radon-Nikodym
R
theorem, it holds that there is a measurable function L
on Ω such that F dP = F L dP for all F 2 F . The function L is called likelihood ratio and
usually written as L = dP=dP ; the alternative probability measure P is said to be the
importance sampling probability measure, or the change of measure. Thus, by weighting the
occurrence IA of event A with the associated likelihood ratio, simulation under the change of
measure yields an unbiased importance sampling estimator
N
`ˆN = ∑ L(i) IA :
(i)

i=1

More importantly, variance reduction is obtained when the change of measure has been
chosen properly, as will be explained below. Importance sampling has been applied
successfully in a variety of simulation areas, such as stochastic operations research, statistics,
Bayesian statistics, econometrics, nance, systems biology; see Rubino and Tuf n (2009).
This section will show that the main issue in importance sampling simulation is the question
which change of measure to consider. The choice is very much problem dependent, however,
and unfortunately, it is dif cult to prevent gross misspeci cation of the change of measure P ,
particularly in multiple dimensions.

Exponential change of measure

As an illustration, consider the problem of estimating the level-crossing probability

`n = P(An ) with An = fY1 + +Yn > nag; (7)

where Y1 ; : : : ;Yn are IID random variables with nite mean µ = E(Y ) < a and with a
light-tailed PDF f (y; v), in which v denotes a parameter vector, such as mean and variance of
a normal density. It is well-known from Cramér's Theorem that P(An ) ! 0 exponentially fast
as n ! ∞. Suppose that under the importance sampling probability measure the random

9
variables Y1 ; : : : ;Yn remain IID, but with an exponentially tilted PDF (also called exponentially
twisted), with tilting factor t:
f (y; v)ety
ft (y; v) = R :
f (y; v)ety dy
Thus, in the importance sampling simulations the Yk -samples are generated from ft (y; v).
Because of the IID assumption, the likelihood ratio becomes
n n
f (Yk ; v)
L(Y1 ; : : : ;Yn ) = ∏
ft (Yk ; v)
= exp nψ(t) t ∑ Yk ; (8)
k=1 k=1
R
with ψ(t) = log f (y; v)ety dy. Variance reduction is obtained if
Vart (`ˆN ) Var(`ˆN ) , Vart (`ˆ1 ) Var(`ˆ1 )
, Et [(`ˆ1 )2 ] E[(`ˆ1 )2 ] , Et [(IA L(Y1 ; : : : ;Yn ))2 ] E[(IA )2 ]:
Because of (??), it is easy to show that the variance is minimized for t = (ψ 0 1 (a). In that
case the importance sampling estimator is logarithmically ef cient (also called asymptotically
optimal; see Rubino and Tuf n (2009; Chapter 4)):
log Et [(`ˆN )2 ]
lim = 2;
n!∞ log Et [`ˆN ]
where the subscript t means that the underlying probability is the change of measure.
Asymptotic optimality implies that RE(`ˆN ) grows subexponentially as n ! ∞, whereas for
CMC the relative error grows exponentially (see (??)).

The cross-entropy method

A general heuristic for constructing an importance sampling algorithm is to consider only a
parameterized family of changes of measures. Consider again problem (??), with PDF
f = f (y; v) where v is the parameter vector. Thus, let Θ be all feasible parameter vectors for
f . For any θ 2 Θ, the change of measure Pθ induces the (single-run) importance sampling
estimator
dP f (Y; v)
`ˆθ = H(Y) (Y) = H(Y) :
dPθ f (Y; θ )
The optimal change of measure is found by variance minimization. Since the estimators are
unbiased, it suf ces to minimize the second moment:
h f (Y; v) 2 i
min Eθ H(Y) :
θ 2Θ f (Y; θ )
Generally, this problem is hard. A successful approach is based on cross-entropy
minimization as explained in Rubinstein and Kroese (2004). First, consider the optimal
change of measure, resulting in a zero-variance estimator:
H(Y)dP(Y)
dPopt (Y) = : (9)
`
This change of measure is not implementable as it requires knowledge of the unknown
quantity `. The cross-entropy method nds Pθ by minimizing the Kullback-Leibler distance
(or cross-entropy) within the class of feasible changes of measure:
min D(dPopt ; dPθ );
θ 2Θ

10
where the cross-entropy is de ned by
h dPopt i h dPopt dPopt i
opt opt
D(dP ; dPθ ) = E log (Y) = Ev (Y) log (Y) :
dPθ dP dPθ
Substituting expression (??), and canceling constant terms and factors, the equivalent
cross-entropy problem becomes

max Ev [H(Y) log dPθ (Y)]:

θ 2Θ

There are several ways to solve this stochastic optimization problem. The original description
of the cross-entropy method for such problems proposes to solve the stochastic counterpart
iteratively, see Rubinstein and Kroese (2004). This approach has been applied successfully to
a variety of estimation and rare-event problems.

State-dependent importance sampling

The importance sampling algorithms described above were based on a static change of
measure; i.e, the samples are generated by a xed alternative statistical law; see (??). In
speci c problems, such as (??), the static importance sampling algorithm yields an ef cient
estimator. However, for many problems it is known that ef cient estimators require an
adaptive or state-dependent importance sampling algorithm (Juneja and Shahabudding, 2006).
To illustrate this concept, consider again the problem of estimating the level-crossing
probability (??). The Yk -variables are called jumps of a random walk (Sk )nk=0 , de ned by
S0 = 0, and for k 1: Sk = ∑kj=1 Y j = Sk 1 +Yk . Under a state-dependent change of measure,
the next jump Yk+1 might be generated from a PDF f (yjk + 1; Sk ); i.e., it depends on jump
time k + 1 and current state Sk . Hence, under the change of measure, the process (Sk )nk=0
becomes an inhomogeneous Markov chain. Given a generated sequence Y1 ; : : : ;Yn , the
associated likelihood ratio is
n
f (Yk ; v)
L(Y1 ; : : : ;Yn ) = ∏ :
k=1 f (Yk jk; Sk 1 )

The next question is: Which time-state dependent PDFs should be chosen for this kind of
change of measure? The criterion could be (i) variance minimization, (ii) cross-entropy
minimization, or (iii) ef ciency.

(ii). A small set of rare-event problems are suited to nd so-called zero-variance approximate
importance sampling algorithms, notably level-crossing problems with Gaussian jumps,
reliability problems, and certain Markov chains problems; see L'Ecuyer et al. (2010).

(iii). A cross-entropy minimization is applied after each state Sk for determining the PDF of
the next jump (Ridder and Taimre, 2009). The result is that when the level-crossing at
time n can be reached from state Sk just by following the natural drift, no change of
measure is applied. Otherwise, the next jump is drawn from an exponentially tilted PDF
with tilting factor t = (ψ 0 1 ((an Sk )=(n k)). This would be the static solution given
before when starting at time k = 0. This approach gives logarithmic ef ciency.

11
(iv). The method developed by Dupuis and Wang (2007) considers the rare-event problem as
an optimal control problem in a differential game. Applying dynamic programming
techniques while using large-deviations expressions, the authors develop logarithmically
ef cient importance sampling algorithms. This approach works also for rare events in
Jackson networks (Dupuis, Sezer, and Wang, 2007).

Markov chains
Many practical estimation problems in statistical systems (e.g., reliability, production,
inventory, queueing, communications) can be reformulated as a Markov model to estimate a
quantity ` = P(YT 2 F ). Let fYt : t = 0; 1; : : :g denote a discrete-time Markov chain with a
state space X with transition probabilities p(x; y); F X is a subset of states, and T is a
stopping time. A typical example is a system of highly reliable components where the
response of interest is the probability of a break down of the system.

Assume that the importance sampling is restricted to alternative probability measures P such
that the Markov chain property is preserved with transition probabilities p (x; y) satisfying
p(x; y) > 0 , p (x; y) > 0:
This constraint ensures the absolute continuity condition. Furthermore, assuming that the
initial distribution remains unchanged, the likelihood ratio of a simulated path of the chain
becomes simply
T 1
p(Yt ; Yt+1 )
L= ∏ :
t=0 p (Yt ; Yt+1 )
Thus, it suf ces to nd the importance-sampling transition-probabilities p (x; y). Considering
these probabilities as parameters, the method of cross-entropy is most convenient; Ridder
(2010) gives suf cient conditions to guarantee asymptotic optimality. However, many realistic
systems are modeled by Markov chains with millions of transitions, which causes several
dif culties: the dimensionality of the parameter space, the danger of degeneracy of the
estimation, and numerical under ow in the computations. Several approaches are proposed to
reduce the parameter space in the cross-entropy method (de Boer and Nicola, 2002; Kaynar
and Ridder, 2010).

Another approach to importance sampling in Markov chains approximates the zero-variance

probability measure Popt . It is known that this Popt implies transition probabilities of the form
γ(y)
popt (x; y) = p(x; y) ;
γ(x)
where γ(x) = P(YT 2 F jY0 = x). As these quantities are unknown (and in fact the subject of
interest), these zero-variance transition probabilities cannot be implemented. However,
approximations of the γ(x) probabilities may be considered (L'Ecuyer et al., 2010). Under
certain conditions this approach leads to strong ef ciency of the importance sampling
estimator.

SPLITTING
The splitting method may handle rare-event probability estimation. Unlike importance
sampling, the probability laws remain unchanged, but a drift to the rare event is constructed by

12
splitting (cloning) favorable trajectories, and terminating unfavorable trajectories. This idea
may be explained as follows. Consider a discrete-time Markov chain fYt : t = 0; 1; : : :g on a
state space X . Suppose that the chain has a regeneration state or set 0, a set of failure states
F , and a starting state y0 . The response of interest is the probability that the chain hits F
before 0. More formally, if T denotes the stopping time

T = infft : Yt 2 0 [ F g;

then
` = P(YT 2 F ):
The initial state y0 62 0 [ F may have either some initial distribution, or be xed and known.
The assumption is that ` is so small that CMC in impractical. Suppose that the state space is
partitioned into sets according to

X X1 X2 Xm = F ; (10)

with 0 2 X n X1 . Usually these sets are de ned through an importance function φ : X ! R,

such that for each k, Xk = fy : φ (y) Lk g for certain levels L1 L2 Lm , with
φ (0) = L0 < L1 . Now de ne stopping times Tk and associated events Ak by

Tk = infft : X(t) 2 0 [ Xk g; Ak = fYTk 2 Xk g:

Because of (??), clearly A1 A2 Am = A = fYT 2 F g. Thus the rare-event

probability ` = P(A) can be decomposed as a telescoping product:
m
` = P(A1 ) ∏ P(Ak jAk 1 ):
k=2

To estimate `, one might estimate all conditional probabilities P(Ak jAk 1) separately (say) by
`ˆk , which gives the product estimator
m
`ˆ = ∏ `ˆk ; (11)
k=1

where `ˆ1 estimates P(A1 ). The splitting method implements the following algorithm for
constructing the `ˆk estimators in a way that the product estimator is unbiased. In the initial
stage (k = 0), run N0 independent trajectories of the chain starting at the initial state y0 . Each
trajectory is run until either it enters X1 or it returns to 0, whatever come rst. Let R1 be the
number of “successful” trajectories; i.e., trajectories that reach X1 before 0. Then set
`ˆ1 = R1 =N0 . Consider stage k 1, and suppose that Rk trajectories have entered set Xk in
(k) (k)
entrance states Y1 ; : : : ; YRk (not necessarily distinct). Replicate (clone) these states, until a
sample of size Nk has been obtained. From each of these states, run a trajectory of the chain,
independently of the others. Each trajectory is run until either it enters Xk+1 or it returns to 0,
whatever come rst. Let Rk+1 be the number of successful trajectories, i.e., trajectories that
reach Xk+1 before 0. Then set `ˆk+1 = Rk+1 =Nk . This procedure is continued until all
trajectories have entered either F or returned to 0.

Recently this form of the splitting method has attracted a lot of interest (see the reference list
in Rubino and Tuf n (2009; Chapter 3)), both from a theoretical point of view analyzing the
ef ciency, and from a practical point of view describing several applications. The analysis

13
shows that the product estimator (??) is unbiased. Furthermore, the analysis of the ef ciency
of the splitting technique depends on the implementation of (a) selecting the levels, (b) the
splitting (cloning) of successful trajectories, and (c) the termination of unsuccessful
trajectories. Generally, the problem of solving these issues optimally is like choosing an
optimal change of measure in importance sampling. In fact, Dean and Dupuis (2008)
discusses this relationship when the model satis es a large deviations principle.

Concerning issue (c), the standard splitting technique terminates a trajectory that returns to the
regeneration state 0, or—in case of an importance function—when the trajectory falls back to
level L0 . This approach, however, may be inef cient for trajectories that start already at a high
level Lk . Therefore, there are several adaptations such as truncation (L'Ecuyer, Demers, and
Tuf n, 2007), RESTART (Villen-Altamirano, and Villen-Altamirano, 1994), and Russian
roulette principle (Melas, 1997).

Concerning issue (b), there are numerous ways to clone a trajectory that has entered the next
level, but the two ways implemented mostly are (i) xed effort, and (ii) xed splitting. Fixed
effort means that the sample sizes Nk are predetermined, and thus each of the Rk entrance
states at set Xk is cloned ck = bNk =Rk c times. The remaining Nk modRk clones are selected
randomly. An alternative is to draw Nk times at random (with replacement) from the Rk
available entrance states. Fixed splitting means that the splitting factors ck are predetermined,
and each of the Rk entrance states at set Xk is cloned ck times to give sample size Nk = ck Rk .

For a certain class of models, Glasserman et al. (1999) has shown that xed splitting gives
asymptotic optimality (as ` ! 0) when the number of levels m ln(`)=2, with sets Xk such
that P(Ak jAk 1 ) are all equal (namely, roughly equal to e 2 ) and splitting factors such that
ck P(Ak+1 jAk ) = 1. However, since ` and the P(Ak+1 jAk ) are unknown in practice, this result
can only be approximated. Moreover, one should take into account the amount of work or
computing time in the analysis; for example, Lagnoux (2006) determines the optimal setting
under a budget constraint of the expected total computing time.

Application to counting
Recently, counting problems have attracted the interest of the theoretical computer science and
the operations research communities. A standard counting problem is model counting, or
#SAT: how many assignments to boolean variables satisfy a given boolean formula consisting
of a conjunction of clauses? The related classical decision problem is: does there exist a true
assignment of the formula? Because exact counting is impracticable due to the exponential
increase in memory and running times, attention shifted to approximate counting—notably by
applying randomized algorithms. In this randomized setting, the counting problem is
equivalent to rare event simulation: let X be the set of all solutions of the problem, whose
number jX j is unknown and the subject of study. Assume that there is a larger set of points
X X with two properties:

2. the number of points jX j is known;

3. it is easy to generate uniformly points x 2 X .

14
Because
jX j
jX j = jX j;
jX j
it suf ces to estimate
jX j
`= = P(U 2 X );
jX j
where U is the uniform random vector on X . Typically ` is extremely small, and thus rare
event techniques are required. Splitting techniques with Markov chain Monte Carlo (MCMC)
simulations have been developed in Botev and Kroese (2008) and Rubinstein (2010) to handle
such counting problems.

QUASI MONTE-CARLO
Suppose that the performance function H in (??) is de ned on the d-dimensional unit
hypercube [0; 1)d , and the problem is to compute its expectation with respect to the uniform
distribution: Z
` = E(H(U)) = H(u) du:
[0;1)d

As was shown in the introduction, the variance of the CMC estimator `ˆNm using a sample size
N m equals σ 2 =(N m), where
Z
2
σ = H 2 (u) du `2 :
[0;1)d

Let PN = fu1 ; : : : ; uN g [0; 1)d be a deterministic point set that is constructed according to a
quasi-Monte Carlo rule with low discrepancy, such as a lattice rule (Korobov), or a digital net
(Sobol', Faure, Niederreiter); see Lemieux (2006). The quasi-Monte Carlo approximation of `
would be
N
∑ H(u j ):
j=1

This deterministic approach is transformed into Monte Carlo simulation by applying a

randomization of the point set. A simple randomization technique is the random shift:
generate m IID random vectors vi 2 [0; 1)d , i = 1; : : : ; m, and compute the quasi-Monte Carlo
approximations
N
`ˆi = ∑ H(u j + vi mod1):
j=1

Then the randomized quasi-Monte Carlo estimator using sample size N m is de ned by

1 m
`ˆ = ∑ `ˆi :
m i=1

The scrambling technique is based on permuting the digits of the coordinates u j . Other
techniques of randomizing quasi-Monte Carlo point sets are less used. The main property is
that when the performance function H is suf ciently smooth, these randomized quasi-Monte
Carlo methods give considerable variance reduction (Lemieux, 2006).

15
REFERENCES
Asmussen, S. and Glynn, P.W. (2007). Stochastic Simulation, Springer-Verlag, New York.
de Boer, P.T. and Nicola, V.F. (2002). Adaptive state-dependent importance sampling
simulation of Markovian queueing networks, European Transactions on Telecommunications,
13, 303-315.
Botev, Z.I. and Kroese, D.P. (2008). An ef cient algorithm for rare-event probability
estimation, combinatorial optimization, and counting, Methodology and Computing in Applied
Probability, 10, 471-505.
Chen, X., Ankenman, B. and Nelson, B.L. (2010). The effects of common random numbers
on stochastic kriging metamodels. Working Paper, Department of Industrial Engineering and
Management Sciences, Northwestern University, Evanston, IL USA.
Chick, S.E. and Gans, N. (2009). Economic analysis of simulation selection problems,
Management Science, 55, 421-437.
Dean, T. and Dupuis, P. (2008). Splitting for rare event simulation: a large deviations
approach to design and analysis, Stochastic Processes and their Applications, 119, 562-587.
Dupuis, P., Sezer, D., and Wang, H. (2007). Dynamic importance sampling for queueing
networks, Annals of Applied Probability, 17, 1306-1346.
Dupuis, P. and Wang, H. (2007). Subsolutions of an Isaacs equation and ef cient schemes for
importance sampling, Mathematics of Operations Research, 32, 723-757.
L'Ecuyer, P. (2006). Uniform random number generator. Handbook in Operations Research
and Management Science Vol 13: Simulation; S.G. Henderson and B.L. Nelson (Eds.),
Chapter 3, pp. 55-81.
L'Ecuyer, P., Blanchet, J.H., Tuf n, B., and Glynn, P.W. (2010). Asymptotic robustness of
estimators in rare-event simulation, ACM Transactions on Modeling and Computer
Simulation, 20, 1, article 6.
L'Ecuyer, P., Demers, V., and Tuf n B. (2007). Rare events, splitting, and quasi-Monte Carlo,
ACM Transactions on Modeling and Computer Simulation, 17, 2, article 9.
Glasserman, P. (2003). Monte Carlo Methods in Financial Engineering, Springer-Verlag, New
York.
Glasserman, P., Heidelberger, P., Shahabuddin, P., and Zajic, T. (1999). Multilevel splitting for
estimating rare event probabilities, Operations Research, 47, 585-600.
Juneja, S. and Shahabuddin, P. (2006). Rare-event simulation techniques: an introduction and
recent advances. Handbook in Operations Research and Management Science Vol 13:
Simulation; S.G. Henderson and B.L. Nelson (Eds.), Elsevier, Amsterdam, Chapter 11,
pp. 291-350.
Kaynar, B. and Ridder, A. (2010). The cross-entropy method with patching for rare-event
simulation of large Markov chains, European Journal of Operational Research, 207,
1380-1397.
Kelton, W.D., Sadowski, R.P., and Sturrock D.T. (2007). Simulation with Arena, Fourth
edition. Mc Graw-Hill, Boston.
Kleijnen, J.P.C. (2008). Design and Analysis of Simulation Experiments, Springer.
Lagnoux, A. (2006). Rare event simulation, Probability in the Engineering and Informational
Sciences, 20, 45-66.
Law, A.M. (2007). Simulation Modeling & Analysis, Fourth edition, McGraw-Hill, Boston.

16
Lemieux, C. (2006). Quasi-random number techniques. Handbook in Operations Research
and Management Science Vol 13: Simulation; S.G. Henderson and B.L. Nelson (Eds.),
Chapter 12, pp. 351-379.
Melas, V.B. (1997). On the ef ciency of the splitting and roulette approach for sensitivity
analysis. Proceedings of the 1997 Winter Simulation Conference, pp. 269-274.
Ridder, A. (2010). Asymptotic optimality of the cross-entropy method for Markov chain
problems, Procedia Computer Science, 1, 1565-1572.
Ridder, A. and Taimre, T. (2009). State-dependent importance sampling schemes via
minimum cross-entropy. To appear in Annals of Operations Research.
Rubino G. and Tuf n, B. (Eds.). (2009). Rare event simulation using Monte Carlo methods,
Wiley.
Rubinstein, R.Y. (2010). Randomized algorithms with splitting: Why the classic randomized
algorithms do not work and how to make them work, Methodology and Computing in Applied
Probability, 12, 1-50.
Rubinstein, R.Y. and Kroese, D.P. (2004). The cross-entropy method: a uni ed approach to
combinatorial optimization, Monte-Carlo simulation and machine learning, Springer.
Rubinstein, R.Y. and Kroese, D.P. (2008). Simulation and the Monte Carlo method, Wiley.
Rubinstein, R.Y. and Marcus, R. (1985). Ef ciency of multivariate control variables in Monte
Carlo simulation, Operations Research, 33, 661-667.
Song, W.T. and Chiu, W. (2007). A ve-class variance swapping rule for simulation
experiments: a correlated-blocks design, IIE Transactions, 39, 713-722.
Villen-Altamirano, M. and Villen-Altamirano, J. (1994). RESTART: a straightforward method
for fast simulation of rare events. Proceedings of the 1994 Winter Simulation Conference,
pp. 282-289.

View publication stats

Richards & Moore
No ratings yet
Richards & Moore
37 pages
MLMC Formulation
No ratings yet
MLMC Formulation
70 pages
Convergence Analysis of Multifidelity Monte Carloestimation
No ratings yet
Convergence Analysis of Multifidelity Monte Carloestimation
24 pages
ProbNum_GilP_PF14
No ratings yet
ProbNum_GilP_PF14
342 pages
1304.5472v1
No ratings yet
1304.5472v1
20 pages
728852
No ratings yet
728852
124 pages
Variance Reduction- Antithetic Variates
No ratings yet
Variance Reduction- Antithetic Variates
60 pages
Artificial Intelligence In Medicine 1st Edition Thompson Stephan instant download
No ratings yet
Artificial Intelligence In Medicine 1st Edition Thompson Stephan instant download
78 pages
Monte Carlo
100% (1)
Monte Carlo
123 pages
0
No ratings yet
0
8 pages
7 Financial Data
No ratings yet
7 Financial Data
42 pages
ch6
No ratings yet
ch6
22 pages
10 montecarlo
No ratings yet
10 montecarlo
10 pages
Divisor Canopen PDF
No ratings yet
Divisor Canopen PDF
8 pages
Week 7 Lecture
No ratings yet
Week 7 Lecture
54 pages
Monte Carlo Final
No ratings yet
Monte Carlo Final
72 pages
1734563898788
No ratings yet
1734563898788
11 pages
Dootika
No ratings yet
Dootika
40 pages
Aci Technical Publication
No ratings yet
Aci Technical Publication
172 pages
Week 8 Lecture New
No ratings yet
Week 8 Lecture New
49 pages
Applications of Monte Carlo Methods in Statistical Inference Usin
No ratings yet
Applications of Monte Carlo Methods in Statistical Inference Usin
57 pages
u1-5
No ratings yet
u1-5
57 pages
Balr 1
No ratings yet
Balr 1
30 pages
Monte Carlo Methods in Bayesian Computation Full Text Download
100% (15)
Monte Carlo Methods in Bayesian Computation Full Text Download
16 pages
Siggraph03
No ratings yet
Siggraph03
24 pages
CM Forrester Total Economic Impact Ansible Analyst Paper f13019 201806 en
No ratings yet
CM Forrester Total Economic Impact Ansible Analyst Paper f13019 201806 en
20 pages
Beamersimulating RVs PDF
No ratings yet
Beamersimulating RVs PDF
115 pages
Renu Vedwal New Format Updated
No ratings yet
Renu Vedwal New Format Updated
4 pages
Escaner
No ratings yet
Escaner
2 pages
Monte Carlo Methods in Financial Engineering
No ratings yet
Monte Carlo Methods in Financial Engineering
6 pages
Image-Based Service Recommendation System A JPEG-Coefficient RFs Approach
No ratings yet
Image-Based Service Recommendation System A JPEG-Coefficient RFs Approach
11 pages
Resum Gamtek
No ratings yet
Resum Gamtek
11 pages
Numerical Methods II: Prof. Mike Giles
No ratings yet
Numerical Methods II: Prof. Mike Giles
22 pages
Notessc w04
No ratings yet
Notessc w04
8 pages
Chapter A
No ratings yet
Chapter A
18 pages
Chap5
No ratings yet
Chap5
51 pages
stats 102c notes
No ratings yet
stats 102c notes
6 pages
Chapter 3 - Variance Reduction Methods
No ratings yet
Chapter 3 - Variance Reduction Methods
20 pages
Empirical Finance7
No ratings yet
Empirical Finance7
30 pages
CH Var Basic PDF
No ratings yet
CH Var Basic PDF
48 pages
Electronic Devices
No ratings yet
Electronic Devices
71 pages
How To Write An Argumentative
No ratings yet
How To Write An Argumentative
18 pages
Somnotouch Resp: Nstruction Manual
No ratings yet
Somnotouch Resp: Nstruction Manual
102 pages
Chapter 1 - Basic Principles of Monte Carlo Methods
No ratings yet
Chapter 1 - Basic Principles of Monte Carlo Methods
25 pages
Variance Reduction Techniques 1
No ratings yet
Variance Reduction Techniques 1
48 pages
H Arsham 1989 Inverse Problems 5 927
No ratings yet
H Arsham 1989 Inverse Problems 5 927
9 pages
Monte Carlo: Basics
No ratings yet
Monte Carlo: Basics
76 pages
Chapter 3: Variance Reduction
No ratings yet
Chapter 3: Variance Reduction
9 pages
Monte Carlo Method
100% (1)
Monte Carlo Method
68 pages
Odf-SC1 05 MonteCarlo
No ratings yet
Odf-SC1 05 MonteCarlo
22 pages
Estimations
100% (1)
Estimations
183 pages
Monte Carlo Simulation
No ratings yet
Monte Carlo Simulation
3 pages
Job CV
No ratings yet
Job CV
2 pages
Monte Carlo Gradient Estimation PDF
No ratings yet
Monte Carlo Gradient Estimation PDF
59 pages
Monte Carlo Simulations Using Matlab: Vincent Leclercq, Application Engineer Email: Vincent - Leclercq@
No ratings yet
Monte Carlo Simulations Using Matlab: Vincent Leclercq, Application Engineer Email: Vincent - Leclercq@
29 pages
G7 Euclid Summary of Final Rating in English Math and Science
No ratings yet
G7 Euclid Summary of Final Rating in English Math and Science
2 pages
MATH F311 (Introduction To Topology) : BITS Pilani
No ratings yet
MATH F311 (Introduction To Topology) : BITS Pilani
15 pages
UPS Battery TR 1& 2 - FBATT12 - PEM - 0003 - 01
No ratings yet
UPS Battery TR 1& 2 - FBATT12 - PEM - 0003 - 01
7 pages
VAR Workbook
No ratings yet
VAR Workbook
171 pages
D166104GS
No ratings yet
D166104GS
1 page
Basic Excel MCMC
No ratings yet
Basic Excel MCMC
20 pages
Scaling PDF
No ratings yet
Scaling PDF
72 pages
MVSG
No ratings yet
MVSG
2 pages
Variance Reduction Techniques
No ratings yet
Variance Reduction Techniques
19 pages
D Key
No ratings yet
D Key
6 pages
LP 1 SS 19
No ratings yet
LP 1 SS 19
15 pages
Reading Assnmnt1 PDF
No ratings yet
Reading Assnmnt1 PDF
1 page
Low Variance Sampling Techniques For Particle Filter
No ratings yet
Low Variance Sampling Techniques For Particle Filter
7 pages
How MBO Helped Fix A Troubled Project
No ratings yet
How MBO Helped Fix A Troubled Project
5 pages
Monte Carlo Methods in Finance PDF
100% (2)
Monte Carlo Methods in Finance PDF
75 pages
Monte Carlo Experiments: Version: 4-10-2010, 20:37
No ratings yet
Monte Carlo Experiments: Version: 4-10-2010, 20:37
6 pages
Monte Carlo Techniques For Bayesian Statistical Inference - A Comparative Review
No ratings yet
Monte Carlo Techniques For Bayesian Statistical Inference - A Comparative Review
15 pages
Variance Reduction Technique
No ratings yet
Variance Reduction Technique
51 pages
Welding by Roop Lal Rana
No ratings yet
Welding by Roop Lal Rana
21 pages
Purpose of Visit: - Action Taken: - : Service Report Page 1/3 Global Peace
No ratings yet
Purpose of Visit: - Action Taken: - : Service Report Page 1/3 Global Peace
4 pages
MC RiskManage
No ratings yet
MC RiskManage
19 pages
Monte Carlo Simulation: Ra Jesh Pi Ryani South Asian University
No ratings yet
Monte Carlo Simulation: Ra Jesh Pi Ryani South Asian University
21 pages
Mister Magic
100% (1)
Mister Magic
3 pages
2-Intermediate Code Generation-Quadruple
No ratings yet
2-Intermediate Code Generation-Quadruple
19 pages
K33850 Steam Boiler Manual REV A
No ratings yet
K33850 Steam Boiler Manual REV A
10 pages
RF Unit VSWR Threshold Alarm
100% (2)
RF Unit VSWR Threshold Alarm
3 pages
Excess Carriers in Semiconductors
No ratings yet
Excess Carriers in Semiconductors
79 pages
MCMC Final Edition
No ratings yet
MCMC Final Edition
17 pages
MFDC 9 D
No ratings yet
MFDC 9 D
16 pages
Module 2 - Sources of History
No ratings yet
Module 2 - Sources of History
2 pages
Data Privacy - NPC Case Digests (2019-2022) by Atty. Paolo Javier
100% (14)
Data Privacy - NPC Case Digests (2019-2022) by Atty. Paolo Javier
39 pages
ស៊ីម ខេមរា20230212 073338 PDF
No ratings yet
ស៊ីម ខេមរា20230212 073338 PDF
1 page
Monte Carlo Methods: Jonathan Pengelly February 26, 2002
No ratings yet
Monte Carlo Methods: Jonathan Pengelly February 26, 2002
18 pages
Sdarticle (1) Random Monte Carlo
No ratings yet
Sdarticle (1) Random Monte Carlo
28 pages
Monte Carlo Integration Lecture
No ratings yet
Monte Carlo Integration Lecture
8 pages
Basic Monte Carlo Techniques
No ratings yet
Basic Monte Carlo Techniques
10 pages
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Probability Theory: A Concise Course
From Everand
Probability Theory: A Concise Course
Y. A. Rozanov
4/5 (2)
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)

Variance Reduction Techniques in Monte Carlo Methods: SSRN Electronic Journal November 2010

Uploaded by

Variance Reduction Techniques in Monte Carlo Methods: SSRN Electronic Journal November 2010

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Variance Reduction Techniques in Monte Carlo Methods

Article in SSRN Electronic Journal · November 2010

Jack P.C. Kleijnen Ad Ridder

SEE PROFILE SEE PROFILE

Kriging and Gaussian process View project

The user has requested enhancement of the downloaded file.

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO

By J. P. C. Kleijnen, A. A. N. Ridder, R.Y. Rubinstein

(1) Tilburg University, Tilburg, The Netherlands, [email protected]

Keywords: common random numbers, antithetic random numbers, importance sampling,

JEL: C0, C1, C9

To illustrate, consider the one-dimensional version of (??):

Then `ˆN is an unbiased estimator for `, and the standard error is

EFFORT(`ˆN ) < EFFORT(`ˆN ):

Estimating the Probability of Rare Events

ANTITHETIC AND COMMON RANDOM NUMBERS

(c). The performance function H is monotone.

` = E(H1 (Y)) E(H2 (Y)):

To estimate `, two choices produce an unbiased estimator:

Var(`ˆN ) < Var(`ˆN ):

To prove this inequality, denote Xi = Hi (Yi ). Then `ˆ = X1 X2 is an unbiased estimator of `

A well-known application of control variates is pricing of Asian options. The payoff of an

As control variate may be C = e rT max(0; ST K) whose expectation is readily available

E(E(XjZ)) = E(X) and Var(X) = E(Var(XjZ)) + Var(E(XjZ)): (6)

A typical example is a level-crossing probability of a random number of variables:

Exponential change of measure

`n = P(An ) with An = fY1 + +Yn > nag; (7)

The cross-entropy method

max Ev [H(Y) log dPθ (Y)]:

State-dependent importance sampling

Another approach to importance sampling in Markov chains approximates the zero-variance

with 0 2 X n X1 . Usually these sets are de ned through an importance function φ : X ! R,

Tk = infft : X(t) 2 0 [ Xk g; Ak = fYTk 2 Xk g:

Because of (??), clearly A1 A2 Am = A = fYT 2 F g. Thus the rare-event

2. the number of points jX j is known;

3. it is easy to generate uniformly points x 2 X .

This deterministic approach is transformed into Monte Carlo simulation by applying a

View publication stats

You might also like