0% found this document useful (0 votes)
68 views

Simulation

.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Simulation

.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 384

Simulation Modelling

IE 519

Contents
Input Modelling 3
Random Number Generation
Generating Random Variates
Output Analysis
134
Resampling Methods 205
Comparing Multiple Systems
Simulation Optimization
248
Metamodels
278
Variance Reduction 292
Case Study
350

IE 519

41
80

219

Input Modelling

IE 519

Input Modelling
You make custom Widgets
How do you model the input
process?
Is it deterministic?
Is it random?
Look at some data
IE 519

Orders
1/5/2004
1/12/2004
1/20/2004
1/29/2004

2/3/2004
2/15/2004
2/19/2004
2/25/2004
2/28/2004

3/6/2004
3/15/2004
3/27/2004
3/31/2004

4/10/2004
4/14/2004
4/17/2004
4/21/2004
4/22/2004
4/28/2004

5/2/2004
5/3/2004
5/24/2004
5/26/2004

6/4/2004
6/15/2004

Now what?
IE 519

Histogram
8
7
6
5
4
3
2
1
0

IE 519

Other Observations
Trend?

Stationary or non-stationary process

Seasonality

May require multiple processes

IE 519

Choices for Modelling


Use the data directly (trace-driven
simulation)
Use the data to fit an empirical
distribution
Use the data to fit a theoretical
distribution

IE 519

Assumptions
To fit a distribution, the data should
be drawn from IID observations
Could it be from more than one
distribution?

Statistical test

Is it independent?

Statistical test
IE 519

Activity I
Hypothesize families of
distributions

Look at the data


Determine what is a reasonable
process
Summary statistics
Histograms
Quantile summaries and box plots
IE 519

10

Activity II
Estimate the parameters

Maximum likelihood estimator (MLE)


Sometimes a very simple statistics
Sometimes requires numerical
calculations

IE 519

11

Activity III
Determine quality of fit

Compare theoretical distribution with


observations graphically
Goodness of fit tests
Chi-square tests
Kolmogorov-Smirnov test

Software

IE 519

12

Chi-Square Test
Formal comparison of a histogram and
the probability density/mass function
Divide the range of the fitted
distribution
intervals
[a0 , ainto
1 ), [ a1 , a2 ),..., [ ak 1 , ak )
Count the number of observations in
each interval

N j Number of X ' s in [a j 1 , a j )
IE 519

13

Chi-Square Test
Compute
the
expected
proportion
a

pj

f ( x)dx

p ( x )

for continuous data

a j 1

a j 1 xi a j

Test statistic is
k

for discrete data

j 1

np j

np j

Reject if too large


IE 519

14

How good is the data?


Assumption of IID observations
Sometimes time-dependent (nonstationary)
Assessment

Correlation plot
Scatter diagram
Nonparametric tests
IE 519

15

Correlation Plot
Calculate and plot the sample
correlation
j Correlation of observations j observations apart
H0 : j 0

IE 519

16

Scatter Diagram
Plot pairs

X i , X i 1

Should be scattered randomly


through the plane
If there is a pattern then this
indicates correlation

IE 519

17

Multiple Data Sets


Often you have multiple data sets (e.g.,
different days, weeks, operators)
X 11 , X 12 ,..., X 1n1
X 21 , X 22 ,..., X 2 n1

X k1 , X k 2 ,..., X 1nk
Is the data drawn from the same process
(homogeneous) and can thus be combined?
Kruskal-Wallis test
IE 519

18

Kruskal-Wallis (K-W)
Statistic
Assign rank 1 to the smallest
observation, rank 2 to the second
k
smallest, etc
n ni
i 1
Calculate
R X ij Rank of X ij
ni

Ri R X ij
j 1

k
Ri2
12
T
3( n 1)

n(n 1) i 1 ni

IE 519

19

K-W Test
The null hypothesis is
H0: All the population distribution are identical
H1: At least one is larger than at least one other
We reject H0 at level if

2
k 1,1

In other words, the test statistic follows a chisquare distribution with k-1 degrees of freedom

IE 519

20

Absence of Data
We have assumed that we had data
to fit a distribution
Sometimes no data is available
Try to obtain minimum, maximum,
and mode and/or mean of the
distribution

Documentation
SMEs
IE 519

21

Triangular Distribution

IE 519

22

Symmetric Beta
Distributions
==2

==5

==3

==10

IE 519

23

Skewed Beta Distributions


=2, =4

IE 519

24

Beta Parameters
Mean

(b a )
a

Mode

1 b a
c a
2

Estimates
( a )(2c a b)

(c )(b a )
(b )

a
IE 519

25

Benefits of Fitting a
Parametric Distribution
We have focused mainly on the approach
where we fit a distribution to data
Benefits:

Fill in gaps and smooth data


Make sure tail behavior is represented
Extreme events are very important to the simulation

but may not be represented

Can easily incorporate changes in the input


process
Change mean, variability, etc.

Reflect dependencies in the inputs

IE 519

26

What About Dependencies


Assumed so far an IID process
Many processes are not:

A customer places a monthly order. Since the


customer keeps inventory of the product, a large
order is often followed by a small order
A distributor with several warehouses places
monthly orders, and these warehouses can supply
the same customers
The behavior of customers logging on to a web site
depends on age, gender, income, and where they
live

Do not ignore it!


IE 519

27

Solutions
A customer places a monthly order.

Should use a time-series model that


captures the autocorrelation

A distributor with several warehouses

Need a vector time-series model

Customers logging on to a web site

Need a random vector model where each


component may have a different distribution

IE 519

28

Taxonomy of Input Models

Examples of models

Univariate

Discrete
Continuous
Mixed

Binomial, etc.
Normal, gamma, beta, etc.
Empirical/Trace-driven

Multivariate

Discrete
Continuous
Mixed

Independent binomial
Multivariate normal
Bivariate-exponential

Time-independent

Discrete-state Markov chains (stationary?)


Discrete-time

Cont.-state
Discrete-state

Stochastic Processes
Continuous-time

Cont.-state

IE 519

Time-series models
Poisson process (stationary?)
Markov process

29

What if it Changes over


Time?
Do not ignore it!
Non-stationary input process
Examples:

Arrivals of customers to a restaurant


Arrivals of email to a server
Arrivals of bug discovery in software

Could model as nonhomogeneous


Poisson process
IE 519

30

Goodness-of-Fit Test
The distribution fitted is tested using
goodness-of-fit tests (GoF)
How good are those tests?
The null hypothesis is that the data
is drawn from the chosen
distribution with the estimated
parameters
Is it true?
IE 519

31

Power of GoF Tests


The null hypothesis is always false!
If the GoF test is powerful enough then it
will always be rejected
What we see in practice:

Few data points:


no distribution is rejected
A great deal of data:
all distributions are rejected

At best, GoF tests should be used as a


guide
IE 519

32

Input Modeling Software


Many software packages exist for
input modeling (fitting distributions)
Each has at least 20-30 distributions
You input IID data, the software
gives you a ranked list of
distributions (according to GoF tests)
Pitfalls?
IE 519

33

Why Fit a Distribution at


All?
There is a growing sentiment that
we should never fit distributions
(not consensus, just growing)
A couple of issues:

You dont always benefit from data


Fitting distribution is misleading

IE 519

34

Is Data Reality
Data is often

Distorted
Poorly communicated, mistranslated or recorded

Dated
Data is always old by definition

Deleted
Some of the data is often missing

Dependent
Often only summaries, or collected at certain times

Deceptive
This may all be on purpose!

IE 519

35

Problems with Fitting


Fitting an input distribution can be
misleading for numerous reasons

There is rarely a theoretical justification for


the distribution. Simulation is often sensitive
to the tails and this is where the problem is!
Selecting the correct model is futile
The model gives the simulation practitioner
a false sense of the model being welldefined
IE 519

36

Alternative
Use empirical/trace-driven
simulation when there is sufficient
data
Treat other cases as if there is no
data, and use beta distribution
IE 519

37

Empirical Distribution
Observations
X 1 , X 2 ,..., X n
Empirical distribution function (CDF)
Number of X i x

FX ( x)
n
or we can order the observations X (1) X ( 2 ) ... X ( n ) and define
0

x X (i )
i 1

FX ( x)

n 1 (n 1) X ( i 1) X ( i )
1

x X (1)
X (i ) x X (i 1)
X (n) x

IE 519

38

Beta Distribution Shapes

IE 519

39

What to Do?
Old rule of thumb based on
number of data points available:
<20
: Not enough data to fit
21-50 : Fit, rule out poor choices
50-200 : Fit a distribution
>200 : Use empirical distribution
IE 519

40

Random Number
Generation

IE 519

41

Random-Number
Generation
Any simulation with random
components requires generating a
sequence of random numbers
E.g., we have talked about arrival times,
service times being drawn from a
particular distribution
We do this by first generating a random
number (uniform between [0,1]) and
then transforming it appropriately
IE 519

42

Three Alternatives
True random numbers

Throw a dice
Not possible to do with a computer

Pseudo-random numbers

Deterministic sequence that is statistically


indistinguishable from a random sequence

Quasi-random numbers

A regular distribution of numbers over the


desired interval
IE 519

43

Why is this Important?


Validity

The simulation model may not be valid


due to cycles and dependencies in the
model

Precision

You can improve the output analysis by


carefully choosing the random numbers
IE 519

44

Pseudo-Random Numbers
Want an iterative algorithm that
outputs numbers on a fixed interval
When we subject this sequence to a
number of statistical test, we cannot
distinguish it from a random
sequence
In reality, it is completely
deterministic
IE 519

45

Linear Congruential
Generators (LCG)
Introduced in the early 50s and still in
very wide use today
Recursive formula

Z i (aZ i 1 c) mod m
a multiplier
c increment
m modulus
Z 0 seed

Every number is determined


by these four values
IE 519

46

Transform to Unit Uniform


Simply divide by m

Zi
Ui
m
What values can we take?

IE 519

47

Examples
Z i (11Z i 1 ) mod 16, Z 0 1
Z i (3Z i 1 ) mod 13, Z 0 1
Z i ( Z i 1 12) mod 16, Z 0 1
IE 519

48

Characteristics
All LCGs loop
The length of the cycle is the period
LCGs with period m have full period
This happens if and only if

The only positive integer that divides both m


and c is 1
If q is a prime that divides m, then q divides
a-1
If 4 divides m then 4 divides a-1

IE 519

49

Types of LCGs
If c=0 then it is called
multiplicative LCG, otherwise
mixed LCG
Mixed and multiplicative LCG
behave rather differently

IE 519

50

Comments on Parameters
Mixed Generator

Want m to be large
A good choice is m = 2b, where b is the number of
bits
Obtain full period if c is odd and a-1 is divisible
by 4

Multiplicative LCGs

Simpler
Cannot have full period (first condition cannot
be satisfied)
Still an attractive option

IE 519

51

Performance Tests
Empirical tests
Use the RNG to generate some
numbers and then test the null
hypothesis
H0: The sequence is IID U(0,1)

IE 519

52

Test 1: Chi-Square Test


Similar to before:
GenerateU1 , U 2 ,..., U n
Split [0,1] into k subintervals (k 100 )
Test statistic is
2
k
k
n
2
f j
n j 1
k
f j Number of U i ' s in jth subinterval
With k-1 degrees of freedom
IE 519

53

Test 2: Serial Test


Consider U1 U1 ,U 2 ,..., U d ,

U 2 U d 1 , U d 2 ,..., U 2 d ,...

Similar to before
d

f j1 j2 jd

n
f j1 j2 jd d

k
j1 1 j2 1
jd 1
Number of U i ' s in the subinterval

k

n
2

IE 519

54

Test 3: Runs Test


Calculate for
U1 , U 2 ,..., U n
for i 1,2,...,5
number of runs up of length i
ri
number of runs up of length 6 i 6

Test statistic (chi-square w/6 d.f.)


1
R
N

a r nb r
6

i 1 j 1

ij

nb j

Where the a and b values are given


empirically
IE 519

55

Test 4: Correlation Test


For uniform variables
1
1
E U , Var U
2
12
C j Cov U i , U i j

1
E U U
4
12 E U U 3

E U iU i j E U i E U i j
i

i j

i j

IE 519

56

Test 4: Correlation Test


Empirical estimate is
12 h
j
U1 kjU1 ( k 1) j 3

h 1 k 0
h (n 1) / j 1
13h 7
Var j
(h 1) 2
Test statistic
j
Aj
Var j
Approximately standard normal
IE 519

57

Passing the Test


A RNG with long period that passes a
fixed set of statistical test is no
guarantee of this being a good RNG
Many commonly used generators are
not good at all, even though they
pass all of the most basic tests

IE 519

58

Classic LCG16807
Multiplicative LCGs cannot have full period,
but they can get very close
Z i 16807 Z i 1 mod 231 1
Z
U i i 31
2 1
Has period of 231-2, that is, best possible
Dates back to 1969
Suggested in many simulation texts and was
(is) the standard for simulation software
Still in use in many software packages
IE 519

59

Java RNG
Mixed LCG with full period
Z i (25214903917 Z i 1 11) mod 2 48

27
2

Ui

Z 2i

Z 2i 1

2 22
2 21

253

Variant of the old rand48() Unix LCG


Ui

Zi

2 48
IE 519

60

Two more LCGs


VB

Z i (1140671485 Z i 1 12820163) mod 2 24


Ui

Zi

2 24

Excel

U i (9821.0 U i 1 0.211327 ) mod 1

IE 519

61

Simple Simulation Tests


Collision Test

Divide [0,1) into d equal intervals


Generate n points in [0,1)t
C=Number of times a point falls in a box
that already has a point (collision)

Birthday Spacing Test

I (1) I ( 2 ) I ( n )
Have k boxes, labeled with
S j I ( j 1) I ( j )
Define the spacing
ConsiderY j : S j 1 S j , j 1,..., n 2

IE 519

62

Performance: Collision
After 215 numbers, VB starts failing
After 217 numbers, Excel starts failing
After 219 numbers, LCG16807 starts failing
The Java RNG does OK up to at least 220
numbers
Note that this means that a clear pattern
is observed from the VB RNG with less
than 100,000 numbers generated!
IE 519

63

Performance: B-day
Spacing
After
After
After
After

210 numbers, VB starts failing


214 numbers, Excel starts failing
214 numbers, LCG16807 starts failing
218 numbers, Java starts failing

For this test, the VB RNG is only good for


about 1000 numbers!
The performance gets even worse if we
look at less significant digits.
IE 519

64

Combined LCG
A better RNG is obtained as follows:
Z1,i (a1,1 Z1,i 1 a1, 2 Z1,i 2 ... a1,k Z1,i k ) mod m1
Z1,i (a2,1 Z 2,i 1 a2, 2 Z 2,i 2 ... a2,k Z 2,i k ) mod m2
Z1,i

Ui

m1

Z 2 ,i

m2

mod 1

Recommended parameters (k=3)

a11 , a12 , a13 0,1403580,810728


a11 , a12 , a13 527612,0,1370589
m1 232 209
m2 232 22853

Cycle length of 2191 with good structure


IE 519

65

Why do RNGs Fail?


We have seen that many
commonly used RNGs fail
simulation tests, even though they
past the standard empirical tests
Why do these RNGs fail?
Need to analyze the structure of
the RNG
IE 519

66

Lattice Structure
For all LCGs, the numbers
generated fall in a fixed number of
planes
We want this to be as many planes
as possible and fill-up the space
This should be true in many
dimensions
IE 519

67

Example: Two Full-Period


LCGs

IE 519

68

LCG RANDU in 3
Dimensions

IE 519

69

Theoretical Tests
Based on analyzing the structure
of the numbers that can be
generated
Lattice test
Spectral test

IE 519

70

Selecting the Seed


Z i (11Z i 1 ) mod 16

Say we need two independent sequences of 8 numbe


Select seed values 1 and 15

1 10 15 0 13 6 11 12 9 2 7 8 5 14 3 4 1
Seed = 15
Seed = 1

Good RNGs will have


precomputed seed values
IE 519

71

Streams and Substreams


A segment corresponding to a seed is
usually called a stream
Also want to be able to get independent
substreams of each stream
Example: Assign each stream to
generating one type of numbers & and
use each substream for independent
replications
Requires very long period generators,
and precomputed streams
IE 519

72

Z i (11Z i 1 ) mod 16
Analysis of RNG
1 10 15 0 13 6 11 12 9 2
0.06

0.63

0.94

0.00

0.81

0.38

0.69

0.75

0.56

0.13

3.5
3
2.5
2
1.5
1
0.5
0
[0,0.25)

[0.25,0.5)

[0.5,0.75)

IE 519

[0.75,1)

73

Do We Need Randomness?
For certain applications, definitely
For simulation, maybe not always
Quasi-random numbers
Say we want to estimate an
expected value
f u du
[ 0 ,1) s

IE 519

74

Monte Carlo Estimate


Using n independent simulation runs
1 n 1
f u i
n i 0

2
Var
n
n ( ) / N (0,1)

Error converges at rate n


IE 519

75

Quasi-Monte Carlo
Replace the random points with a set of
points that cover [0,1)s more uniformly

IE 519

76

Discussion
By using Quasi-random numbers,
we are able to achieve faster
convergence rate
When estimating an integral, real
randomness is not really an issue
What about discrete event
simulation?
IE 519

77

Discussion
Generating random numbers is
important to every simulation
project

Validity of the simulation


Precision of the output analysis

Not all RNG are very good

IE 519

78

Discussion
Problems

Too short a period (period of 231 not sufficient)


Unfavorable lattice structure
Numbers generated by RANDU() fall on 15 planes in R2

Inability to get truly independent subsequences


Need streams (segments), and substreams

Should choose a RNG that passes both


empirical and theoretical tests, has a very
long period, and allows us to get good
streams
IE 519

79

Generating Random
Variates

IE 519

80

Generating Random
Variates
Say we have fitted an exponential
distribution to interarrival times of
customers
Every time we anticipate a new customer
arrival (place an arrival even on the
events list), we need to generate a
realization of of the arrival times
Know how to generate unit uniform
Can we use this to generate exponential?
(And other distributions)
IE 519

81

Two Types of Approaches


Direct

Obtain an analytical expression


Inverse transform
Requires inverse of the distribution function

Composition & Convolution


For special forms of distribution functions

Indirect

Acceptance-rejection
IE 519

82

Inverse-Transform Method

IE 519

83

Formulation
Algorithm
1. Generate U ~ U (0,1)
1

2. Return X F (U )

Proof

P X x P F 1 (U ) x

P (U F ( x))
F ( x)
IE 519

84

Example: Weibull

IE 519

85

Example: Exponential
x0

x0

F ( x)

1 e x /

IE 519

86

Discrete Distributions

IE 519

87

Formulation
Algorithm
X can take values x1 , x2 ,...
1. Generate U ~ U (0,1)
2. Return X min I : U F xI
Proof: Need to show
P X xi p( xi )i
IE 519

88

Continuous, Discrete,
Mixed
Algorithm
1. Generate U ~ U (0,1)
2. Return X min x : F ( x) U

IE 519

89

Discussion: Disadvantages
Must evaluate the inverse of the
distribution function

May not exist in closed form


Could still use numerical methods

May not be the fastest way

IE 519

90

Discussion: Advantages
Facilitates variance reduction
X 1 F11 (U1 )
X 2 F21 (U 2 )
Can select
U1 , U 2 independent
U1 U 2
U1 1 U 2
Ease of generating truncated distributions
IE 519

91

Composition

F ( x) p j F j ( x),
Assume that
j 1

p
j 1

Algorithm
1.

2.

Generate a positive random integer,


such that P(J=j)=pj
Return X with distribution Fj
IE 519

92

Convolution
Assume thatX Y1 Y2 ... Ym
(where the Ys are IID with CDF G)
Algorithm
1. Generate Y1 Y2 ... Ym IID each with CDF G
2. Return X Y1 Y2 ... Ym

IE 519

93

Acceptance-Rejection
Method
Specify a function that majorizes the
t ( x) f ( x), x
density

r ( x) t ( x)

New density function

1.
Generate Y with density r
Algorithm
2. Generate U independint of Y
3. If U f (Y ) t (Y ), return X Y .
Otherwise go back to Step 1.
IE 519

t ( x)dx

94

Example:

IE 519

95

Example: More Efficient

IE 519

96

Simple Distributions
Uniform
X a (b a )U
Exponential
X ln U
m-Erlang

X
ln
m

U
i

i 1

IE 519

97

Gamma
Distribution function

F ( x)

x
1

x /
1 e

j!
j 0
0

x0
otherwise

No closed-form inverse
Note that ifX ~ gamma( ,1) then
X ~ gamma( , )
IE 519

98

Gamma(,1) Density

IE 519

99

Gamma(,1)
Gamma(1,1) is exponential(1)
0<<1: Acceptance-rejection with

x0
0
x 1
t ( x)
0 x 1
( )
x
e ( ) 1 x

This majorizes the Gamma (,1) density,


but can we generate random variates?
IE 519

100

Gamma(,1), 0<<1
The integral of majorizing function
x
e
x 1
0 t ( x)dx 0 ( ) dx 1 ( )dx
b
( ) ,
b (e ) / e
1

New density

x0
0 1
x
r ( x)
0 x 1
b
e x
1 x

IE 519

101

Gamma(,1), 0<<1
The distribution function is
x
x

R ( x) r ( y )dy b x
0
1 e

Invert

R 1 (u )

bu

0 x 1
1 x

1/

b(1 u )
ln

1
u
b
otherwise

IE 519

102

Gamma(,1), 0<<1
1.Generate U1~U(0,1) and let P=bU1. If P>1,
go to step 3. Otherwise go to step 2
2.Let Y=P1/, and generate U2~U(0,1). If U2eY,
return X=Y. Otherwise, go to step 1.
3.Let Y=-ln[(b-P)/], and generate U2~U(0,1).
If U2Y-1, return X=Y. Otherwise, go to step
1.
IE 519

103

Gamma(,1), 1<
Acceptance-rejection with
t x

2 1

c 4 e ( )
IE 519

104

Gamma(,1), 1<
Distribution function (loglogistics)
x
R( x)

x
Inverse1

u
R (u )

1 u

1/

IE 519

105

Normal
Distribution function does not have closed
form (so neither does the inverse)
Can use numerical methods for inversetransform
Note that

X ~ N (0,1)
X ~ N ( , )

If we can generate unit normal, then we can


generate any normal
IE 519

106

Normal: Box-Muller
Algorithm

1. Generate independent U1 , U 2 ~ U (0,1)


2. Set X 1 2 ln U1 cos 2U 2 ,
X 2 2 ln U1 sin 2U 2
3.Return X 1

Technically, independent N(0,1), but


serious problem if used with LCGs
IE 519

107

Polar Method
Algorithm
1. Generate independent U1 , U 2 ~ U (0,1).
Let Vi 2U i 1, W V12 V22
2. If W 1 go to step 1. Otherwise, let
Y (2 ln W ) / W
X 1 V1Y
X 2 V2Y
IE 519

108

Derived Distributions
Several distributions are derived
from the gamma and normal
Can take advantage of knowing
how to generate those two
distributions

IE 519

109

Beta
Density

x1 1 (1 x) 2 1

f ( x)
B 1 , 2
0

B 1 , 2 t
1

z1 1

(1 t )

z 2 1

0 x 1
otherwise

dt

No closed form CDF. No closed form inverse


Must use numerical methods for inversetransform method
IE 519

110

Beta Distribution Shapes

IE 519

111

Beta Properties
Sufficient to consider beta on [0,1]
If X~beta(1,2) then 1-X~beta(2,1)
If 2=1 then
1 1
2 1
1 1
x (1 x)
x
f ( x)

1 x1 1
B 1 , 2
B 1 ,1
1

F ( x) x
If 1=1, 2=1 then X~U(0,1)
IE 519

112

Beta: General Approach


If Y1~Gamma(,1) and Y2~Gamma(,1),
and Y1 and Y2 are independent, then

Y1
~ beta 1 , 2
Y1 Y2
Thus, if we can generate two gamma
random variates, we can generate a beta
with arbitrary parameters
IE 519

113

Pearson Type V and Type


VI
Pearson Type V

X~PT5() iff 1/X~gamma()

Pearson Type VI

If Y1~Gamma(,) and
Y2~Gamma(,1), and Y1 and Y2 are
independent,
then
Y1

Y2

~ PT 6 1 , 2 ,
IE 519

114

Pearson Type V

IE 519

115

Pearson Type VI

IE 519

116

Normal Derived
Distributions
Lognormal

Y ~ N ,

~ LN ,

Test distributions (not often used


for modeling):

Chi-squared
Students t distribution
F distribution
IE 519

117

Log-Normal

IE 519

118

Empirical
Use inverse-transform method
Do not need to search through observations
because changes occur precisely at 0, 1/(n-1),
2/(n-1),
Algorithm
1. Generate U ~ U (0,1). Let P (n-1 )U ,

and let I P 1
2.Return X X ( I ) ( P I 1) X ( I 1) X ( I )

IE 519

119

Empirical Distribution
Function

IE 519

120

Discrete Distributions
Can always use the inverse-transform
method
May not be most efficient
Algorithm
1. Generate U ~ U (0,1).

2.Return the nonnegative integer X I


that satisfies
I 1

j 0

j 0

p(j) U p(j)
IE 519

121

Alias Method
Another general method is the
alias method, which works for
every finite range discrete
distribution

IE 519

122

Alias Method: Example


0.1 x 0
0.4 x 1

L0=1

L2=3

p( x)

0.2 x 2
0.3 x 3
1

1. Generate I ~ DU (0, n), U ~ U (0,1)


2.If U Fi return X I . Otherwise return X LI
IE 519

123

Bernoulli
Mass function

1 p x 0

p( x) p
x 1
0
otherwise

Algorithm
1. Generate U ~ U (0,1)
2.If U p return X 1. Otherwise return X 0

IE 519

124

Binomial
Mass function
t x
p (1 p ) t x
p( x) x
0

x {0,1,..., t}
otherwise

Use the fact that if X~bin(t,p) then


X Y1 Y2 ..., Yt
Yi ~ Bernoulli( p )
IE 519

125

Geometric
Mass function
x {0,1,..., t}

otherwise

p( x)

p (1 p ) x

Use inverse-transform
1. Generate U ~ U (0,1)

2. Return X ln(1 U )
ln(1 p )

IE 519

126

Negative Binomial
Mass function
s x 1 s

p (1 p ) x
p( x)
x
0

x {0,1,..., t}
otherwise

Note that X~negbin(n,p) iff


X Y1 Y2 ..., Yn
Yi ~ Geometric( p )
IE 519

127

Poisson
e x

Mass function p ( x) x!
0
Algorithm

1. Let a e , b 1, i 0

x {0,1,..., t}
otherwise

2. Generate U i 1 ~ U (0,1) and replace b by bU i 1.


If b a, return X i. Otherwise go to step 3.
3. Let i i 1 and go back to step 1
Rather slow. No very good algorithm for
Poisson distribution
IE 519

128

Poisson Process
A stochastic process {N (t), t 0}
that counts the number of events up
until time t is a Poisson process if:

Events occur one at a time


N (t+s)-N (t) is independent of {N (t),
t0}

A Poisson process is determined by


d
its rate
(t ) E N (t )

dt

IE 519

129

Generating a Poisson
Process
Stationary with rate >0
Time between events Ai=ti-ti-1 are
IID exponential
Algorithm
1. Generate U ~ U (0,1)

2. Return ti ti 1 1 ln U

IE 519

130

Nonstationary Case
Can we simply generalize?
(t)

ti

ti 1
IE 519

131

Thinning Algorithm
1. Set t=ti-1
2. Generate U1, U2 IID U(0,1)
3. Replace t by
1
t * ln U1 , where * max (t )
t

*
U

(
t
)
/

4. If 2
return ti = t. Otherwise,

go back to step 2.
IE 519

132

Summary
For any stochastic simulation it is
necessary to generate random
variates from either a theoretical
distribution or an empirical
distribution
General methods we covered

Inverse-transform
Acceptance-rejection
Alias method
IE 519

133

Output Analysis

IE 519

134

Output Analysis
Analyzing the output of the simulation is
a part that is often done incorrectly (by
analysts and commercial software)
We consider several issues

Obtaining statistical estimates of


performance measures of interest
Improving precision of those estimates
through variance reduction
Comparing estimates from different models
Finding the optimal performance value
IE 519

135

Simulation Output
The output from a single simulation run
is a stochastic process Y1, Y2,
Observations (n replications of length
m):
y y y
11

1i

1m

y21 y2i y2 m

yn1 yni ynm


IE 519

136

Parameter Estimation
Want to estimate some parameter
based on these observations
Unbiased?
E ?

Consistant?
lim ?
t

IE 519

137

Transient vs Steady State

IE 519

138

Initial Values: M/M/1


Queue

IE 519

139

Types of Simulation
Terminating simulation
Steady-state parameters

Non-terminating simulation Steady-state cycle parameters


Other parameters

IE 519

140

Terminating Simulation
Examples:

A retail establishment that is open for


fixed hours per day
A contract to produce x number of a
high cost product
Launching of a spacecraft

Never reaches steady-state


Initial conditions are included
IE 519

141

Non-Terminating
Simulation
Any system in continuous operation
(could have a break)
Interested in steady-state parameters
Initial conditions should be discarded
Sometimes no steady-state because the
system is cyclic
Then we are interested in steady-state
cycle parameters
IE 519

142

Terminating Simulation
Let Xj be a random variable defined on
the jth replication
Want to estimate the mean =E (Xj )

X (n) t n 1,1 / 2

S ( n)
n

Fixed-sample-size procedure
CI assumes Xjs are normally distributed
IE 519

143

Quality of Confidence
Interval
Number of failures
Depends on both the
underlying
distribution and the
number of
replications
Average delay (25 customers)

Average delay (500 customers)

IE 519

144

Specifying the Precision


Absolute error
X
To obtain this

1 P X half length X half length

P X

P X half length

IE 519

145

Replications Needed
To obtain absolute error of , the
number of replications needed is
approximately
2

S
( n)

*
na min i n : ti 1,1 / 2

i

IE 519

146

Relative Error

Also interested in the relative error


Now we have
X

1 P

half length

P X X
P X X
P(1 ) X
P X (1 )

P X X

IE 519

147

Replications Needed
To obtain relative error of , the number
of replications needed is approximately

n min

*
r

S 2 ( n)
ti 1,1 / 2


i
i n:

1
X ( n)

IE 519

148

Sequential Procedure
2

S
( n)
Define '
, (n, ) ti 1,1 / 2
1
i

Algorithm

0. Make n0 replications and set n n0

1. Compute X (n) and from X 1 , X 2 ,..., X n


2. If (n,) X (n) ' , use X (n) as the
estimate of and stop.
Otherwise, let n n 1 make an additional
replication and go to step 1
IE 519

149

Other Measures
If we only use averages, the results
can often be misleading or wrong
What about the variance?
Alternative/additional measures

Proportions
Probabilities
Quantilies
IE 519

150

Example
Suppose we are interested in
customer delay X. We can estimate

Average delay E[X]

Proportion of customer with Xa

Probabilities, e.g., P[Xa]

The q-quantile xq
IE 519

151

Estimating Proportions
Define an indicator function

1 if X a
Ii
0 otherwise
Obtain a point estimate of the
proportion
1 n

n
i 1

IE 519

152

Estimating Probabilities
Want to estimate p=P(XB)
Have n replications X1,X2,,Xn
Define
S=number of observations that fall in
set B
S ~ binomial(n,p)
Unbiased estimate S
is

n
IE 519

153

Estimating Quantiles
Let X(1),X(2),,X(n) be the order statistics
corresponding to n simulation runs
A point estimator is then

X ( nq )
xq
X nq

if nq is an integer
otherwise

IE 519

154

Initial Conditions
In terminating simulation there is no
steady-state
Hence, the initial conditions are included
in the performance measure estimates
How should they be selected?

Use an artificial warm-up period just to get


reasonable start-up state
Collect data and model the initial conditions
explicitly

IE 519

155

Discussion
For terminating simulation we must use
replications (cannot increase length of simulation
run)
Point estimates of performance measures:

Unbiased estimate and an approximate CI is easily


constructed for the mean performance
Also, obtained point estimates for proportions,
probabilities, and quantiles (mean not always enough)

It is important to be able to control the precision


determine how many replications are needed
Initial conditions are always included in the
estimates for terminating simulations must be
selected carefully

IE 519

156

Steady-State Behavior
Now were interested in parameters
related to the limit distribution

Fi ( y ) F ( y )
i

Fi ( y ) PYi y
F ( y ) P Y y

Problem is that we cannot wait until


infinity!
IE 519

157

Estimating Mean
Suppose we want to estimate the
steady-state mean

lim E Yi
i

Problem:

E Y (m) , m

One solution is to add warm-up and get


a less biased estimator m

1
Y (m, l )
Yi

m l i l 1
IE 519

158

Approaches for Estimating


There are numerous approaches
for estimating the mean

Replication/deletion
One long replication:

Start with this

Batch-means
Autoregressive method
Spectrum analysis
Regenerative method
Standardized time series method

IE 519

159

Choosing the Warm-Up


Period
In replication/deletion method the main
issue is to choose the warm-up period
Would likeE Y ( m, l ) , m l.
Tradeoff:

If l is too small then we still have a large bias


If l is too large then the estimate will have a
large variance

Very difficult to determine from a single


replication
IE 519

160

Welchs Procedure
Y11
Y21

Y12
Y22

Y13
Y23

Y14 Y1,m 2
Y24 Y2,m 2

Y1,m 1
Y2,m 1

Y1,m
Y2,m

Yn1
Y1

Yn 2
Y2

Yn 3
Y3

Yn 4 Yn ,m 2
Y4
Ym 2

Yn ,m 1
Ym 1

Yn ,m
Ym

Y1 (1) Y2 (1) Y3 (1)

Ym 1 (1)
IE 519

161

Welchs Procedure
Key is to smooth out high-frequency
oscillations in the averages

Yi ( w)

w
1
Yi s , i w 1,..., m w

2w 1 s w
i 1
1
Yi s , i 1,2,..., w

2i 1 s ( i 1)

Then plot the moving average


IE 519

162

Example: Hourly
Throughput

When is it warmed up?


IE 519

163

Welchs Procedure
Much smoother and
easier to tell where it
has converged
Want to err on the side
of selecting it too large

IE 519

164

Replication/Deletion
Similar to terminating simulation

X (n' ) t n ' 1,1 / 2

S 2 ( n' )
n'

1 m'
X ( n' )
Y ji

m'1 m ' l
m' l
Need n pilot runs to determine the warm-up
period l, and then throw away the first l
observations from the new n runs

IE 519

165

Discussion
Replication/deletion appoach

Easiest to understand and implement


Has good statistical performance if done
correctly
Applies to all output parameters and can be
used to estimate several different parameters
for the same model
Can be used to compare different systems

Nonetheless, some other methods have


clear advantages
IE 519

166

Covariance Stationary
Process
Classic statistical inference assumes independent
and identically distributed (IID) observations
Even after eliminating the initial transient this is
not true for most simulations because most
simulation output is auto-correlated
However, it is reasonable to assume that after the
initial transient the output will be covariance
stationary, that is,

k cov Yi , Yi k

is independent of i

IE 519

167

Notation:
Simulation output:
Mean:
Variance:
Covariance:

Y1 , Y2 ,..., Yn

v E Yj

2 Var Y j
k cov Yi , Yi k

Variance:

0 2
k
Correlation:
k
0
IE 519

Assume
covariance
stationary

168

Implications of
Autocorrelation

If the process is covariance stationary the


average is still an unbiased estimator, that is,
n

Y j and E
j 1

However, the same cannot be said about the


standard estimate of the variance
2
n
S
(
n
)
1
1
2

2

Y

j
n
n n 1 j 1
In fact,

E S ( n)
2

1 k n

1 2

n 1

k 1

n 1

IE 519

169

Expression for Variance


Assuming covariance stationary process it can be
shown that:
n 1
0
k

1 2 1 k
n
n
k 1
2

We hope the estimate of the variance


S2 n
unbiased, that is,

is

S2
2

E

By combining the top equation above with the last


equation on previous slide, we can check this for
an independent and autocorrelated output process

IE 519

170

Independent Process
If the output process is independent
then
k cov Yi , Yi k
k

0
2

S
E

n
2

0, k 1

k
0 2
1 2 1 n k n n

k 1

n 1

1 2 1 k 1
n
k 1

2 2
n 1

n 1

IE 519

171

Autocorrelation in Process
If the process is positively correlated (usual):
k 0, k 1
0
2

n

S
E

n
2

0 2
k
1 2 1 n k n n

k 1

n 1

1 2 1 k 1
n
k 1

2 2
n 1

n 1

Hence, the estimator has less precision than


predicted and the CI is misleading
IE 519

172

Batch-Means Estimators
Batch-means estimators are the most popular
alternative to replication/deletion
The idea here is to do one very long simulation
run and estimate the parameters from this run
Advantage is that the simulation only has to go
through the initial transient once
Assuming covariance-stationary output

No problem estimating the mean


Estimating the variance is difficult because the data is
likely to be autocorrelated, that is, Yi and Yi+1 are
correlated

IE 519

173

Classical Approach
Partition the run of n into k equalsize contiguous macro replications,
each composed of m=n/k micro
replications
k
1
Point estimator
j
k j 1

IE 519

174

CI Analysis
Assuming as before that Y1, Y2, is covariancestationary with E[Yi]=

v j
If the batch size is large enough, then the
estimates will be approximately uncorrelated
Suppose we can also choose k large enough so
that they are approximately normal
It follows that the batch estimates have the
same mean and variance
Hence we can treat them as approximately IID
normal and get the usual confidence interval
IE 519

175

Variants of Batch-Means
Y1 , Y2 , Y3 ,....., Ym 1 , Ym , Ym 1 ,...., Y2 m 1 , Y2 m , Y2 m 1 ,..., Yn
Batch 1

Batch 2

Y1 , Y2 ,....., Ym 1 , Ym ,...., Ym l , Ym l 1 ,..., Y2 m l 1 ,..., Yn


Batch 1

Batch 2

Y1 , Y2 , Y3 ,....., Ym 1 , Ym , Ym 1 ,...., Y2 m 1 , Y2 m , Y2 m 1 ,..., Yn


Batch 1
Batch 2

IE 519

176

Steady-State Batching
General variance estimator
2
SB
Var
nm

2
B

jB
2
j

B 1

B {1, m 1,2m 1,..., (k 1)m 1}


B {1,2,..., n m 1}

IE 519

177

Determining the Batch


Size
Tradeoff

Large batch sizes have the needed


asymptotic properties
Small batch sizes yield more batches
That is, choice between bias due to poor
asymptotics and variance due to few batches

Rule of thumb (empirical):

Little benefit to more than 30 batches


Should not have fewer than 10 batches

IE 519

178

Mean Squared Error


The mean squared error (MSE) of an
estimator is

MSE , E

Bias 2 , Var

This is the classic measure of quality


Can use to select the optimal batch
size
IE 519

179

Optimal Batch Size


The asymptotic mse-optimal batch size
is
1/ 3

cb 1

m 2n
1
cv 0

cb Bias constant
*

cv Variance constant

Center of gravity
IE 519

180

Regenerative Method
Similar to batch-means, the regenerative
method also tries to construct
independent replications from a single run
Assume that Y1, Y2, has a sequence of
random points 1 B1 < B2 < called
regeneration points, and the process from
Bj is independent of the process prior to Bj
The process between two successive
regeneration points is called a
regeneration cycle
IE 519

181

Estimating the Mean


E[ Z ]

E[ N ]
Zj

B j 1 1

iB j

Z ( n' )

N ( n' )
n' Number of regeneration cycles
IE 519

182

Analysis
The estimator is not unbiased. However,
it is strongly consistent

(n' ) (w.p.1)
n '

Let be the covariance matrixUof


j Z j, N j
Let V j Z j N j
These are IID with mean 0 and variance

V2 11 2 12 2 22
IE 519

183

Analysis
From the CLT

V ( n' )

2
V

n'

N (0,1)

Have estimates

11 (n' ) 12 (n' )

(n' )

12 (n' ) 22 (n' )
2
2

V (n' ) 11 (n' ) 2 (n' ) 12 (n' ) (n' ) 22 (n' )

IE 519

184

Analysis
2
2

(
n
'
)

Can be shown thatV


V (w.p.1)
n '

Hence

(n' )

(n' ) n' N (n' )


2
V

N (0,1)

We get a CI

(n' )

z1 2 V2 (n' ) n'
N ( n' )
IE 519

185

Non-Independence
Non-overlapping batch-means and
regeneration methods try to create
independence between batches/cycles
An alternative is to use estimates of the
autocorrelation structure to estimate the
variance of the sample mean
(Again, estimating the mean is no
problem, just the variance)
Spectrum analysis and autoregressive
methods attempt to do this
IE 519

186

Spectral Variance
Estimator
Assume the process is covariance
stationary:
E Y , E Y Y (l )
j

j l

(l)

The variance can be expressed as


The spectral density function of the process:
1
f ( )
(l ) cos(l )

2 l
Since 2f (0) l (l,) an estimate of the
spectral density function at frequency 0 is
an estimate of the variance

IE 519

187

Spectral Variance
Estimator
Using standard results:
m 1

2 (n)

w (l )

l ( m 1)

Batch
size

(l )

1 n l
n (l ) Yr Y (n) Yr l Y (n)
n r 1
wn (0) 1
wn (l ) 1

Weights

IE 519

188

Parameters
For the batch sizem

m n0
n

Example of weight functions


1 l m l m 1
wn (l )
otherwise
0
1 l 2 m 2
wn (l )
0

l m 1
otherwise
IE 519

189

Autoregressive Method
Again assume covariance-stationary
output process, and also a pth-order
autoregressive model

b Y
p

j 0

i j

b0 1
{ i } uncorrelated random variables
with mean 0 and variance 2
IE 519

190

Convergence Result
Can be shown that

mVar Y (m)
m

p
j 0

bj

Can estimate these quantities and get


2
Var Y (m)
2

mb

A CI can be constructed using tdistribution


IE 519

191

What is the Coverage?


Empirical results for 90% CI for two simulation models

IE 519

192

Discussion
Replication/deletion is certainly the most
popular in practice (easy to understand)
Batch-means is very effective. There are
practical algorithms and still a lot of research
Spectral methods are still a subject of active
research but probably not used much in
practice (very complicated)
Autoregressive methods appear not be
used/investigated much
Regeneration methods are theoretically
impeccable but practically useless!
IE 519

193

Comments on Variance
Estimates
We have spent considerable time looking at
alternative estimates of the variance
Why does it matter?
Simulation output is usually (always) autocorrelated, which makes it difficult to estimate
variance, and hence the CI may be incorrect
Most seriously, the precision of the estimate
may be less than predicted and hence
inference drawn from the model may not be
valid
IE 519

194

Implications of
Autocorrelation
Because simulation output is usually
autocorrelated we cannot simply use all of
the observations to estimate the mean
We need some way of obtaining no
correlation

Replication/deletion gets this through


independent replications
Batch-means gets the (almost) through nonoverlapping batches
Regenerative method get this through
independent regenerative cycles

IE 519

195

Sequential Procedures
None of the single run methods we have
discuss can assure any given precision
(which we need to make a decision)
Several sequential procedures exist that
allow us to do this

More complicated than for


replication/deletion
May require very long simulation runs

IE 519

196

Good Sequential
Procedures
Batch-means and relative error stopping
rule

Law and Carson procedure (1979)


Automated Simulation Analysis Procedure
(ASAP) and extension ASAP3 (2002, 2005)

Spectral method and relative stopping rule

WASSP (2005)

All of these methods obtain much better


coverage
However, they are rarely if ever used!
IE 519

197

Estimating Probabilities
Know how to estimate means
How about probabilities p = P[YB] ?
1 if Y B
Note that
Z

0 otherwise
PY B P Z 1
1 P Z 1 0 P Z 0
E[ Z ]

We therefore already know this!


IE 519

198

Estimating Quantiles
Suppose we want to estimate the qquantile yq, that is, P[Y yq]=q
More complicated
Most estimates based on order
statistics

Biased estimates
Computationally expensive
Coverage low if sample size is too low
IE 519

199

Cyclic Parameters
No steady state distribution
With some cycle definition

Fi ( y ) P Yi y F ( y ) P Y y
C

All of the techniques we have


discussed before for steady-state
parameters still apply to this new
process
IE 519

200

Multiple Measures
In practice we are usually interested in
multiple measures simultaneously, so we have
several CIs

P 1 I1 1 1

P k I k 1 k
How does this effect our overall CI?

P s I s , s 1,..., k ?

IE 519

201

Bonferroni Inequality
No problem if independent

P s I s , s 1,..., k P s I s
s 1

In practice performance measures are


very unlikely to be independent
If they are not independent, we can use
Bonferroni inequality
k

P s I s , s 1,..., k 1 s
IE 519

s 1

202

Computational
Implications
Say we have 5 performance measures and
we want a 90% CI
Two alternatives:

We can get five 98% CI for each of the


performance measures, which gives us a 90%
overall CI. This is computationally expensive
We can get five 90% CI and live with the fact
that one or more of them is likely to not cover
the true value of the parameter

We will revisit this topic when we talk


about multiple comparison procedures
IE 519

203

Output Analysis:
Discussion
Terminating simulation

Replications defined by
terminating event
Can determine precision
Initial conditions

Multiple runs

Replication/deletion
Non-terminating simulation Issue with bias
Elimination of initial transient

Single long run

Batch-means, regenerative etc


Autocorrelation problem with
estimating the variance

IE 519

204

Resampling Methods

IE 519

205

Sources of Variance
We have learned how to estimate
variance and construct CI, predict
number of simulation runs needed, etc.
Where does the variance come from?

Random number generator (RNG)


Made worse
Generation of random variates
by long
Computer only approximates real values
runs!
Initial transient/stopping rules
Made better
by long
Inherently biased estimators
runs!
Modelling error ?

IE 519

206

Input Modelling
We have discussed input modeling and
output analysis separately
Recall main approaches for input
modeling:

Fit a parametric distribution


Fit an empirical distribution
Use a trace
Use beta distribution

In practice fitting a parametric distribution


is the most common approach
IE 519

207

Numerical Example
The underlying system is an M/M/1/10 queue
The simulation model is 1 station, capacity of
10, and empirical distribution for interarrival
and service times from 100 observations
Want to estimate the expected time in system
E[W]
Typical simulation experiment:

10 replications
Very long run of 5000 customers
Very long warm-up period of 1000 customers
CI constructed using t-distribution

We would expect a very good estimate for the


performance of the model
IE 519
208

Effect of Estimating
Distribution Parameters
True model
No resampling

Direct resampling

True model assumes that the true models for interarrival and
service distribution is known
No resampling is the traditional approach of empirical
distribution and then construct a sample mean based on 10
replications
Direct resampling obtains a new sample of 100 data points
IE 519
209
for each of the 10 replications

Why Poor Coverage?


The uncertainty due to replacing the
true distribution with an estimate is
neglected

This is the case for all commercial


simulation software

Remedies

Direct resampling
Bootstrap resampling
Uniformly randomized resampling
IE 519

210

Direct Resampling
For each replication (simulation
run) use a new sample to create an
empirical distribution function
Requires a lot more data
Alternatively what data is available
can be split among the replications
Can confidence intervals be
constructed?
IE 519

211

Bootstrap Resampling
Use the bootstrap to create a new
sample for a new empirical
distribution function for each
replication
Bootstrap: sampling with
replacement
No need for additional data and
may even be able to use less data
IE 519

212

Bootstrap Resampling
Algorithm
For each input quantity q modeled,
sample n values
from
the
observed
data
i
i
i
v
,
v
,...,
v
with replacement
q (1)
q ( 2)
q(n)
Construct an empirical distribution for
each q based on these samples
Do a simulation run based on these
input distributions (ith output)
Repeat
IE 519

213

Uniformly Randomized
Note that if F is the cdf of X then
F(X) is uniform on [0,1]
X [1] X [ 2] ... X [ n ]

F X [1] F X [ 2] ... F X [ n ]
F X [ k ] ~ beta (k , n k 1)

IE 519

214

Uniform Randomized
Bootstrap
For each input quantity q modeled,
xq (1) , xq ( 2data
) ,..., xq ( n )
order the observed
Generate a sample of n ordered values
i
i
i
uq (1) , uq ( 2 ) ,..., uq ( n )
from a uniform distribution
Set Fq( p ) xq ( j ) uqi ( j )
and construct an empirical distribution
for each q based on these samples
Do a simulation run based on these
input distributions (ith output)
Repeat
IE 519

215

Numerical Results

90% CI for a M/M/1/10 queue and varying traffic intensity {0.7,0.

Observations of interarrival and service times {50,100,500}

IE 519

216

Numerical Results

90% CI for a M/U/1/10 queue and varying traffic intensity {0.7,0.

Observations of interarrival and service times {50,100,500}

IE 519

217

Discussion
Uncertainty in the input modeling
can effect the precision of the output
For a given application you can
estimate this effect by selecting 3-5
random subsets of the data, and
performing the analysis on each
Bootstrap resampling can help fix
the problem
IE 519

218

Discussion
Bootstrap resampling is much more general, and
provides an answer to the question:

Given a random sample & a statistic T calculated on


this sample, what is the distribution of T ?

Assumptions:

The empirical distribution converges to the true


distribution as the number of samples increases
T is sufficiently smooth
Problems with extreme point estimates

Other simulation applications:

Model validation
Ranking-and-selection, etc.

IE 519

219

Comparing Multiple
Systems

IE 519

220

Multiple Systems
We know something about how to
evaluate the output of a single
system
Simulation is rarely used to simply
evaluate one system
Comparison:

Two alternative systems can be built


Proposed versus existing system
What-if analysis for current system
IE 519

221

Types of Comparisons
Comparison of two systems
Comparison of multiple systems

Comparison with a standard


All pair-wise comparison
Multiple comparison with the best (MCB)

Ranking-and-selection

Selecting the best system of k systems


Selecting a subset of m systems containing the
best
Selecting the m best of k systems

Combinatorial optimization
IE 519

222

Overview of Various Approaches


Comparison of Systems

Construct (simultaneous) confidence intervals

Ranking-and-selection

Indifference zone
The systems that is selected has performance that is

within an indifference zone of the best performance with


a fixed probability

This is the most common method


Bayesian approach
Optimal simulation budget allocation

Optimization

Design of experiments/Response surfaces


Search procedures

IE 519

223

Example: One or Two


Servers?

IE 519

224

Comparing Two Systems


Have IID observations from two
output processes and want to
construct a CI for the expected
difference:
X 11 , X 12,..., X 1n 1 E X 1i
1

X 21 , X 22,..., X 2 n2 2 E X 2i

1 2
IE 519

225

A Paired-t CI
If n1=n2=n we can construct a paired
CI
Z i X 1i X 2i
E Z i

1 n
Z ( n) Z i
n i 1
n
1
2
Z i Z ( n)
Z (n)

n( n 1) i 1

Z (n) t n 1,1
IE 519

Z (n)
226

Welch CI
Now do not require equal samples, but assume
that the two processes are independent
1 ni
X i (ni ) X ij
ni j 1
ni
1
X ij X i (ni )
Si2 (ni )

ni 1 j 1

X 1 (n1 ) X 2 (n2 ) t f ,1 2

S12 (n1 ) S 22 (n2 )

n1
n2

S (n ) n S (n ) n
f
S (n ) n n 1 S (n ) n n
2
1

2
1
2

2
2

2
2

IE 519

227

Obtaining IID Observations


Need the observations from each
system to be IID
Terminating simulation

Each run is IID, so no problem

Non-terminating simulation

Replication/deletion approach
Non-overlapping batch-means
IE 519

228

Comparing Multiple
Systems

Comparison with a standard


All pair-wise comparison
Multiple comparison with the best
(MCB)

IE 519

229

Comparison with a
Standard
Now assume that one of the systems is the
standard, e.g. an existing system
Construct a CI with with overall confidence
level 1- for 2-1, 3-1,, k-1.
Using Bonferroni inequality: Construct k-1
confidence intervals at level 1-k1
The individual CIs can be constructed using
any method as Bonferroni will always hold

IE 519

230

All Pair-Wise Comparison


Now want to construct CIs to
compare all systems with all other
Quite difficult because we need
each individual CI to have level 1kk1 to guarantee an overall
level of 1-
Only feasible for a relatively small
number of k
IE 519

231

Multiple Comparison with the Best


(MCB)
We are really interested in whatever is the best
system, and hence to construct CIs to see if it is
significantly better than each of the others
i max l
l i

2
2
, X i (n) max X l (n) h

i max l X i (n) max X l (n) h

l i
l

i
l

i
n
n

Here h is a critical parameter and


x max{0, x}
MCB procedures are related to ranking-andselection

IE 519

232

Ranking-and-Selection
Have some k systems, and IID
observations from each system:

i E X ij

i i ... i
1

Want to select the best system, that is,


the system with the largest mean. We
call this the correct selection (CS)
Can we guarantee CS?
IE 519

233

Indifference Zone
Approach
We say that the selected system i* is the
correct selection (CS) if

i i
*

Here is called the indifference zone


Our goal is

P (CS ) P i1 i* P *

Here P* is a user selected probability


(Bechhofers approach)
IE 519

234

Two-Stage Approach:
Stage I
Obtain n0 samplesn and calculate
1 0
(1)
X i (n0 ) X ij
n0 j 1
n0
1
2
2
X ij X i (n0 )
Si (n0 )

n0 1 j 1

Calculate the total samples needed

h12 S i2 (n0 )
N i max n0 1,

2

IE 519

235

Two-Stage Approach:
Stage II
Obtain Ni-n0 more observations, and
calculate the second stage and overall
Ni
mean( 2 )
1
X i ( N i n0 )
X ij

N i n0 j n0 1
X i ( N i ) wi1 X i(1) (n0 ) wi 2 X i( 2 ) ( N i n0 )
n0
wi1
Ni

Ni
1 1
n0

wi 2 1 wi1
IE 519

N i n0 2
1

h S (n0 )

2
1

2
i

236

Comments: Assumptions
As usual: normal assumption
Do not need equal or known variances
(many statistical selection procedures
do)
Two-stage approach requires an estimate
of the variance (remember controlling
the precision)
The above approach assumes the least
favorable configuration
IE 519

237

Subset Selection
In most applications, many of the systems are
clearly inferior and can be eliminated quite
easily
Subset-selection:
I {1Find
,2,..k}a subset of systems

P ik I 1 P *

Guptas approach:

2
I l : X l (n) max X i (n) h

i l
n

IE 519

238

Proof

2
P ik I P X ik (n) max X i (n) h

i ik
n

2
P X ik (n) X i (n) h
, i ik
n

0
X ( n) X ( n)

i ik
i
ik
i
ik
P
h
, i ik
2n
2n



Approximat

ely standard normal


h

select h

P Z i h, i 1,2,..., k 1 1
IE 519

239

Two-Stage Bonferroni: Stage I


Specify

P* 1
t t1 ( k 1),n0 1

Make no replications and calculate the sample


variance of the difference
n
0
1
2
2
X li X lj X i X j
Sij

n0 1 l 1

Calculate the second stage sample size

N i max

t 2 Sij2
n0 , max 2
j i
IE 519

240

Two-Stage Bonferroni: Stage


II
Obtain the additional sample,
calculate the overall sample
means and select the best system,
with the following CI:
min 0, X i max X j i max j

j i

j i

max 0, X i max X j

IE 519

j i

241

Combined Procedure
Initialization: Calculate
X i(1) (n0 )
n0
1
2

S i2
X

X
(
n
)
ij i 0
n0 1 j 1

and

Subset selection: Calculate


Wil t Si2 Sl2 n0

1/ 2

and

I i : X i (n0 ) X l (n0 ) Wil , l i


If |I|=1, stop. Otherwise, calculate second-stage:

N i max n0 , hS i2 2
Obtain more samples from each system iI
Compute the overall sample means and select
the best system

IE 519

242

Sequential Procedure
2 n0 1

1 2

1 , h 2 2 n0 1

2 k 1

Compute
n0
1
2

Si2
X

X
(
n
)

X
(
n
)
ij lj i 0 l 0
n0 1 j 1
Screen

Set

I new i I old : X i (r ) X l (r ) Wil (r ), l i


h 2 Sil2
Wil (r ) max 0,
r
2

2r

If |I|=1, stop. Otherwise, take one more


observation and go back to screening
IE 519

243

Where does the h come


from?
Solved numerically from (Rinott):

P *

1 1
0 0


(
n

1
)

x y

Normal cdf

k 1

f
(
x
)
dx
n0 1

f n0 1 ( y )dy

f 2 density

More commonly, you look it up in a


table (some in the book)
IE 519

244

Large Number of
Alternatives
Two-stage ranking-and-selection procedures
usually only efficient for up to about 20
alternatives

Always focus
(LFC)
on least-favorable-configuration
, i 1,2,..., k 1
i

For large number of systems LFC would be very


unlikely
Use screening followed by two-stage R&S
Use sequential procedure

The procedure given earlier can be used up to 500


alternatives or so

IE 519

245

Other Approaches
Focused on comparing expected values
of performance to identify the best
Alternatives:

Select the system most likely to be best

Select the largest probability of success

Bayesian procedures

IE 519

246

Bayesian Procedures
Posterior and prior
Take action to maximize/minimize the posterior
R&S: Given a fixed computing budget find the
allocation of simulation runs to systems that
minimizes some loss function

0 i
L01 (i, )
1 otherwise
Lo.c. (i, ) max j i
j

IE 519

247

Discussion: Selecting the


Best
Three major lines of research:

Indifference-zone procedures
Most popular, easy to understand, use the LFC

assumption

Screening or subset selection based on


constructing a confidence interval
Can be applied for more alternatives, do not give you a

final selection. Can be combined with indifference zone


selection

Allocating your simulation budget to minimize a


posterior loss function
More efficient use of simulation effort, but does not give

you the same guarantee as indifference-zone methods

IE 519

248

Simulation Optimization

IE 519

249

Larger Problems
Even with the best methods, R&S
can only be extended to perhaps
500 alternatives
Often faced with more when we
can set certain parameters for the
problem
Need simulation optimization

IE 519

250

What is Simulation
Optimization?
Optimization where the objective
function is evaluated using
simulation

Complex systems

Often large scale systems

No analytical expression available


IE 519

251

Problem Setting
Components of any optimization
problem:
Decision variables ()
Objective function
n
f :R R

Constraints R n

IE 519

252

Simulation Evaluation
No closed form expression for the
n
functionf : R R
Estimated using the output of
stochastic discrete event
simulation:
X ( )
Typically, we
f (may
) E have
X ( ) .
IE 519

253

Types of Techniques
Decision Variables
Continuous

Discrete

Gradient-Based
Methods

Size of
Small

Note: These are direct


optimization methods.
Metamodels
approximate the
objective function and
then optimize (later).

Ranking &
Selection

Large
Random
Search

IE 519

254

Continuous Decision
Variables
Most methods are gradient based

( k 1)

(k )

k f

(k )

Issues:

All the same issues as in non-linear


programming
(k )

How to estimate the gradient


f

IE 519

255

Stochastic Approximation
Fundamental work by Robbins and Monro
(1951) & Kiefer and Wolfowitz (1952)
Asymptotic convergence can be assured

lim k 0
k

Generally slow convergence


IE 519

256

Estimating the Gradient


Challenge to estimate the gradient:
f
f ,
f ,...,
f

1
2
n

Finite differences are simple:


X (i i ) X (i )

f i
i
(could also be two-sided)
IE 519

257

Improving Gradient
Estimation
Finite differences requires two
simulation runs for each estimate
May be numerically instable
Better: estimate gradient during
( ) as
the same simulation Xrun

Perturbation analysis
Likelihood ration or score method
IE 519

258

Other Methods
Stochastic approximation variants
have received most attention by
researchers
Other methods for continuous domains
include

Sample path methods

Response surface methods (later)


IE 519

259

Discrete Decision
Variables
Two types of feasible regions:
Feasible region small (have seen
this)

Trivial for deterministic case but must


still account for the simulation noise

Feasible region large

E.g., the stochastic counterparts of


combinatorial optimization problems
IE 519

260

Statistical Selection
Selecting between a few alternatives
1 , 2 ,..., m
Can evaluate every point and compare
Must still account for simulation noise
We now know several methods:

Subset selection
Indifference zone ranking & selection
Multiple comparison procedures (MCP)
Decision theoretic methods
IE 519

261

Large Feasible Region


When the feasible region is large it
is impossible to enumerate and
evaluate each alternative
Use random search methods

Academic research focused on


methods for which asymptotic
convergence is assured
In practice, use of metaheuristics
IE 519

262

Random Search (generic)


Step 0: Select an initial solution (0) and simulate
its performance X((0)). Set k=0
Step 1: Select a candidate solution (c) from the
neighborhood N((0)) of the current solution and
simulate its performance X((c))
Step 2: If the candidate satisfied the acceptance
criterion, let (k+1)= (c); otherwise let (k+1)= (k)
Step 3: If stopping criterion is satisfied terminate
the search; otherwise let k=k+1 and go to Step 1

IE 519

263

Random Search Variants


Specify a neighborhood structure
Specify a procedure for selecting
candidates
Specify an acceptance criterion
Specify a stop criterion
IE 519

264

Metaheuristics
Random search methods that have
been found effective for
combinatorial optimization
For simulation optimization

Simulated annealing
Tabu search
Genetic algorithms
Nested partitions method
IE 519

265

Simulated Annealing
Falls within the random search framework
Novel acceptance criterion:

P Accept c

X c X (k )

Tk

X c X (k )

1,

otherwise

The key parameter is Tk, which is called


the temperature
IE 519

266

Temperature Parameter
Usually the temperature is decreased
as the search evolves
If it decreases sufficiently slowly then
asymptotic convergence is assured
For simulation optimization there are
indications that constant temperature
works as well or better

IE 519

267

Tabu Search
Can be fit into the random search framework
A unique feature is the restriction of the
neighborhood:

Solution requiring the reverse of recent moves not


allowed in the neighborhood

Maintain a tabu list of moves

Other features include long term memory that restart


with a different tabu list at good solutions
Has been applied successfully to simulation
optimization

IE 519

268

Genetic Algorithms
Works with sets of solutions (populations)
rather than single solutions
Operates on the population
simultaneously:

Survival
Cross-over
Mutation

Novel construction of a neighborhood


Has been used successfully for simulation
optimization
IE 519

269

Nested Partitions Method


Originally designed for simulation
optimization
Uses

Partitioning
Random sampling
Local search improvements

Has asymptotic convergence


IE 519

270

NP Method
(k ) 1 (k ) 2 (k )

Superregion

3 (k ) \ (k )

Most Promising
Region

In k-th iteration
j=2 subregions
Subregion

1 (k )

s ( (k ))

Subregion

2 (k )

Partition of the feasible region


In each iteration there is the most promising region
(k)
Use sampling to determine where to go next

IE 519

271

Sampling
Sources of randomness:

Performance of a subset is based on a random


sample of solutions from that subset

Performance of each individual samples estimated


using simulation

Difficulty of estimating performance depends


on how much variability in the region
Intuitively appealing to have more sampling
from regions that have high variance
IE 519

272

Two-Stage Sampling
Use two-stage statistical selection methods to
determine the number of samples
Phase I:

Obtain initial samples from each region

Calculate estimated mean and variance

Calculate how many additional samples are needed

Phase II:

Obain remaining samples

Estimate performance of region

IE 519

273

Convergence
Single-stage NP converges
asymptotically (useless?)
Two-stage NP converges to a solution
that is within an indifference zone of
optimum with a given probability

Reasonable goal softening


Statement familiar with simulation users

IE 519

274

Theory versus Practice


Asymptotically converging methods

Good theoretical properties


May not converge fast of be easy to
use/understand

Practical methods

Often based on heuristic search


Do not necessarily account for
randomness
Do no guarantee convergence
IE 519

275

Commercial Software
SimRunner (Promodel)

Genetic algorithms

AutoStat (AudoMod)

Evolutionary & genetic algorithms

OPTIMIZ (Simul8)

Neural networks

OptQuest (Arena, Crystal Ball, etc)

Scatter search, tabu search, neural nets


IE 519

276

Optimization in Practice
In academic work we have very specific
definitions:

Optimization = find the best solution


Approximation = find a solution that is within
a given distance of optimal performance
Heuristic = seek the optimal solution

In practice, people do not always think


about the theoretical ideal optimum that
is the basis for all of the above

Optimization = improvement

IE 519

277

Combining Theory &


Practice
Best of both worlds

Robustness and computational power of


heuristics
Guarantee performance somehow

Some examples:

Combine genetic algorithms with statistical


selection
Two-stage NP-Method guarantees
convergence within an indifference zone
with a prespecified probability
IE 519

278

Metamodels

IE 519

279

Response Surfaces
Obtaining a precise simulation estimate is
computationally expensive
We often want to do this for many
different parameter values (and even find
some optimal parameter values)
An alternative is to construct a response
surface of the output as a function of
these input parameters
This response surface is a model of the
simulation models, that is, a metamodel
IE 519

280

Metamodels
Simulation can be (simply) represented
as
y g
For as single output and additive
randomness, wey can
this as
g write
The metamodel, models g and
models
~
g f y
IE 519

281

Example

Instead of simulating an
exact contour construct
a metamodel using a few
values

IE 519

282

Regression
Most commonly, regression models have
been used for metamodels
f ( x) k pk ( x)
p1 (x) x1

p 2 ( x ) x2
p3 (x) x1 x2

The issues are determining how many terms


to include and estimating the coefficients
IE 519

283

DOE for RS Models


The coefficients are given by

X X
Key issues:

X ty

Would like to minimize variance of


Can be done by controlling the random number

stream

Would like to estimate with fewer


simulation runs
Designs to reduce bias
IE 519

284

Why Response Surfaces?

Box, Hunter, and Hunter (1978). Statistics for Experimenters, Wiley.

IE 519

285

Compare with True


Optimum

Why did this fail?


IE 519

286

Response Surface
Optimization

IE 519

287

Second Order Model

IE 519

288

Experimental Process
State your hypothesis
Plan an experiment

Design of Experiments (DOE)

Conduct the experiment

Run a simulation

Analyze the data

Output analysis

Repeat steps as needed


IE 519

289

DOE
Define the goals of the experiment
Identify and classify independent and
dependent variables (see example)
Choose a probability model

Linear, second order, other (see later)

Choose an experimental design

Factorial design, fractional factorial, latin


hybercubes, etc.

Validate the properties of the design


IE 519

290

Example of Variables
Depende
nt

Independent

Throughpu Job release policy, lot size, previous


t
maintenance, speed
Cycle Time Job release policy, lot size, previous
maintenance, speed
Operating
Cost

Previous maintenance, speed

IE 519

291

Other Metamodels
Many other approaches can be taken to
metamodeling

Splines
Have been used widely in deterministic simulation

responses

Radial basis functions


Neural networks
Krieging
Have also been used widely in deterministic

simulation and gaining a lot of ground in stochastic


simulation

IE 519

292

Variance Reduction

IE 519

293

Variance Reduction
As opposed to physical experiments, in simulation
we can control the source of randomness
May be able to take advantage to improve
precision

Output analysis
Ranking & selection
Experimental designs, etc.

Several methods:

Common random numbers


Comparing two or more systems

Antithetic variates
Improving precision of a single system

Control variates, indirect estimation, conditioning

IE 519

294

Common Random
Numbers
Most useful technique
Use the same stream of random
numbers for each system when
comparing
Motivation:
Z j X1 j X 2 j
1 n
Z ( n) Z j
n j 1
Var Z (n)

Var X Var X 2Cov X

Var Z j

1j

2j

1j

, X2j

IE 519

295

Applicability

IE 519

296

Synchronization
We must match up random
numbers from the different systems
Careful synchronization of the
random number stream

Assign one substream to each process


Divide that substream up to get
replications

Does not happen automatically


IE 519

297

Example: Failed Sync.

IE 519

298

Example: M/M/1 vs M/M/2


Independent sampling

CRN
IE 519

299

Example: Correlation
Induced

IE 519

300

Example: System
Difference

IE 519

301

Methods for CRN Use


Many methods assume independence
between systems
Ranking & Selection, etc.

Some methods designed to use CRN,


while it violates the assumptions of others

Experimental design

Can design the experiments specifically to


take advantage of CRNs
IE 519

302

Discussion
Dramatic improvements can be
achieved with CRN (but can also be
harmful)
Recommendations:

Make sure CRN are applicable (pilot study)


Use methods that take advantage of CRN
Synchronize the RNG
Use one-to-one random variate generator

IE 519

303

Antithetic Variates
We now turn to improving precision of a
simulation of a single system
Basic idea:

Pairs of runs
Large observations offset by small
observations
Use the average, which will have smaller
variance

Need to induce negative correlation


IE 519

304

Mathematical Motivation
Recall for covariance stationary process

Y1 , Y2 ,..., Yn
we have

2 2 n 1
l
Var Y
1 l
n n l 1
n
l CovYi , Yi l

If the covariance terms are negative the


variance will be reduced
Difficult to get all of those negative
IE 519

305

Complementary Random
Numbers

The simplest approach is to use


complementary random numbers
Suppose service times are exponentially
distributed with mean = 5
U
0.37
0.55
0.98
0.24
0.71
Avg.
S.Dev.

X
4.98
3.02
0.09
7.19
1.70
3.40
2.78

1-U
0.63
0.45
0.02
0.76
0.29
IE 519

X
2.30
3.96
20.17
1.36
6.22
6.80
7.70

X
3.64
3.49
10.13
4.27
3.96
5.10
2.83
306

Example (cont.)
U
0.07
0.35
0.21
0.57
0.66
Avg.
S.Dev

X
13.39
5.26
7.86
2.81
2.08
6.28
4.58

1-U
0.93
0.65
0.79
0.43
0.34

X
0.36
2.15
1.16
4.23
5.39
2.66
2.11

X
6.87
3.70
4.51
3.52
3.74
4.47
1.40

Does this prove that


antithetic variates work for
this example?
IE 519
307

What You Need


As for CRN, we need a monotone
relationship between the (many) unit
uniform random numbers to the (single)
output that we are interested in
X
(When there are multiple outputs we
need to hold for each output.)
As before:

Synchronization
Inverse-transform
IE 519

308

Formulation
1

X 2 j 1 F (U )
1

X 2 j F (1 U )
X 2 j 1 Y2 j 1
Simulation
X 2 j Y2 j
Yj

Y2 j 1 Y2 j
2
IE 519

309

Example: M/M/1 Queue


Independent sampling

Antithetic sampling

IE 519

310

Complimentary Processes
Imagine a queueing simulation with arrivals and
services
Large interarrival times will in general have the
same effect on performance measures as large
service times
Idea: Use the random numbers used for
generating interarrival times in the first run of a
pair to generate service time in the second run,
and vice versa
This could be extended to any situation where
you can argue similar complimentary

IE 519

311

Combining CRN with AV


Both CRN and AV are using very similar
ideas, so why not combine them?
System 1: Run 1.1 and Run 1.2
System 2: Run 2.1 and Run 2.2
If we have all the right correlations:

Run 1.1 and Run 2.2 are positively correlated


Run 1.2 and Run 2.1 are positively correlated

Thus, the overall performance may be


worse
IE 519

312

Discussion
Basic idea is to induce negative
correlation to reduce variance
Success is model dependent
Must show that it works

Based on model structure


Pilot experiments

Since we need a monotone relationship


between the RNG and output:
synchronization and inverse-transform
IE 519

313

Control Variates
We are again interested in improving the
precision of some output Y

YC

observed valued
of the output

E X

deviation from the


known mean

0 if X and Y are postively correlated


a is
0 if X and Y are negatively correlated

X is the control variate (any correlated r.v.)


IE 519

314

Estimator Properties
The controlled estimator is unbiased
E YC E Y a ( X )

E[Y ] a E[ X ] 0
The variance is
Var YC Var Y a( X )
So

Var [Y ] a 2 Var [ X ] 2a Cov ( X , Y )

Var YC Var [Y ] 2a Cov ( X , Y ) a 2 Var [ X ]


IE 519

315

Optimal a Given Y

0
Var [Y ] a 2 Var [ X ] 2a Cov ( X , Y )
a
2a Var [ X ] 2 Cov ( X , Y )

Cov ( X , Y )
a
Var [ X ]
*

2
Var [Y ] a Var [ X ] 2a Cov ( X , Y )

2Var [Y ] 0

IE 519

316

Optimal Variance
With the optimal value a*

Var Y Var [Y ] a
*
C

* 2

Var [ X ] 2a * Cov ( X , Y )
2

Cov ( X , Y )
Var [ X ]
Var [Y ]
Var [ X ]
Cov ( X , Y )
2
Cov ( X , Y )
Var [ X ]

Cov ( X , Y ) 2
2
Var [Y ]
1 XY
Var [Y ]
Var [ X ]
IE 519

317

Observations
By using the optimal value a*

The controlled estimator is never more


variable than the uncontrolled estimator
If there is any correlation, the controlled
estimator is more precise
Perfect correlation means perfect
XY 1
estimate *
2

Var YC 1 XY Var [Y ] 0

Wheres the catch?


IE 519

318

Estimating a*
Never know Cov[X,Y] and hence not a*
Need to estimate

YC* (n) Y (n) a * (n) X (n)

C
*
XY ( n)
a (n) 2

S X ( n)

Y
n

j 1

Y (n) X j X (n) n
S X2 (n)

This will in general be a biased estimator


Jackknifing can be used to reduce bias
IE 519

319

Example: M/M/1 Queue


Want to estimate the expected
customer delay in system
Possible control variates:

Service times
Positive correlation

Interarrival times
Negative correlation

IE 519

320

Example: Service Time


CVs

Shouldn' t be known!

Rep

13.84

0.92

3.18

0.95

2.26

0.88

2.76

0.89

4.33

0.93

1.35

0.81

1.82

0.84

3.01

0.92

1.68

0.85

10

3.60

0.88

Y (10) 3.78

4.13

X (10) 0.89 0.9


S X2 (10) 0.002
C (10) 0.07
XY

C XY (10)
a (10) 2
35.00
S X (10)
*

YC* (10) Y (10) a * (10) X (10)


3.78 35 (0.01) 4.13
IE 519

321

Multiple Control Variates


Perhaps we have two correlated random
variables (e.g., both service times and
interarrival times)

YC Y a( X )
Y a( X

(1)

) a ( X
(1)

( 2)

( 2)

X X (1) X ( 2 )
Problems?
IE 519

322

Multiple Control Variates


To take best advantage of each control
variate we need different weights

YC Y a1 ( X (1) (1) ) a2 ( X ( 2 ) ( 2 ) )
Find the partial derivatives with respect
to both and solve for optimal values as
before

IE 519

323

In General
m

YC Y ai ( X ( i ) ( i ) )
i 1

varYC var[Y ] ai2 var X ( i )


i 1

2 ai cov Y , X ( i )
i 1
m

2 ai a j cov X ( i ) , X ( j )
i 1 j 1

IE 519

324

Types of Control Variates


Internal

Input random variables, or functions of those random


variables
Know expectation
Must generate anyway

External

We cannot know E[Y ]


However, with some simplifying assumptions we may
have an analytical model that we can solve and
' Esame
[Y ' ] output
hence knowthe
Requires a simulation of the simplified system

IE 519

325

Indirect Estimation
Primarily been used for queueing
simulations
Di Delay of ith customer
d E[ Di ]
Wi Total wait of ith customer
w E[Wi ]
Q(t ) Number of customers in queue at time t
L(t ) Number of customers in system at time t
IE 519

326

Direct Estimators
n
1
d (n) Di
n i 1

1 n
w (n) Wi
n i 1
1

Q ( n)
T ( n)

T (n)

L( n)
T ( n)

T (n)

Q(t )dt
0

L(t )dt
0

IE 519

327

Known Relationships
w (n) d (n) S (n)

1 n
S ( n) S i
n i 1
Si Service time of customer i

E S (n) E[ S ]
Can we take advantage of this?
IE 519

328

Indirect Estimator
Replace the average with the
known expectation
~ (n) d (n) E[ S ]
w
Avoid variation
For any G/G/1 queue it can be
shown that ~
var w(n) var w (n)
Is this trivial?
IE 519

329

Littles Law
A key result from queueing is Littles
L w
Law

Arrival rate to system


Q d
Indirect estimators of average number
~
of customer
Q (n) in
dthe
(n) queue/system

~
~ (n) d (n) E[ S ]
L ( n) w
IE 519

330

Numerical Example
M/G/1 queue

Exponential

=.5 =.
7
15
11

=.
9
4

4-Erlang

22

17

Hyperexponential

Service Dist.

IE 519

331

Conditioning
Again replace an estimate with its exact
analytical value, hence removing a
source of variability
E X | Z z Analytically known

E X E E X | Z
var E X | Z var X E var X | Z var X

IE 519

332

Discussion
We need

can be easily generated


E[X|Z=z] can be easily calculated
E[var[X|Z]] is large

This is going to be very model


dependent
IE 519

333

Example: Time-Shared
Computer Model

Want to estimate the expected delay in queue


for CPU (dC), disk (dD) and tape (dT)

IE 519

334

Conditioning
Estimating dT may be hard due to lack of data
Observe the number NT in tape queue every
time a job leaves the CPU
If this job were to go to the tape queue, its
expected delay would be

E DT E ST N T 12.50 NT
Z NT

E DT | N T z 12.50 z (known analytically)


Variance reduction of 56% was observed
IE 519

335

Discussion
Both indirect estimation and
conditioning are extremely
application dependent
Require both good knowledge of
the system as well as some
background from the analyst
Can achieve good variance
reduction when used properly
IE 519

336

Variance Reduction
Discussion
Have discussed several methods

Common random numbers

Antithetic random variates

Control variates

Indirect estimation

Conditioning

More
application
specific

IE 519

337

Applicability &
Connections
Can we use VRT with any technique for
output analysis (e.g., batch-means)?
Can we use VRT (especially CRN) with
ranking-and-selection and multiple
comparison methods?
Can we design our simulation experiments
(DOE) to take advantage of VRT
(especially when building a metamodel)?
Can we use VRT with simulation
optimization techniques?
IE 519

338

VRT & Batch-Means


Batch-means is a very important method
for output analysis (non-overlapping &
overlapping)
Problem is that there may be correlation
between batches
Generally no problem with the use of
common random numbers or antithetic
variates
Use of control variates requires some
additional consideration but can be done
IE 519

339

VRT & Ranking & Selection


In R&S we need to make statements about

X i ( n) X l i l

If we use CRNs then the two averages will


be dependent, which complicates analysis
Two general methods:

Look at pair-wise differences


Bonferroni inequality
Assume some structure for the dependence
induced by the CRNs

IE 519

340

Pair-Wise Differences
We can replace

ij

(n) i 1, 2,...,k ; j 1, 2,...,n

with the pair-wise differences

ij

(n) X lj (n) i l , j 1, 2,...,n

This will then include the effect of the


CRN-induced covariance
IE 519

341

Bonferroni Approach
We can break up the joint statement
using Bonferroni inequalities

k 1

k 1

i 1

1 P Ai
i 1

Ai X i (n) X k (n) i k
e.g.

Very conservative approach


IE 519

342

Assumed Structure
E.g., Nelson-Matejcik modification of twostage ranking and selection assumes
sphericity
2

2 i
cov X ij , X lj
i l

il
il

Turns out to be a robust assumption

IE 519

343

VRT & DOE/Metamodeling


Experimental design X is used in many
simulation studies to construct a
metamodel (usually regression model)
of the responsey X

Can we take advantage of variance


reduction to improve the design?
IE 519

344

23 Factorial Design

How would you


used VRT for
this design?

IE 519

345

Assignment Rule
In an m-point experiment that admits orthogonal
blocking into two blocks of size m1 and m2, use a
common stream of random numbers for the first
block and the antithetic random numbers for the
second block

U1 u11 , u12 ,..., u1v

U 2 u 21 , u 22 ,..., u 2v
u11 , u12 ,..., u1v

u jv u jv ,1 , u jv , 2 ,...

u jv 1 u jv ,1 ,1 u jv , 2 ,...
IE 519

346

Blocking

IE 519

347

23 Factorial Design in 2
Blocks

In physical experiments we block because we


have to (we lose the three-way interaction
effect).
In simulation we do it because it is better
IE 519
348

VRT and Optimization


Most of the optimization techniques used with
simulation do not make any assumption (e.g.,
just heuristics from deterministic optimization
applied to simulation)

No problem with using variance reduction

Nested partitions method

Need independence between iterations


Key is to make a correct selection of a region in each
iteration
Can use CRN within each iteration to make that
selection

IE 519

349

Discussion
Variance reduction techniques can be very
effective in improving the precision of
simulation experiments
Of course variance is only part of the equation,
and you should
bias and
also
E[ X ]consider

efficiency
2
2

MSE ( X ) E X

var[ X ] E X E[ X ]
2

1
Eff ( X )
MSE ( X ) C ( X )
C ( X ) Cost of computing X
IE 519

350

Case Study

IE 519

351

Manufacturing Simulations
Objective

Increased throughput
Reduce in-process inventory
Increase utilization
Improved on-time delivery
Validate a proposed design
Improved understanding of system
IE 519

352

Evaluate Need for


Resources
How many machines are needed?
Where to put inventory buffers?
Effect of change/increase in
production mix/volume?
Evaluation of capital investments
(e.g., new machine)

IE 519

353

Performance Evaluation
Throughput
Response time
Bottlenecks

IE 519

354

Evaluate Operational
Procedures
Scheduling
Control strategies
Reliability
Quality control

IE 519

355

Sources of Randomness
Interarrival times between orders, parts,
or raw material
Processing, assembly, or inspection time
Times to failure
Times to repair
Loading/unloading times
Setup times
Rework
Product yield
IE 519

356

Example: Assembly Line


Increase in demand expected
Questions about the ability of an
assembly line to meet demand
Requested to simulate the line to
evaluate different options to
improve throughput

IE 519

357

Project Objective
Improve throughput in the line
Simulate the following options:

Optimize logic of central conveyor loop


Reconfigure the functional test stations
to allow parallel flow of pallets
Eliminate the conveyor and move to
manual material handling

IE 519

358

Assembly Line
Verification
Test

HIM
Assembly

Functional Tests

Flashing

Packaging

Manual Station 1
Assemble heat
sinks and fans

Soldering
Manual Station 2
Install power module
onto power PCBA

Strapping

Hi-Pot Test

IE 519

359

Simulation Project
Define a conceptual model of the line
Gather data on processes
Validate the model
Implement model in Arena
Test model on known scenarios
Evaluate options
Recommend solutions
IE 519

360

How Can Throughput be


Improved?
Change the queuing logic

This determines how pallets move


from one station to the next
Flash station
Two stations in a single loop

Functional test station


Three loops with two stations each

IE 519

361

Logic of the Flash Stations


Frame goes to the 2nd station
2

Frame goes to the 2nd station


and waits in the queue

1
3
4

Frame goes to the 1st station


Frame goes to the 1st station
and waits in the queue
IE 519

362

Logic of the Functional


Test
6
5

2
1

3
4

IE 519

363

How Can Throughput be


Improved?
Reconfigure functional test stations

Parallel tests would be more efficient


with respect to flow of pallets
Take up more space on floor longer
distances

Is it worthwhile to reconfigure?

IE 519

364

Circulate Pallets in System

IE 519

365

How can Throughput be


Improved?
Eliminate the conveyor

Manual material handling

Increase number of pallets

Currently 48 pallets
Sometimes run out

IE 519

366

Arena Simulation Model


The conceptual model was
implemented using the Arena
software
Current configuration simulation
and output compared to what we
have observed
Performance of several alternative
configurations compared
IE 519

367

Options Considered
Current configuration
Pallets re-circulate rather than queue
Various queue logics at functional tests
Flash station in series, functional test in
parallel
Both flash and functional test stations in
parallel
Increased number of pallets in system
Eliminate conveyor
IE 519

368

Queue Logic Options


Option 1: Queue one drive
at second station in each
loop starting with furthest
away loop
Option 2: Queue one drive
at both stations in each
loop, starting with furthest
away loop
Option 3: No queuing in
loops
Option 4: Queue at second
station in first loop only,
start with furthest away loop

IE 519

369

Throughput Comparison
Configuration

Throughput
(drives/day)

Current

265

Recirculation of pallets

275 (4% increase)

Queue logic: Option 1

274 (3% increase)

Queue logic: Option 2

279 (5% increase)

Queue logic: Option 3

280 (6% increase)

Queue logic: Option 4

295 (11% increase)

Mixed series/parallel

282 (6% increase)

All tests in parallel

296 (12% increase)

Increase to 60 pallets (25%)

291 (10% increase)

No Conveyor

256 (3% decrease)

IE 519

370

Why Does Throughput


Improve?
Consider the utilization of the test stations
First loop utilization

Station 1
Station 2

0.67
0.94

0.81
Difference of 0.27

Second loop utilization

Station 3
Station 4

Third loop

Station 5
Station 6

0.45
0.81

Difference of 0.36

utilization
0.30
0.52

0.63

0.41

Difference of 0.22

IE 519

371

Improving Utilization
Backfilling will improve balance
between different loops (Option 2)

Loop utilization: 0.53, 0.68, 0.76


Does not solve whole issue

Not queuing at test stations will


balance load between stations with a
loop (Option 3)

Station utilization: 0.64, 0.65, 0.75, 0.71, 0.80,


0.76
No queuing may leave station empty too easily

IE 519

372

Intermediate Options
Option 1: Queue one drive at second station
in each loop starting with furthest away
loop

Backfilling
Balance between no-queuing and current
method of queuing one drive at each station

Option 4: Queue at second station in first


loop only, start with furthest away loop

Uses backfilling idea


Balance between no-queuing and queuing at
second station

IE 519

373

Utilization Comparison
Configuration

Functional Test
Utilization

Current

0.67, 0.94, 0.45, 0.81, 0.30,


0.52

Recirculation of pallets

0.71, 0.94, 0.51, 0.86, 0.26,


0.54

Queue logic: Option 1

0.42, 0.67, 0.56, 0.83, 0.67,


0.91

Queue logic: Option 2

0.39, 0.67, 0.52, 0.84, 0.61,


0.91

Queue logic: Option 3

0.64, 0.65, 0.75, 0.71, 0.80,


0.76

Queue logic: Option 4

0.53, 0.82, 0.65, 0.65, 0.73,


374
0.70IE 519

Comments on Utilization
Utilization of functional test
stations is currently uneven and
can be improved
Key ideas

Backfilling
Correct amount of queuing allowed

IE 519

375

Bottleneck Analysis
Utilization of various stations

Manual station 1
Manual station 2
Soldering
Hi-Pot
Strapping
Flashing
Functional test

80%
65%
79%
15%
37%
53% average
71% average

Functional test station 2

HIM
Verification
Packing

Bottlenecks*

Third highest
utilization

94%

31%
47%
57%

Bottleneck

*Statistically equivalent

IE 519

376

Bottleneck Identification
Functional Test Station 2 is the most
heavily loaded station on the line
On average, the functional test
stations are slightly less loaded than
Manual Station 1 and Soldering
Station, which should hence also be
considered bottlenecks

IE 519

377

Functional Test Bottleneck


Configuration

Queue Length

Current

0.73 0.34

MAX=10 (21%)

Recirculation of pallets

0.17 0.04

MAX=1 (2%)

Queue logic: Option 1

0.72 0.32

MAX=10 (21%)

Queue logic: Option 2

0.20 0.10

MAX=8 (17%)

Queue logic: Option 3

1.28 0.50

MAX=17 (35%)

Queue logic: Option 4

0.79 0.27

MAX=11 (23%)

Mixed series/parallel

0.96 0.26

MAX=19 (40%)

All tests in parallel

1.14 0.33

MAX=14 (29%)

Increase to 60 pallets

1.34 0.55

MAX=16 (33%)

IE 519

378

Comments on Queue
Length
Functional test queue

Average queue length relatively short


Occasionally very long queues

Similar results for other stations,


e.g., HIM assembly station
Not a cause for concern

IE 519

379

Recommendations
Throughput can be improved:

Queuing logic at test stations


Requires reprogramming of conveyor

Configuring test stations in parallel


Requires significant reorganization
ROI must be evaluated carefully

Increase number of pallets


Currently close to point of rapidly diminishing

returns
Will not combine well with other improvements

IE 519

380

Further Improvements
Optimal logic of functional tests
depend on mix of drives, daily load,
etc.
Possibility of dynamically changed
logic?
Determine a relationship between
product mix parameters and best
logic
IE 519

381

Other Areas of
Improvement
Scheduling of drives

Mix of frames made on each day


Order of how different frames are made

Suggestion

Grouping and spacing


Group similar drives together for efficiency
Space time consuming drives apart
Account for deadlines and resource
availability
IE 519

382

Will Scheduling
Improvements Help?
Simulation results

Assume batch sizes with certain range and


certain most common batch size
Type

Mi
n

Ma
x

Most
common

Throughpu
t

Batching

27

14

279

Batching

22

10

265

No Batch

274

Clearly improvements can be made


IE 519

383

Discussion
Significant improvement can be obtain
through inexpensive changes

Recommend changing queuing logic as


inexpensive but high return alternative

Worthwhile to consider issues of


scheduling
Simulation model can be reused to
consider other potential improvements
Company followed recommendations and
increased throughput as predicted
IE 519

384

You might also like