0% found this document useful (0 votes)
339 views

4.2 Estimation of Absolute Performance

The document discusses different types of output analysis for simulations. It describes terminating simulations that run for a set duration with initial conditions, and non-terminating or steady-state simulations that run continuously. The output from stochastic simulations exhibits random variability both within and across replications. Absolute measures like means, proportions, and quantiles can be used as point estimators, while confidence intervals incorporate uncertainty in point estimates.

Uploaded by

Ansh Ganatra
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
339 views

4.2 Estimation of Absolute Performance

The document discusses different types of output analysis for simulations. It describes terminating simulations that run for a set duration with initial conditions, and non-terminating or steady-state simulations that run continuously. The output from stochastic simulations exhibits random variability both within and across replications. Absolute measures like means, proportions, and quantiles can be used as point estimators, while confidence intervals incorporate uncertainty in point estimates.

Uploaded by

Ansh Ganatra
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 42

Estimation of Absolute Performance

Outline:
1. Type of Simulation w.r.t. Output Analysis
2. Stochastic Nature of Output Data
3. Absolute Measures
4. Output Analysis for Terminating Simulation
5. Output Analysis for Steady-State Simulation
1. Type of Simulation w.r.t. Output Analysis

Output analysis is the examination of the data generated by a simulation


Its purpose is either to predict the performance of a system or to compare the
performance of two or more alternate system designs
The need for statistical output analysis is based on the observation that the
output data from a simulation exhibits random variability
• due to use of random numbers to produce input variables
• Two different streams or sequences of random variables will produce two
sets of outputs which will differ

3
1. Type of Simulation w.r.t. Output Analysis

Objective: Estimate system performance via simulation


If the system performance is measured by  , the result of a set of simulation
experiments will be an estimator of ˆ
The precision of the estimator canˆbe measured by:
• The standard error of . ˆ
• The width of a confidence interval (CI) for .
Purpose of statistical analysis:
• To estimate the standard error or CI .
• To figure out the number of observations required to achieve desired error/CI.
Potential issues to overcome:
• Autocorrelation, e.g. inventory cost for subsequent weeks lack statistical
independence.
• Initial conditions, e.g. inventory on hand and # of backorders at time 0 would
most likely influence the performance of week 1.

4
1. Type of Simulation w.r.t. Output Analysis

Distinguish the two types of simulation: transient vs. steady state.


Illustrate the inherent variability in a stochastic discrete-event simulation.
Cover the statistical estimation of performance measures.
Discusses the analysis of transient simulations.
Discusses the analysis of steady-state simulations.

5
1. Type of Simulation w.r.t. Output Analysis

Terminating verses non-terminating simulations


Terminating simulation:
• Runs for some duration of time TE, where E is a specified event that stops
the simulation.
• Starts at time 0 under well-specified initial conditions.
• Ends at the stopping time TE.
• Bank example: Opens at 8:30 am (time 0) with no customers present and 8
of the 11 teller working (initial conditions), and closes at 4:30 pm (Time TE
= 480 minutes).
• The simulation analyst chooses to consider it a terminating system because
the object of interest is one day’s operation.

6
1. Type of Simulation w.r.t. Output Analysis

Non-terminating simulation:
• Runs continuously, or at least over a very long period of time.
• Examples: assembly lines that shut down infrequently, telephone systems,
hospital emergency rooms.
• Initial conditions defined by the analyst.
• Runs for some analyst-specified period of time TE.
• Study the steady-state (long-run) properties of the system, properties that
are not influenced by the initial conditions of the model.
Whether a simulation is considered to be terminating or non-terminating
depends on both
• The objectives of the simulation study and
• The nature of the system.

7
2. Stochastic Nature of Output Data

Model output consist of one or more random variables because the model is an
input-output transformation and the input variables are r.v.’s.
M/G/1 queueing example:
• Poisson arrival rate = 0.1 per minute;
service time ~ N(= 9.5, =1.75).
• System performance: long-run mean queue length, LQ(t).
• Suppose we run a single simulation for a total of 5,000 minutes
– Divide the time interval [0, 5000) into 5 equal subintervals of 1000
minutes.
– Average number of customers in queue from time (j-1)1000 to j(1000) is
Yj .

8
2. Stochastic Nature of Output Data

M/G/1 queueing example (cont.):


• Batched average queue length for 3 independent replications:
Batching Interval Replication
(minutes) Batch, j 1, Y1j 2, Y2j 3, Y3j
[0, 1000) 1 3.61 2.91 7.67
[1000, 2000) 2 3.21 9.00 19.53
[2000, 3000) 3 2.18 16.15 20.36
[3000, 4000) 4 6.92 24.53 8.11
[4000, 5000) 5 2.82 25.19 12.62
[0, 5000) 3.75 15.56 13.66

• Inherent variability in stochastic simulation both within a single replication


and across different replications.
Y1. , Y2. , Y3. ,
• The average across 3 replications, can be regarded as independent
observations, but averages within a replication, Y11, …, Y15, are not.

9
3. Absolute Measures

Consider the estimation of a performance parameter,  (or ), of a simulated


system.
It is desired to have a “point estimate” and an “interval estimate” of  (or )
• In many cases, there is an obvious or natural choice candidate for a point
estimator. Sample mean is such an example
• Interval estimates expand on “point estimates” by incorporating the
uncertainty of point estimates
– Different samples from different intervals may have different means
– An interval estimate quantifies this uncertainty by computing lower
and upper values with a given level of confidence (i.e., probability)

10
3. Absolute Measures

Simulation output data are of the form {Y1,Y2,…,Yn} for estimating  is referred to
as discrete-time data, because the index n is discrete valued
The simulation data of the form {Y(t), 0  t  TE} is referred to as continuous-time
data with time-weighted mean because the index t is continuous valued.
Point estimation for discrete time data.
• The point estimator:
n
1
ˆ   Yi
n i 1
– Is unbiased if its expected value is , that is if:
– Is biased if:
E (ˆ)   Desired

E (ˆ)  

11
3. Absolute Measures: Point Estimator

Point estimation for continuous-time data.


• The point estimator:
1 TE
ˆ   Y (t )dt
TE 0

– Is biased in general where: E (ˆ).  


– An unbiased or low-bias estimator is desired.
Usually, system performance measures can be put into the common framework
of  or 
• e.g., the proportion of days on which sales are lost through an out-of-stock
situation, let:
1, if out of stock on day i
Y (t )  
0, otherwise

12
3. Absolute Measures : Point Estimator

Performance measure that does not fit this common framework is a “quantile” or
“percentile”

Pr{Y   }  p
• e.g., p=0.85; 85% of the customers will experience a delay of  minutes are
less. Or a customer has only a 0.15 probability of experiencing a delay longer
than  minutes.
• Estimating quantiles: the inverse of the problem of estimating a proportion or
probability. In estimating probability, a proportion is given and p is to be
estimated; but in estimating a quantile, p is given and  is to be estimated.
• Consider a histogram of the observed values Y:
– Find such that 100p% of the histogram is to the left of (smaller than) .
ˆ
– e.g., if we observe n=250 customer delays, then an estimate of the 85th
percentileˆ of delay is a value such that (0.85)(250)=212.5 213 of the
observed values are less than or equal to  ˆ

13
3. Absolute Measures: Confidence-Interval Estimation

To understand confidence intervals fully, it is important to distinguish between


measures of error, and measures of risk
• contrast the confidence interval with a prediction interval (another useful
output-analysis tool).
• Both confidence and prediction intervals are based on premise that the data
being produced by the simulation is well represented by a probability model

14
3. Absolute Measures: Confidence-Interval Estimation

Consider a manufacturing system producing parts and the performance measure is


cycle time for parts (time from release into the factory until completion). Yij is the
cycle time for jth part produced in i replication.

Within Replication Data Across Replication


Data
Y11 Y12 …… Y1n1
Y1. , S12 , H1
Y21 Y22 …… Y2n2
Y2. , S 22 , H 2
………
YR1 YR2 …… YRnR
YR. , S R2 , H R

H is confidence interval half-width Y.. , S 2 , H


15
3. Absolute Measures: Confidence-Interval Estimation

Suppose the model is the normal distribution with mean , variance 2 (both
unknown).
• Let Yi. be the average cycle time for parts produced on the ith
replication (representing a day of production) of the simulation.
– Therefore, its mathematical expectation is and let  be the day-
to-day variation of the average cycle-time
• Suppose our goal is to estimate 
• Average cycle time will vary from day to day, but over the long-run
the average of the averages will be close to .
• The natural estimator for  is the overall sample mean of R
independent replications, R
, but it is not , is only estimate
..   Yi. Rof that error
• A confidence interval (CI) is aYmeasure
i 1

• Let Sample variance across R replications: 1 R


S 
2

R  1 i 1
(Yi.  Y.. ) 2

16
3. Absolute Measures: Confidence-Interval Estimation

Confidence Interval (CI):


• A measure of error.
• Assumes Yi. are normally distributed.
S
Y..  t / 2, R 1 , where t / 2, R 1 is the quantile of t - distribution
R
• We cannot know for certain how far Y..  but CI attempts to bound
is from
that error.
• A CI, such as 95%, tells us how much we can trust the interval to actually
bound the error between and  . Y..
• The more replications we make, the less error there is in Y.. (converging to 0
as R goes to infinity).
• Unfortunately, the confidence interval itself may be wrong!!

17
3. Absolute Measures: Confidence-Interval Estimation

Prediction Interval (PI):


• A measure of risk.
• A good guess for the average cycle time on a particular day is our
estimator but it is unlikely to be exactly right as the daily average varies.
• PI is designed to be wide enough to contain the actual average cycle
time on any particular day with high probability.
• Normal-theory prediction interval:

1
Y..  t / 2, R 1S 1 
R

• The length of PI will not go to 0 as R increases because we can never


simulate away risk.
• PI’s limit is:   z / 2 indicating no matter how much we simulate,
the daily average still varies.
18
3. Absolute Measures: Confidence-Interval Estimation

Example:
• Suppose that the overall average of the average cycle time on
120 replications of a manufacturing simulation is 5.80 hours,
with a sample standard deviation of 1.60 hours
• Since t0.025,119=1.98, a 95% confidence interval for the long-run
expected daily average cycle time is 5.801.98(1.60/120) or
5.800.29 hours.
– Our best guess for average cycle time is 5.80 hours, but there
could be as much as 0.29 hours error in that estimate
• On any particular day, we are 95% confident that the average
cycle time for all parts produced on that day will be
5.801.98(1.60)(1+1/120) = 5.803.18 hours!!

19
4. Output Analysis for Terminating Simulations

A terminating simulation: runs over a simulated time interval [0, TE] and results
in observations Y1, …, Yn
The sample size n may be a fixed number or a random variable.
A common goal is to estimate:

1 n 
  E   Yi , for discrete output
n
 i 1 
 1 TE 
  E   Y (t )dt , for continuous output Y (t ),0  t  TE
 TE 0 
In general, independent replications (R) are used, each run using a different
random number stream and independently chosen initial conditions.

20
4. Output Analysis for Terminating Simulations: Statistical
Background

It is very important to distinguish within-replication data from across-


replication data.
The issue is further confused by the fact that simulation languages only provide
summary of the measures and not the raw data.
For example, consider simulation of a manufacturing system
• Two performance measures of that system: cycle time Yi. and
for parts
work in process (WIP).
• Let Yij be the cycle time for the jth part produced in the ith replication.
• Across-replication data are formed by summarizing within-replication
data

21
4. Output Analysis for Terminating Simulations: Statistical
Background

Across Replication:
• For example: the daily cycle time Raverages (discrete time data)
1
– The average: Y.. 
R

Yi.
i 1
1 R
– The sample variance: S 
2
 i. ..
R  1 i 1
(Y  Y ) 2

S
– The confidence-interval half-width: H  t / 2, R 1
R
Within replication:
• For example: the WIP (a continuous time data)
1 TEi
– The average: Yi. 
T Ei 0
Yi (t )dt

– The sample variance: Si2 


1 TEi
 Yi(t )  Y
2

i . dt
T Ei 0

22
4. Output Analysis for Terminating Simulations: Statistical
Background

Y.. the interval replication sample averages, Y,i. are


Overall sample average, , and
always unbiased estimators of the expected daily average cycle time or daily
average WIP.

Across-replication data are independent (different random numbers) and


identically distributed (same model), but within-replication data do not have
these properties.

23
4. Output Analysis for Terminating Simulations: C.I. with
Specified Precision

Sometimes we would like to estimate CI with a specified precision


The half-length H of a 100(1 – )% confidence interval for a mean , based on
the t distribution, is given by:
S
H  t / 2, R 1
R
R is the # of S2 is the sample
replications variance

 
Suppose that an error criterion  is specified with probability 1 - , a sufficiently
large sample size should satisfy: P Y..      1  
(in other words, it is desired to estimate  by ) Y..

24
4. Output Analysis for Terminating Simulations: C.I. with
Specified Precision

Assume that an initial sample of size R0 (independent) replications has been


observed.
Obtain an initial estimate S02 of the population variance 2.
Then, choose sample size R such that R  R0:
• Since t/2, R-1 z/2, an initial estimate of R:
2
 z / 2 S 0 
R  , z / 2 is the standard normal distribution.
  
• R is the smallest integer satisfying R  R0 and
2
 t / 2, R 1S 0 
Collect R - R0 additional observations. R   
S   
The 100(1-)% C.I. for : Y..  t / 2, R 1
R
25
4. Output Analysis for Terminating Simulations: C.I. with
Specified Precision
Call Center Example: estimate the agent’s utilization  over the first 2 hours of the workday.
• Initial sample of size R0 = 4 is taken and an initial estimate of the population variance is S02 =
(0.072)2 = 0.00518.
• The error criterion is = 0.04 and confidence coefficient is 1- = 0.95, hence, the final sample
size must be at least:

2
2
 z0.025 S0  1.96 * 0.00518
    12.14
• For the final sample size:
  
2
0.04 R must be
greater than
R 13 14 15 this #
t 0.025, R-1 2.18 2.16 2.14
t / 2,R1S 0 /   2 15.39 15.1 14.83
• R = 15 is the smallest integer satisfying the error criterion, so R - R0 = 11 additional
replications are needed.
• After obtaining additional outputs, half-width should be checked.

26
4. Output Analysis for Terminating Simulations: Quantiles

To present the interval estimator for quantiles,


• it is helpful to look at the interval estimator for a mean in the special
case when mean represents a proportion or probability, p
In this book, a proportion or probability is treated as a special case of a
mean.

27
4. Output Analysis for Terminating Simulations: Quantiles

When the number of independent replications Y1, …, YR is large enough that


t/2,n-1 = z/2, the confidence interval for a probability p is often written as:

pˆ (1  pˆ )
pˆ  z / 2
R 1
The sample proportion

A quantile is the inverse of the probability to the probability estimation


problem:
p is given

Find  such that Pr(Y) = p

28
4. Output Analysis for Terminating Simulations: Quantiles

The best way is to sort the outputs and use the (R*p)th smallest value, i.e., find
 such that 100p% of the data in a histogram of Y is to the left of .
• Example: If we have R=10 replications and we want the p = 0.8 quantile,
first sort, then estimate by the (10)(0.8) = 8th smallest value (round if
necessary).

5.6 sorted data


7.1
8.8
8.9
9.5
9.7
10.1
12.2 this is our point estimate
12.5
12.9
29
4. Output Analysis for Terminating Simulations: Quantiles

 Confidence Interval of Quantiles: An approximate (1-)100%


confidence interval for  can be obtained by finding two values
l and u.
 l cuts off 100pl% of the histogram (the Rpl smallest value of the
sorted data).
 u cuts off 100pu% of the histogram (the Rpu smallest value of the
sorted data).
p (1  p )
where p  p  z / 2
R 1
p(1  p )
pu  p  z / 2
R 1

30
4. Output Analysis for Terminating Simulations: Quantiles

Consider a single run of a simulation model to estimate a steady-state or long-


run characteristics of the system.
• The single run produces observations Y1, Y2, ... (generally the samples of an
autocorrelated time series).
• Performance measure:
1 n
  lim  Yi , for discrete measure (with probability 1)
n   n i 1
1 T
  lim  Y (t ) dt ,
E
for continuous measure (with probability 1)
T  TE
0
E

– Independent of the initial conditions.

31
5. Output Analysis for Steady-State Simulation

The sample size is a design choice, with several considerations in mind:


• Any bias in the point estimator that is due to artificial or arbitrary initial
conditions (bias can be severe if run length is too short).
• Desired precision of the point estimator.
• Budget constraints on computer resources.
Notation: the estimation of  from a discrete-time output process.
• One replication (or run), the output data: Y1, Y2, Y3, …
• With several replications, the output data for replication r: Yr1, Yr2, Yr3, …

32
5. Output Analysis for Steady-State Simulation: Initialization
Bias

Methods to reduce the point-estimator bias caused by using artificial and


unrealistic initial conditions:
• Intelligent initialization.
• Divide simulation into an initialization phase and data-collection phase.
Intelligent initialization
• Initialize the simulation in a state that is more representative of long-run
conditions.
• If the system exists, collect data on it and use these data to specify more
nearly typical initial conditions.
• If the system can be simplified enough to make it mathematically solvable,
e.g. queueing models, solve the simplified model to find long-run expected or
most likely conditions, use that to initialize the simulation.

33
5. Output Analysis for Steady-State Simulation:
Initialization Bias

Divide each simulation into two phases:


• An initialization phase, from time 0 to time T0.
• A data-collection phase, from T0 to the stopping time T0+TE.
• The choice of T0 is important:
– After T0, system should be more nearly representative of steady-state
behavior.
• System has reached steady state: the probability distribution of the system
state is close to the steady-state probability distribution (bias of response
variable is negligible).

34
5. Output Analysis for Steady-State Simulation: Initialization
Bias

M/G/1 queueing example: A total of 10 independent replications were made.


• Each replication beginning in the empty and idle state.
• Simulation run length on each replication was T0+TE = 15,000 minutes.
• Response variable: queue length, LQ(t,r) (at time t of the rth replication).
• Batching intervals of 1,000 minutes, batch means
Ensemble averages:
• To identify trend in the data due to initialization bias
• The average corresponding batch means across replications:

R
• The preferred1 method to determine deletion point.
Y. j 
R
 Y
r 1
rj R replications

35
5. Output Analysis for Steady-State Simulation: Error Estimation

If {Y1, …, Yn} are not statistically independent, then S2/n is a biased estimator of
the true variance.
• Almost always the case when {Y1, …, Yn} is a sequence of output
observations from within a single replication (autocorrelated sequence,
time-series).
Suppose the point estimator  is the sample mean

Y  i 1Yi / n
n

• Variance of Yis almost impossible to estimate.


• For system with steady state, produce an output process that is
approximately covariance stationary (after passing the transient phase).
– The covariance between two random variables in the time series
depends only on the lag (the # of observations between them).

36
5. Output Analysis for Steady-State Simulation: Error Estimation

For a covariance stationary time series, {Y1, …, Yn}:


• Lag-k autocovariance is:  k  cov(Y1 , Y1 k )  cov(Yi , Yi  k )

k
• Lag-k autocorrelation is: k 
2

If a time series is covariance stationary, then the variance of is: Y


2  n 1
 k 
V (Y ) 
n 1  2  1    k 
k 1  n 

The expected value


 S of
2 the variance estimator is:
 n / c 1
E    BV (Y ), where B 
 n  n 1
37
5. Output Analysis for Steady-State Simulation: Error Estimation

Stationary time series Yi exhibiting


positive autocorrelation.

Stationary time series Yi exhibiting


negative autocorrelation.

Nonstationary time series with an


upward trend

38
5. Output Analysis for Steady-State Simulation: Error Estimation

The expected value of the variance estimator is:

 S2  n / c 1
E    BV (Y ), where B  and V (Y ) is the variance of Y
 n  n 1

• If Yi are independent, then S2/n is an unbiased estimator of V (Y )


• If the autocorrelation k are primarily positive, then S2/n is biased low as an
estimator of . V (Y )
• If the autocorrelation k are primarily negative, then S2/n is biased high as
an estimator of . V (Y )

39
5. Output Analysis for Steady-State Simulation: Replication
Method

Use to estimate point-estimator variability and to construct a confidence


interval.
Approach: make R replications, initializing and deleting from each one the
same way.
Important to do a thorough job of investigating the initial-condition bias:
• Bias is not affected by the number of replications, instead, it is affected
only by deleting more data (i.e., increasing T0) or extending the length
of each run (i.e. increasing TE).
Basic raw output data {Yrj, r = 1, ..., R; j = 1, …, n} is derived by:
• Individual observation from within replication r.
• Batch mean from within replication r of some number of discrete-time
observations.
• Batch mean of a continuous-time process over time interval j.

40
5. Output Analysis for Steady-State Simulation: Batch Means for
Interval Estimation

Using a single, long replication:


• Problem: data are dependent so the usual estimator is biased.
• Solution: batch means.
Batch means: divide the output data from 1 replication (after appropriate deletion) into
a few large batches and then treat the means of these batches as if they were
independent.

A continuous-time process, {Y(t), T0 t  T0+TE}:


• k batches of size m = TE/k, batch means:
1 jm
Yj   Y (t  T0 )dt
m ( j 1) m

A discrete-time process, {Yi, i = d+1,d+2, …, n}:


jm
1
• k batches of size m = (n – d)/k, batch means: Yj   Yi  d
m i ( j 1) m 1
41
5. Output Analysis for Steady-State Simulation : Batch Means
for Interval Estimation

Y1 , ..., Yd , Yd 1 , ..., Yd  m , Yd  m1 , ..., Yd  2 m , ... , Yd ( k 1) m1 , ..., Yd  km


deleted Y1 Y2 Yk

Starting either with continuous-time or discrete-time data, the variance of the


sample mean is estimated by:

S 2
1
k
Y j  Y  2  k
Y j2  kY 2
k

k 
j 1
k 1 
j 1
k (k  1)
If the batch size is sufficiently large, successive batch means will be
approximately independent, and the variance estimator will be
approximately unbiased.
No widely accepted and relatively simple method for choosing an acceptable
batch size m (see text for a suggested approach). Some simulation software
does it automatically.
42

You might also like