Cpsc531 Input
Cpsc531 Input
Carey Williamson
Department of Computer Science
University of Calgary
Fall 2017
Motivational Quote
2
(Slightly Revised) Motivational Quote
model
“If you can’t measure it, you can’t improve it.”
- Peter Drucker
3
Simulation Input Analysis
4
Data Collection
5
Data Analysis Checklist (meta-level)
6
Data Analysis Checklist (detailed-level)
9
Histograms (2 of 3)
▪ Guideline:
— The number of cells ≈ the square root of the sample size
10
Histograms (3 of 3)
Same data
with different
interval sizes
11
Selecting the Family of Distributions (1 of 4)
12
Selecting the Family of Distributions (2 of 4)
13
Selecting the Family of Distributions (3 of 4)
14
Selecting the Family of Distributions (4 of 4)
15
Quantile-Quantile Plots (1 of 8)
𝐹 𝑥𝑞 = ℙ 𝑋 ≤ 𝑥𝑞 = 𝑞, 0<𝑞<1
16
Quantile-Quantile Plots (2 of 8)
17
Quantile-Quantile Plots (3 of 8)
number of 𝑋𝑖′ 𝑠 ≤ 𝑥
𝐹𝑛 𝑥 =
𝑛
▪ It follows that
𝑗
𝐹𝑛 𝑥 =
𝑛
where 𝑗 is the rank or order of 𝑥, i.e., 𝑥 is the 𝑗-th value among 𝑋𝑖 ’s.
18
Quantile-Quantile Plots (4 of 8)
▪ Problem:
— For finite value 𝑥 = 𝑋(𝑛) , we have 𝐹𝑛−1 1 = 𝑋(𝑛)
— But from the model we generally have: 𝐹 −1 1 = ∞
— How to resolve this mismatch?
19
Quantile-Quantile Plots (5 of 8)
▪ 𝐹(𝑥): the CDF fitted to the observed data, i.e., the model
20
Quantile-Quantile Plots (6 of 8)
𝑗−0.5
— 𝑋(𝑗) ’s are plotted versus 𝐹 −1 where 𝐹 is the normal CDF with
𝑛
sample mean (99.93 sec) and sample STD (1.29 sec)
21
Quantile-Quantile Plots (7 of 8)
▪ Example (continued):
Check whether the door installation
times follow a normal distribution.
Straight line,
supporting the
hypothesis of a
normal distribution
Superimposed density
function of the Normal
distribution scaled by the
number of observation,
that is 20 × 𝑓(𝑥)
22
Quantile-Quantile Plots (8 of 8)
23
Parameter Estimation (1 of 4)
24
Parameter Estimation (2 of 4)
25
Parameter Estimation (3 of 4)
26
Parameter Estimation (4 of 4)
27
Goodness-of-Fit Tests (1 of 2)
28
Goodness-of-Fit Tests (2 of 2)
— Hypothesis:
𝑋 has a Poisson distribution with rate 3.64
29
Chi-Square Test (1 of 11)
Intuition:
30
Chi-Square Test (2 of 11)
Concepts:
▪ Null hypothesis 𝐻0 :
The observed random variable 𝑋 conforms to the model distribution
▪ Alternative hypothesis 𝐻1 :
The observed random variable 𝑋 does not conform to the model distribution
▪ Test statistic 𝜒 2 :
The measure of the difference between sample data and the model
distribution
▪ Significance level 𝛼:
The probability of rejecting the null hypothesis when the null hypothesis is
true. Common values are 0.05 and 0.01.
31
Chi-Square Test (3 of 11)
Approach:
32
Chi-Square Test (4 of 11)
Test Statistic:
𝐸𝑖 = 𝑛 ⋅ 𝑝 𝑥
𝑎𝑖−1 ≤𝑥<𝑎𝑖
33
Chi-Square Test (5 of 11)
Test Statistic:
34
Chi-Square Test (6 of 11)
𝑑𝑓 = 2
𝑑𝑓 = 5
𝑑𝑓 = 10
35
Chi-Square Test (7 of 11)
Intuition:
36
Chi-Square Test (8 of 11)
Critical Value:
2
▪ For significance level 𝛼, the critical value 𝜒𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 is defined such that:
2 2
ℙ 𝜒𝑘−𝑠−1 ≥ 𝜒𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 =𝛼
2 2
▪ 𝜒𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 = 𝜒𝑘−𝑠−1,1−𝛼
the (1 − 𝛼)-quantile of
chi-square distribution
2
with 𝑘 − 𝑠 − 1 𝜒𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 Shaded area = 𝛼
degrees of freedom
Do not reject Reject
37
Chi-Square Test (9 of 11)
▪ Interpretation:
— The test statistic can be
as large as the critical value
Chi-square PDF
— If the test statistic is greater
than the critical value then,
the null hypothesis is rejected
40
Kolmogorov-Smirnov Test
▪ Intuition:
— Formalizes the idea behind examining a Q-Q plot
— The test compares the CDF of the hypothesized
distribution with the empirical CDF of the sample
observations based on the maximum distance between
two cumulative distribution functions.
41
Selecting Model without Data (1 of 2)
42
Selecting Model without Data (2 of 2)
44