Kriging Metamodeling in Constrained Simulation Optimization: An Explorative Study
Kriging Metamodeling in Constrained Simulation Optimization: An Explorative Study
Kriging to simulation experimentation. Popular software where k denotes the number of inputs, h = (h1 , K, hk )′ the
for obtaining a LHD is Crystal Ball, @Risk, and Risk distance vector between two inputs, say x i and x i′ , θ j the
Solver; also see the references in (Kleijnen 2007).
importance of input j; that is, the higher θ j is, the less ef-
s S 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 Sum
20 1 1
fect input j has, and p j the smoothness of the correlogram
21 1 1
22 1 1
function. Often, these powers p j are chosen as
23 1 1
24 1 1
p j = p = 2 . Then, the resulting correlogram is the infi-
25 1 1
26 1 1
nitely differentiable so-called “Gaussian” correlation func-
27 1 1 tion.
28 1 1 The criterion to select the weights λ is mean-squared
29 1 1
30 1 1 prediction error σ e2 defined as
31 1 1
32 1 1
33
34 1
1 1
1
( )
σ e2 = E (Y (x 0 ) − Y (x 0 )) 2 . ) (4)
35 1 1
36 1 1
37 1 1 Minimizing (4) under the constraint (2) gives the op-
38 1 1 timal weights
39 1 1
Sum 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 20 /
⎛ 1 − 1 / Γ −1 γ ⎞
Figure 2: A latin hypercube design for two inputs λ / = ⎜⎜ γ + 1 / −1 ⎟⎟ Γ −1 (5)
⎝ 1 Γ 1 ⎠
2 KRIGING: BASICS
where γ denotes the vector of (co)variances
Kriging is an interpolation method that predicts unknown (γ (x 0 − x1 ), K , γ (x 0 − x n )) , Γ denotes the n × n matrix
/
∑
n
Yˆ ( x 0 ) = ∑ in = 1 λi ⋅ Y ( x i ) = λ ′ ⋅ Y with λ =1. (2) 1
i =1 i
2γˆ (h) =
N (h)
∑ N (Y (x i ) − Y (x j ))
(h )
2
(7)
The weights λ = (λ1 , K, λn )′ in (2) depend on the
distances between the input to be predicted x 0 and the in- where N (h) denotes the number of distinct pairs in
puts already observed x i . Kriging assumes that the closer N ( h) = {(x i , x j ) : x i − x j = h ; i, j = 1,K, n} ; see
the input data are, the more positively correlated the pre-
Matheron (1962). These estimates imply estimates of the
diction errors are. This assumption is modeled through the
correlogram (or the related variogram). In simulation, a corresponding correlations, so the parameters θ j and p j
popular class of correlograms is in (3) can be fitted. For this fitting, standard Kriging soft-
ware uses Maximum Likelihood Estimation (MLE); Van
ρ (h) = ∏ j =1 exp(−θ j ⋅ | h j | )
k pj
(3) Beers and Kleijnen (2003), however, use Weighted Least
Squares (WLS) estimation for a linear correlogram func-
tion. These estimated covariances are substituted into (5) to
estimate the optimal weights. The statistical complications
356
Biles, Kleijnen, van Beers, and van Nieuwenhuyse
arising from this estimation are discussed in Den Hertog, • The simulated period was 120 days, with costs
Kleijnen, and Siem (2006). maintained on a $/day basis.
• The ordering cost was $32 per order plus $3 per
3 KRIGING IN CONSTRAINED OPTIMIZATION unit ordered.
• The order lead-time was uniformly distributed be-
Kleijnen and van Beers (2004) applied Kriging for sensi- tween 0.5 and 1: U(0.5, 1).
tivity analysis of stochastic simulation with a single output. • The inventory review interval was one day.
Now we explore the use of Kriging in constrained optimi- • Demand was Poisson distributed with a rate of λ =
zation. 10 customers/day.
We studied two experiments with an (s, S) inventory • The number of units demanded per customer var-
system. In such a system, a replenishment order is placed ied between 1 and 4 and followed the (cumulative)
as soon as the inventory position (= on-hand inventory + distribution (0.17, 1; 0.5, 2; 0.83, 3; 1, 4).
outstanding orders – backlogs) drops to or below the reor- • The holding cost was $1/unit/day.
der point s. This replenishment order brings the inventory • The shortage cost was $5/unit/day.
position back to the order-up-to level S. Note that S conse- An Arena-based (s, S) model with these parameters
quently denotes the maximum inventory position. We de- was simulated for r = 5 replications at each design point in
fine an auxiliary variable Q = S – s (the actual order size is the 20-point LHD shown earlier in Figure 2. For each of
a random variable that is obviously at least equal to Q). these points, five replicates were simulated; each replicate
In general, the objective of our optimization problem started with an inventory of S units. The average results for
is to find those values of s and S (or, equivalently, s and Q) holding, shortage and total cost are displayed in Table 1;
that minimize a given cost function subject to a number of the three rightmost columns show whether the design point
constraints: violates the holding cost constraint (CVh), the shortage
cost constraint (CVs), or the Boolean OR for these results
min y0(X) (8) (CV). The last columns shows that one-half of the 20 de-
such that sign points turned out to be feasible, which indicates that
yj(X) ≥ aj (j = 1, …, m) the design points were well placed in the experimental re-
x1 ≥ 0, x2 ≥ 0 gion.
where the vector X = (x1, x2) refers to an (s, S) (or, equiva- Table 1: Simulation results for the LHD applied to the (s, S)
lently (s, Q)) combination. inventory system.
In our two experiments we studied different objective
functions and different constraints. In the next two subsec- Trial s S Hold Short Total CVh CVs CV
tions, we describe these two experiments and their results. 1 20 48 11.60 18.84 124.99 0 1 1
2 21 58 16.53 13.33 119.63 0 1 1
3 22 68 21.45 10.49 121.41 0 1 1
3.1 Experiment 1: An (s, S) Inventory System With
4 23 78 26.69 8.88 123.27 1 0 1
Holding Cost And Shortage Cost Constraints 5 24 40 9.73 15.04 128.18 0 1 1
6 25 50 14.18 12.23 122.06 0 1 1
In the first experiment, the objective is to minimize the ex- 7 26 60 18.89 8.68 118.24 0 0 0
pected “total” cost y0(X) (defined as the sum of holding, 8 27 70 25.03 6.27 120.27 1 0 1
9 28 56 17.71 8.01 120.62 0 0 0
shortage and ordering cost) subject to limitations on the
10 29 42 12.16 8.95 125.62 0 0 0
expected holding cost y1(X) (as there is a scarcity of stor- 11 30 52 16.98 7.46 121.96 0 0 0
age space for the product) and shortage cost y2(X) (as man- 12 31 62 20.79 6.06 120.23 0 0 0
agement wishes to maintain customer satisfaction). More 13 32 72 27.38 4.03 124.79 1 0 1
specifically, we have: 14 33 66 24.08 4.18 119.68 0 0 0
15 34 44 14.16 6.68 126.22 0 0 0
16 35 54 20.41 3.86 125.45 0 0 0
min y0(X) (9) 17 36 64 24.63 3.55 128.18 0 0 0
18 37 74 30.60 1.98 125.95 1 0 1
such that 19 38 76 31.70 2.35 126.97 1 0 1
y1(X) ≤ 25 20 39 46 15.95 4.91 127.47 0 0 0
y2(X) ≤ 10
x1 ≥ 0, x2 ≥ 0 The graphical function of Minitab (2007) gave the re-
sponse surfaces shown in Figure 3. The total cost function
where x1 stands for s, and x2 stands for S. The parameters y0(X) is very well behaved and follows a nearly quadratic
for this (s, S) inventory control model were as follows: form (Minitab uses all design points to fit the surface –
hence, we observe a slight departure from a perfect quad-
357
Biles, Kleijnen, van Beers, and van Nieuwenhuyse
ratic form). The surfaces for the constraint functions for Surface Plot of Hold vs s, S_1
holding cost and shortage cost are highly planar in the re-
gion of interest defined by 10 ≤ x1 ≤ 40 and 40 ≤ x2 ≤ 90.
Excel Solver gave as the optimal solution: x1 = 25, x2 =
63, y0 = $118.47, y1 = $20.56, and y2 = $8.13. Hence, a re-
sponse surface approach to optimization yielded an uncon- 40
30
variables s, S, and Total in Table 1. 40
60
80 10
s
S_1
Figure 4 shows the plot for the total cost produced
through a 20-by-20 grid of Kriging predictions. The result-
ing optimum is s = 27, S = 60, y0 = $118.56, y = $19.30,
and y2 = $7.94. Again, neither of the constraints is binding. (a) Holding cost plot
However, this grid used only the even values of S over
the range from 40 to 80. To find a possibly better solution,
Surface Plot of Short vs s, S_1
we conducted a localized Kriging search for s is 26, 27,
and 28 and S is 59, 60, and 61. These Kriging predictions
confirmed the above optimum.
30
Short 20
10
50
124 0
30 s
40
60 10
S_1 80
123
122
121
(b) Shortage cost plot
119
118
80
140
70 40
60 35
Total 130
30
50 25 50
120
Big S 40 20 30
little s 40
60
s
80 10
S_1
Figure 3. Plots for (a) Holding cost, (b) Shortage cost, and
(c) Total cost for the (s, S) inventory system. Resulting
from a 20-point LHD.
358
Biles, Kleijnen, van Beers, and van Nieuwenhuyse
3.2 Experiment 2: An Integrated Production- riod was determined by means of Welch’s method (Law
Inventory System with an (S, S) Policy and A and Kelton 2000, p. 520). To estimate the partial cost per
Service Level Constraint time unit and the CSL for a given (s, Q) combination, we
used the replication-deletion approach described in (Law
The second experiment illustrates the application of and Kelton 2000, p. 525).
Kriging to determine the optimal values of s and S in an The initial number of replications for any input com-
integrated production-inventory system. In this system, bination is set at two. Next, extra replications are added se-
every time a replenishment order is triggered, the produc- quentially, to obtain 95 % confidence interval with a target
tion system needs to first produce the order before it ends relative error of 0.01 for the partial cost per time unit for a
up in inventory. given (s, Q) combination. So, based on m0 replications, we
The optimal policy is defined as the policy that mini- obtain
mizes expected partial cost per time unit (consisting of in-
ventory holding cost and ordering cost), subject to the con- S y2
straint of a target Customer Service Level (CSL) (defined y ±t α (10)
as the percentage of units ordered that could be immedi- m0 −1;1−
2
m0
ately delivered from stock). with
The production time per product unit at the factory is m0
359
Biles, Kleijnen, van Beers, and van Nieuwenhuyse
360
Biles, Kleijnen, van Beers, and van Nieuwenhuyse
(s*, Q*) = (55, 48) so S* = 55 + 48 = 103, yielding a pre- ments in which Kriging was applied to solve this problem.
dicted partial cost y 0 of 5.6472 and a CSL of 0.9511. Until now, Kriging has been applied chiefly to determinis-
The DACE toolbox can also be used to estimate the tic simulations. Van Beers and Kleijnen (2003) have dem-
gradients of the goal (or objective) function (partial cost in onstrated its application to sensitivity analysis of stochastic
this case) and the constraint (CSL) at the predicted opti- simulations with a single response. The present paper
mum (s* = 55, Q* = 48); see Lophaven et al. (2002, pp. demonstrates that Kriging has potential for constrained op-
15-16). This yields the gradients displayed in Table 2. timization in stochastic simulation— though a number of
issues remain to be investigated.
Table 2. Gradients of the partial cost function and the CSL Firstly, future research might aim at how to improve
function, determined through DACE the precision of the Kriging model for the constrained re-
partial cost CSL sponse, in the neighborhood where the constraint is bind-
∂f ∂f ing; i.e., are the slacks significantly positive or negative?
∇f = ( , ) (0.0930, -0.0616) (0.0040, -0.0027) Secondly, additional work is underway to apply methods
∂s ∂Q
∇f
developed in Mathematical Programming to the Kriging
0.1116 0.0048 approximations.
∇f
(0.8338, -0.5520) (0.8313, -0.5558) Simulation optimization remains a problem solved by
∇f
heuristics; i.e., it is impossible to identify a truly optimum
solution in stochastic simulation. For example, in multiple
Dividing both gradients by their norm in Table 2 comparison and ranking procedures the classical assump-
shows that the gradients point roughly in the same direc- tion is that the simulation outputs are normally distributed
tion—as the Karush-Kuhn-Tucker (KKT) first-order opti- with constant variance; hence, the procedures can only
mality conditions require. In future research, we shall ap- make probability statements about the optimum. In con-
ply a statistical procedure to test whether the KKT strained simulation optimization, the problem is even more
conditions for constrained optimization indeed hold in the complicated. When dealing with an estimated optimum so-
predicted optimum; see Bettonvil et al. (2007). lution at a boundary, we are faced with a more complicated
probability statement involving possible violations of the
Step 3: confirmatory simulation constraints. In current research we are using a t statistic to
Finally, we compare the optima predicted by the Kriging test the feasibility of a candidate solution, and a boot-
model and an exhaustive simulation search. We therefore strapped statistic to test whether the KKT conditions hold
simulated each integer (s, Q) combination in the neighbor- at that solution.
hood of the predicted optimum; i.e., we chose to explore Unlike many other simulation optimization heuristics
the experimental area defined by 54 < s < 57 and (which have been discussed at recent Winter Simulation
46 < Q < 51, and a step size of 1 for both s and Q. This ex- Conferences), our heuristic assumes neither many simu-
periment gives as the optimum (s*, Q*) = (56, 50) with a lated factor combinations nor many replications (per com-
simulated partial cost y 0 of 5.6218 and a simulated CSL of bination). An advantage is that our heuristic is appropriate
0.95. for computationally expensive simulation experiments. A
The minimum partial cost predicted by Kriging is disadvantage is that we have not yet succeeded in proving
close to the minimum simulated. The location of the the convergence of our heuristic to the true optimum.
optimum, however, does not coincide. Further analysis re-
veals that at (s*, Q*) = (56, 50) Kriging predicts a partial
cost y 0 of 5.6191, which is very close to the simulated par- ACKNOWLEDGMENTS
tial cost; however, the Kriging model for CSL predicts a
CSL of 0.9495 so the Kriging model considers this point Professor Biles received a Visitor’s Grant from the Neth-
to be infeasible. erlands Organization for Scientific Research (NWO) to do
We believe that the latter result indicates that the pre- research at Tilburg University during 4½ months in 2007.
cision of the Kriging methodology for constrained optimi-
zation may benefit substantially from efforts to further im- REFERENCES
prove the Kriging prediction for the constraint in the
neighborhood of the constraint’s cut-off value. This issue Bettonvil, B., E. del Castillo and J. P. C. Kleijnen. 2007.
will be further explored in future research. Statistical testing of optimality conditions in multire-
sponse simulation-based optimization. Working Paper,
4 CONCLUSIONS Tilburg University, Tilburg, Netherlands
Cressie, N.A.C. 1993. Statistics for spatial data. John
This paper considered the problem of constrained optimi- Wiley & Sons, Inc., New York.
zation in stochastic simulation. It described two experi-
361
Biles, Kleijnen, van Beers, and van Nieuwenhuyse
den Hertog, D., J. P. C. Kleijnen, and A.Y.D. Siem. 2006. the “Operations Research Group” of the “Center for Eco-
The correct Kriging variance estimated by bootstrap- nomic Research (CentER)”. He also teaches at the Eindho-
ping, Journal of the Operational Research Society, ven University of Technology, in the Postgraduate Interna-
5(4):400-409. tional Program in Logistics Management Systems. He is an
Kelton, W. D., R. P. Sadowski, and D. T. Sturrock. 2007. “external fellow” of the Mansholt Graduate School of So-
Simulation with Arena, Fourth Edition, McGraw-Hill, cial Sciences of Wageningen University. His research con-
New York. cerns the statistical design and analysis of simulation ex-
Kleijnen, J. P. C. 2007. DASE: Design and analysis of periments, information systems, and supply chains. He has
simulation experiments. Springer, New York. been a consultant for several organizations in the USA and
Kleijnen, J. P. C. and W. C. M. van Beers. 2004. Applica- Europe, and serves on any international editorial boards
tion-driven sequential designs for simulation experi- and scientific committees. He spent several years in the
ments: Kriging metamodeling, Journal of the Opera- USA, at universities and private companies. He received a
tional Research Society, 55(9):876-883. number of international awards, including the INFORMS
Law, A. M. and W. D. Kelton. 2000. Simulation modeling Simulation Society's “Lifetime Professional Achievement
and analysis, third edition, McGraw-Hill, Boston, MA. Award (LPAA)” of 2005. More information can be found
Lophaven, S. N., H.B. Nielsen, and J. Sondergaard. 2002. on his website at <http://center.uvt.nl/staff
Dace: A Matlab Kriging Toolbox, Technical Report /kleijnen/>.
IMM-TR-2002-12.
Matheron, G. 1962. Traité de géostatistique appliquée. WIM VAN BEERS is a researcher in the Department of
Memoires du Bureau de Recherches Geologiques et Information Management of Tilburg University. His pri-
Minieres, Editions Technip, Paris, 14:57-59. mary research area is Kriging for interpolation in random
McKay, M. D., R. J. Beckman, and W. J. Conover. 1979. simulation .He is also a member of the Department of
A comparison of three methods for selecting input Mathematics and Computer Science of Eindhoven Univer-
values in the analysis of output from a computer code. sity of Technology. His e-mail address is <wvbeers@
Technometrics, 21(2):239-245. uvt.nl>.
Minitab, Inc. 2007. Minitab 15 statistical software, State
College, PA. INNEKE VAN NIEUWENHUYSE is an Assistant Pro-
The MathWorks, Inc. 2001. Matlab: the language of tech- fessor at the Faculty of Economics and Applied Economics,
nical computing, Novi, Michigan. Department of Decision Sciences and Information Man-
Van Beers, W. C. M. and J. P. C. Kleijnen. 2003. Kriging agement at K.U.Leuven (Belgium). Her research focuses
for interpolation in random simulation, Journal of the on the performance analysis of production and supply
Operational Research Society, 54: 255-262. chain systems, using queuing theory and simulation. Her e-
Wackernagel, H. 2003. Multivariate geostatistics. mail address is <inneke.vannieuwenhuyse@econ.
Springer-Verlag, Berlin. kuleuven.be>.
AUTHOR BIOGRAPHIES
362