(36)Production-inventory systems with imperfect advance demand information and updating
(36)Production-inventory systems with imperfect advance demand information and updating
August 8, 2006
Abstract
We consider a supplier with finite production capacity and stochastic production times that
produces a single product. Customers provide advance demand information (ADI) to the sup-
plier by announcing orders ahead of their due dates. However, this information is not perfect,
and customers may request an order be fulfilled prior to or later than the expected due date
or they may decide to cancel. Hence, the demand leadtime (the time between when an or-
der is announced and when it becomes due or is canceled) is random. Customers update the
status of their orders, but the time between consecutive updates is random as well. We con-
sider several schemes through which ADI is revealed and updated. For each, we formulate the
production-control problem as a continuous-time Markov decision process and prove that there
is an optimal (among all policies) state-dependent base-stock policy, where the base-stock levels
depend upon the number of orders at various stages of update. In addition, we show that the
state-dependent base-stock levels are non-decreasing in the number of orders in each stage of
update. In a numerical study, we examine the benefit of ADI to both supplier and customers
and study the effect of having full versus partial ADI. We also compare the performance of a
class of simple heuristics to that of an optimal policy.
Keywords: Advance demand information, production-inventory systems, make-to-stock queues,
continuous-time Markov decision processes
1 Introduction
By sharing advance demand information (ADI), firms can reduce inventory costs, improve service
levels, and help coordinate the supply chain. Large manufacturers such as Dell and retailers such
as Wal-Mart have implemented sophisticated processes that enable them to share inventory usage
and point-of-sale (POS) data in real time with thousands of their suppliers. According to a recent
survey, more than 50% of manufacturers in the personal computer industry now provide demand
information to their suppliers (Thonemann, 2002). Initiatives such as the inter-industry consor-
tium on Collaborative Planning, Forecasting and Replenishment (CPFR) offer firms a framework
for sharing demand forecasts and for coordinating production and inventory control decisions. The
emergence of software standards such as the Extended Markup Language (XML) and new com-
munication technologies such as Radio Frequency Identification (RFID) is expected to enable even
greater levels of information sharing and coordination.
There are a number of ways firms may share ADI. For instance, a manufacturer may share
forecasts of future demand with its suppliers, a retailer may provide inventory usage information
to its distributors, or customers may place orders ahead of their due dates with a manufacturer.
In each case, ADI consists of additional information regarding the timing or quantity of future
demand. This information can be perfect (exact information about future orders) or less than
perfect (estimates of timing or quantity of future orders). The information can also be either
explicit, with customers directly stating their intent about future orders, or implicit, with customers
allowing the suppliers to observe their internal operations and to deduce estimates of future orders
(e.g., suppliers observe their customers’ POS data).
The benefit of ADI is intuitively clear. By having more information about future demand, the
supplier can reduce the need for inventory or excess capacity. Customers may also benefit, directly
through improved quality of service from the supplier, or indirectly through lower supplier costs.
However, the availability of ADI also raises several questions. How should the supplier use ADI
to make actual decisions? How valuable is ADI and to both supplier and customers and how is
this value affected by operating characteristics of the supplier and the quality of the information
provided by the customer? Are there significant benefits to receiving information further in advance
of due dates? Are there simple, effective policies that take advantage of ADI?
In this paper, we address these and other related questions for a supplier that produces a single
product. Customers furnish the supplier with advance demand information by announcing orders
ahead of their due dates. However, this information is not perfect, and orders may become due
prior to or later than the announced expected due date or they can be canceled altogether. Hence,
1
the demand leadtime (the time between when an order is announced and when it is requested or
canceled) is random. Customers provide status updates as their orders progress towards becoming
due, but the time between consecutive updates is also random. We are primarily motivated by
settings where the demand information is provided by the customers to the supplier implicitly. The
supplier observes the internal operations of its customers (e.g., order fulfillment, manufacturing, or
inventory usage) and uses this information to estimate when customers will eventually place orders.
We will sometimes refer to such internal operations as the demand leadtime system.
We consider three schemes through which ADI is revealed. In the first, actual due dates of
orders are independent, and orders that are announced later can become due before those that
are announced earlier. Similarly, updates are independent and do not follow a first-announced,
first-updated rule. We refer to this as the system with independent due dates (IDD). In the second
scheme, announced orders are updated and the orders become due in the sequence in which they are
announced. We term this the system with sequential due dates (SDD). We consider two variants
of systems with SDD, each of which differs slightly in how orders progress though the demand
leadtime system. Finally, in the third scheme, there is exactly one order announced at a time and
it is progressively updated. We call this the system with a single order due date (SODD).
The following examples illustrate the three schemes. For the system with IDD, consider a
supplier that provides a component to a manufacturer of a large and complex product, such as
an aircraft. The manufacturer informs the supplier each time it initiates the production of a new
product and each time it completes a stage of the production process. The component provided
by the supplier is not immediately needed and is required only at a later stage of the production
process. The manufacturer does not accept early deliveries, but wishes to have the component
available as soon as it is needed in a just-in-time fashion. The supplier uses the information about
the progression of the product through the manufacturer’s production process to estimate when it
will need to make a delivery to the manufacturer. In making such estimates, the supplier uses its
knowledge of the manufacturer’s operations and available data from past interactions. However,
the estimates are clearly imperfect and the manufacturer (due to inherent variability in the time
it takes to complete each production stage) may complete a production stage sooner or later than
expected. The manufacturer may initiate, in response to its own demand, the production of
multiple products simultaneously (e.g., an aircraft manufacturer may have multiple airplanes being
assembled in parallel). The evolution of these products through the production process is largely
independent, so that a product that enters a particular stage of production later than another
product may complete it sooner. To use a queueing analogy, the manufacturer’s production process
2
can be viewed as consisting of parallel servers, with each server carrying out a series of tasks whose
durations are random.
In addition to manufacturing, the IDD scheme arises in other settings. For example, van
Donselaar et al. (2001) describe a case study of how ADI is shared among building constructors and
material suppliers. Building constructors inform suppliers about the start and progress of building
projects. The suppliers use this information to estimate when their material will be needed. This
estimation is imperfect because progress on a building project can be highly variable and because
design specifications may change over the course of the project, sometimes leading the constructors
not to place orders after all.
For the system with SDD, consider again a setup similar to the one described for IDD, except
that now the manufacturer’s production process is a serial production line. The production line
consists of a series of individual workstations that process jobs one at a time on a first-come, first-
served (FCFS) basis. If a workstation is busy, incoming jobs must wait in its queue. Hence, the
queueing analogy here is of a series of single server queues. The manufacturer informs the supplier
each time it releases a job into the production line (this may correspond to the manufacturer re-
ceiving an order from its own customers) and then updates the supplier each time the job completes
processing at one of the workstations. The supplier uses this information to estimate when it would
receive a request for a delivery from the manufacturer. Such a request would coincide with a job
arriving at the workstation where the component provided by the supplier is needed.
For the system with SODD, consider a supplier that produces a product sold through a single
retailer. The retailer continuously reviews its inventory and follows a (Q, r) policy for placing
orders with the supplier. This means that the retailer places an order for Q units each time its
inventory position drops below r. The supplier has real-time access to POS data from the retailer
and is aware of the retailer’s ordering policy. Each order placed with the retailer can be used by
the supplier to update its estimate of the time until the next replenishment order. This updating
process progresses through Q stages, culminating in the placement of an order for a single batch of
Q units. From the perspective of the supplier, there is always exactly one announced order, whose
due date is updated each time the retailer’s inventory position changes.
In this paper, we present models for each of the three ADI schemes; IDD, SDD, and SODD.
We treat continuous-review systems where orders are announced according to a Poisson process
and orders are updated a fixed number of times. We consider systems with endogenous supply
leadtimes where the supplier has finite capacity with exponentially distributed production times.
The supplier has the ability to produce items ahead of their due dates in a make-to-stock fashion.
3
However, items in inventory incur a holding cost. When an order becomes due and it cannot be
immediately satisfied from inventory, it is backordered but it incurs a backorder cost. The supplier’s
objective is to find a production control policy to minimize the expected total discounted cost or
the expected average cost per unit time.
For each ADI scheme, we formulate the problem as a continuous-time Markov decision pro-
cess (MDP). In each case, the state of the system is described by a k + 1 dimensional vector
(x, y1 , . . . , yk ), where x represents the supplier’s net inventory of finished products and yi repre-
sents the number of announced orders that have undergone the (i − 1)th update. We show that
there is an optimal production policy that is a state-dependent base-stock policy; it is optimal for
the supplier to produce if and only if x < s(y), where s(y) is the optimal base-stock level when
the number of announced orders is y = (y1 , . . . , yk ). In addition, we show that (1) the optimal
base-stock level is non-decreasing in each state variable yi , (2) a unit increase in yi leads to at most
a unit increase in the base-stock level s(y), and (3) the increase in the base-stock level is higher
with an increase in the number of orders at a later stage than with an increase in the number of
orders at an earlier stage. The derived structure is useful because it allows one to compute and
store an optimal policy in terms of just the base-stock levels.
The structural results are also important because they motivate our development of certain easy-
to-implement heuristics for large problems not computationally amenable to exact (i.e., optimal)
solution. These heuristics control production via a positive integer r and a vector (α 1 , . . . , αk ) with
P
0 ≤ αi ≤ 1. Under such heuristics, the decision is to produce if x − ki=1 αi yi < r and to idle
otherwise. We refer to these heuristics as linear base-stock policies (LBPs) because the base-stock
P
levels {r + ki=1 αi yi } are linear combinations of the state variables yi . By restricting a subset of
the weights to be zero (or other specific values), we can specify different versions of the heuristic.
The heuristics mimic the optimal policy by determining a state-dependent (dynamic) base-stock
level that increases with the number of announced orders. By letting 0 ≤ αi ≤ 1 and αj ≤ αi for
j < i, the heuristics preserve all the structural properties of the optimal policy.
Finally, we present results from an extensive numerical study of systems with IDD in which we
(1) examine the value of ADI by comparing systems with ADI, without ADI, and with partial ADI,
(2) study the benefit of ADI to the customers who provide it, and (3) evaluate the performance of
various versions of the heuristics. The analysis reveals the following insights:
• The benefit of having ADI can be significant, with up to 24 percent reduction in long-run av-
erage cost for some examples we consider. However, the magnitude of the relative cost savings
is sensitive to system parameters, including demand leadtime, utilization of the production
4
facility, and the ratio of backorder to holding costs.
• ADI offers significant relative cost savings (in comparisons between systems with and with-
out ADI) for moderate mean demand leadtimes, but small relative cost savings when mean
demand leadtimes are either very large or very small. When orders are announced shortly
before they are due, the supplier has little time to take advantage of this information. When
orders are announced far in advance, the accuracy of ADI is lower due to the higher variance
in demand leadtime.
• All else equal, for systems with demand arrival rate close to the maximum production rate,
ADI offers little relative cost savings. The optimal policy, with or without ADI, is to produce
most of the time. However, for systems with low demand arrival rate, the relative cost savings
from ADI depends upon the mean demand leadtime, with short mean demand leadtimes
giving a low relative value of ADI, and long mean demand leadtimes sometimes giving a high
relative value of ADI.
• For fixed expected demand leadtimes, ADI yields the largest relative cost reduction when the
ratio of backorder to holding costs is moderate, and offers a little relative cost reduction when
the ratio is either very large or very small. When the ratio of backorder to holding costs is
low, ignoring ADI and producing in a make-to-order fashion (i.e., holding little or no finished
goods inventory in anticipation of future demand) carries a relatively small penalty. When
the ratio is large, the amount of finished goods inventory held is high for systems both with
and without ADI, and the probability of backorders is relatively small in both systems.
• The benefit of full ADI relative to partial ADI is typically limited, where under partial ADI
the supplier has access to the status of future orders only when they enter a later stage of the
demand leadtime system. We find in many cases that most of the value of full ADI is realized
by announcing orders as they enter the last one or two stages of the demand leadtime system.
• Although ADI leads to an overall reduction in cost, it can in some cases lead to an increase
in backorder costs. In other words, ADI may be used by the supplier to reduce inventory,
but at the expense of increasing backorders. This implies that customers that provide ADI
may not necessarily see an improvement in their service levels and, in some cases, may even
witness a deterioration. In exchange for providing ADI, customers may however negotiate an
increase in the backorder penalty they apply to the supplier.
5
of the LBP heuristics. For small problems in which direct comparisons are possible, these
versions perform well in relation to the optimal policy.
Besides expanding our understanding of advance demand information, this paper also makes a
methodological contribution by developing an approach for proving structural results for continuous-
time Markov decision processes with unbounded jump rates. (The model with IDD has unbounded
jump rates.) The usual approach for proving structural properties is to first uniformize the
continuous-time MDP to get an equivalent discrete-time MDP, and then to show that certain
properties of functions are preserved by the MDP transition operator. Results typically then follow
using induction and the convergence of value iteration. See page 211 of Stidham (2002) for com-
ments on the importance of uniformization in the previous outline. For problems with unbounded
jump rates, uniformization cannot be applied, and hence the approach does not work. Our new
method involves proving the desired structural properties for each of a sequence of problems with
bounded jump rates (using the proof outline just mentioned), and then extending the results to the
problem with unbounded jump rates by taking an appropriate limit. Although the idea is some-
what intuitive, there are a number of non-trivial technical issues to be worked out. The general
approach may prove useful in other problem contexts.
2 Related Literature
There is a growing literature dealing with inventory systems with ADI. A review of much of this
work can be found in Gallego and Özer (2002). Models can be broadly classified into two categories
based on whether inventory is reviewed periodically or continuously.
For systems with periodic review, ADI is typically modeled as information available about
demand in future periods. Under varying assumptions, Gallego and Özer (2001), Özer and Wei
(2004), and Schwarz et al. (1997) have shown the existence of optimal state-dependent base-stock
policies for periodic-review problems with ADI. In these papers, the base-stock levels depend upon
the vector of advance orders for future periods. Other papers that consider periodic review systems
with ADI include Thonemann (2002), DeCroix and Mookerjee (1997), and Gavirneni et al. (1999).
For continuous-review inventory systems with ADI, Buzacott and Shanthikumar (1994) consider
production-inventory systems with ADI and evaluate policies that use two parameters: a base-stock
level and a release leadtime. Hariharan and Zipkin (1995) introduced the notion of demand leadtime
in a system where orders are announced exactly L units of time before they are due. For constant
supply leadtimes and Poisson order arrivals, they show that there is an optimal base-stock policy
6
with a fixed base-stock level. Karaesmen et al. (2002) analyze a discrete-time model with constant
demand leadtimes that is similar to our SDD model with no due-date updating. They prove the
optimality of state-dependent base-stock policies. Gallego and Özer (2002, Section 2.4) consider
a system similar to our SODD setting, but with exogenous load-independent supply leadtimes.
Gayon et al. (2004) study a system similar to our IDD scheme but with multiple demand classes,
lost sales, and no due-date updates. They show that there is an optimal state-dependent base-stock
policy. Other papers that deal with continuous-review systems include Liberopoulos et al. (2003),
and Karaesmen et al. (2004).
Advance demand information can be viewed as a form of forecast updating. Examples of papers
that deal with inventory systems with periodic forecast updates include Graves et al. (1986), Heath
and Jackson (1994), Güllü (1996), Sethi et al. (2001), Zhu and Thonemann (2004), and references
therein. The models we present in this paper can be viewed as dealing with forecast updates.
However, in our case the updates are with respect to the timing of future demand.
Finally, we note that there is a literature that deals with how a supplier should quote delivery
leadtimes to its customers; see for example Duenyas and Hopp (1995), Hopp and Sturgis (2001), and
references therein. The setting studied in this literature is quite different from ours and typically
concerns make-to-order systems where no finished goods inventory is held in advance of customer
orders.
Our paper appears to be the first to consider imperfect advance demand information with
updates for continuous-review production-inventory systems. It also appears to be the first to
provide a unified framework for modeling the three ADI schemes: IDD, SDD, and SODD. Our
results regarding the structure of optimal policies bear some similarity to some of those cited
above. However, our general modeling framework is quite different — we analyze a continuous-
time MDP that incorporates imperfect ADI, updates, as well as endogenous supply leadtimes. In
addition, we provide a rigorous and detailed treatment of both discounted-cost and average-cost
problems.
The rest of the paper is organized as follows. In Section 3, we formulate the problem for a
system with IDD and describe the structure of an optimal policy. We do the same in Sections 4
and 5 for systems with SDD and SODD respectively. In Section 6, we present numerical results.
In Section 7, we offer a summary of main results and concluding comments.
7
3 Systems with Independent Due Dates
Consider a supplier of a single product, who can produce at most one unit of the product at a
time. Assume that production times are exponentially distributed with mean µ−1 and that orders
are placed by customers according to a Poisson process with rate λ. Orders are placed before their
due dates. The amount of time between when an order is announced (initially placed) and when
it becomes due is random. We refer to this random variable as the demand leadtime. We assume
orders are homogeneous in the sense that demand leadtimes have the same distribution for all
orders, and hence the expected demand leadtime is the same for all orders.
After an order is announced, it progresses through the demand leadtime system before becoming
due. Specifically, it undergoes a series of k − 1 updates (k ≥ 1). For i = 1, . . . , k − 1, the time
between the (i − 1)th and ith update is a random variable that has the exponential distribution
with mean νi−1 , independent of everything else. Here, the 0th update is the initial announcement
of the order. The time between the (k − 1)th update and the time the order becomes due has
exponential distribution with mean νk−1 . Hence, each demand leadtime can be viewed as consisting
of k exponentially distributed stages with the expected demand leadtime of each order equal to
ν1−1 + · · · + νk−1 . Viewed in this fashion, the ith update corresponds to the demand leadtime moving
from stage i to stage (i + 1). When an order undergoes its ith update, the supplier learns that the
order’s expected remaining demand leadtime has decreased from νi−1 + · · · + νk−1 to νi+1
−1
+ · · · + νk−1 .
When an order has undergone exactly i − 1 updates we say that it is in stage i. The special
case with k = 1 of this setup represents a situation with no updates and demand leadtime with
the exponential distribution. Equivalently, we may think of demand leadtime as having a phase
type distribution with k phases in series. Information is provided each time the demand leadtime
completes a phase. In the case where νi = ν for i = 1, . . . , k, the distribution of demand leadtimes
becomes Erlang with k phases and parameter ν. The process by which orders are placed, progress
through the demand leadtime, and eventually become due can be viewed as an M/G/∞ queueing
system with Poisson arrivals, infinite number of parallel servers, and service times with a phase-type
distribution with k phases in series.
An example of a system with IDD is the case of a supplier who provides a component to
a customer involved in developing complex projects with multiple stages (e.g., aircraft assembly,
building construction, production of weapon systems, etc). The component provided by the supplier
is needed after the first k stages of the project are completed. The customer informs the supplier
each time a new project is initiated and each time a project completes a stage. The timing of
project initiation is random and determined by the customer’s own demand. Also random are the
8
durations of project stages. At any point in time there may be several projects underway at various
stages of progress. The evolution of the projects are independent of each other, so that a project
that starts a particular stage earlier than another project may finish later, and vice-versa. The
supplier, who is expected to make a delivery whenever a project completes the first k stages, uses
information on how many projects are at each particular stage to decide whether or not to produce.
In our model, we assume that demand is Poisson and both production times and times be-
tween consecutive updates are exponentially distributed. These assumptions are made in part for
mathematical tractability as they allow us to formulate the control problem as an MDP and en-
able us to describe the structure of an optimal policy. They are also useful in approximating the
behavior of systems where variability is high. The assumptions of Poisson demand and exponential
production times are consistent with previous treatments of production-inventory systems; see for
example, Buzacott and Shanthikumar (1993), Ha (1997), Zipkin (2000), and de Véricourt et al.
(2002) among others. The exponential distribution is appropriate for modeling the time between
updates as well since, for the applications we have in mind, this time corresponds to the duration
of activities at the customer level. For example, in the case of building construction, the time until
certain tasks are completed can be highly variable, which perhaps justifies the high coefficient of
variation of an exponential distribution. In Section 7, we discuss how these assumptions may be
partially relaxed.
In this section and the next, we assume that the total number of announced orders at any instant
Pk
remains bounded by a finite integer m < ∞, so that i=1 yi ≤ m, where yi is the number of
Pk
orders in stage i. Orders are rejected if i=1 yi = m. This assumption is made for mathematical
convenience, because it allows us to formulate the problem as a Markov decision process with
bounded jump rates. From a queueing perspective, the introduction of the finite m means that we
approximate the M/G/∞ queue mentioned above by an M/G/m/m queue (sometimes called an
Erlang loss system). The constant m can be made arbitrarily large so that the fraction of rejected
orders (which can be computed exactly using the Erlang loss formula) is arbitrarily small; hence,
the assumption is not too restrictive from a practical standpoint. Using the results of the present
section as a building block, we show in Section 3.3 how to extend our results to the case of IDD with
no bound on the total announced orders. In the SDD and SODD models of subsequent sections,
the jump rates naturally remain bounded without introducing the bound m; hence, we also do
away with the assumption when analyzing those models.
9
We formulate the problem as a continuous-time Markov decision process. First we need some
notation. Let Z and Z+ be respectively the sets of integers and non-negative integers, and let Zk and
Zk+ be their respective k-dimensional cross products. Let R be the real numbers. Throughout, y =
P
(y1 , . . . , yk ). The MDP has state space Sm := Z×Zk+ (m), where Zk+ (m) := {y ∈ Zk+ : ki=1 yi ≤ m}.
Throughout this section, in order to keep notation clean, we will reflect the dependence on m only
in the notation that is later used for extending to the case with no m. It is, however, important to
keep in mind that most of the quantities in this section do depend upon m, even if this dependence
is not reflected in the notation.
The state of the system is determined by X(t), which represents the net inventory at time t,
and Yi (t), which represents the number of announced orders in stage i at time t; i = 1, . . . , k. Let
Y(t) = (Y1 (t), . . . , Yk (t)). In each state, two actions are possible: produce or idle (do not produce).
The objective is to identify a production policy that minimizes the long-run expected discounted
cost. Without loss of optimality, we consider control policies that make decisions only at transition
times. Let the set of such (possibly history-dependent) policies be denoted by Π. A deterministic
stationary policy π := {π(x, y) : (x, y) ∈ Sm } specifies the action taken at any time as a function
only of the state of the system, where π(x, y) = 1 means produce if the state is (x, y), whereas
π(x, y) = 0 means idle if the state is (x, y).
We will work with a uniformized version (see, e.g., Lippman, 1975; Serfozo, 1979; Puterman,
1994) of the problem in which the transition rate in each state under any action is Λ := λ + µ +
P
m ki=1 νi so that the transition times 0 = τ0 ≤ τ1 ≤ τ2 ≤ . . . are such that {τn+1 − τn : n ≥ 0} is
a sequence of i.i.d. exponential random variables, each with mean Λ−1 . We will provide additional
intuition regarding Λ later. Let {(Xn , Yn ) : n ≥ 0} denote the embedded Markov chain of states;
that is, (Xn , Yn ) := (X(τn ), Y(τn )) is the state immediately after the n-th transition.
For i = 1, . . . , k, let ei be the k−dimensional vector with 1 in position i and zeros elsewhere.
Let e0 be the k-dimensional vector of zeros. If action a ∈ {0, 1} is selected in state (x, y), then the
next state of the embedded Markov chain is (x0 , y0 ) with probability
Λ−1 µI{a=1} if (x0 , y0 ) = (x + 1, y)
Λ−1 λI{ȳ<m} if (x0 , y0 ) = (x, y + e1 )
Λ−1 ν y I if (x0 , y0 ) = (x, y + ei+1 − ei )
i i {yi ≥1}
p(x,y),(x0 ,y0 ) (a) :=
−1 if (x0 , y0 ) = (x − 1, y − ek )
Λ νk yk I{yk ≥1}
P
Λ−1 Λ − µI{a=1} − λI{ȳ<m} − ki=1 νi yi I{yi ≥1} if (x0 , y0 ) = (x, y)
0 otherwise,
Pk
where ȳ := i=1 yi and I{·} is the indicator function. Throughout the paper, the cost rate when
10
the state is (x, y) is
where h > 0 and b > 0 are the per-unit holding and backorder cost rates, and x+ = max{x, 0} and
x− = − min{x, 0}.
The value function, which specifies the optimal expected total discounted cost, is given by
"∞ #
hZ ∞ i X Λ n c(Xn )
∗
vm π
(x, y) := inf E(x,y) e−βt c(X(t))dt = inf E(x,y)
π
, (2)
π∈Π t=0 π∈Π γ γ
n=0
π
where β > 0 is the discount rate, γ := β + Λ, and E(x,y) denotes expectation with respect to the
probability measure determined by policy π and (X(0), Y(0)) = (x, y).
Let V be the set of real-valued functions on Sm and let v be an arbitrary element of V . Define
Tλ , Ti1 , Tµ : V → V as follows
v(x, y + e1 ) Pk
if i=1 yi <m
Tλ v(x, y) :=
v(x, y) Pk
if i=1 yi = m,
v(x, y + ei+1 − ei ) if i = 1, . . . , k − 1; yi ≥ 1
Ti1 v(x, y) := v(x − 1, y − ek ) if i = k; yk ≥ 1
v(x, y) otherwise,
Tµ v(x, y) := min{v(x, y), v(x + 1, y)}.
v = T v, (5)
and moreover a stationary policy that specifies for each (x, y) an action that attains the minimum
on the right-hand side of (5) is optimal (see Propositions 3.1.1 and 3.1.3 and Section 5.1 of Bertsekas
2001 or Section 11.5 of Puterman 1994).
11
In the optimality equation (5), operator Tλ corresponds to the arrival of a customer. More
precisely, if v(x, y) represents the “value” of being in state (x, y), then Tλ v(x, y) is the value just
after an arrival occurs when the state is (x, y). Similarly, operator Ti1 ; i = 1, . . . , k − 1 corresponds
to an update of an order from stage i to stage i + 1 and operator Tk1 corresponds to an order
becoming due. Operator Tµ corresponds to the production decision. When v(x + 1, y) < v(x, y),
it is better to produce a unit of inventory than it is to idle. In this case Tµ v(x, y) = v(x + 1, y)
represents the value just after the completion of the unit of inventory when the state is (x, y). When
v(x + 1, y) ≥ v(x, y), it is instead better to idle, in which case Tµ v(x, y) = v(x, y). The other term
P
— ki=1 νi (m − yi )v(x, y) — in the optimality equation corresponds to null transitions introduced
through uniformization of the jump rate. To understand the term λ that multiplies Tλ v(x, y), note
that in state (x, y), the next event will be an arrival with probability λ/Λ. Similar interpretations
are possible for the other multipliers. The term λ also represents the rate of order announcements.
Likewise, µ is the rate of potential production completions, νi yi is the rate of updates at stage i
P
when yi orders are in stage i, and ki=1 νi (m − yi ) is the rate of null transitions when there y jobs
in the demand leadtime system. Hence Λ is the overall rate of (real and null) transitions.
In preparation for Theorem 1, we will need Propositions 1 and 2 below. We start by giving the
following definitions. For v ∈ V let ∆v(x, y) := v(x + 1, y) − v(x, y), and define U := {v ∈ V :
v satisfies conditions (C1)–(C4)}, where conditions (C1)–(C4) are defined as follows:
As we will see shortly, the value function satisfies these four conditions. Condition (C1) is a
convexity property that will be used in the proof of Theorem 1 below to show the existence of a state-
dependent base-stock optimal policy. Conditions (C2) and (C3) will imply that the announcement
of a new order or the update of an existing order will cause the base-stock level either to increase
by one or to remain unchanged. Condition (C3) will be used to show that it is optimal to produce
whenever there are backorders.
Proposition 1 If v ∈ U , then T v ∈ U .
∗ ∈ U.
Proposition 2 The value function is an element of U ; that is, vm
12
The proofs of Propositions 1 and 2 are in the appendix. We are now ready for the main result
of the section. Theorem 1 describes the structure of an optimal policy.
Proof. Any stationary policy that specifies for each (x, y) ∈ Sm an action that attains the minimum
∗ (x, y) is optimal; see equations (3)–(5) and citations thereafter. Hence, the stationary policy
in T vm
1 := {(x, y) ∈ S : v ∗ (x+1, y) < v ∗ (x, y)} and action a = 0
that prescribes action a = 1 in states Sm m m m
0 := {(x, y) ∈ S
in states Sm ∗ ∗
m : vm (x + 1, y) ≥ vm (x, y)} is optimal. By Proposition 2, the value
∗ satisfies Condition (C1), and so π ∗ defined in (6) satisfies π ∗ (x, y) = 1 for (x, y) ∈ S 1
function vm m
sy+ej + 1 ≥ sy+el . Therefore, sy+ej ≤ sy+el ≤ sy+ej + 1 and hence sy+el is equal to either
sy+ej or sy+ej + 1.
Theorem 1 states that for each vector y of announced orders there exists a threshold sy such that
it is optimal to produce if inventory on hand is less than sy , and it is optimal to idle if inventory on
hand is greater than or equal to sy . We refer to the parameters {sy } as the y-dependent base-stock
levels. Part (a) with l = j + 1 indicates that the y-dependent base-stock level increases by at most
one unit if any order is updated or if a new order is announced. It also follows from part (a) that
sy is increasing in each of the components of y; that is sy ≤ sy+iej for i ∈ Z+ and j = 1, . . . , k.
Part (b) states that it is optimal to produce whenever there is a backorder.
13
Figure 1 illustrates the structure described in Theorem 1 for two examples, each with k = 2
stages. In part (a) of the figure, the mean time 1/νi spent in each of the stages is relatively long,
and hence the production policy is much less sensitive to orders in stage 1 than it is to orders in
stage 2. In part (b) the mean time 1/νi is shorter, and hence the production policy treats orders
in stage 1 almost the same as orders in stage 2.
In some settings, customers may cancel their orders after they have been announced. This
means that announcing a potential future order does not commit a customer to eventually place an
order. In particular, consider a situation where, with each update, an order is either canceled or its
due date is updated. In this case, ADI is imperfect with regard to both timing and realization of
future orders. For example, in the context of a building project, changes to building specifications
at some stage of the project may lead the building constructor to cancel orders for certain material.
Extending the model for systems with IDD to include order cancelation is relatively straight-
forward. Let pi denote the probability that an order is canceled at the end of its ith stage. The
case where pi = 0 corresponds to a system with no cancelations. We assume there is no penalty
to the customer for canceling an order. The state space, action space, and cost rates are as in a
system without order cancelations. To handle cancelations, we need only redefine T i1 as
(1 − pi )v(x, y + ei+1 − ei ) + pi v(x, y − ei ) if i = 1, . . . , k − 1; yi ≥ 1
Ti1 v(x, y) := (1 − pk )v(x − 1, y − ek ) + pk v(x, y − ek ) if i = k; yk ≥ 1
v(x, y) otherwise.
It can be shown that with cancelations there is again an optimal policy as described in Theo-
rem 1. Of course, the state-dependent base-stock levels are affected by the values of the cancelation
probabilities. From numerical results (not shown), we observed that the base-stock levels are
non-increasing in each of the probabilities pi . Similar extensions are possible for the models in
subsequent sections.
In this section we address an average-cost version of a problem with IDD. The main result is
that the structural characteristics of the optimal policy described in the previous section for the
discounted-cost setting carry over to the average-cost case. For policy π ∈ Π, the average-cost is
given by
π
Pn−1 π
Pn−1
E(x,y) l=0 c(Xl )[τl+1 − τl ] E(x,y) l=0 c(Xl )
Jπ (x, y) := lim sup π = lim sup .
n→∞ E(x,y) [τn ] n→∞ n
14
(a) ν1 = ν2 = 0.01
140
120
Net inventory x
100
80
60
40
20
0
80
Ann 60 80
oun
ced 60
ord 40
ers y2
in stag 40 ge 2,
20 s in sta
e 1,
y
20
ced order
1 0 0 Announ
(b) ν1 = ν2 = 0.10
Figure 1: Optimal policies for two different systems with IDD and k = 2: The surfaces depict the
state-dependent base-stock levels. For a given y = (y1 , y2 ), if the net inventory on hand x is below
the surface, it is optimal to produce; if the net inventory on hand x is on or above the surface, it
is optimal to idle. (m = 200, µ = 1, λ = 0.6, h = 10, b = 100)
15
Let J(x, y) := inf π∈Π Jπ (x, y). A policy that yields average cost J(x, y) for all (x, y) ∈ Sm is said
to be optimal for the average-cost problem. We have the following theorem.
Theorem 2 Suppose that λ < µ. Then there exists a stationary state-dependent base-stock policy
π A = {π A (x, y)} that is optimal for the average-cost problem. Its base-stock levels {sA
y } satisfy
sA A A A
y+el ∈ {sy+ej , sy+ej + 1} for j = 0, . . . , k − 1, l = j + 1, . . . , k and π (x, y) = 1 for x < 0. In
addition, the optimal average cost is finite and independent of the initial state; that is, there exists
a finite constant J such that J(x, y) = J for all (x, y) ∈ Sm .
Proof. The main idea of the proof is to obtain the desired results for the average-cost problem
by using Proposition 2 for the discounted-cost problem and letting β ↓ 0.
∗ (x, y). The optimality equation (5) can be rewritten
Given a discount rate, let v̌(x, y) := γvm
as:
n Λ X o
v̌(x, y) = min c(x) + p(x,y),(x0 ,y0 ) (a)v̌(x0 , y0 ) .
a∈{0,1} γ
(x0 ,y0 )∈S
Let α = Λ/γ = Λ/(β + Λ), and define hα (x, y) := v̌α (x, y) − v̌α (0, e0 ), where we have appended a
subscript α to indicate dependence on α (and hence on β).
Parts (i) and (ii) of Sennott’s Theorem 7.2.3 state that under conditions I, II, and III given
below there exists a sequence {αn := Λ/(βn + Λ)} and a real-valued function h(·) such that αn ↑ 1
and limn→∞ hαn (x, y) = h(x, y). (Note that αn ↑ 1 means βn ↓ 0.) Moreover, the function h(·)
satisfies
n X o
J + h(x, y) ≥ min c(x) + p(x,y),(x0 ,y0 ) (a)h(x0 , y0 ) , (7)
a∈{0,1}
(x0 ,y0 )∈S
so ∆h(x, y) = ∆ limn→∞ hαn (x, y) = limn→∞ ∆hαn (x, y) ≤ limn→∞ ∆hαn (x + 1, y) = ∆ limn→∞
hαn (x + 1, y) = ∆h(x + 1, y). Similar arguments show that h(·) also satisfies conditions (C2)–(C4).
16
Hence, as in the proof of Theorem 1, it follows that the policy
0 if x ≥ sA
A y
π (x, y) :=
1 if x < sA
y
with sA
y := min{x : h(x + 1, y) − h(x, y) ≥ 0} is optimal for the average-cost problem, and that
(I) There exists a stationary policy and a state z ∈ Sm such that the induced Markov chain
has a positive recurrent class R ⊆ Sm and the expected first passage time and expected first
passage cost (for definitions see Lemma 1 below) from any state (x, y) ∈ Sm to z are finite.
(III) For each state (x, y) ∈ Sm \R, there exists a policy that induces a Markov chain for which the
expected first passage time and expected first passage cost from state z to (x, y) are finite.
Let z = (0, e0 ). Define $ := {$(x, y) : (x, y) ∈ Sm } to be the stationary policy that produces if
the net inventory is less than zero and idles if the net inventory is at least zero; i.e. $(x, y) = I {x<0} .
In Lemma 1 below we show that under policy $, the class R := {(x, y) ∈ Sm : x ≤ 0} is positive
recurrent and the expected first passage time and cost from any state (x, y) ∈ Sm to (0, e0 ) are
finite. This verifies that condition I holds. Condition II is clearly satisfied for our case, since
c(x) = hx+ + bx− is convex in x with a minimum at c(0) = 0, and c(x) ↑ ∞ as x ↑ ∞ or as
x ↓ −∞. To prove that condition III holds one may use an argument similar to that in the proof
of the condition I. The details are omitted for brevity.
17
large.” In the discounted-cost case in Section 3.1, we did not need this assumption because costs
were discounted.
In this section we again consider the system with IDD with the discounted-cost criterion. The model
is identical to that considered in Section 3.1, except that here we do not place the bound m on the
number of announced orders that may be in the system. Hence, there are no rejected orders, but
rather all arrivals enter the demand leadtime system. The state space is now S := Z×Zk+ . Without
the bound m, we have a continuous-time Markov decision process with unbounded transition rates.
In particular, the conditional rate of transitions out of state (x, y) ∈ S under action a ∈ {0, 1}
P P
is λ + ki=1 νi yi + µI{a=1} . Without the bound m on ki=1 yi , this conditional rate is clearly not
bounded.
Theorem 3 below shows that the results in Theorem 1 continue to hold in the setting with
unbounded jump rates. Although this may not be surprising because Theorem 1 holds for any fi-
nite m, it is important to highlight that the presence of unbounded transition rates poses a technical
challenge. In particular, it is not possible to apply uniformization to a problem with unbounded
jump rates. Without uniformization, the continuous-time problem cannot be transformed into an
“equivalent” discrete-time problem as in Section 3.1. Such a transformation is typically a crucial
step for proving structural properties of optimal policies using inductive approaches (as in the
proofs of Propositions 1 and 2). An additional difficulty is that only recently has there developed a
theory for problems with both unbounded jump rates and unbounded cost rates that characterizes
the value function as a particular solution of the optimality equation, and ensures the existence of
stationary optimal policies. See Guo and Hernández-Lerma (2003) for results and references. This
literature, however, does not consider how to prove structural properties of optimal policies.
Roughly speaking, our proof of Theorem 3 below establishes the structure of an optimal policy
and of the value function for the problem with unbounded jump rates by letting m grow to infinity
through a sequence of problems such as those considered in Section 3.1. In doing so, there are
a number of technical points, such as the existence of various limits, that must be treated with
care. The approach may provide a template that could be used for analyzing other continuous-time
MDPs with unbounded jump rates and cost rates.
To begin, it will be helpful to re-write the optimality equation (5) from Section 3.1 as v = L m v
18
where operator Lm is given by
1 h
k
X
Lm v(x, y) := c(x) + λI{ k
yi <m} v(x, y + e1 ) + νi−1 yi−1 v(x, y + ei − ei−1 )
Qm (y) i=1
i=2
i
+ νk yk v(x − 1, y − ek ) + µ min{v(x, y), v(x + 1, y)}
Pk
and Qm (y) := λI{ k
yi <m} + i=1 νi yi + µ.
i=1
By rearranging terms, it can be checked that the equation v = Lm v is equivalent to (5). Note,
however, that Lm is a different operator than T . (Keep in mind, too, that T also depends upon m,
although this is not reflected in the notation.)
Let v ∗ denote the value function of the problem with unbounded jump rates. The optimality
equation for the problem with unbounded jump rates is v = Lv, where L is given by
1 h X k
Lv(x, y) := c(x) + λv(x, y + e1 ) + νi−1 yi−1 v(x, y + ei − ei−1 )
Q(y)
i=2
i
+ νk yk v(x − 1, y − ek ) + µ min{v(x, y), v(x + 1, y)}
Pk
and Q(y) := λ + i=1 νi yi + µ.
∗ is the value function for the problem with the finite bound m on the number of
Recall that vm
∗ from S to S by defining v ∗ (x, y) := 0
announced orders. For each m we extend the domain of vm m m
∗ ,
for (x, y) ∈ S \ Sm . We will use the following lemma, which provides a pointwise bound on vm
independent of m. The proof, which is based upon what we believe is a novel coupling argument,
is in the appendix.
In preparation for the main theorem of the section, consider the set of functions BR (S) := {v :
there exist constants c1 , c2 ≥ 0 so that |v(x, y)| ≤ c1 + c2 R(x, y) for all (x, y) ∈ S} where R(x, y)
is given in (8). To employ the theory of Guo and Hernández-Lerma (2003) as we do below, one
must identify a non-negative function R suitable for defining BR (S). Guo and Hernández-Lerma
do not specify a particular R for the use of their theory. “Suitable” means that the function R must
satisfy some conditions that relate to the cost and transition rates of the continuous-time Markov
chains induced by stationary policies. In the proof (in the appendix) of Lemma 3 we verify that
our choice of R in (8) is indeed suitable.
19
Lemma 3 The system with IDD and unbounded jump rates satisfies Assumptions A, B, and C of
Guo and Hernández-Lerma (2003).
Theorem 3 For the system with IDD and unbounded jump rates, the value function v ∗ is the unique
function in BR (S) that solves the optimality equation v = Lv. Moreover, v ∗ satisfies conditions
(C1)–(C4), and there exists an optimal state-dependent base-stock policy. The base-stock levels
satisfy the conditions in (a) and (b) in Theorem 1.
Proof. The first statement follows from Theorem 3.2 of Guo and Hernández-Lerma (2003).
Lemma 3 shows that their conditions A, B, and C (which are needed to apply their theorem) hold
for our problem.
For the remainder, we begin by showing that there exists a pointwise convergent subsequence of
∗ }. Lemma 2 implies for each (x, y) ∈ S that {v ∗ (x, y)} is a bounded sequence of real numbers
{vm m
(note that the bound in the lemma does not depend upon m). Hence, for each (x, y), the sequence
∗ (x, y)} has a convergent subsequence in R.
{vm
Next let {z1 , z2 , z3 , . . . } be an enumeration of the countable space S [each zi is some element
(x, y) of S]. We now proceed with a diagonalization argument to construct the pointwise convergent
∗ }. Let {m
subsequence of {vm ∗
1,j : j = 1, 2, . . . } be such that limj→∞ vm1,j (z1 ) exists. Next, let
∗
{m2,j : j = 1, 2, . . . } be a subsequence of {m1,j : j = 2, 3, . . . } for which limj→∞ vm (z2 ) exists.
2,j
∗
Note also that limj→∞ vm (z1 ) exists, because {m2,j : j = 1, 2, . . . } ⊆ {m1,j : j = 2, 3, . . . }.
2,j
(As an alternative to the previous argument, we may appeal to Tychonoff’s Theorem; see, e.g.,
Brémaud 1999.) Let v ∗∗ denote the limit; that is, v ∗∗ : S → R is defined to be the function for
∗ (x, y) = v ∗∗ (x, y) for all (x, y) ∈ S.
which limj→∞ vm j
Next, we show that the limit v ∗∗ is in fact the value function v ∗ for the problem with unbounded
jump rates. For any function v on S and any (x, y) ∈ S, observe that Lm v(x, y) = Lv(x, y) when
P
m > ki=1 yi . Hence, for any (x, y) ∈ S it follows that
v ∗∗ (x, y) = lim vm
∗
j
∗
(x, y) = lim Lmj vm j
∗
(x, y) = lim Lvm j
(x, y). (9)
j→∞ j→∞ j→∞
For any function v on S and any (x, y) ∈ S we next re-express Lv(x, y). To this end, for given
20
(x, y) consider the continuous function L(x,y) : Rk+3 → R defined by
" k+1
#
1 X
L(x,y) (ϕ1 , . . . , ϕk+3 ) := c(x) + λϕ1 + νi−1 yi−1 ϕi + µ min{ϕk+2 , ϕk+3 } .
Q(y)
i=2
= Lu(x, y) .
Note that in (10), we may pass the limit inside L(x,y) because L(x,y) is continuous. Applying the
∗ } and u = v ∗∗ and using (9), it follows that v ∗∗ (x, y) =
preceding observation with {uj } = {vm j
Lv ∗∗ (x, y). Now, because (x, y) was arbitrary, we see that v ∗∗ = Lv ∗∗ . That is, v ∗∗ solves the
optimality equation for the problem with unbounded jump rates. Moreover, it can be seen from
Lemma 2 that v ∗∗ is in BR (S) with c1 = β −2 (h + b)(λ + µ) and c2 = β −1 (h + b). Therefore, it
follows from the first part of the theorem that v ∗∗ = v ∗ ; that is, v ∗∗ is the value function for the
problem with unbounded jump rates. It can now be readily verified that v ∗ = limj→∞ vm
∗ satisfies
j
conditions (C1) through (C4) with Zk+ (m − 1) and Zk+ (m) replaced by Zk+ . Arguing as in the proof
of Theorem 1, the remaining statements in the theorem follow from Theorems 3.2 and 3.3 of Guo
and Hernández-Lerma (2003).
We consider a setup similar to the one described in Section 3. However, in this section, we model
a system where due dates of orders are updated (and the orders become due) in the same sequence
that orders are initially placed. For updating, we consider two cases: (1) multiple order updating,
and (2) single order updating.
21
Under multiple order updating, there can be multiple orders at each stage of update. An update
could occur for any stage and would affect the oldest order at that stage. Hence orders at each
stage are updated one at time in the sequence they have arrived at that stage. The queueing
analogy for this setting is a system where before becoming due, orders progress through a series
of k single-server queues with service time at the ith queue exponentially distributed with rate ν i .
Orders at each queue are served on a first-come, first-served basis.
This form of updating is motivated by settings where there is a supplier that provides a compo-
nent to a manufacturer. The manufacturer’s production system consists of a series of workstations
that process items one at a time on a FCFS basis. Production times are random and vary from
one workstation to another. The component provided by the supplier is used in the (k + 1)th
workstation and is expected to be delivered by the supplier as soon as an item goes through the
first k workstations. The manufacturer shares information about when items are released into the
production system and when they complete an operation at any of the workstations. In our model,
an item released into the first workstation corresponds to an order being announced. An item that
completes processing on the ith workstation is equivalent to an order that undergoes the ith update.
An item that completes processing at the kth workstation corresponds to an order becoming due.
Under single order updating, only the oldest announced order is progressively updated. The
time between the ith and (i + 1)th update is exponentially distributed with mean ν i−1 . Once
the oldest announced order becomes due, the next oldest announced order becomes available for
updating. At any given time, there may be multiple announced orders but only one order that
has undergone one or more updates. The supplier knows the total number of announced orders
and the update stage of the oldest announced order. To use a queueing analogy, announced orders
can be viewed as going through a single-server queue with Poisson arrivals and service times that
have a phase-type distribution with k phases in series, such that the ith phase has an exponential
distribution with rate νi .
Single order updating arises in settings similar to those of multiple order updating, except
that items at the manufacturer are now processed one unit at time with all the operations carried
out on a single workstation (instead of a series of workstations). At any given time, there may
be multiple items waiting to be processed in the queue of the workstation, but only one item (at
most) undergoing processing. Each time the workstation completes an operation, the manufacturer
informs the supplier. The manufacturer also informs the supplier each time a new item is released
to the workstation. In our model, the items waiting in the queue of the workstation correspond to
orders that have been announced but not updated yet. The item being processed, if any, corresponds
22
to the oldest announced order and the number of operations it has completed indicates the number
of due date updates through which it has gone.
and Tbµ := Tµ , Tbi := Ti1 . The optimality equation is given by v = Tbv, where operator Tb : Vb → Vb is
defined as follows:
h k
X i
Tbv(x, y) := γ
b−1 c(x) + λTbλ v(x, y) + νi Tbi v(x, y) + µTbµ v(x, y) .
i=1
b ∗ = {b
Theorem 4 The state-dependent base-stock policy π π ∗ (x, y)} given by
0 if x ≥ sby
∗
π
b (x, y) :=
1 if x < sby
b ∗ (x, y) = 1 if x < 0.
(b) π
From Theorem 4, we can see that there is an optimal policy with essentially the same structure as
in Theorem 1. This optimal policy is a state-dependent base-stock policy in which the y-dependent
base-stock level increases by at most one unit if an additional order is announced or if the due
date of an announced order is updated. It is again also optimal to produce whenever there is a
backorder.
23
4.2 Single Order Updating
k
Here, there can be at most one order in stages i = 2, . . . , k. The state space is S := Z × Z , where
k P
Z := {y : y1 ∈ Z+ , yi ∈ {0, 1} for i = 2, . . . , k, ki=2 yi ≤ 1}. The state of the system at time
t is determined by the net inventory X(t), and the number of announced orders Y i (t) in stage i;
i = 1, . . . , k. The quantity Y 1 (t) denotes the number of orders that have been announced but
whose due date has not yet been updated. The cost function is (1).
Let V be the set of real-valued functions on S and define T : V → V as follows:
h k
X i
T v(x, y) := γ −1 c(x) + λT λ v(x, y) + νi T i v(x, y) + µT µ v(x, y) ,
i=1
In this section, we consider a system where only one order is announced at a time and its due date is
progressively updated. The time between the ith and (i + 1)th updates is once again exponentially
distributed with mean νi−1 . Whenever an order becomes due, a new order is announced. Hence,
at any time, there is exactly one announced order in the system, and the time between consecutive
order due dates (and consecutive order announcements) has a phase-type distribution with k phases.
We assume the supplier has the ability to observe the phase of the arrival process. To use a
queueing analogy, the demand leadtime system can be viewed as a closed queueing network in which
exactly one customer repeatedly circulates through a series of single servers each with exponentially
distributed service times.
24
An example of this type of ADI is a setting where the supplier has a single customer in the form
of a retailer who faces a Poisson demand process with rate ν and uses a (Q, r) policy when placing
orders with the supplier (here, k = Q and νi = ν for all i). That is, the retailer places an order of
size Q whenever its own inventory position (sum of inventory on order and inventory on hand less
backorders) reaches r. Here, the retailer’s inventory position takes on values Q+r, Q+r−1, . . . , r+1
with the transition time between consecutive values being exponentially distributed with rate ν.
If the supplier has access to the information about inventory position at the retailer, then the
supplier can use inventory position information to update the expected time at which the retailer
will eventually place an order. From the perspective of the supplier there can be only one announced
order at a time and the time between updates (there is a total of Q updates) is exponentially
distributed with rate ν. In this setting, one “unit” for the supplier is an order of size Q from
the retailer and the service time for this unit is exponential with rate µ. Here, the time between
consecutive due dates has an Erlang distribution with Q phases, each with mean ν −1 , so that the
mean time between due dates is Qν −1 .
We formulate the problem as a continuous-time Markov decision process with state space Se :=
e k , where Z
Z×Z e k := {ei : i = 1, . . . , k}. Each element of the state space again describes the
supplier’s net inventory on hand and the number of announced orders in each stage of update.
P
Let Ve be the set of real-valued functions on S.
e Let γ
e := β + ki=1 νi + µ and define Tei : Ve → Ve
as follows:
v(x, y + ei+1 − ei ) if i = 1, . . . , k − 1; y = ei
Tei v(x, y) := v(x − 1, y + e1 − ek ) if i = k; y = ek
v(x, y) otherwise.
Then, the optimality equation is given by v = Tev, where the operator Te : Ve → Ve is defined by
h k
X i
e
T v(x, y) := γ
e −1
c(x) + νi Tei v(x, y) + µTeµ v(x, y) ,
i=1
with Teµ := Tµ .
As in previous sections, it can be shown that there exists an optimal state-dependent base-stock
policy with properties similar to those detailed earlier. We omit the details.
6 Numerical Results
In this section, we present results from a numerical study. The goal is to examine the benefits from
using ADI and to compare the impact of having full versus partial ADI. In addition, we describe a
25
class of simple heuristic policies and compare their performance to that of an optimal policy. For
brevity, we limit our discussion to systems with IDD.
In all cases we set ρ = λ/µ < 1, so that Theorem 2 applies. In all examples, the holding-cost
rate is h = 10 and the service rate is µ = 1, unless stated otherwise. We used m = 300, and for
the purpose of performing numerical calculations, we restricted the net inventory x to be between
−100 and 100. This value of m ensures that the fraction of demand that is rejected is practically
insignificant. (With finite m, the fraction of arrivals that are rejected is given by the well-known
Erlang loss formula. Using the formula, we found that across all cases considered, the fraction of
rejected arrivals was no higher than roughly 10−68 .) For each problem instance, we obtained the
long-run average cost by solving the MDP using value iteration. The value iteration algorithm was
terminated once the average cost was correct to four decimal places (see Section 8.5 of Puterman
1994). We use average cost instead of discounted cost, because average cost is independent of the
initial state and the discount factor. Average cost is also a more widely used performance measure
in practice.
To assess the benefit of ADI, we compare the optimal average cost, JA , for a system with ADI to
the optimal average cost, JN , for a system with no ADI and obtain the percentage cost reduction
PCR := 100 × (JN − JA )/JN . The two systems are identical in all respects, except that for the
system with no ADI, orders are not announced ahead of their due dates. In other words, information
about when orders enter the demand leadtime system and when they move from one stage to the
next is withheld. Only departures from the last stage of the demand leadtime system are observed.
In general, the distribution of the departure process from the demand leadtime system is different
from the distribution of its arrival process, which is assumed to be a Poisson process with rate λ.
However, the arrival and departure processes in steady state do have identical (Poisson with rate
λ) distributions for systems with IDD and no bound m. This observation follows from the fact that
the departure process from an M/G/∞ queue in steady state is a homogeneous Poisson process
with the same rate as the exogenous input Poisson process. For systems with IDD with finite m,
the departure process in steady state is closely approximated by a Poisson process when m is large
and the fraction of rejected orders is small (see discussion in the previous paragraph).
Long-run average cost is unaffected by the transient behavior of the demand leadtime system. As
the demand leadtime system approaches steady state, its departure process converges in distribution
to a Poisson process with rate λ. Therefore, we may, for the purpose of computing long-run average
26
cost for the system without ADI, assume the arrival process to the system with no ADI is a Poisson
process with rate λ. Finally, we note that there is an optimal policy for a system with no ADI
with Poisson arrivals and exponential production times that is a base-stock policy with a fixed
base-stock level; see, e.g., Veatch and Wein (1996).
Representative numerical results comparing systems with and without ADI are shown in Table 1
and Figures 2 and 3, where the PCR is shown for varying values of parameters ν, λ, and b. The
results are shown for a system with a single stage (i.e., k = 1). The effect of multiple updating
stages is discussed separately in Sections 6.2 and 6.4.
ν1
λ 0.01 0.02 0.05 0.1 0.2 0.5 1 2 5 10
0.1 0.00 0.00 0.00 0.02 0.37 3.94 8.07 17.84 12.60 7.53
0.2 0.00 0.00 0.06 0.64 2.36 8.39 14.11 18.41 11.74 6.84
0.4 0.46 1.62 4.79 8.47 12.69 16.17 20.93 17.70 9.60 5.35
b = 10 0.6 0.84 2.45 6.13 9.46 12.03 12.82 9.81 1.94 1.38 0.81
0.8 8.52 8.47 10.54 10.61 9.06 5.87 3.26 2.02 0.54 0.04
0.9 9.02 9.25 7.58 5.37 3.03 1.21 1.17 1.03 0.99 0.92
0.1 0.36 1.96 8.01 16.33 25.47 44.92 40.43 28.71 14.76 8.12
0.2 3.06 6.18 12.88 20.44 28.61 38.63 28.61 13.15 0.38 0.29
0.4 5.16 7.59 12.43 17.59 22.53 17.66 15.30 10.69 5.17 2.74
b = 50 0.6 4.24 7.58 12.88 15.46 14.77 7.77 4.75 1.51 0.86 0.47
0.8 11.20 13.78 13.29 9.87 5.96 2.60 1.31 0.90 0.38 0.16
0.9 11.73 12.78 5.30 2.74 1.14 0.44 0.23 0.20 0.13 0.02
0.1 8.93 14.28 23.84 32.73 49.49 51.82 39.08 23.15 6.63 0.05
0.2 0.13 1.08 6.14 13.96 25.43 18.72 6.63 5.90 3.11 1.55
0.4 2.08 4.78 10.96 16.58 19.37 15.63 5.05 4.07 2.06 1.04
b = 100 0.6 5.85 9.72 15.17 16.82 13.96 6.95 3.14 1.89 0.86 0.52
0.8 12.93 14.98 12.63 8.04 4.42 1.64 0.83 0.46 0.24 0.13
0.9 9.85 10.09 6.98 4.01 2.03 0.41 0.23 0.20 0.12 0.03
Table 1: The percentage cost reduction (PCR) for a system with IDD (k = 1).
The effect of ν on PCR, when all other parameter values are fixed, is not monotonic, with
PCR initially increasing in the mean demand leadtime 1/ν and then decreasing. ADI offers the
greatest benefit in terms of PCR when the size of 1/ν is moderate. The percentage cost reduction
is relatively small when either 1/ν is very large or very small. This can be explained as follows.
When 1/ν is small, the mean time between when an order is announced and when it becomes due is
small. Hence, the information is of little use. When 1/ν is large, the mean time between an order’s
27
20
18 λ = 0.4
λ = 0.6
16 λ = 0.8
λ = 0.9
Percentage Cost Reduction (PCR)
14
12
10
0
0 10 20 30 40 50 60 70 80 90 100
Mean Demand Leadtime 1/ν
Figure 2: The effect of mean demand leadtime on PCR for a system with IDD and k = 1, b = 100.
10
ν = 0.2
8
ν = 0.5
Percentage Cost Reduction (PCR)
7 ν=1
0
0 1 3 5 7 9 11 13 15 17 19
b/h
Figure 3: The effect of b/h on PCR for a system with IDD and k = 1, λ = 0.8.
28
announcement and due date is long, but so is the variance. This makes the information about future
demand relatively less useful. The meanings of “large,” “small,” and “moderate” 1/ν depend upon
the value of λ. Although the joint effect of λ, µ, and ν on PCR is complicated, it appears that the
value of 1/ν that maximizes PCR for a given λ is increasing in λ. The largest value of 1/ν shown
in Figure 2 and Table 1 is 1/ν = 100; however, computations for larger values bear out the claim
that the relative benefit of ADI is small for large 1/ν. For instance, with b = 100, λ = 0.8, and
1/ν = 500, we find that PCR is 3.22. These results highlight an important insight: having earlier
notice of future orders may not always be desirable since the quality of this information tends
also to deteriorate (i.e., the variance in the demand leadtime increases when the average demand
leadtime increases). In our model, this is due to the fact that demand leadtime is assumed to have
the exponential distribution. However, this also captures the fact that in practice the earlier an
order is announced the less reliable will be the estimate of its due date (see Section 6.3 for further
results and discussion for systems with multiple stages of updating).
For each fixed ν, the effect of λ on PCR is also not monotonic. In general, for each fixed ν,
the relative benefit of ADI is small when λ is very large (close to µ = 1). When λ is large, the
optimal policy with or without ADI is for the production facility to produce most of the time.
Hence, the availability of ADI makes little difference for the decisions taken. When λ is small, the
absolute cost reduction from ADI is small, because costs in the systems with and without ADI both
approach zero as λ ↓ 0. However, for small λ we have observed different behaviors of PCR (which
measures relative cost reduction) depending upon whether or not ν is smaller than ν ∗ (µ) := hµ/b.
In particular, when ν is smaller than ν ∗ (µ) then PCR approaches zero as λ approaches zero. On
the other hand, when ν is larger than ν ∗ (µ) then
C1 − C2 bµν − hµ2
PCR ≈ 100 × = 100 × (11)
C1 bµν + bν 2
for λ near zero. (We have no formal proof of the preceding observation, but our numerical exper-
iments have borne this out. In the appendix we provide an informal analytical explanation.) We
see close agreement between these approximations and the exact values of PCR shown in Table 1
for λ = 0.1. For instance, when ν = 1, µ = 1, h = 10, and b = 50 the expression on the right-hand
side of (11) is 40. For ν = 1, λ = 0.1, and b = 50 the table shows the exact (to two decimal places)
value of PCR to be 40.43. The approximation is even closer for smaller values of λ (not shown).
Note also that the entries Table 1 with PCR of 0.00 have ν < ν ∗ (µ).
The effect of the ratio b/h is also not monotonic, with the value of PCR relatively small when
b/h is either very small or very large. When the ratio b/h is small, ignoring ADI and producing in a
make-to-order fashion (i.e., holding little or no inventory in anticipation of future demand) carries
29
a relatively small penalty. When the ratio b/h is large, the base-stock levels are high for systems
both with and without ADI, and the probability of backorders is relatively small in both systems.
Hence, ADI becomes relatively less useful.
In settings where the demand leadtime system consists of multiple stages, the supplier and the
customer often have a choice of how much information is shared. For example, should the customer
inform the supplier as soon as an order enters the first stage or wait until an order has sufficiently
progressed before forwarding the information to the supplier? Similarly, should the customer update
the supplier each time an order enters a new stage or should it wait until the order has passed a
specified number of stages? These questions are relevant when there is a cost associated with
collecting the information, transmitting it from one party to another, and then making decisions
based on it. To explore the benefit of full versus partial information sharing, we consider a system
where the demand leadtime has two stages (k = 2) and compare the performance of this system
when there is full ADI (information is shared as soon as orders enter the first stage and as they
leave one stage and enter the next) to its performance when there is no ADI and when there is
only partial ADI (information is shared only when orders enter the second stage). The systems
can be viewed as identical in all respects except for the number of update stages, with full ADI
corresponding to k = 2, partial ADI to k = 1, and no ADI to k = 0. In the system with full ADI,
the time spent in each stage is exponentially distributed with mean 1/ν. In the system with partial
ADI, the time between when an order is announced and when it becomes due is 1/ν.
Representative numerical results are displayed in Table 2, which not surprisingly shows that
full ADI is indeed superior. (A formal proof of this observation follows from the fact that any
policy for the system with partial or no ADI can be reproduced for the system with full ADI by
basing decisions in the latter only on the net inventory and the number of orders in the second
stage.) The value of full ADI is most significant when both ν and λ are in the mid-range, and least
significant when both ν and λ are either very small or very large, as in the upper left corner and
lower right corner of Table 2. This is consistent with results from Section 6.1. More significantly,
the incremental benefit from full ADI (k = 2) over partial ADI (k = 1) is typically small. This
suggests that partial ADI can be sufficient, especially if updating is expensive to implement. This
insight appears to be robust with respect to the number of stages. For example, in section 6.4, we
consider the case of k > 2 when a heuristic policy is used instead of the optimal policy and we
observe a diminishing effect to the value of ADI as the number of stages for which ADI is available
30
increases.
We close this section by noting a subtle difference between the effect of increasing ADI by
increasing the number of stages observable to the supplier and increasing ADI by increasing the
length of a particular stage. For example, compare a system with k stages with each stage having
mean lead time 1/ν to a system with a single stage with mean k/ν. Both systems have the same
overall mean, k/ν, but the variance of the system with k stages is k/ν 2 while the one for the system
with a single stage is k 2 /ν 2 (i.e., k times larger). This helps explain why observing more stages
of the demand process is always beneficial, but increasing the average length of a particular stage
may not be.
0.01 25.07 24.54 24.53 46.38 43.67 43.65 107.24 93.38 92.10
2.08% 2.14% 5.85% 5.90% 12.93% 14.12%
0.02 25.07 23.87 23.85 46.38 41.87 41.74 107.24 91.18 87.81
4.78% 4.86% 9.72% 10.00% 14.98% 18.12%
0.05 25.07 22.32 22.26 46.38 39.34 38.47 107.24 93.70 86.93
10.96% 11.21% 15.17% 17.06% 12.63% 18.94%
0.10 25.07 20.91 19.80 46.38 38.58 36.60 107.24 98.62 91.50
16.58% 21.00% 16.82% 21.10 % 8.04 % 14.69 %
0.20 25.07 20.21 19.00 46.38 39.91 36.33 107.24 102.51 98.04
19.37% 24.20% 13.96% 21.68 % 4.42 % 8.58 %
1.00 25.07 23.80 20.62 46.38 44.93 43.05 107.24 106.35 105.48
5.05% 17.75% 3.14% 7.18 % 0.83 % 1.64 %
2.00 25.07 24.05 23.63 46.38 45.51 44.89 107.24 106.75 106.33
4.07% 5.74 % 1.89% 3.22 % 0.46 % 0.85 %
Table 2: Average cost and percentage cost reduction (PCR) for systems with IDD (b = 100).
The columns labeled “k = 0” show the average cost for systems without ADI.
In evaluating the benefit of ADI, we have so far taken the perspective of the supplier who manages
the production process. In this section, we consider the impact of ADI on the customer who provides
it. In particular, we address the question of whether or not both supplier and customer benefit from
sharing demand information. It is often argued that ADI can reduce costs of the supplier (which
we observed to be true) and improve the quality of service received by the customer. In our setting,
31
the latter assertion would mean that customers experience fewer backorders and shorter fulfillment
delays. In Figure 4, we show the breakdown of supplier cost in terms of inventory holding and
backorder cost for one set of examples. As we can see, ADI does not always reduce the backorder
costs. In some cases, the supplier uses ADI to reduce inventory holding cost at the expense of
backorder cost. Generally, whether backorder cost, holding cost, or both decrease (both cannot
increase) depends upon problem parameters. Hence, there is no guarantee that sharing ADI will
lead to improved service levels to the customers.
This, of course, raises the question of why a customer would be willing to provide ADI only to
see service levels suffer. One possible answer is that in practice customers who provide ADI also
require a contractual agreement that service levels be improved or, alternatively, that the penalties
for poor service be increased. For example, the customer could offer ADI, but simultaneously
increase the penalty for backorders. In Figure 5, for a system with a single stage, we illustrate the
impact on the cost of the supplier of having customers simultaneously provide ADI and increase
unit backorder costs. The figure shows the average cost for the system with no ADI and b = 50,
as well as the average cost for the system with ADI for different values of b. From the figure, it
can be seen that, for instance, if ν = 0.5 then the supplier is indifferent between operating without
ADI at b = 50 and operating with ADI at b roughly equal to 70. In other words, in exchange for
receiving ADI, the supplier is willing to accept up to a 40% increase in the backorder penalty rate.
In settings where the state space is large, computing and storing an optimal policy is difficult due
to the well known “curse of dimensionality” of dynamic programming. This is a crucial issue when
the number of update stages k is large. To get a feel for how k affects computation times, for a
system with IDD and λ = 0.6, b = 100, νi = 0.1, m = 40, and |x| ≤ 70 it took about 5 seconds of
CPU time on an Intel Xeon 2.40 GHz processor for value iteration to converge for k = 1. For k = 2
the computations took roughly 23 minutes, and for k = 3 they took roughly 34 hours. We did not
run to completion for k = 4 for this example, but estimate that it would have taken months. There
is apparently no simple solution for these computational problems for general k, because the size of
the state space grows exponentially in k. The optimal policy is also cumbersome to communicate
and visualize when the number of update stages is large (it requires storing and displaying a multi-
dimensional look-up table). Hence, there is a need in some cases for alternatives to the optimal
policy in the form of easy to implement, yet reasonably effective, heuristics.
In this section, we introduce a class of such heuristics. The heuristics control production via a
32
48
46
Holding cost (no ADI)
44 Holding cost (with ADI)
Backorder cost (with ADI)
42 Backorder cost (no ADI)
Average cost
40
38
36
34
32
30
28
0.1 0.2 0.5 1 2 5 10 20 50 100
1/ ν
Figure 4: Average holding and backorder costs, with and without ADI.
110
100
Average cost for a system without ADI and b=50
90
80
Average cost
70
ν = 0.05
ν = 0.2
60
ν = 0.5
50
40
30
20
10 20 30 40 50 60 70 80 90 100 110
b
Figure 5: Average cost with ADI versus unit backorder cost rate.
33
threshold r and a “weight” vector α= (α1 , . . . , αk ), where 0 ≤ αi ≤ 1. Using such a heuristic, the
P
decision is to produce if x − ki=1 αi yi < r and to idle otherwise (recall that x is the net inventory
level and yi is the number of orders in stage i). As we mentioned in the introduction, we refer to these
P
heuristics as linear base-stock policies (LBPs) because the base-stock levels s(y) = r + ki=1 αi yi
increase linearly in the number of announced orders. By restricting a subset of the weights to be
specific values (e.g., zero or one), different versions of the heuristics can be specified. The heuristic
is of course optimal when k = 0; in that case the optimal policy is a base-stock policy with a
fixed base-stock level. More generally, the heuristics mimic the optimal policy by specifying state-
dependent base-stock levels {s(y)} that are non-decreasing in the number of announced orders.
Because 0 ≤ αi ≤ 1, a unit increase in the number of orders announced leads to at most a unit
increase in the base-stock level. Moreover, by taking αi ≥ αj for i > j, base-stock levels are more
sensitive to increases in the number of orders in later stages of update. In short, the heuristics
preserve the properties of the optimal policy described in Theorem 2.
For small k, it is straightforward to evaluate the average cost for the LBP heuristic for a given
combination of r and α using the policy-evaluation version of value iteration. For large k, exact
computation of the average cost of an LBP policy becomes infeasible from a practical standpoint
(these are the same instances where obtaining an optimal policy is also computationally intractable).
However, for given α and r it is not difficult to evaluate the average cost associated with an LBP
policy using simulation, even for large k.
Numerical results comparing the performance of an optimal policy and that of the LBP heuristic
are shown in Tables 3 and 4 for systems with k = 1 and k = 2 respectively. The base-stock level r
and weight vector α were obtained using a heuristic search. Therefore, there may be better values
of r and α than those displayed. As we can see, LBP performs well for the range of parameters
tested. Across all examples with k = 1 or k = 2, the difference between the average cost of an
optimal policy and the average cost of the heuristic is no greater than 7%, suggesting that LBPs
are effective. Unfortunately, it is difficult to evaluate the performance gap between optimal policies
and LBPs for k > 2 because the state space is too large. Nevertheless, it is possible to implement
an LBP heuristic in such instances (see related discussion later in this section); we are just not able
to evaluate how close it is to optimal.
Observe that in Table 3 with k = 1, the value of the weight is chosen to be 1 for higher values
of ν. This reflects the fact that when ν is high, the mean demand leadtime is small, and therefore
it is desirable to essentially factor announced orders directly into the inventory position x, which is
then compared to the threshold r. On the other hand, for low values of ν, the weight is chosen to
34
ν1
λ 0.01 0.05 0.10 0.20 0.50 1.00 2.00
Table 3: The differences between the average cost of an LBP and the optimal average
cost, as a percentage of the optimal average cost for systems with IDD and k = 1.
The numbers in parentheses are (α1 , r).
35
LBP LBP-S SSOP
ν1 = ν 2 λ = 0.4 λ = 0.6 λ = 0.8 λ = 0.4 λ = 0.6 λ = 0.8 λ = 0.4 λ = 0.6 λ = 0.8
0.01 2.88 1.64 3.51 3.10 1.64 4.51 0.04 0.05 1.39
(0.1,0.1,−5) (0,0.1,−2) (0.1,0.2,−12) (0.1,−2) (0.1,−2) (0.2,−7)
0.02 3.50 3.56 4.30 4.92 4.99 4.66 0.08 0.31 3.84
(0.1,0.1,−2) (0.1,0.2,−5) (0.1,0.4,−10) (0.1,−1) (0.1,0 ) (0.5,−11)
0.05 1.03 1.94 6.02 1.29 2.81 8.84 0.28 2.27 7.78
(0.1,0.2,−1) (0.1,0.4,−3) (0.1,0.5,0) (0.2,0) (0.4,−2) (0.9,−6)
0.10 6.01 3.79 6.39 8.96 6.46 7.83 7.78 5.42 7.79
(0.1,0.3,0) (0.2,0.2,1) (0.4,0.4,3) (0.4,0) (0.7,−1) (0.9,2)
0.20 4.94 2.90 3.11 6.88 9.91 4.56 6.37 9.87 4.56
(0.1,0.5,0) (0.5,0.5,0) (0.5,0.5,6) (0.6,0) (0.8,1) (1,6)
1.00 0.47 0.73 0.74 15.61 4.36 1.11 15.43 4.36 0.83
(0.6,0.6,1) (0.5,0.5,3) (0.1,0.1,9) (0.5,2) (1,3) (0.4,10)
2.00 0.14 0.11 0.32 1.76 1.37 0.44 1.76 1.37 0.39
(0.6,0.6,1) (0.6,0.6,3) (0.4,0.4,9) (1,2) (1,5) (1,10)
Table 4: The differences between the average cost of LBP, LBP-S, and SSOP and the optimal
average cost, as a percentage of the optimal average cost for systems with IDD and k = 2. The
values in parentheses under LBP are (α1 , α2 , r). The values in parentheses under LBP-S are (α2 , r).
All cases use b = 100.
be small (e.g., α1 = 0.1, which was the smallest strictly positive value considered in the heuristic
search ). This is also consistent with our prior observation that ADI is less valuable when the mean
demand leadtime is very long. In Table 4 we see that for higher values of νi the best LBP policy
(as found by the heuristic search) gives even weights to both y1 and y2 , consistent with what we
saw in part (b) of Figure 1. In addition, for the larger νi , Table 4 shows that the LBP policy yields
a cost that is only slightly higher than does an optimal policy (although these are also the cases
where ADI is relatively less valuable). In such cases, optimal base-stock levels as in Theorem 2 lie
roughly on a plane (again, see part (b) of Figure 1) for k = 2. Hence, it is natural that LBPs will
perform well, because they control production using base-stock levels that lie on a plane for k = 2
(a hyperplane for larger k).
As discussed in Section 6.1, the incremental value of updating is typically small when an optimal
policy is followed. Therefore, including information from only one or two of the k stages may be
sufficient. Our computational experiments support a similar conclusion when the system is operated
under an LBP. In Table 4, for a system with k = 2, we compare the LBP heuristic with a version
of the heuristic that sets α1 = 0 (i.e., only information from stage 2 is used). Under this simplified
36
heuristic, the action is to produce if x − α2 y2 < r and to idle otherwise. We refer to this simplified
heuristic as LBP-S. As we can see, the percentage difference in cost between LBP and LBP-S is
relatively small. In Table 4, we also include the performance of an optimal policy for k = 1 that
ignores all but the orders in the last stage of update. We refer to this policy as the single-stage
optimal policy (SSOP). The performance of all three heuristics is somewhat comparable. Of course
the cost of the SSOP is always lower than that of LBP-S. However, LBP-S is simpler because it
requires storage of just two parameters.
In Table 5, we compare the performance of different versions of the LBP heuristic when the
number of stages is varied. A common feature of the versions we consider is simplicity of imple-
mentation. In all cases, the heuristics are characterized by either one or two parameters. In the
P
first heuristic, LBP-Sum, we set αi = 1 for i = 1, . . . , k, so that we produce if x − ki=1 yi < r and
we do not produce otherwise. There is a single parameter for which a search is needed in this case:
the parameter r. In the second heuristic, LBP-G (G is for geometric), we set αi = αk−i+1 where
P
0 ≤ α ≤ 1 so that we produce if x − ki=1 αk−i+1 yi < r. In this case, there are two parameters that
need to be determined: α and r. A special case is α = 1, in which case LBP-G reduces to LBP-Sum.
The third heuristic is LBP-S described above, where we set αi = 0 for i 6= k and optimize over αk
and r. We also consider a simplified version of LBP-S in which we fix αk = 1 and search only over
r. We call this version LBP-1.
As we can see, all four heuristics typically lead to improvements over the case of no ADI. This
shows that ADI can still be beneficial, even when using simple control policies. The data also
suggest that while updating is beneficial, its incremental value indeed diminishes (under policies
LBP-Sum and LBP-G) as the number of update stages increases. In fact, the table shows that in
some examples the performance of LBP-Sum can get worse as k increases. This occurs because it is
not desirable to place a high weight (of 1) on orders that are not close to being due. This does not
occur (at least for the examples in the table) when using the slightly more sophisticated LBP-G.
On the other hand, the comparison between LBP-S and LBP-G shows that there is some value in
accounting for the demand information from the first stages.
7 Concluding Comments
In this paper, we have considered a production-inventory system where the production facility has
access to ADI in the form of advance announcements and subsequent updates. This ADI is not
perfect because (a) customers may request an order prior to or later than the announced expected
due date, (b) the time between due date updates is random, and (c) announced orders may be
37
k No ADI LBP-Sum LBP-G LBP-S LBP-1
Table 5: The average cost of LBPs for systems with IDD and λ = 0.8,
νi = 0.1. In parentheses under LBP-Sum and LBP-1 [respectively, LBP-G,
LBP-S] are values of r [resp., (α, r), (αk , r)]. Values for k ≥ 3 are based
upon simulation estimates.
38
canceled prior to becoming due. Given the current inventory level and the number of announced
orders at various stages of update, the production facility is faced with the decision of whether
or not to produce. We considered several schemes through which demand information is revealed
and updated. For each scheme, we formulated the problem as a continuous-time Markov decision
process and showed that there is an optimal state-dependent base-stock policy, with base-stock
levels that are non-decreasing in the number of announced orders at each stage of update. We also
showed that the base-stock level increases by at most one unit with a unit increase in the number
of announced orders at any stage.
In numerical experiments, we observed that the cost reduction to the supplier from the intro-
duction of ADI is sensitive to system parameter values such as expected demand leadtime, arrival
rate of orders, and the ratio of backorder cost to holding cost. In particular, there appear to be val-
ues of these parameters for which ADI is most valuable. In addition, the numerical results suggest
that most of the benefit of full ADI (access to information about the status of orders in each stage
of the demand leadtime system) can be achieved with partial ADI (access to information about
only the last stage(s) of the demand leadtime system). Although ADI is always beneficial to the
supplier, we observed that this may not be the case for the customers who provide the ADI. In some
cases, the supplier uses ADI to reduce inventory at the expense of higher backorders. We showed
that a possible remedy is for the customers to negotiate higher backorder penalties in exchange for
ADI. Finally, we introduced a simple and easy to implement family of heuristics, linear base-stock
policies. Numerical results show that these heuristics can be nearly as effective as an optimal policy.
There are several potential avenues for future research. It would be of interest to consider
systems where order sizes are variable and where the actual number of units in each order is not
exactly known until the order becomes due. This would generalize the model with order cancelation
by assigning a probability distribution to order sizes that allows for sizes other than zero or one.
It would also be of interest to consider systems with multiple customer classes, where customers
are not homogeneous but instead vary in their demand rates, demand leadtimes, update frequency,
or backorder costs. Although the problem becomes more difficult to solve because of the multi-
dimensionality of the state space, we expect it will be still possible to determine the structure of an
optimal policy. In particular, we expect the production policy to remain a state-dependent base-
stock policy, but the state of the system would include backorder levels from each customer class.
In addition to production, the optimal policy would specify whether an order from a customer
that becomes due should be satisfied from available inventory, if there is any, or backordered. This
decision would of course depend on the backorder cost associated with the customer class. Finally,
39
it would be interesting to relax the exponential assumption associated with order inter-arrival
times, production times, and update durations. For example, it may be possible to substitute
the exponential distribution with phase-type distributions which are useful in approximating other
distributions. The use of phase-type distributions retains the Markovian property of the system
and continues to allow the formulation of the control problem as an MDP.
Acknowledgment
We would like to thank Yves Dallery, Jean-Philippe Gayon, and Francis de Véricourt for many
useful discussions.
Appendix
h k
X i
−1
T v(x, y) = γ c(x) + λTλ v(x, y) + νi Ti v(x, y) + µTµ v(x, y) .
i=1
(i) For operator Tλ we need to show that ∆Tλ v(x, y+ej ) ≤ ∆Tλ v(x+1, y+el ) for j = 0, . . . , k−1,
40
Pk
and l = j + 1, . . . , k. If i=1 yi < m − 1, then
≤ ∆v(x + 1, y + el + e1 )
= ∆Tλ v(x + 1, y + el ).
Pk
If i=1 yi = m − 1, then
≤ ∆v(x + 1, y + el )
= ∆Tλ v(x + 1, y + el ).
The inequalities follow from the fact that v satisfies condition (C2).
(ii) For operator Ti , we need to show that ∆Ti v(x, y+ej ) ≤ ∆Ti v(x+1, y+el ) for j = 0, . . . , k −1
and l = j + 1, . . . , k. For i = 1, . . . , k − 1, let J = I{i=j} and L = I{i=l} . When (J, L) ∈
{(0, 0), (0, 1)} the inequalities (12) and (13) below follow from the fact that v satisfies con-
dition (C2). For (J, L) = (1, 0), the inequalities follow from the fact that v satisfies condi-
tions (C2) and (C3).
If yi ≥ 1, we have
≤ 0. (12)
41
If yi = 0, we have
≤ 0. (13)
Now we consider operator Tk . Let I = I{l=k} . In both cases below (yk ≥ 1 and yk = 0), the
inequalities follow from the fact that v satisfies conditions (C2) and (C3).
If yk ≥ 1, we have
= yk ∆v(x − 1, y + ej − ek ) + (m − yk )∆v(x, y + ej )
≤ 0.
For yk = 0, we have
≤ 0.
(iii) To verify Tµ v satisfies condition (C2), let x∗y+ej := min{x : ∆v(x, y + ej ) ≥ 0} and x∗y+el :=
min{x : ∆v(x, y + el ) ≥ 0}. By condition (C3) and the definition of x∗y+el , we have
∆v(x∗y+el , y + ej ) ≥ ∆v(x∗y+el , y + el ) ≥ 0. This implies x∗y+el ≥ x∗y+ej . Also, by condi-
tion (C2) and the definition of x∗y+ej , we have ∆v(x∗y+ej + 1, y + el ) ≥ ∆v(x∗y+ej , y + ej ) ≥ 0.
42
This implies x∗y+ej + 1 ≥ x∗y+el . Therefore, x∗y+ej ≤ x∗y+el ≤ x∗y+ej + 1 and consequently
x∗y+el is equal to either x∗y+ej or x∗y+ej + 1.
Condition (C3):
(i) For operator Tλ it is straightforward to check that Tλ v satisfies condition (C3) when v ∈ U .
(ii) For operator Ti , we need to show that ∆Ti v(x, y+ej+1 ) ≤ ∆Ti v(x, y+ej ) for j = 0, . . . , k −1.
Consider first i = 1, . . . , k − 1, and let J = I{i=j} and K = I{i=j+1} . The three possible
combinations of J and K are (J, K) ∈ {(0, 0), (0, 1), (1, 0)}. The inequalities (14) and (15)
below follow from the fact that v satisfies condition (C3).
43
If yi ≥ 1, we have
≤ 0. (14)
If yi = 0, we have
≤ 0. (15)
Now we consider operator Tk . Let I = I{j=k−1} . The inequalities (16) and (17) below follow
from the fact that v satisfies conditions (C2) and (C3).
If yk ≥ 1, we have
≤ 0. (16)
44
If yk = 0, we have
≤ 0. (17)
(iii) For Tµ , we need to show that ∆Tµ v(x, y + ej+1 ) ≤ ∆Tµ v(x, y + ej ) for j = 0, . . . , k − 1.
In part (iii) of the argument for Condition (C2), we showed that x∗y+ej+1 is either x∗y+ej or
x∗y+ej + 1.
(b) x = x∗y+ej − 1 :
∆Tµ v(x, y + ej+1 ) = ∆Tµ v(x, y + ej ) = 0.
(b) x = x∗y+ej − 1 :
(c) x = x∗y+ej :
45
Condition (C4):
It is easy to verify that if v ∈ U , then Tλ v and Ti v; i = 1, . . . , k satisfy condition (C4). For Tµ ,
when x < 0, we have:
Therefore,
Tµ v(x + 1, y) ≤ min{v(x + 1, y), v(x, y)} = Tµ v(x, y).
where v0 is the function that is identically zero on Sm . It is simple to show by induction that
n−1
n b+h X
0 ≤ T v0 (x, y) ≤ (|x| + j) αj ,
γ
j=0
Pk
where α := γ −1 (λ+µ+m i=1 νi )
∗ (x, y) = lim
∈ [0, 1). Hence, we have 0 ≤ vm n
n→∞ T v0 (x, y) < ∞.
∗ is a real-valued function on S (i.e., v ∗ ∈ V ).
Therefore, vm m
46
P
where operator P : VR → VR is defined by P f (x, y) := (x0 ,y0 )∈R p(x,y),(x0 ,y0 ) ($(x, y))f (x0 , y0 ).
P
Define f (x, y) := −x + ki=1 yi , H := G, and := (µ − λ)/Λ. Note that the condition λ < µ ensures
P
that > 0. Let ȳ := ki=1 yi , L = I{ȳ<m} , and Ii = I{yi ≥1} for i = 1, . . . , k. For (x, y) ∈ R \ H, we
have
k−1
X
µ λ νi yi
P f (x, y) = f (x + 1, y) + f (x, y + Le1 ) + f (x, y + Ii (ei+1 − ei ))
Λ Λ Λ
i=1
k
X
νk yk νi (m − yi )
+ f (x − Ik , y − Ik ek ) + f (x, y)
Λ Λ
i=1
k−1 h
X i
µ λ νi yi
≤ (−x − 1 + ȳ) + (−x + ȳ + 1) + (−x + ȳ)
Λ Λ Λ
i=1
k h
X i
νk yk (m − yi )νi
+ (−x + ȳ) + (−x + ȳ)
Λ Λ
i=1
µ λ
= −x + ȳ − +
Λ Λ
= f (x, y) − .
$ T <∞
Therefore (18) holds, and R is positive recurrent. As mentioned above, this also yields E (x,y)
for any state (x, y) ∈ R.
Now we show that the expected first passage cost of going from any state (x, y) ∈ R to state
$ C < ∞. The first step is to show that there exists a nonnegative
(0, e0 ) is finite; that is E(x,y)
function g ∈ VR and a finite set H ⊂ R with (0, e0 ) ∈ H such that
Define g(x, y) := θκ−x+ȳ and H := G. For κ > 1, θ > 0, a calculation as above shows that for
(x, y) ∈ R \ H, we have
θ −x+ȳ−1 h i
g(x, y) − P g(x, y) ≥ κ (λ + µ)κ − µ − λκ2 . (19)
Λ
For κ ∈ (1, µ/λ) the term in square brackets in (19) is strictly positive. Hence, for such κ and with
θ large enough, the right-hand side of (19) exceeds c(x) = hx+ + bx− for all (x, y) ∈ R \ H. The
assumption λ < µ ensures that (1, µ/λ) is non-empty. By Corollary C.2.4 of Sennott, the above
$
proves that E(0,e $ C <∞
C < ∞. Finally, by Proposition C.2.2(iv) of Sennott it follows that E(x,y)
0)
Proof of Lemma 2. Fix m and (x, y) ∈ Sm . For any policy π ∈ Π we can obtain an upper bound
∗ (x, y) by computing the value of the policy π. Hence, to bound v ∗ (x, y) from above, it suffices
on vm m
47
to bound from above the value of using the policy π + that “always produces” [π + (x0 , y0 ) = 1 for
all (x0 , y0 ) ∈ Sm ].
To this end, we begin by developing an explicit construction of a version of the continuous-time
Markov chain (CTMC) induced by π + . Suppose that {Ai : i = 1, 2, . . . } is an i.i.d. sequence of
uniform [0, 1] random variables and that {Ei : i = 1, 2, . . . } is an i.i.d. sequence of exponential
random variables each with mean Λ−1 , independent of {Ai : i = 1, 2, . . . }. Let Ê0 := 0, Ên :=
Pn
j=1 Ej for n = 1, 2, . . . , and N (t) := max{n ≥ 0 : Ên ≤ t}.
P
Recall the notation ȳ0 = ki=1 yi0 . Consider the function f : Sm × [0, 1] → Sm given by
h i
(x 0 + 1, y0 ) if a ∈ 0, µ/Λ
i
0 , y0 + e I 0
(x 1 {ȳ <m} ) if a ∈ µ/Λ, (λ + µ)/Λ
i
(x0 , y0 + e Pi−1 0 )/Λ, (λ + µ +
Pi 0 )/Λ
i+1 − e i ) if a ∈ (λ + µ + j=1 j jν y ν y
j=1 j j
f ((x0 , y0 ), a) :=
for i = 1, . . . , k − 1
Pk−1 Pk i
(x 0 − 1, y0 − e ) if a ∈ (λ + µ + ν y 0 )/Λ, (λ + µ + ν y 0 )/Λ
k j=1 j j j=1 j j
0 0
(x , y ) otherwise.
For the fixed value (x, y) ∈ Sm , define
and
Note that {(X(t), Y(t)) : t ≥ 0} has the distribution of the CTMC induced by π + , as desired.
For n = 1, 2, . . . define
Rn := j ∈ {1, . . . , n} : (Xj , Yj ) = (Xj−1 + 1, Yj−1 )
Un := j ∈ {1, . . . , n} : (Xj , Yj ) = (Xj−1 , Yj−1 + e1 )
Dn := j ∈ {1, . . . , n} : (Xj , Yj ) = (Xj−1 − 1, Yj−1 − ek )
Xn = X 0 + Rn − D n
Ȳn = Ȳ0 + Un − Dn .
48
Pk
(Again, Ȳn = i=1 Yn,i .) For n = 1, 2, . . . also define
i
Ũn := j ∈ {1, . . . , n} : Aj ∈ µ/Λ, (λ + µ)/Λ .
Un ≤ Ũn
Dn ≤ Ȳ0 + Un .
Next we construct another process that is coupled with {(X(t), Y(t))}. Consider the function
g : Z+ × [0, 1] → Z+ given by
h i
z+1 if a ∈ 0, (λ + µ)/Λ
g(z, a) :=
z otherwise.
Define
Z0 := |x| + ȳ
Zn := g(Zn−1 , An ) n = 1, 2, . . . ,
Z(t) := ZN (t) .
Observe that
≤ |X0 | + Rn + Ȳ0 + Un
= |x| + Rn + ȳ + Ũn
= Zn .
Hence,
49
Next, define c† (x) := (h + b)x. Note that c† (·) is increasing on Z+ and that c(x) ≤ c† (|x|). It
now follows from (21) that
Z ∞ Z ∞ Z ∞ Z ∞
π+ −βt −βt −βt †
E(x,y) e c(X(t))dt = E e c(X(t))dt ≤ E e c (|X(t)|)dt ≤ E e−βt c† (Z(t))dt,
t=0 t=0 t=0 t=0
where E is expectation on the probability space upon which {Ai } and {Ei } are defined and where
R∞
the initial state is (x, y). Hence, E t=0 e−βt c† (Z(t))dt is an upper bound on vm
∗ (x, y).
Regardless of m, the process {Z(t)} has the following “dynamics”: Z(0) = |x| + ȳ and Z(·)
remains in state (say) z ∈ Z+ an exponential amount of time with mean 1/(λ + µ) before moving
to state z + 1 ∈ Z+ . The latter fact can be verified by conditioning on the geometric number of
transitions made from z back to z by the embedded process {Zn }. Direct calculations using value
iteration [to compute the expected discounted cost accrued by {Z(t)} through the time of its (say)
j-th jump to the right] and induction show that
Z ∞ i i
−βt 0 (h + b)(|x| + ȳ) X λ+µ h+b X λ+µ
E e c (Z(t))dt = + i
t=0 λ+µ+β λ+µ+β λ+µ+β λ+µ+β
i≥0 i≥1
(h + b)(|x| + ȳ) (h + b)(λ + µ)
= +
β β2
<∞,
50
and
k−1
X
λw0 (x, y + e1 ) + νi yi w0 (x, y + ei+1 − ei ) + νk yk w0 (x − 1, y − ek ) + µI{a=1} w0 (x + 1, y)
i=1
k
X
− [λ + νi yi + µI{a=1} ]w0 (x, y) ≤ c0 w0 (x, y) + b0 (24)
i=1
Let w0 (x, y) := (|x| + ȳ)2 and M 0 := λ + µ + max{ν1 , . . . , νk }. Note that (23) holds when x = ȳ = 0.
Otherwise, we have either |x| ≥ 1 or ȳ ≥ 1 (or both), and hence
k
X
λ+ νi yi + µ ≤ M 0 (|x| + ȳ) .
i=1
Multiplying the above by R(x, y) yields (23). The left-hand side of (24) simplifies to
Therefore, C(3) holds with c0 := 2(λ + µ) and b0 := λ + µ. This completes the proof.
Proof of Theorem 4. Here, we use conditions (C1)–(C4) with Zk+ (m) and Zk+ (m − 1) replaced
b k . With this substitution, we will verify that if v ∈ U , then Tbv ∈ U . The rest of the proof
by Z +
is similar to that of Theorem 1 and is omitted for brevity. As in the proof of Proposition 1, we
can readily verify that Tbλ v, Tbµ v ∈ U if v ∈ U . In the following, we show that if v ∈ U , then Tbi v
satisfies conditions (C1)–(C4) so that Tbi v ∈ U as well. For conditions (C1) and (C4), the proof is
straightforward.
Condition (C2):
We need to show that ∆Tbi v(x, y + ej ) ≤ ∆Tbi v(x + 1, y + el ) for j = 0, . . . , k − 1 and l = j + 1, . . . , k.
Consider i = 1, . . . , k − 1 and let J = I{i=j} and L = I{i=l} .
For yi ≥ 1, we have
≤ ∆v(x + 1, y + el + ei+1 − ei )
= ∆Tbi v(x + 1, y + el ).
51
For yi = 0, we have
≤ ∆v(x + 1, y + el + L(ei+1 − ei ))
= ∆Tbi v(x + 1, y + el ).
The inequalities follow from the fact that v satisfies conditions (C1) and (C2).
Now we consider Tbk . Let I = I{l=k} . For yk ≥ 1, we have
For yk = 0, we have
The inequalities follow from the fact that v satisfies conditions (C2) and (C3).
Condition (C3):
We need to show that ∆Tbi v(x, y + ej+1 ) ≤ ∆Tbi v(x, y + ej ) for j = 0, . . . , k − 1. For i = 1, . . . , k − 1
let J = I{i=j} and K = I{i=j+1} .
For yi ≥ 1, we have
≤ ∆v(x, y + ej + ei+1 − ei )
= ∆Tbi v(x, y + ej ).
For yi = 0, we have
≤ ∆v(x, y + ej + J(ei+1 − ei ))
= ∆Tbi v(x, y + ej ).
For all values of yi ≥ 0, the inequalities above follow from the fact that v satisfies condition (C3).
For Tbk , let I = I{j=k−1} . For yk ≥ 1, we have
For yk = 0, we have
52
The inequalities follow from the fact that v satisfies conditions (C2) and (C3).
Systems with IDD and Low Arrival Rate. For small enough λ, a system with no ADI will
hold no inventory and produce to order, giving an average cost of approximately C1 := λb/µ; this
expression comes from ignoring the possibility of multiple orders being present at once (which is
not unreasonable when λ is very small) and noting that jobs arrive at rate λ and incur cost at rate
b during the time it takes to produce one unit, which has mean 1/µ. For a system with ADI, it will
again be best not to hold inventory when no jobs are in the demand leadtime system. When an
order is announced, then the decision is whether or not to produce one unit in advance of the order
be coming due. Again ignoring the possibility of multiple orders being present simultaneously, if the
decision is not to produce upon announcement of an order, then the long-run average cost will again
be roughly C1 = λb/µ. If the decision is to commence production upon announcement of an order,
then we can derive an approximation for the average cost by conditioning on whether the production
is completed prior to the order becoming due [which occurs with probability µ/(µ+ν)] or not [which
occurs with probability ν/(µ + ν)]. By standard properties of exponential random variables, the
order will generate an average holding cost of h/ν conditional upon the unit completing production
prior to the order becoming due. Similarly, the order will generate an average backorder cost of
b/µ conditional upon the unit not completing production prior to the order coming due. Putting
µ h ν b
it together, the long-run average cost is approximately C2 := λ[ µ+ν ν + µ+ν µ ].
Hence, for the system with ADI, it will be better to produce upon announcement of an order
provided C2 < C1 . Rearranging terms, it follows that it is better to produce if ν > ν ∗ (µ) and
it is better to wait for the order to become due otherwise. It follows that if ν > ν ∗ (µ) then
PCR ≈ 100 × (bµν − hµ2 )/(bµν + bν 2 ) for λ small. Likewise, if ν ≤ ν ∗ (µ) then PCR ≈ 0 for λ
small. A similar analysis is possible for general k.
References
D. P. Bertsekas. Dynamic Programming and Optimal Control, Volume 2. Athena Scientific, Bel-
mont, MA, second edition, 2001.
P. Brémaud. Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. Springer-Verlag,
New York, 1999.
53
J. A. Buzacott and J. G. Shanthikumar. Safety stock versus safety time in MRP controlled pro-
duction systems. Management Science, 40:1678–1689, 1994.
R. Cavazos-Cadena and L. I. Sennott. Comparing recent assumptions for the existence of average
optimal stationary policies. Operations Research Letters, 11:33–37, 1992.
F. de Véricourt, F. Karaesmen, and Y. Dallery. Optimal stock allocation for a capacitated supply
system. Management Science, 48:1486–1501, 2002.
I. Duenyas and W. J. Hopp. Quoting customer lead times. Management Science, 41:43–57, 1995.
G. Gallego and Ö. Özer. Integrating replenishment decisions with advance order information.
Management Science, 47:1344–1360, 2001.
G. Gallego and Ö. Özer. Optimal use of demand information in supply chain management. In
J. Song and D. Yao, editors, Supply Chain Structures: Coordination, Information and Optimiza-
tion, pages 119–160. Kluwer Academic Publishers, 2002.
J.-P. Gayon, S. Benjaafar, and F. de Véricourt. Using imperfect demand information in production-
inventory systems with multiple customer classes. Working paper, Ecole Centrale Paris, 2004.
S. C. Graves, H. C. Meal, S. Dasu, and Y. Qiu. Two-stage production planning in a dynamic envi-
ronment. In S. Axsäter, C. Schneeweiss, and E. Silver, editors, Multi-stage Production Planning
and Control. Springer-Verlag, Berlin, 1986.
A. Y. Ha. Inventory rationing in a make-to-stock production system with several demand classes
and lost sales. Management Science, 43:1093–1103, 1997.
54
R. Hariharan and P. Zipkin. Customer-order information, leadtimes, and inventories. Management
Science, 41:1599–1607, 1995.
D. C. Heath and P. L. Jackson. Modeling the evolution of demand forecasts with application to
safety-stock analysis in production/distribution systems. IIE Transactions, 26:17–30, 1994.
W. J. Hopp and M. R. Sturgis. A simple, robust leadtime-quoting policy. Manufacturing & Service
Operations Management, 3:321–336, 2001.
G. Liberopoulos, A. Chronis, and S. Koukoumialos. Base stock policies with some unreliable
advance demand information. Working paper, University of Thessaly, 2003.
S. Lippman. Applying a new device in the optimization of exponential queueing systems. Operations
Research, 23:687–710, 1975.
Ö. Özer and W. Wei. Inventory control with limited capacity and advance demand information.
Operations Research, 52:988–1000, 2004.
L. B. Schwarz, N. C. Petruzzi, and K. Wee. The value of advance-order information and the
implications for managing the supply chain: an information/control/buffer portfolio perspective.
Working paper, Purdue University, 1997.
L. I. Sennott. Stochastic Dynamic Programming and the Control of Queueing Systems. John Wiley
& Sons, New York, 1999.
R. F. Serfozo. An equivalence between continuous and discrete time Markov decision processes.
Operations Research, 27:616–620, 1979.
S. P. Sethi, H. Yan, and H. Zhang. Peeling layers of an onion: Inventory model with multiple
delivery modes and forecast updates. Journal of Optimization Theory and Applications, 108:
253–281, 2001.
55
S. Stidham. Analysis, design, and control of queueing systems. Operations Research, 50:197–216,
2002.
K. van Donselaar, L. R. Kopczak, and M. Wouters. The use of advance demand information in a
project-based supply chain. European Journal of Operational Research, 130:519–528, 2001.
M. H. Veatch and L. M. Wein. Scheduling a make-to-stock queue: Index policies and hedging
points. Operations Research, 44:634–647, 1996.
K. Zhu and U. W. Thonemann. Modeling the benefits of sharing future demand information.
Operations Research, 52:136–147, 2004.
56