Analysis of Queues - Methods and Applications (2012, CRC Press)
Analysis of Queues - Methods and Applications (2012, CRC Press)
Natarajan Gautam
Analysis of Queues
Methods and Applications
The Operations Research Series
Series Editor: A. Ravi Ravindran
Professor, Department of Industrial and Manufacturing Engineering
The Pennsylvania State University – University Park, PA
Published Titles:
Analysis of Queues: Methods and Applications
Natarajan Gautam
Forthcoming Titles:
Natarajan Gautam
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made
to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all
materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all
material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in
any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, micro-
filming, and recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.
copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-
8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that
have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identi-
fication and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Author. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
List of Case Studies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi
List of Paradoxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Analysis of Queues: Where, What, and How?. . . . . . . . . . . . . . . . . . . . 2
1.1.1 Where Is This Used? The Applications . . . . . . . . . . . . . . . . . . . . 2
1.1.2 What Is Needed? The Art of Modeling . . . . . . . . . . . . . . . . . . . . 5
1.1.3 How Do We Plan to Proceed? Scope and Methods . . . . . . 7
1.2 Systems Analysis: Key Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.1 Stability and Flow Conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.2 Definitions Based on Limiting Averages . . . . . . . . . . . . . . . . . . 10
1.2.3 Asymptotically Stationary and Ergodic Flow
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.4 Little’s Law for Discrete Flow Systems . . . . . . . . . . . . . . . . . . . . 12
1.2.5 Observing a Flow System According to a Poisson
Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Queueing Fundamentals and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.1 Fundamental Queueing Terminology . . . . . . . . . . . . . . . . . . . . . 21
1.3.2 Modeling a Queueing System as a Flow System . . . . . . . . . 26
1.3.3 Relationship between System Metrics for G/G/s
Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.3.3.1 G/G/s/K Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.3.4 Special Case of M/G/s Queue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.4 Psychology in Queueing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763
Preface
each chapter and leave out the remaining topics or assign them for
independent reading.
2. Furthermore, this book would be perfect when used in two courses.
In the first course, one could cover the appendices, followed by
Chapters 1 through 4. And in the second (advanced) course, one
could cover Chapters 5 through 10. In that case, it would be
sufficient to require an undergraduate course on probability as a
prerequisite to the first course in the sequence.
The analytical methods presented in this book are substanti-
ated using applications from a wide set of domains, including
production, computer, communication, information, transportation,
and service systems. This book could thus be used in courses in
programs such as industrial engineering, systems engineering, oper-
ations research, statistics, management science, operations man-
agement, applied mathematics, electrical engineering, computer
science, and transportation engineering. In addition, I sincerely
hope that this book appeals to an audience beyond students and
instructors. It would be appropriate for researchers, consultants, and
analysts that work on performance modeling or use queueing mod-
els as analysis tools. This book has evolved based on my numerous
offerings of entry-level to mid-level graduate courses on the theory
and application of queueing systems. Those courses have been my
favorite among graduate courses, and I am absolutely passionate
about the subject area. I have truly enjoyed writing this book, and I
sincerely hope you will enjoy reading it and getting value out of it.
Natarajan Gautam
College Station, Texas
I have come to realize after writing this book that just like it takes a village
to raise a child, it does so to write a book as well. I would like to take this
opportunity to express my gratitude to a small subset of that village.
I would like to begin by thanking my dissertation adviser Professor
Vidyadhar G. Kulkarni for all the knowledge, guidance, and professional
skills he has shared with me. His textbooks have been a source of inspira-
tion and a wealth of information that have been instrumental in shaping this
book. I would also like to acknowledge Professor Kulkarni’s fabulous teach-
ing style that I could only wish to emulate. Talking about excellent teachers,
I would like to thank all the fantastic teachers I have had growing up. I was
lucky to have fabulous mathematics teachers in high school—Mrs. Sarva-
mangala and Mr. Nainamalai. I am also grateful to my excellent instructors
during my undergraduate program, including Professor G. Srinivasan for
his course on operations research and Professor P.R. Parthasarathy for his
course on probability and random processes. I would also like to thank Pro-
fessor Shaler Stidham Jr., who taught me the only course on queueing that I
have ever taken as a student.
Next, I would like to express my sincerest gratitude to some of my col-
leagues. In particular, I would like to thank Professor A. Ravindran for
encouraging me to write this book and for all his tips for successfully com-
pleting it. I was also greatly motivated by the serendipitous conversation I
had with Professor Sheldon Ross when he happened to sit by me during a
bus ride at an INFORMS conference. In addition, I would also like to thank
several colleagues that have helped me with this manuscript through numer-
ous conversations, brainstorming sessions, and e-mail exchanges. They
include Professors Karl Sigman, Ward Whitt, and David Yao from Columbia
University; Dr. Mark Squillante from IBM; Professors Raj Acharya, Russell
Barton, Jeya Chandra, Geroge Kesidis, Soundar Kumara, Anand Sivasubra-
maniam, Qian Wang, and Susan Xu from Penn State University; Professors
J.-F. Chamberland, Guy Curry, Rich Feldman, Georgia-Ann Klutke, P.R.
Kumar, Lewis Ntaimo, Henry Pfister, Don Phillips, Srinivas Shakkottai,
Alex Sprintson, and Marty Wortman from Texas A&M University; Profes-
sor Rhonda Righter from the University of California at Berkeley; Professor
Sunil Kumar from the University of Chicago; and Professors Anant Bal-
akrishnan, John Hasenbein, David Morton, and Sridhar Seshadri from the
University of Texas at Austin.
Some of the major contributions to the contents of this book are due to
my former and current students that took my courses and collaborated on
research with me. In particular, I would like to thank Vineet Aggarwal, Yiyu
xvii
xviii Acknowledgments
xix
This page intentionally left blank
List of Case Studies
xxi
This page intentionally left blank
List of Paradoxes
xxiii
This page intentionally left blank
1
Introduction
For a moment imagine being on an island where you do not have to wait
for anything; you get everything the instant you want or need them! Sounds
like a dream doesn’t it? Well, let us not have any illusions about it and state
upfront that this book is not about creating such an island, leave alone cre-
ating such a world. Wait happens! This book is about how to deal with it.
In other words, how do you analyze systems to manage the waiting expe-
rienced by users of the system. Having said that waiting is inevitable, it is
only fair to point out that in several systems waiting has been significantly
reduced using modern technology. For example, at fast-food restaurants it
is now possible to order online and your food is ready when you show up.
At some amusement parks, especially for popular rides, you can pick up a
ticket that gives you a time to show up at the ride to avoid long lines. With
online services, waits at banks and post offices have reduced considerably.
Most toll booths these days have automated readers that let you zip through
without stopping. There are numerous such examples and it appears like
there are only a few places like airport security where the wait has gotten
longer over the years!
Before delving into managing waiting, here are some further comments
to consider:
1
2 Analysis of Queues
De
Real-life system sig
g na
n
eli nd
od op
er
M ati
on
Model Optimization,
description and control, what-if
assumptions Focus of this book analysis, etc.
g
in
ak
N
eg
m
ot
ion
ia
Performance De
n
Analysis framework
FIGURE 1.1
Framework and scope of this book.
Input Output
System
FIGURE 1.2
A flow system with inputs and outputs.
Introduction 9
busses into which people enter and exit whenever the bus stops; cash reg-
ister at a store into which money enters and exits; fuel reservoir in a gas
station where gasoline enters when a fuel tanker fills it up and it exits when
customers fill up their vehicle gas tanks; theme parks where riders arrive into
the park, spend a good portion of their day going on rides and leave. There
are many such examples of flow systems in everyday life. Not all such sys-
tems are necessarily best modeled as queueing systems. Nonetheless, there
are a large number of fundamental results that we would like to present here
with the understanding that although they are frequently used in the context
of queueing, they can also be applied in wider domains such as inventory
systems, for example.
We describe some notations that would be used in this chapter alone. The
description is given as though the entities are discrete; however, by changing
the word “number” to “amount,” one can pretty much arrive at the same
results if the entities were continuous. Let α(t) be the number of entities that
flow into the system during the time interval [0, t]. Also, define γ(t) as the
number of entities in the system at time t with γ(0) = 0, that is, the system is
empty initially. Finally, δ(t) denotes the number of entities that flow out of
the system in time [0, t]. Due to flow conservation, we have
which essentially states that all entities that arrived into the system during a
time period of length t either left the system or are still in the system. In other
words, entities are neither created nor destroyed. If one were careful, most
flow systems can be modeled this way by appropriately choosing the entities.
For example, in systems like maternity wards in hospitals where it
appears like the number of people checking in would be fewer than num-
ber of people checking out, by appropriately accounting for unborn children
at the input itself, this balance can be attained. Although the previous exam-
ple was said in jest, one must be careful especially in systems with losses. In
our definition, entities that are lost are also included in the output (i.e., in the
δ(t) definition) but one has to be very careful during analysis. To illustrate
this point further, consider a system like a hotline where customers call for
help. In such systems, some customers may wait and leave without being
served, and some customers may leave without waiting (say due to a busy
signal). One has to be very careful in classifying the customers and deriving
performance measures accordingly for each class individually. The results
presented in this section is by aggregating over all classes (unless accounted
for explicitly). To clarify further, consider a production system where the
raw material that flows in results in both defective and nondefective prod-
ucts. Clearly, when it comes to analysis, the emphasis we place on defective
items may be significantly different than that for nondefective items, so it
might be beneficial to derive individual performance measures. To model
the production system as a whole, it may be beneficial to consider them as
10 Analysis of Queues
a single class. With this in mind, we next present a set of results that are
asymptotic in time, that is, as t → ∞.
α(t) δ(t)
lim = lim .
t→∞ t t→∞ t
α(t)
= lim . (1.3)
t→∞ t
Notice that the numerator of the term inside the limit essentially is the
amount of time in the interval [0, T] during which there were exactly i in
the system. Verify that
∞
H= iqi .
i=0
τ1 + τ2 + · · · + τn
= lim . (1.6)
n→∞ n
H = (1.7)
where , H, and are defined in Equations 1.3, 1.4, and 1.6, respectively.
We do not provide a proof of the preceding result (for which it is more con-
venient if we have a stationary and ergodic system, although that is not
Introduction 13
Problem 1
Couch-Potato is a high-end furniture store that carries a sofa set called Plush.
Customers arrive into Couch-Potato requesting for a Plush sofa set according
to a Poisson process at an average rate of 1 per week. Couch-Potato’s policy
is to not accept any back orders. So if there are no Plush sofa sets available
in inventory, customers’ requests are not fulfilled. It is also Couch-Potato’s
policy to place an order from the manufacturer for “five” Plush sofa sets as
soon as the number of them in inventory goes down to “two”. The manu-
facturer of Plush has an exponentially distributed delivery time with a mean
of 1 week to deliver the set of “five” Plush sofa sets. Model the Plush sofa
set system in Couch-Potato as a flow system. Is the system stable? Compute
the average input rate , the time-averaged number of Plush sofa sets in
inventory (H), and the average number of weeks each Plush sofa set stays in
Couch-Potato ().
Solution
The Plush system in Couch-Potato is indeed a (discrete) flow system where
with every delivery from the manufacturer, five sofa sets flow into the sys-
tem. Also, with every fulfilled customer order, sofa sets exit the system. We
let γ(t) be the number of Plush sofa sets in the system at time t. Although
we do not need γ(0) to be zero for the analysis, assuming that would not
be unreasonable. Also, notice that for all t, γ(t) stays between “zero” and
“seven”. For example, if by the time the shipment arrived, two customers
have already ordered Plushes, then the number in inventory would become
zero. Likewise, a maximum of “seven” is because an order of “five” Plushes
are placed when the inventory reaches “two”, so if the shipment arrives
before the next customer demand, there would be “seven” Plush sofa sets
in the system. Notice that since γ(t) never exceeds “seven”, the system is
stable.
To obtain the other performance measures, we model the stochas-
tic process {γ(t), t ≥ 0} as a CTMC with state space {0, 1, 2, 3, 4, 5, 6, 7} and
corresponding infinitesimal generator matrix
14 Analysis of Queues
⎡ ⎤
−1 0 0 0 0 1 0 0
⎢ 1 −2 0 0 0 0 1 0⎥
⎢ ⎥
⎢ 0 1 −2 0 0 0 0 1⎥
⎢ ⎥
⎢ 0 0 1 −1 0 0 0 0⎥
Q=⎢
⎢ 0
⎥
⎢ 0 0 1 −1 0 0 0⎥⎥
⎢ 0 0 0 0 1 −1 0 0⎥
⎢ ⎥
⎣ 0 0 0 0 0 1 −1 0⎦
0 0 0 0 0 0 1 −1
[p0 p1 p2 p3 p4 p5 p6 p7 ]Q = [0 0 0 0 0 0 0 0]
and p0 + p1 + · · · + p7 = 1. We get
1
[p0 p1 p2 p3 p4 p5 p6 p7 ] = [1 1 2 4 4 4 3 2].
21
Note that an order for “five” Plush sofa sets is placed every time the
inventory level reaches 2. So we pay attention to state 2 with correspond-
ing steady-state probability p2 = 2/21. In the long run, a fraction 2/21 of time
the system is in state 2 and on average state 2 lasts for half a week. Thus
the average rate at which orders are placed is 2 × 2/21 per week. Hence
the average input rate = 2 × (2/21) × 5 = 20/21 Plush sofa sets per week.
Also, the time-averaged number of Plush sofa sets in inventory (H) can be
computed as
7
85
H= ipi = .
21
i=0
Therefore, using Little’s law, the average number of weeks each Plush sofa
set stays in Couch-Potato () can be computed as
H
= = 4.25 weeks.
Remark 1
lim E[γ(tn )] → H,
n→∞
and
lim P{γ(tn ) = i} → qi ,
n→∞
Problem 2
Consider a single-product inventory system with continuous review adopt-
ing what is known as the (K, R) policy, which we explain next. Demand
of one unit arrives according to a Poisson process with parameter λ per
week. Demand is satisfied using products stored in inventory, and no back-
orders allowed, that is, if a demand occurs when the inventory is empty, the
demand is not satisfied. The policy adopted is called (K, R) policy wherein
an order for K items is placed as soon as the inventory level reaches R. It
takes a random time exponentially distributed with mean 1/θ weeks for the
order to be fulfilled (this is called lead time). Assume that K > R, but both R
and K are fixed constants. Problem 1 is a special case of this single-product
inventory system adopting the (K, R) policy with K = 5, R = 2, and θ = λ = 1
per week. What would the distribution and expected value of the number
of items in inventory be the instant a demand arrives? Also, determine the
average product departure rates.
Solution
Let X(t) be the number of products in inventory at time t. Clearly, {X(t), t ≥ 0}
is a CTMC with state space S = {0, 1, . . . , R + K} and rate diagram shown in
Figure 1.3. Let pi be the steady-state probability of i items in inventory. To
obtain pi for all i ∈ [0, R + K], we use the balance equations
i−1
R
θ pj = λpi , i = 1, . . . , R, θ pj = λpi , i = R + 1, . . . , K,
j=0 j=0
R
θ pj = λpK+i , i = 1 . . . , R.
j=i
θ θ
θ θ θ
0 1 2 R–1 R R+1 K K+ 1 K+ 2 R+K–1 R+K
λ λ λ λ λ λ λ λ λ λ λ λ λ
FIGURE 1.3
Rate diagram for (K, R) inventory system.
Introduction 17
K+R
Then, p0 can be obtained using pi = 1 =⇒
i=0
θ i−1 θ R θ R
R R
i−1
p0 1+ φ + (K − R) φ + (φ − φ ) = 1,
λ λ λ
i=1 i=1
Having described some generic results for flow systems, we now delve
into a special type of flow system called queueing systems.
Arrivals Departures
FIGURE 1.4
A single-station queueing system. (From Gautam, N., Queueing Theory. Operations Research and
Management Science Handbook, A. Ravindran (ed.), CRC Press, Taylor & Francis Group, Boca
Raton, FL, pp. 9.1–9.37, 2007. With permission.)
of canonical queueing systems. For the rest of this chapter, we only consider
single-station queueing systems with one waiting line and one set of servers.
Of course one could model a multistation queueing network as a single flow
system, but in practice one typically models in a decomposed manner where
each node in a network is a single-flow system. This justifies considering a
single station. Also, at this stage, we do not make any distinctions between
classes of entities. With that in mind, we present some details of queueing
systems.
Consider a single-station queueing system as shown in Figure 1.4. This
is also called a single-stage queue. There is a single waiting line and one or
more servers (such as at a bank or post office). We will use the term “servers”
generally but sometimes for specific systems we would call them processors
or machines. We call the entities that arrive and flow through the queueing
system as customers, jobs, products, parts, or just entities. Arriving cus-
tomers enter the queueing system and wait in the waiting area if a server
is not free (otherwise they go straight to a server). When a server becomes
free, one customer is selected and service begins. Upon service completion,
the customer departs the system. Usually, time between arrivals and time
to serve customers are both random quantities. Therefore, to analyze queue-
ing systems one needs to know something about the arrival process and the
service times for customers. Other aspects that are relevant in terms of anal-
ysis include the number of servers, capacity of the system, and the policy
used by the servers to determine the service order. Next we describe few key
remarks that are needed to describe some generic, albeit basic, results for
single-station queueing systems.
Remark 2
The entities that flow in the queueing system will be assumed to be dis-
crete or countable. In fact, a bulk of this book is based on discrete queues
Introduction 19
with fluid queues considered in only two chapters toward the end. As
described earlier, these entities would be called customers, jobs, products,
parts, etc.
Remark 3
Unless explicitly stated otherwise, the customer inter-arrival times, that is,
the time between arrivals, are assumed to be IID. Thereby the arrival process
is generally assumed to be what is called a renewal process. Some exceptions
to that are when the arrival process is time varying or when it is correlated.
But those exceptions will only be made in subsequent chapters. Further, all
arriving customers enter the system if there is room to wait (that means
unless stated otherwise, there is no balking). Also, all customers wait till their
service is completed in order to depart (likewise, unless stated otherwise,
there is no reneging).
Remark 4
For the basic results some assumptions are made regarding the service
process. In particular, we assume that the service times are IID random
variables. Also, the servers are stochastically identical, that is, the service
times are sampled from the same distribution for all servers. In addition, the
servers adopt a work-conservation policy, that is, the server is never idle
when there are customers in the system. The last, assumption means that
as soon as a service is completed for a customer, the server starts serving
the next customer instantaneously (if one is waiting for service). Thus while
modeling one would have to appropriately define what all activities would
be included in a service time.
The assumptions made in the preceding remarks can and will certainly
be relaxed as we go through the book. There are many instances in the book
that do not require these assumptions. However, for the rest of this chap-
ter, unless explicitly stated otherwise, we will assume that assumptions in
Remarks 2, 3, and 4 hold. Next, using the assumptions, we will provide some
generic results that will be useful to analyze queues. However, before we
proceed to those results, recall that to analyze queueing systems one needs to
know something about the arrival process, the service times for customers,
the number of servers, capacity of the system, and the policy used by the
servers to determine the service order. We will next describe queues using a
compact nomenclature that takes all those into account.
In order to standardize description for queues we use a notation that is
accepted worldwide called Kendall notation honoring the pioneering work by
20 Analysis of Queues
AP/ST/NS/Cap/SD.
TABLE 1.1
Fields in the Kendall Notation
AP M, G, Ek , Hk , PH, D, GI, etc.
ST M, G, Ek , Hk , PH, D, GI, etc.
NS denoted by s, typically 1, 2, . . . , ∞
Cap denoted by K, typically 1, 2, . . . , ∞
default: ∞
SD FCFS, LCFS, ROS, SPTF, etc.
default: FCFS
Source: Gautam, N., Queueing Theory. Operations
Research and Management Science Handbook,
A. Ravindran (ed.), CRC Press, Taylor &
Francis Group, Boca Raton, FL, pp. 9.1–
9.37, 2007. With permission.
Introduction 21
Define An as the time when the nth customer arrives, and thereby
An − An−1 is the nth inter-arrival time if the arrivals are not in batches. Let
Sn be the service time for the nth customer. Usually from the Kendall nota-
tions, especially when assumptions in Remarks 2, 3, and 4 hold, we typically
know both An − An−1 and Sn stochastically for all n. In other words, we know
the distributions of inter-arrival times and service times. In some sense they
and the other Kendall notation terms form the “input.” Next we describe
some terms and performance measures that can be derived once we know
the inputs.
Let Dn be the time when the nth customer departs. We denote X(t) as the
number of customers in the system at time t, Xn as the number of customers
in the system just after the nth customer departs, and Xn∗ as the number of
customers in the system just before the nth customer arrives. Although in
this chapter we would not go into details, it is worthwhile mentioning that
{X(t), t ≥ 0}, {Xn , n ≥ 0}, and {Xn∗ , n ≥ 0} are usually modeled as stochastic pro-
cesses. We also define two other variables, which are usually not explicitly
modeled. These are Wn , the waiting time of the nth customer, and W(t), the
total remaining workload at time t (this is the sum of the remaining service
time for all the customers in the system at time t). The preceding variables are
described in Table 1.2 for easy reference, where customer n denotes the nth
arriving customer. Note that if we are given A1 , A2 , . . ., as well as S1 , S2 , . . .,
we can obtain Dn , X(t), Xn , Xn∗ Wn , and W(t). We describe that next for a
special case (note that typically we do not know the explicit realizations of
An and Sn for all n; we only know their distributions).
To illustrate the terms described in Table 1.2, consider a G/G/1 queue
where the inter-arrival times are general and service times are general with
a single server adopting FCFS and infinite space for customers to wait (refer
TABLE 1.2
Variables—Their Mathematical Notation as well as Meanings
Relation to
Variable Other Variables Meaning
An Arrival time of customer n
Sn Service time of customer n
Dn Departure time of customer n
X(t) Number of customers in the system at time t
Xn X(Dn +) Number in system just after customer n’s departure
Xn∗ X(An −) Number in system just before customer n’s arrival
Wn Dn − An Waiting time of customer n
W(t) Total remaining workload at time t
Source: Gautam, N., Queueing Theory. Operations Research and Management Science
Handbook, A. Ravindran (ed.), CRC Press, Taylor & Francis Group, Boca
Raton, FL, pp. 9.1–9.37, 2007. With permission.
Introduction 23
S7
W(t)
S2 S3
S5 S6
S1 S4
A1 A2 A3 A4 A5 A6 A7 t
D1 D2 D3 D4 D5 D6 D7
X(t)
A 1 A2 A3 A4 A5 A6 A7 t
D1 D2 D3 D4 D5 D6 D7
W1 W3 W5
W2 W4 W6
W7
FIGURE 1.5
Sample path of workload and number in the system for a G/G/1 queue.
to Figure 1.5). Let A1 , A2 , . . . , A7 be the times that the first seven customers
arrive to the queue. The customers require a service time of S1 , S2 , . . . , S7 ,
respectively. Assume that the realizations of An and Sn are known (although
in practice we only know them stochastically). The queue is initially empty.
As soon as the first customer arrives (that happens at time A1 ) the number
in the queue jumps from 0 to 1 (note the jump in the X[t] graph). Also, the
workload in the system jumps up by S1 because when the arrival occurs
there is S1 amount of work left to be done (note the jump in the W[t] graph).
Until the next arrival or service completion, the number in the system is
going to remain a constant equal to 1. Hence the X[t] graph stays flat at
1 till the next event. However, the workload keeps reducing because the
server is working on the customer. Notice from the figure that before the
first customer’s service is completed, the second customer arrives. Hence
the number in the system (the X[t] graph) jumps up by 1 and the work-
load jumps up by S2 (the W[t] graph) at time A2 . Since there is only a
single server and we use FCFS, the second customer waits while the first
customer continues being served. Hence the number in the system (the X[t]
graph) stays flat at 2 and the workload reduces (the W[t] graph) reduces
continuously.
As soon as the server completes service of customer-1, that customer
departs (this happens at time D1 ). Note that the time spent in the system by
customer-1 is W1 = D1 − A1 = S1 . Immediately after customer-1 departs, the
number in the system (the X[t] graph) jumps down from 2 to 1. However,
since the server adopts a work-conservation policy, it immediately starts
working on the second customer. Hence the W(t) graph has no jumps at
time D1 . From the figure notice that the next event is arrival of customer-3,
which happens while customer-2 is being served. Hence at time A3 the
24 Analysis of Queues
Problem 3
Consider the exact same arrival times of the first seven customers
A1 , A2 , . . . , A7 as well as the exactly same corresponding service time require-
ments of S1 , S2 , . . . , S7 , respectively, as described earlier. However, the
system has two identical servers. Draw graphs of W(t) and X(t) across time
for the first seven customers assuming that the eighth customer arrives well
after all the seven previous customers are served. Compare and contrast the
graphs against those we saw earlier for the case of one server.
Solution
We assume that the system is empty at time t = 0. The graphs of W(t) and
X(t) versus t for this G/G/2 queue is depicted in Figure 1.6. The first cus-
tomer arrives at time A1 and one of the two servers processes this customer
and the workload process jumps up by S1 . While one server processes this
customer, another customer arrives and the workload jumps by S2 at time
Introduction 25
W(t)
S2 S6
S5
S1 S3 S7
S4
A1 A2 A3 A4 A5 A6 A7 t
D1 D2 D3 D4 D6 D7 D5
X(t)
A1 A2 A3 A4 A5 A6 A7 t
D1 D2 D3 D4 D6 D7 D5
FIGURE 1.6
Sample path of workload and number in the system for a G/G/2 queue.
A2 . However, since there are two servers processing customers now, the
workload can be reduced twice as fast (hence a different downward slope
of W[t] immediately after A2 ). Then at time D1 , the first server completes
serving the first customer and becomes idle. The second server subsequently
completes serving customer-2 and the entire system becomes empty for a
short period of time between D2 and A3 when the third customer arrives.
Since the servers are identical, we do not specify which server processes the
third customer but we do know that one of them is processing the customer.
Then at time D3 , the system becomes empty. This process continues. When-
ever there are two or more customers in the system the workload reduces at
a faster rate than when there is one in the system. However, when there are
no customers in the system W(t) is 0.
Next we contrast the differences in Figures 1.5 and 1.6. The periods with
the system empty have indeed grown, which is expected with having more
servers. It is crucial to point out that the notion of busy period is unclear
since a server could be idle but the system could have a customer served by
the other server. However, the notion of the time when the system is empty
is still consistent between the two figures (and that is when W[t] and X[t]
are zero). Another difference between the figures that we described earlier is
that in the G/G/2 case the downward slopes for the W(t) process take on two
different values depending on the number in the system. However, a crucial
difference is that the customers do not necessarily depart in the order they
arrived. For example, the seventh customer departs before the fifth customer
(i.e., D7 < D5 ) in the G/G/2 figure. For this reason, we do not call this service
discipline FIFO in this book (that means first-in-first-out) and instead stick to
FCFS. However, the term “FIFO” does apply to the waiting area alone (not
including the servers), and hence it is often found in the literature. However,
to avoid any confusion we say FCFS.
26 Analysis of Queues
In a similar fashion, one could extend this to other queues and disciplines
by drawing the W(t) and X(t) processes (see the exercises at the end of the
chapter). However, typically we do not know realizations of An and Sn for
all n but we only know the distributions of the inter-arrival times and service
times. In that case can we say anything about X(t), Xn , Xn∗ , W(t), Dn , and Wn ?
We will see that next.
N
I(Xn∗ = j)
π∗j = lim n=1
,
N→∞ N
T
0 I(W(t) ≤ x)dt
G(x) = lim ,
T→∞ T
N
I(Wn ≤ x)
n=1
F(x) = lim ,
N→∞ N
T
0 X(t)dt
L = lim
T→∞ T
and
W1 + W2 + · · · + WN
W = lim .
N→∞ N
Problem 4
Consider the time between t = 0 and t = D7 for the G/G/1 queue in Figure 1.5.
Assume that we have numerical values for all Ai and Di for i = 1, . . . , 7. What
fraction of time between t = 0 and t = D7 were there for two customers in
the system? What fraction of the seven arriving customers saw one customer
in the system? What is time-averaged number of customers in the system
between t = 0 and t = D7 ?
Solution
From Figure 1.5, note that there are two customers in the system between
times A2 and D1 , A3 and D2 , A6 and A7 , as well as D5 and D6 . Thus the
fraction of time between t = 0 and t = D7 that there were two customers in
the system is ((D1 − A2 ) + (D2 − A3 ) + (A7 − A6 ) + (D6 − D5 ))/D7 . Notice that
D
the expression is identical to 0 7 I(X(t) = 2)dt/D7 .
From Figure 1.5, also note that customers 1, 4, and 5 saw zero customers
in the system when they arrived; customers 2, 3, and 6 saw one in the system
when they arrived; and customer 7 saw two customers in the system upon
arrival. Thus a fraction 3/7 of the arriving customers saw one customer in
7
the system. The fraction is indeed equal to I Xn∗ = 1 /7.
1
To obtain the time-averaged number of customers in the system between
D
time 0 and D7 , we use the expression 0 7 X(t)dt/D7 . Hence we have that
28 Analysis of Queues
value as
L = lim E[X(t)]
t→∞
Introduction 29
and
W = lim E[Wn ].
n→∞
Since the system is asymptotically stationary and ergodic, the two definitions
of pj , πj , π∗j , G(x), F(x), L, and W would be equivalent. In fact, we would end
up using the latter definition predominantly as we would be modeling the
queueing system as stochastic processes and perform steady-state analysis.
One of the primary objectives of analysis of queues is to obtain closed-
form expressions for the performance metrics pj , πj , π∗j , G(x), F(x), L, and
W given properties of the queueing system such as inter-arrival time dis-
tribution, service time distribution, number of servers, system capacity,
and service discipline. Although we would derive the expressions for var-
ious settings in future chapters only, for the remainder of this section, we
concentrate on describing the relationship between those measures.
We explain this relation using an example illustration, but the rigorous proof
uses what is known as a level-crossing argument. In a G/G/s queue, note
that the times customers arrive to an empty system are regenerative epochs.
In other words, starting at a regenerative epoch, the future events are inde-
pendent of the past. In Figure 1.5, times A1 , A4 , and A5 are regenerative
epochs. The process that counts the number of regenerative epoch is indeed a
30 Analysis of Queues
renewal process and the time between successive regenerative epochs are IID
(although for this system the distribution is not easy to compute in general).
We call this time between successive regenerative epochs as a regenerative
cycle. For the regenerative process described previously in a G/G/s system,
we assume that the regenerative cycle times on average are finite and the
system is stable (stability conditions will be explained later). It is crucial to
note that within any regenerative cycle of such a G/G/s queue, the number
of arriving customers seeing j others in the system would be exactly equal to
the number of departing customers seeing j others in the system.
For example, consider the regenerative cycle [A1 , A4 ) in Figure 1.5. There
are three arrivals, two of which see one in the system (customers 2 and 3)
and one sees zero in the system (customer 1). Observe that there must be
exactly three departures (if there are three arrivals). Of the three departures,
two see one in the system (customers 1 and 2) and one sees zero (customer 3).
Similarly, in regenerative cycles [A4 , A5 ) and [A5 , A8 ) one can observe (pre-
tending A8 is somewhere beyond D7 ) that the number of arriving customers
that see j in the system (for any j) would be exactly equal to the number of
departing customers that see j in the system. Since the entire time is com-
posed of these regenerative cycles, by summing over infinitely large number
of regenerative cycles we can see that the fraction of arriving customers that
see j others in the system would be exactly equal to the fraction of depart-
ing customers that see j others in the system. Hence we get πj = π∗j . Before
proceeding it is worthwhile to verify for the G/G/2 case in Figure 1.6 where
the regenerative cycles are [A1 , A3 ), [A3 , A4 ), [A4 , A5 ), and [A5 , A8 ) assuming
A8 is somewhere beyond D5 . Also, this result can be generalized easily for
a finite capacity queue and any service discipline as long as arrivals occur
individually and service completions occur one by one.
Next we describe the relationship between L and W. For that we require
some additional terminology. Define the following for a single stage G/G/s
queue (with characteristics described in the previous paragraph):
1
= E[An − An−1 ].
λ
1
= E[Sn ].
μ
λ
ρ= .
sμ
ρ < 1.
In other words, we require that λ < sμ for the system to be stable. This is
intuitive because it says that there is enough capacity (service rate on average
offered by all servers together is sμ) to handle the arrivals. In the literature,
ρ > 1 is called an overloaded system and ρ = 1 is called a critically loaded
system. Next we present a remark for stable G/G/s queues.
Remark 5
The average departure rate from a G/G/s queue is defined as the long-run
average number of customers that depart from the queue per unit time. If
the G/G/s queue is stable, then the average departure rate is λ.
32 Analysis of Queues
This remark is an artifact of the argument made in Section 1.2.1 that since
this is a stable flow conserving system, the average input rate must be the
average output rate. With that said, we describe the relationship between
the preceding terms and then explain them subsequently.
A G/G/s queue with notation described earlier in this section satisfies the
following:
1
W = Wq + , (1.8)
μ
L = λW (1.9)
and
Lq = λWq . (1.10)
We will now explain these equations. Equation 1.8 is directly from the defini-
tion; the total time in the system for any customer must be equal to the time
spent waiting in the queue plus the service time; thus taking expectations we
get Equation 1.8. Equations 1.9 and 1.10 are both due to Little’s law described
in Section 1.2.4. Essentially, if one considers the entire queueing system as a
flow system and then suitably substitutes the terms in Equation 1.7, then
Equation 1.9 can be obtained. However, if one considered just the wait-
ing area as the flow system, then Equation 1.7 can be used once again to
derive Equation 1.10. The preceding equations can be applied in more gen-
eral settings than what we described. In particular, Equation 1.8 is applicable
beyond the G/G/s setting such as: it does not require renewal arrivals; if
appropriately defined, it can be used for finite capacity queues as well as
some non-FCFS disciplines. Likewise, Equations 1.9 and 1.10 are applicable
in a much wider contexts since Little’s law can be applied to any flow sys-
tem, not just the G/G/s queue setting. In particular, it can be extended to
G/G/s/K queues (as we will see at the end of this section) by appropriately
picking λ values. Also, it is not required that the discipline be FCFS (even
work-conservation is not necessary).
The key benefit of the three equations is that if we can compute one of L,
Lq , W, or Wq , the other three can be obtained by solving the three equations
for the three remaining unknowns. It is worthwhile pointing out that λ and μ
are not unknowns. One can (usually) easily compute λ and μ from the G/G/s
description. We illustrate this using an example next.
Problem 5
A simulation of a G/G/9 queue yielded Wq = 1.92 min. The inputs to the sim-
ulation included a Pareto distribution (with mean 0.1235 min and coefficient
of variation of 2) for the inter-arrival times and a gamma distribution (with
Introduction 33
mean 1 min and coefficient of variation of 1) for the service times. Compute
L, Lq , and Wq .
Solution
Based on the problem description we have a G/G/s queue with s = 9, λ = 8.1
per minute, and μ = 1 per minute. Also, Wq = 1.92 min, which is the aver-
age time a customer waits to begin service. Using Equation 1.10 we can
get Lq = λWq = 8.1 × 1.92 = 15.552 customers that can be seen waiting for
service (on average) in the system. Also, using Equation 1.8 we have
W = Wq + 1/μ = 2.92 min (which is the mean time spent by each customer in
the system). Finally, using Equation 1.9 we get L = λW = 8.1 × 2.92 = 23.652
customers in the system on average in steady state.
λ
L = Lq + .
μ
Also, since L is the expected number of customers in the system and Lq is the
expected number of customers in the waiting, we have λ/μ as the expected
number of customers at the servers. Therefore, the expected number of busy
servers in steady state is λ/μ = sρ. Define random variable Bi as 1 if server i
is busy in steady state and 0 otherwise (i.e., server i is idle). Since the servers
are identical we define pb as the probability that a particular server is busy,
that is, P(Bi = 1) = pb . Also, E[Bi ] = pb for all i ∈ [1, 2, . . . , s]. We saw earlier
that E[B1 + B2 + · · · + Bs ] = λ/μ since the expected number of busy servers is
λ/μ. But E[B1 + B2 + · · · + Bs ] is also spb . Hence we have the probability that
a particular server is busy pb given by
pb = ρ.
Also, for the special single server case of s = 1, that is, G/G/1 queues, the
probability that the system is empty, p0 is
p0 = 1 − ρ
since a fraction π∗K of arrivals would be turned away due to a full system.
Also, note that the average rate of departure from the system after being
served is also λ. Using λ, the average number of customers that enter the
queueing system per unit time, we can write down Little’s law as
L = λW. (1.11)
1
W = Wq + ,
μ
Lq = λWq .
πj = π∗j
for all j with them being equal to zero if j > K. Having said that, it is cru-
cial to point out that for most of the book we will mainly concentrate on
Introduction 35
infinite capacity queues (with some exceptions especially in the very next
chapter) due to issues of practicality and ease of analysis. From a practi-
cal standpoint, if a queue actually has finite capacity but the capacity is
seldom reached, approximating the queue as an infinite capacity queue is
reasonable.
pj = π∗j .
This result is called PASTA (Poisson arrivals see time averages) since the
arrivals are seeing time-averaged quantities as described in Section 1.2.5.
The PASTA property can be used further to obtain relations between time
averages and ensemble averages. Recall the definition of L, which can also
be written as
∞
L= jpj .
j=0
∞
j
L(k) = k! pj .
k
j=k
Also, let W (k) be the kth moment of the waiting time in steady state, that is,
Note that W (1) = W and L(1) = L. Of course using Little’s law we have L = λW.
But Little’s law can be extended for the M/G/s queue in the following
manner. For an M/G/s queue
probably much faster, but since the customer is kept occupied it does not
appear that way).
There are many such situations where humans perceive something as tak-
ing longer when it actually might be shorter. One such example is at fast-food
restaurants. While designing the waiting area, one is faced with choosing
whether to have one long serpentine waiting line or have one line in front of
each server. We will see later in this book that one serpentine line is better
than having one line in front of each server when it comes to minimizing the
expected time spent in the system. But then why do we see fast-food restau-
rants having a line in front of each server? One reason is that most people
feel happier to be on a shorter line than a longer line! Also, for example, if
there are three servers in a fast-food restaurant taking orders, then most peo-
ple feel better being second in line behind a server than being the fourth in
a long serpentine line, although in terms of getting out of the restaurant it
is better to be fourth in a serpentine line with three servers than second in
a line with one server. Even though this appears irrational, the perception
certainly matters while making design decisions.
Another aspect that is crucial for customer satisfaction is to reduce the
anxiety involved in waiting. Providing information, estimates, and guaran-
tees, as well as reducing uncertainties, goes a long way in terms of customer
satisfaction. Getting an estimate of how long the wait is going to be can
reduce the anxiety level. In many phone-based help desks one typically gets
a message saying, “your call will be answered in approximately 5 minutes.”
By saying that one typically does not get impatient for that 5 min. In most
restaurants when one waits for a table, the greeter usually gives an estimate
of how long the wait is. In theme parks one is usually provided information
such as “wait time is 45 minutes from this point.” These days, to avoid road
rage, on many city roads one sees estimated travel times displayed for var-
ious points of interest. It is also crucial to point out that in many instances
providing this waiting information is not only useful in terms of reducing
anxiety but it also enables the customer or user to make choices (such as
considering alternative routes when a road is congested).
In many instances, providing information to customers so that they could
make informed decisions actually improves the system performance. For
example, a note on the highway (or on the radio) informing an accident
would make some drivers take alternate routes. This certainly alleviates the
congestion on the highway where the accident has taken place. Another
example is at theme parks (such as at DisneyWorld) where as a guest you
have the option of standing in a long line or picking up a token that gives
you another window of time to show up for the ride (which also ensures
short wait times). This is a win–win situation because it not only reduces
the agony of waiting and improves the experience of the guests as now they
can enjoy more things, but it also reduces bursts seen during certain times of
the day by spreading them out and thereby running the systems more effi-
ciently. This also allows for classifying customers and catering to their needs,
Introduction 39
as opposed to using FCFS for all customers. By not forcing all the customers
to wait or to show up at times specified by the system, the system is able to
satisfy customers that prefer one option versus the other.
The last comment made in the previous sentence essentially states that
by putting the onus on the customers to choose the class they want to belong
makes the system appear more fair and not skewed toward the preference
of one type. The whole notion of fairness, especially in queues with human
entities is a critical aspect. Customer satisfaction and frustration with waiting
can get worse if there is a perception of the system being unfair. Consider a
grocery store checkout line. Sometimes when a new counter opens up while
everyone is waiting in long lines, the manager invites customers arbitrar-
ily. A lot of customers consider that as unfair. In restaurants one tends to
get annoyed when someone that arrived later gets seated first, although
that might be for practical reasons such as a table of the right size becom-
ing available. But usually when such an unfair thing occurs, the agony of
waiting worsens. Therefore, most systems adopt FCFS as it is a common
notion of being fair. But again that has been questioned by many researchers.
Unfortunately, what is considered fair, is completely in the mind of the one
experiencing the situation.
Talking about situations, usually the same duration of wait times could
be tolerable in one situation and unbearable in another even though it is the
same person. There are many reasons for that difference. One has to do with
the customer’s expectations. If one waits longer or shorter than expected,
although in both cases the actual wait times are the same, the latter makes
a customer more satisfied. In fact, that is why most of the times services
overestimate the wait time while informing their customers. Also, whether
a wait time is considered acceptable or not depends on what one is waiting
for. There are many things that are considered “worth the wait,” especially
something precious. Of course as we described earlier, it also depends on
what the customer is doing while waiting, an engaged customer would per-
ceive the same wait time as shorter than when the same customer is idle.
There are things that also appeal logically; for example, if it takes 5 min to
download a very large file, that is alright but the same 5 min for a small web
page is ridiculous. In summary, understanding human nature is important
while making design and control decisions.
Human nature not only plays out while considering customer satisfaction
but also in terms of behavior. Balking is a behavior when arriving customers
decide not to join the queue. Although usually the longer the line, the greater
the chance for balking. But that may not be true all the time as sometimes a
longer line might imply better experience! In fact, people balk lines saying,
“I wonder why no one is here, maybe it is awful.” So it becomes important
to understand the balking behavior and rationale. Same applies to reneg-
ing, which is abandoning a queue after waiting for some time and service
does not begin. Again, understanding the behavior associated with reneg-
ing can help develop appropriate models. It was observed that the longer
40 Analysis of Queues
one waits, the lesser the propensity to renege. This is not intuitive because
one expects customers to wait for some time and become impatient, so with
time the reneging rate should have increased. In fact, systems like emer-
gency management (such as 9-1-1 calls in the United States) use an LCFS
policy because customers that are under true emergency situations tend to
hang up and try again persistently with high rates of reneging. However,
other callers behave in the opposite fashion, that is, wait patiently or renege
and not retry. By understanding the behavior of true emergency callers, a
system that prioritizes such callers without actually knowing their condition
can be built.
Customer behavior and customer satisfaction go hand in hand. Satis-
fied customers behave in a certain way and unsatisfied customers behave
in other ways. In other words, customer behavior is a reaction to customer
satisfaction (or dissatisfaction). On the flip side, for organizations that pro-
vide service, understanding customer behavior and being able to model it
goes a long way in providing customer satisfaction. There are three compo-
nents of customer satisfaction: (a) quality of service (QoS), (b) availability,
and (c) cost. A service system (with its limited resources) is depicted in
Figure 1.7. Into such a system customers arrive, if resources are available
they enter the system, obtain service for which the customers incur a cost,
and then they leave the system. We define availability as the fraction of time
arriving customers enter the system. Thereby QoS is provided only for cus-
tomers that “entered” the system. From an individual customer’s standpoint,
the customer is satisfied if the customer’s requirements over time on QoS,
availability, and cost are satisfied.
The issue of QoS (sometimes called conditional QoS as the QoS is condi-
tioned upon the ability to enter the system) versus availability needs further
discussion. Consider the analogy of visiting a medical doctor. The ability
to get an appointment translates to availability; however, once an appoint-
ment is obtained, QoS pertains to the service rendered at the clinic such as
QoS
Output
Enter Service
Exit
Customer system
arrival
Reject Input
Payment
FIGURE 1.7
Customer satisfaction in a service system. (From Gautam, N., Quality of Service Metrics. Frontiers
in Distributed Sensor Networks, S.S. Iyengar and R.R. Brooks (eds.), Chapman & Hall/CRC Press,
Boca Raton, FL, pp. 613–628, 2004. With permission.)
Introduction 41
waiting time and healing quality. Another analogy is airline travel. Getting
a ticket on an airline at a desired time from a desired source to a desired des-
tination is availability. QoS measures include delay, smoothness of flight,
in-flight service, etc. One of the most critical business decisions is to find the
right balance between availability and QoS. A service provider can increase
availability by decreasing QoS and vice versa. A major factor that could
affect QoS and availability is cost. As a final set of comments regarding cus-
tomer satisfaction, consider the relationship between customer satisfaction
and demand. If a service offers excellent customer satisfaction, very soon
its demand would increase. However, if the demand increases, the service
provider would no longer be able to provide high customer satisfaction,
which eventually deteriorates, thereby decreasing demand. This is a cycle
one has to plan carefully.
As a final comment, although this book mainly considers physics of
queues, it is by no means the only way to design and operate systems. As
we saw in the examples given earlier, by considering psychological issues it
is certainly possible to alleviate anxiety and stress associated with waiting. In
fact, a holistic approach would combine physics and psychology of queues
to address design, control, and operational issues.
Reference Notes
There are several excellent books and papers on various aspects of theory
and applications of queueing models. The list is continuously growing and
something static such as this book may not be the best place for that list.
However, there is an excellent site maintained by Hlynka [54], which is a
phenomenal repository of queueing materials. The website includes a won-
derful historical perspective of queues and cites several papers that touch
upon the evolution of queueing theory. It also provides a large list of queue-
ing researchers, books, course notes, and software among other things such
as queueing humor.
This chapter as well as most of this book has been influenced by a sub-
set of those fantastic books in Hlynka [54]. In particular, the general results
based on queueing theory is out of texts such as Kleinrock [63], Wolff [108],
Gross and Harris [49], Prabhu [89], and Medhi [80]. Then, the applications of
queues to computer and communication systems have predominantly been
influenced by Bolch et al. [12], Menasce and Almeida [81], and Gelenbe and
Mitrani [45]. Likewise, applications to production systems are mostly due to
Buzacott and Shanthikumar [15]. The theoretical underpinnings of this book
in terms of stochastic processes are mainly from Kulkarni [67], which is also
the source for many of the notations used in this chapter as well as others in
this book.
42 Analysis of Queues
Exercises
1.1 Consider a doctor’s office where reps stop by according to a renewal
process with an inter-renewal time according to a gamma distribu-
tion with mean 25 days and standard deviation 2 days. Whenever a
rep stops, he or she drops off 10 samples of a medication. If a patient
needs the medication, the doctor gives one of the free samples if
available. Patients arrive at the doctor’s office needing medication
according to a Poisson process at an average rate of 1 per day. To
make the model more tractable, assume that all samples given dur-
ing a rep visit must be used or discarded before the next visit by the
rep. Model the number of usable samples at the doctor’s office as a
flow system. Is it stable? Derive expressions for the average input
rate , the time-averaged number of free samples in the doctor’s
office H, and the average number of days each sample stays at the
doctor’s office ?
1.2 Consider a G/D/1/2 queue that is empty at time t = 0. The first eight
arrivals occur at times t = 1.5, 2.3, 2.7, 3.4, 5.1, 5.2, 6.5, and 9.3. The
service times are constant and equal to 1 time unit. Draw a graph
of W(t) and X(t) from t = 0 to t = 6.5. Make sure to indicate arrival
times and departure times. What is the time-averaged workload as
well as number in the system from t = 0 to t = 6.5?
Introduction 43
X(t)
3
1
t
0
FIGURE 1.8
Plot of X(t) for a G/G/1 queue.
1.3 Figure 1.8 represents the number in the system during the first few
minutes of a G/G/1 queue that started empty. On the figure, mark
A3 and D2 . Also, what is X(D2 +)? Derive the time-averaged number
in the system between t = 0 and t = D4 in terms of Ai and Di for
i = 1, 2, 3, 4.
1.4 An assembly line consists of three stages with a single machine at
each stage. Jobs arrive to the first stage one by one and randomly
with an average of one every 30 s. After processing is completed the
jobs get processed at the second stage and then the third stage before
departing. The first-stage machine can process 3 jobs per minute, the
second-stage machine can process 2.5 jobs per minute, and the third-
stage machine can process 2.25 jobs per minute. The average sojourn
times (including waiting and service) at stages 1, 2, and 3 were 1, 2,
and 4 min, respectively. What is the average number of jobs in the
system for the entire assembly line and at each stage? What is the
average time spent by each job in the system? What is the x-factor
for the entire system that is defined as the ratio of the average time
in the system to the average time spent processing for any job?
1.5 Consider a manual car wash station with three bays and no room
for waiting (assume that cars that arrive when all the three bays are
full leave without getting washed there). Cars arrive according to a
Poisson process at an average rate of 1 per minute but not all cars
enter. It is known that the long-run fraction of time there were 0, 1,
2, or 3 bays full are 6/16, 6/16, 3/16, and 1/16, respectively. What is
L for this system? What about W and Wq ? What is the average time
to wash a car?
1.6 Consider a factory floor with two identical machines and jobs arrive
externally at an average rate of one every 20 min. Each job is
processed at the first available machine and it takes an average of
30 min to process a job. The jobs leave the factory as soon as process-
ing is completed. The average work-in-process in the entire system
is 24/7. Compute the steady-state average throughput (number of
44 Analysis of Queues
processed jobs exiting the system), cycle time (i.e., mean time in the
system), and the long-run fraction of time each machine is utilized.
1.7 Consider a production system as a “black box.” The system pro-
duces only one type of a customized item. The following informa-
tion is known: the cycle time (i.e., average time between when an
order is placed and a product is produced) for any item is 2 h and the
throughput is one item every 15 min on average. It is also known that
the average time spent on processing is only 40/3 min (the rest of the
cycle time is spent waiting). In addition, the standard deviation of
the processing times is also 40/3 min. What is the steady-state aver-
age number of products in the system? When the standard deviation
of the processing time was reduced, it was observed that the aver-
age number in the system became 7, but the throughput remained
the same. What is the new value for cycle time?
1.8 Two barbers own and operate a barber shop. They provide two
chairs for customers who are waiting for a haircut, so the number
of customers in the shop varies between 0 and 4. For n = 0, 1, 2, 3, 4,
the probability pn that exactly n customers are in the barber shop
in steady-state is p0 = 1/16, p1 = 4/16, p2 = 6/16, p3 = 4/16, and
p4 = 1/16.
(a) Calculate L and Lq .
(b) Given that an average of four customers per hour arrive
according to a Poisson process to receive a haircut, determine
W and Wq .
(c) Given that the two barbers are equally fast in giving haircuts,
what is the average duration of a haircut?
1.9 Consider a discrete time queue where time is slotted in minutes.
At the beginning of each minute, with probability p, a new cus-
tomer arrives into the queue, and with probability 1 − p, there are
no arrivals at the beginning of that minute. At the end of a minute
if there are any customers in the queue, one customer departs with
probability q. Also, with probability 1−q, no one departs a nonempty
queue at the end of that minute. Let Zn be the number of customers
in the system at the beginning of the nth minute before any arrivals
occur in that minute. Model {Zn , n ≥ 0} as a discrete time Markov
chain by drawing the transition diagram. What is the condition of
stability? Assuming stability, obtain the steady-state probabilities
for the number of customers in the system as seen by an arriving
customer.
1.10 For an M/M/2/4 system with potential arrivals according to a Pois-
son process with mean rate 3 per minute and mean service time
0.25 min, it was found that p4 = 81/4457 and L = 0.8212. What are
the values of W and Wq for customers that enter the system?
2
Exponential Interarrival and Service Times:
Closed-Form Expressions
The most fundamental queueing models, and perhaps the most researched
as well, are those that can be modeled as continuous time Markov chains
(CTMC). In this chapter, we are specifically interested in such queueing mod-
els for which we can obtain closed-form algebraic expressions for various
performance measures. A common framework for all these models is that
the potential customer arrivals occur according to a Poisson process with
parameter λ, and the service time for each server is according to exp(μ). Note
that the inter-arrival times for potential customers are according to exp(λ)
distribution, but all arriving customers may not enter the system.
The methods to analyze such queues to obtain closed-form expressions
for performance measures are essentially by solving CTMC balance equa-
tions. The methods can be broadly classified into three categories. The first
category is a network graph technique that uses flow balance across arcs
called arc cuts on the CTMC rate diagram. Then, there are some instances
where it is difficult to solve the balance equations using arc cuts for which
generating functions would be more appropriate. Finally, in some instances
where neither arc cuts nor generating functions work, it may be possible
to take advantage of a property known as reversibility to obtain closed-form
expressions for the performance measures.
In the next three sections, we describe those three methods with some
examples. It is crucial to realize that the scenarios are indeed examples, but
the methodologies can be used in many more instances. In fact, the focus
of this chapter (and the entire book) is to explain the methodologies using
examples and not showcase the results for various examples of queues. In
other words, we would like to focus on the means and not the ends. There
are several wonderful texts that provide results for every possible queueing
system. Here we concentrate on the techniques used to get those results.
Since the methods in this chapter are based on CTMC analysis, we
explain the basics of that first. Consider an arbitrary irreducible CTMC
{Z(t), t ≥ 0} with state space S and infinitesimal generator matrix Q.
Sometimes Q is also called rate matrix. In order to obtain the steady-state
probability
row vector p = [pi ], we solve for pQ = 0 and normalize using
i∈S pi = 1. The set of equations for pQ = 0 is called balance equations and
can be written for every i ∈ S as
45
46 Analysis of Queues
pi qii + pj qji = 0.
j=i
If we draw
the rate diagram, then what is immediately obvious is that since
qii = − j=i qij , we have
pi qij = pj qji .
j=i j=i
This means that across each node i, the flow out equals the flow in.
Many times it is not straightforward to solve the balance equations
directly. The next three sections present various simplifying techniques to
solve them and also use the results obtained for various other performance
metrics besides the steady-state probability distribution.
1 3 5
α δ α δ α
β β β
γ γ
0 2 4
FIGURE 2.1
Arc cut example.
Exponential Interarrival and Service Times: Closed-Form Expressions 47
⎡ ⎤
−α − γ α γ 0 0 0
⎢ β −β 0 0 0 0 ⎥
⎢ ⎥
⎢ 0 δ −α − γ − δ α γ 0 ⎥
[p0 p1 p2 p3 p4 p5 ] ⎢
⎢
⎥
⎥
⎢ 0 0 β −β 0 0 ⎥
⎣ 0 0 0 δ −α − δ α ⎦
0 0 0 0 β −β
= [0 0 0 0 0 0]
can be manipulated to get the flow balance across the arc cut resulting in
Equation 2.1. It is crucial to understand that the arc cut made earlier is just to
illustrate the theory. In practice, the cut chosen in the example would not be
a good one to use. The objective of these cuts is to write down relationships
between the unknowns so that the unknowns can be solved easily. Therefore,
cuts that result in two or three pi terms would be ideal to use. To illustrate
that, we present the following example problem.
Problem 6
For the CTMC with rate diagram in Figure 2.1, compute the steady-state
probabilities p0 , p1 , p2 , p3 , p4 , and p5 by making appropriate arc cuts and
solving the balance equations.
Solution
For this example, by making five suitable cuts (which ones?), we get the
following relationships
(α + γ)p0 = βp1
γp0 = δp2
(α + γ)p2 = βp3
γp2 = δp4
αp4 = βp5
α+γ
p1 = p0 ,
β
γ
p2 = p0 ,
δ
48 Analysis of Queues
(α + γ)γ
p3 = p0 ,
δβ
γ2
p4 = p0 ,
δ2
γ2 α
p5 = p0 .
δ2 β
Note how much simpler this is compared to solving the node balance
equations. Once we know p0 , we have an expression for all pi . Now, by solv-
ing for p0 using the normalizing relation p0 + p1 + p2 + p3 + p4 + p5 = 1,
we get
1
p0 = .
1 + (α + γ)/β + γ/δ + (α + γ)γ/(δβ) + γ2 /δ2 + γ2 α/(δ2 β)
A word of caution is that when the cut set A is separated from the state
space S, all the arcs going from A to S − A must be considered. A good
way to make sure of that is to clearly identify the cut set A as opposed to
just performing a cut arbitrarily. Next, we use the arc cut method to obtain
steady-state distributions of the number in the system for a specific class of
queueing systems.
⎧
⎪
⎪ λ if j = i + 1 and i < K
⎪
⎪
⎨ iμ if j = i − 1 and 0 < i < s
qij = sμ if j = i − 1 and s ≤ i ≤ K
⎪
⎪
⎪
⎪ − min(1, K − i)λ − min(i, s)μ if j = i
⎩
0 otherwise.
λp0 = μp1
λp1 = 2μp2
λp2 = 3μp3
..
.
λps−1 = sμps
λps = sμps+1
..
.
λpK−1 = sμpK
50 Analysis of Queues
λ λ λ λ λ λ λ λ λ
0 1 2 S–1 S S+1 K–1 K
μ 2μ 3μ (s–1)μ Sμ Sμ Sμ Sμ Sμ
FIGURE 2.2
Rate diagram for M/M/s/K queue.
where pj for any j ∈ S is the probability that there are j customers in the
system in steady state, that is,
In addition, since the CTMC is ergodic, pj is also the long-run fraction of time
the system has j customers, that is,
T
0 I{X(t)=j} dt
pj = lim
T→∞ T
K
pi = 1,
i=1
we get
⎡ ⎤−1
s n s
K
1 λ (λ/μ)
p0 = ⎣ + ρn−s ⎦
n! μ s!
n=0 n=s+1
where
λ
ρ= .
sμ
(λ/μ)K
pK = p0
s!sK−s
since pK is the probability that in steady state there are K customers in the sys-
tem and that the arrivals are according to a Poisson process (due to PASTA
property). Also note that the rate at which customers are rejected is λpK on
average and the mean queue entering rate is λ(1 − pK ).
Another performance metric that we can quickly obtain is the long-run
average number of customers in the system that are waiting for their service
to begin (Lq ). Using the definition of Lq and the pj values, we can derive
K
p0 (λ/μ)s ρ
Lq = pj max(j − s, 0) = [1 − ρK−s − (K − s)ρK−s (1 − ρ)].
s!(1 − ρ)2
j=0
Lq
Wq =
λ(1 − pK )
Lq 1
W= +
λ(1 − pK ) μ
λ(1 − pK )
L = Lq +
μ
using the standard system analysis results via Little’s law and the definitions
(see Equations 1.8 through 1.10). Note that for Little’s law, we use the enter-
ing rate λ(1 − pK ) and not the arrival rate because not all arriving customers
enter the system.
In a similar fashion, using pj values, it would be possible to derive higher
moments of the steady-state number in the system and number waiting to be
served. See Exercise 2.1 at the end of this chapter for one such computation.
However, computing higher moments of the time in the system and time in
the queue in steady state for entering customers is a little more tricky. For
that we first derive the steady-state distribution. Let Yq and Y be random
variables that respectively denote the time spent by an entering customer in
52 Analysis of Queues
the queue and in the system (including service). To obtain the cumulative
distribution function (CDF) of Yq and Y, we require that the entities in the sys-
tem are served according to FCFS. The analysis until now did not require FCFS,
and for any work-conserving discipline, the results would continue to hold.
However, for the following analysis, we specifically take the default FCFS
condition into account. Having said that, it is worthwhile to mention that
other service disciplines can also be considered, but we will only consider
FCFS here.
To obtain the CDF of Yq that we denote as FYq (t), we first reiterate that Yq
is in fact a conditional random variable. In other words, it is the time spent
waiting to begin service for a customer that entered the system in steady
state, that is, given that there were less than K in the system upon arrival
in steady state. Since the arrivals are according to a Poisson process, due to
PASTA the probability that an arriving customer in steady state will see j in
the system is pj . Also, the probability that an entering customer in steady state
would see j others in the system is the probability that an arriving customer
would see j others in the system given that there are less than K in the system.
Using a relatively straightforward conditional probability argument, one can
show that the probability that an entering customer will see j in the system
in steady state is pj /(1 − pK ) for j = 0, 1, . . . , K − 1.
Also, if an entering customer sees j in the system, the time this customer
spends before service begins is 0 if 0 ≤ j < s (why?) and an Erlang(j − s + 1, sμ)
random variable if s ≤ j < K. The explanation for the Erlang part is that since
all the s servers would be busy during the time, the entering customer waits
before service, and this time corresponds to j − s + 1 service completions.
However, since each service completion corresponds to the minimum of s
random variables that are according to exp(μ), service completions occur
according to exp(sμ) (due to minimum of exponentials property). Further, since
the sum of j − s + 1 exp(sμ) random variables is an Erlang(j − s + 1, sμ)
random variable (due to the sum of independently and identically distributed
[IID] exponentials property), we get the desired result. Thus, using the defini-
tion and CDF of the Erlang random variable, we can derive the following
by conditioning on j customers in the system upon entering in steady
state:
j−s
K−1
pj −sμt (sμt)r
=1− e .
1 − pK r!
j=s r=0
Exponential Interarrival and Service Times: Closed-Form Expressions 53
Once the CDF of Yq is known, FY (t), the CDF of Y can be obtained using
the fact that Y−Yq is an exp(μ) random variable, corresponding to the service
time of this entering customer. Therefore, we have for K > s > 1 the CDF as
FY (t) = P{Y ≤ t}
t t
= FYq (t − u)μe−μu du = 1 − e−μ(t−u) dFYq (u) + 1 − e−μt FYq (0)
0 0
j−s
s−1
pj pj
K−1 pj
K−1
(sμt)r
= 1 − e−μt + − e−sμt
1 − pK 1 − pK 1 − pK r!
j=0 j=s j=s r=0
t
K−1
pj
− e−μt eμu − sμe−sμu
1 − pK
0 j=s
j−s
−sμu (sμu)r−1 1 − sμu
+ e sμ du
(r − 1)! r
r=1
j−s
s−1
pj pj
K−1
(sμt)r
= 1 − e−μt − e−sμt
1 − pK 1 − pK r!
j=0 j=s r=0
⎡ ⎧
K−1
pj ⎨ s
−μt ⎣
−e − (1 − e−(s−1)μt )
1 − pK ⎩ s − 1
j=s
s
+ (1 − e−(s−1)μt − (s − 1)μte−(s−1)μt )
s−1
⎛ ⎞
j−s+1 j−s
s ⎝1 − ((s − 1)μt)i
− e−(s−1)μt ⎠
s−1 i!
i=0
⎫⎤
j−s
r ⎬
s ((s − 1)μt)r+1 s ⎦.
+ e−(s−1)μt
s−1 (r + 1)! s−1 ⎭
r=1
Note here that since Wq is a random variable with mass at 0, one has to be
additionally careful with the convolution realizing that FYq (0) is nonzero.
Also, when K = s, FY (t) = 1 − e−μt since the sojourn time equals service time
for entering customers. The case s = 1 can be addressed by rederiving the
integral using s = 1.
54 Analysis of Queues
∞
F̃Y (w) = E[e−wY ] = e−wu dFY (u).
0
Once we know the LST, there are several techniques to invert it to obtain the
CDF FY (·), for example, direct computation by looking up tables, converting
to Laplace transform (LT) and inverting it, or by numerical transform inver-
sion. However, moments of Y can easily be obtained by taking derivatives.
With that understanding, the LST can be computed using the definition of Y
(as opposed to taking the LST of FY (·)) as
s−1
K−1 j−s+1
pj μ pj μ sμ
F̃Y (w) = +
1 − pK μ+w 1 − pK μ+w sμ + w
j=0 j=s
s−1 s
pj μ p0 μ 1 sμ λ
= +
1 − pK μ+w 1 − pK μ+w s! sμ + w μ
j=0
1 − (λ/(sμ + w))K−s
× .
1 − λ/(sμ + w)
Problem 7
Using the same notations as the M/M/s/K queue earlier, derive the distribu-
tion for the number of entities in the system and also the sojourn time in the
system for the classic M/M/1 queue. Do any conditions need to be satisfied?
Obtain expressions for L, W, Lq , and Wq .
Solution
The M/M/1 queue is a special case of the M/M/s/K queue with s = 1 and
K = ∞. Most of the performance measures can be obtained by letting s = 1
and K = ∞ in the M/M/s/K analysis. Hence, unless necessary, the results
will not be derived, but the reader is encouraged to verify them. However,
there is one issue. While letting K = ∞, we need to ensure that the CTMC
that models the number of customers in the system for the M/M/1 queue is
positive recurrent. The condition for positive recurrence (and hence stability
of the system) is
λ
ρ= < 1.
μ
In other words, the average arrival rate (λ) must be smaller than the aver-
age service rate (μ) to ensure stability. This is intuitive because in order for
the system to be stable, the server should be able to remove customers on
average faster than the speed at which they enter. Note that when K is finite,
instability is not an issue.
The long-run probability that the number of customers in the system is j
(for all j ≥ 0) is given by pj = ρj p0 , which can be obtained by writingthe bal-
ance equations for pj in terms of p0 . Now the normalizing equation j pj = 1
56 Analysis of Queues
can be solved only when ρ < 1, and hence this is called the condition for
positive recurrence. Therefore, when ρ < 1, we have
p0 = 1 − ρ
and
λ
L= ,
μ−λ
λ2
Lq = ,
μ(μ − λ)
1
W= ,
μ−λ
λ
Wq = .
μ(μ − λ)
It is crucial to realize that all these results require that ρ < 1. Also note that
while using Little’s law, one can use λ as the entering rate as no customers
are going to be turned away. Besides these performance metrics, one can also
derive higher moments of the steady-state number in the system. However,
to obtain the higher moments of the time spent in the system by a customer
arriving at steady state, one technique is to use the distribution of the time
in the system Y. By letting s = 1, K = ∞, and using pj = (1 − λ/μ)(λ/μ)j in the
M/M/s/K results, we get the LST after some algebraic manipulation as
μ−λ
F̃Y (w) = .
μ−λ+w
Y ∼ exp(μ − λ).
Problem 8
Using the same notations as the M/M/s/K described earlier in this section,
derive the distribution for the number of entities in the system and also the
sojourn time in the system for the multiserver M/M/s queue. What is the
stability condition that needs to be satisfied? Obtain expressions for L, W, Lq ,
and Wq .
Solution
The M/M/s queue is a special case of the M/M/s/K queue with K = ∞.
Most of the performance measures can be obtained by letting K = ∞ in the
M/M/s/K analysis; hence unless necessary the results will not be derived
but the reader is encouraged to verify them. Similar to the M/M/1 queue,
here too we need to be concerned about stability while letting K = ∞. The
condition for stability is
λ
ρ= < 1.
sμ
In other words, the average arrival rate (λ) must be smaller than the average
service capacity (sμ) across all servers to ensure stability.
By writing down the balance equations for the CTMC or by letting K = ∞
in the M/M/s/K results, we can derive the long-run probability that the
number of customers in the system is j (when ρ < 1) as
⎧ j
⎪
⎨ 1
j!
λ
μ p0 if 0 ≤ j ≤ s − 1
pj = j
⎪
⎩ 1 λ
p0 if j ≥ s
s! sj−s μ
where
s−1 −1
1 λ n (λ/μ)s 1
p0 = + .
n! μ s! 1 − λ/(sμ)
n=0
Either using pj from the previous equation (and using Little’s law wher-
ever needed) or by letting K = ∞ in the M/M/s/K results, we have
p0 (λ/μ)s λ
Lq = ,
s!sμ[1 − λ/(sμ)]2
λ p0 (λ/μ)s λ
L= + ,
μ s!sμ[1 − λ/(sμ)]2
58 Analysis of Queues
1 p0 (λ/μ)s
W= + ,
μ s!sμ[1 − λ/(sμ)]2
p0 (λ/μ)s
Wq = .
s!sμ[1 − λ/(sμ)]2
s−1 s
μ μ 1 sμ λ
F̃Y (w) = pj + p0 .
μ+w μ+w s! sμ + w − λ μ
j=0
1 1 1
= − .
(μ + w)(sμ + w − λ) (μ + w)(sμ − μ − λ) (sμ − λ − μ)(sμ + w − λ)
sμ2
− (1 − e−(sμ−λ)y )
(sμ − λ)[(s − 1)μ − λ]
for y ≥ 0. The reader is encouraged to verify that E[Y] results in the expres-
sion for W, and that by letting s = 1 we get Y ∼ exp(μ − λ), the M/M/1
sojourn time expression.
Problem 9
Using the same notations as the M/M/s/K described earlier in this section,
derive the distribution for the number of entities in the system and also the
sojourn time in the system for the single-server finite capacity M/M/1/K
queue. What is the rate at which customers enter the system on an average?
Obtain expressions for L, W, Lq , and Wq .
Exponential Interarrival and Service Times: Closed-Form Expressions 59
Solution
The M/M/1/K queue is a special case of the M/M/s/K system with s = 1.
All the results presented here are obtained by letting s = 1 for the corre-
sponding M/M/s/K results. We define ρ = λ/μ; however, since K is finite ρ
can be greater than one and the system would still be stable. The steady-
state probability that there are j customers in the system (for 0 ≤ j ≤ K) is
given by
ρj [1 − ρ]
pj = .
1 − ρK+1
In particular, the fraction of arrivals that are turned away due to a full
system is
ρK [1 − ρ]
pK = .
1 − ρK+1
KρK+2 − (K + 1)ρK+1 + ρ
L= .
(1 − ρ)(1 − ρK+1 )
K−1 j+1
pj μ
F̃Y (w) = .
1 − pK μ+w
j=0
This LST can be inverted using the fact that (μ/μ + w)j+1 is the LST of an
Erlang(j + 1, μ) distribution to get
⎛ ⎞
j
K−1
pj (μy) r
FY (y) = P{Y ≤ y} = ⎝1 − e−μy ⎠
1 − pK r!
j=0 r=0
for y ≥ 0. The reader is encouraged to cross-check all the results with the
M/M/1 queue by letting K → ∞ and assuming ρ < 1.
Problem 10
Using the same notations as the M/M/s/K described earlier in this section,
derive the distribution for the number of entities in the queue-less M/M/s/s
system. What if s = ∞? Is it trivial to obtain the sojourn time distribu-
tions for customers that enter the system? Obtain expressions for L, W, Lq ,
and Wq .
Solution
The last of the special cases of the M/M/s/K system is the case when s = K,
which gives rise to the M/M/s/s system. Note that no customers wait for
service. If there is a server available, an arriving customer enters the system
otherwise the customer is turned away. This is also known as the Erlang loss
system. Similar to the previous special cases, here too one can either work
with the M/M/s/K system letting s = K or start with the CTMC.
The probability that there are j (for j = 0, . . . , s) customers in the system in
steady state is
(λ/μ)j /j!
pj = s .
(λ/μ)i /i!
i=0
Therefore, the Erlang loss formula is the probability that a customer arriving
to the system in steady state is rejected (or the fraction of arriving customers
that are lost in the long run) and is given by
(λ/μ)s /s!
ps = s .
(λ/μ)i /i!
i=0
Although we do not derive the result here (see Section 4.5.3), it is worthwhile
to point out the remarkable fact that the previous formulae hold even for the
Exponential Interarrival and Service Times: Closed-Form Expressions 61
M/G/s/s system with mean service time 1/μ. In other words, the steady-
state distribution of the number in the system for an M/G/s/s queue does
not depend on the distribution of the service time.
Using the steady-state number in the system, we can derive
λ
L= (1 − ps ).
μ
Since the effective entering rate into the system is λ(1 − ps ), we get W = 1/μ.
This is intuitive because for customers that enter the system, since there is no
waiting for service, the average sojourn time is indeed the average service
time. For the same reason the sojourn time distribution for customers that
enter the system is same as that of the service time, that is, Y ∼ exp(μ). In
addition, since there is no waiting for service, Lq = 0 and Wq = 0.
It is customary to consider a special case of the M/M/s/s system which is
when s = ∞. We call that M/M/∞ queue since K takes the default value of
infinite. For the M/M/∞ system, the probability that there are j customers in
the system in the long run is
1
pj = (λ/μ)j e−λ/μ .
j!
Having described the M/M/s/K queue and its special cases in detail, next
we move to other CTMC-based queueing systems for which arc cuts are
inadequate and we demonstrate other methodological tools.
be easily solved using arc cuts) for illustration purposes, followed by some
detailed ones.
Problem 11
Consider a CTMC with S = {0, 1, 2, 3, . . .} for which we are interested in
obtaining the steady-state probabilities p0 , p1 , . . . represented using a gen-
erating function. For all i ∈ S and j ∈ S, let the elements of the infinitesimal
generator matrix Q be
⎧
⎪
⎪ α if j = i + 1,
⎨
−α − iβ if j = i,
qij =
⎪ iβ
⎪ if j = i − 1 and i > 0,
⎩
0 otherwise.
∞
(z) = pi zi .
i=0
Solution
The balance equations can be written for i > 0 as
and for i = 0 as
−p0 α + p1 β = 0. (2.3)
∞
∞
pi zi (α + iβ) = pi zi αz + ipi zi−1 β .
i=0 i=0
If this derivation is not clear, it may be better to write down the balance equa-
tions for i = 0, 1, 2, 3, . . ., multiply them by 1, z, z2 , z3 , . . ., and then see how
that results in the previous equation. We can rewrite the previous equation
in terms of (z) as
α(z) + βz
(z) = αz(z) + β
(z),
Exponential Interarrival and Service Times: Closed-Form Expressions 63
where
(z) = d(z)/dz. Upon rearranging terms, we get the differential
equation
α
(z) = (z)
β
which can be solved by dividing the equation by (z) and integrating both
sides with respect to z to get
α
log((z)) = z + c.
β
Using the condition (1) = 1 (why?), the constant c can be obtained as equal
to −α/β. Thus, we obtain the generating function (z) as
Problem 12
Consider an M/M/1 queue with arrival rate λ per hour and service rate μ
per hour. For j = 0, 1, 2, . . ., let pj be the steady-state probability that there are
j in the system. Using the balance equations, derive an expression for the
generating function
∞
(z) = pi zi .
i=0
Solution
The node balance equations are
p0 λ = p1 μ
p1 (λ + μ) = p0 λ + p2 μ
p2 (λ + μ) = p1 λ + p3 μ
p3 (λ + μ) = p2 λ + p4 μ
.. .. ..
. . .
and multiply the first equation by z0 , the second by z1 , the third by z2 , the
fourth by z3 , and so on. Then, upon adding we get
(p0 z0 + p1 z1 + p2 z2 + p3 z3 + · · · )(λ + μ) − p0 μ
= (p0 z0 + p1 z1 + p2 z2 + p3 z3 + · · · )λz
μ p0 μ
+ (p0 z0 + p1 z1 + p2 z2 + p3 z3 + · · · ) − .
z z
∞
(z) = pi zi
i=0
we get
p0 μ(1 − z) p0 μ
(z) = = .
μ − (λ + μ)z + λz2 (μ − λz)
μ−λ 1−ρ
(z) = =
μ − λz 1 − ρz
Exponential Interarrival and Service Times: Closed-Form Expressions 65
=
(1)
where
(z) = d(z)/dz. Using the expression for (z), we have
(1 − ρ)ρ
(z) = .
(1 − ρz)2
Therefore, we have
ρ λ
L =
(1) = = .
1−ρ μ−λ
λ2
Lq = ,
μ(μ − λ)
1
W= ,
μ−λ
λ
Wq = .
μ(μ − λ)
Although we have seen these results before in Problem 7, the main reason
they are presented here is to get an appreciation of the generating function as
an alternate technique. In the next few examples, the flow balance equations
may not be easily solved (also the arc cuts would not be useful) and the
power of generating functions will become clearly evident.
retries after another exp(θ) time. This process continues until the customer
is served. This system is called a retrial queue. A popular application of this
is the telephone switch. If there are s lines in a telephone switch and all of
them are busy, the customer making a call gets a message “all lines are busy
please try your call later” and the customer retries after a random time. In
the following example, we consider another application (albeit a much sim-
plified model than what is observed in practice) where s = 1, which is used
in modeling Ethernets with exponential back-off.
Problem 13
Consider a simplified model of the Ethernet (an example of a local area net-
work). Requests arrive according to a Poisson process with rate λ per unit
time on average to be transmitted on the Ethernet cable. If the Ethernet
cable is free, the request is immediately transmitted and the transmission
time is exponentially distributed with mean 1/μ. However, if the cable is
busy transmitting another request, this request waits for an exp(θ) time and
retries (this is called exponential back-off in the networking literature). Note
that every time a retrial occurs, if the Ethernet cable is busy the request
gets backed off for a further exp(θ) time. Model the system using a CTMC
and write down the balance equations. Then obtain the following steady-
state performance measures: probability that the system is empty (i.e., no
transmissions and no backlogs), fraction of time the Ethernet cable is busy
(i.e., utilization), mean number of requests in the system, and cycle time
(i.e., average time between when a request is made and its transmission is
completed).
Solution
The state of the system at time t can be modeled using two variables: X(t)
denoting the number of backlogged requests and Y(t) the number of requests
being transmitted on the Ethernet cable. The resulting bivariate stochastic
process {(X(t), Y(t)), t ≥ 0} is a CTMC with rate diagram given in Figure 2.3.
To explain this rate diagram, consider node (3,0). This state represents three
messages that have been backed off and each of them would retry after
λ θ λ 2θ λ 3θ λ 4θ λ
μ μ μ μ μ
FIGURE 2.3
Rate diagram for the retrial queue.
Exponential Interarrival and Service Times: Closed-Form Expressions 67
exp(θ) time. Hence, the first of them would retry after exp(3θ) time result-
ing in state (2,1). Also, a new request could arrive when the system is in
state (3,0) at rate λ which would result in a new state (3,1). Note that in state
(3,0), there are no requests being transmitted. Now consider state (3,1). A
new request arrival at rate λ would be backed off resulting in the new state
(4,1); however, a transmission completion at rate μ would result in the new
state (3,0). Note that a retrial in state (3,1) would not change the state of the
system and is not included. However, even if one were to consider the event
of retrial, it would get canceled in the balance equations and hence we need
not include it.
Although one could write down the node balance equations, it is much
simpler when we consider arc cuts. Specifically, cuts around nodes (i, 0) for
all i would result in the following balance equations:
p0,0 λ = p0,1 μ
p1,0 (λ + θ) = p1,1 μ
.. .. ..
. . .
and vertical cuts on the rate diagram would result in the following balance
equations:
p0,1 λ = θp1,0
p1,1 λ = 2θp2,0
p2,1 λ = 3θp3,0
p3,1 λ = 4θp4,0
.. .. ..
. . .
and we will leave as an exercise for the reader to solve for p0,0 using the
previous set of equations. We consider using generating functions. Let 0 (z)
and 1 (z) be defined as follows:
∞
0 (z) = pi,0 zi
i=0
68 Analysis of Queues
and
∞
1 (z) = pi,1 zi .
i=0
For the first set of balance equations, if we multiply the first equation by z0 ,
the second by z1 , the third by z2 , the fourth by z3 , and so on, then upon
adding we get
Likewise, if we multiply the first equation in the second set of balance equa-
tions by z0 , the second equation by z1 , the third by z2 , the fourth by z3 , and
so on, then upon adding we get
λ
1 (z) = 0 (z), (2.6)
μ − λz
λ θz θ
0 (z) + 0
(z) = 0
(z). (2.7)
μ μ λ
Using Equation 2.6 and the fact that 0 (1) + 1 (1) = 1, we get
λ
1 (1) = = 1 − 0 (1)
μ
provided λ < μ which is the condition for stability. Therefore, if λ < μ, the
utilization or fraction of time Ethernet cable is busy is 1 (1) = λ/μ. Hence, the
fraction of time the Ethernet cable is idle in steady state is 1 − λ/μ. Also, if we
can solve for 0 (z) in the (differential) Equation 2.7, then we can immediately
compute 1 (z) using Equation 2.6. This is precisely what we do next.
Letting y = 0 (z), we can rewrite Equation 2.7 as
1 λ/μ
dy = dz.
y (θ/λ) − (θ/μ)z
Exponential Interarrival and Service Times: Closed-Form Expressions 69
λ θ θz
log(y) + k = − log − ,
θ λ μ
−λ/θ λ/θ
λ 1 z 1 1
0 (z) = 1 − − −
μ λ μ λ μ
(λ/θ)+1 λ/θ
λ μ
= 1− .
μ μ − λz
λ/θ
λ λ (λ/θ)+1 μ
1 (z) = 1− .
μ − λz μ μ − λz
Next, we obtain the performance metrics using 0 (z) and 1 (z). We have
already computed the utilization of the Ethernet cable. The probability that
the system is empty with no transmissions or backlogs is p0,0 , which is equal
to 0 (0), and hence
(λ/θ)+1
λ
p0,0 = 1− .
μ
! "
λ2 μ + θ
0
(1) + 1
(1) =
θμ μ − λ
and the mean number of requests being transmitted in the system is λ/μ
(i.e., the cable utilization). Therefore, the mean number of requests in the
system (L) is
! "
λ λ2 μ + θ λ(λ + θ)
L= + = .
μ θμ μ − λ θ(μ − λ)
70 Analysis of Queues
Hence, using Little’s law, the cycle time (W) which is the average time
between when a request is made and its transmission is completed, is
λ+θ
W= .
θ(μ − λ)
2.2.3 Bulk Arrival Queues (M[X] /M/1) with a Service System Example
So far we have only considered the case of individual arrivals. However, in
practice it is not uncommon to see bulk arrivals into a system. For example,
arrivals into theme parks is usually in groups and arrivals as well as service
in restaurants is in groups. We do not consider bulk service in this text; the
reader is referred to other books in the queueing literature on that subject.
We only discuss the single server bulk arrival queue here.
Consider an infinite-sized queue with a single server (with service times
exp(μ)). Arrivals occur according to a Poisson process with average rate λ
per unit time. Each arrival brings a random number X customers into the
queue. The server processes the customers one by one taking an independent
exp(μ) time for each. This system is called an M[X] /M/1 queue. Let ai be
the probability that arriving batch is of size i, that is, ai = P{X = i} for i > 0.
The generating function of the probability mass function (PMF) of X is φ(z),
which is given by
∞
∞
φ(z) = E[zX ] = P(X = i)zi = ai zi .
i=1 i=1
Note that φ(z) is either given or can be computed since ai is known. In addi-
tion, we can compute E[X] = φ
(1) (i.e., the derivative of φ(z) with respect to
z at z = 1) and E[X(X − 1)] = φ
Problem 14
Consider a single server fast-food restaurant where customers arrive in
groups according to a Poisson process with rate λ per unit time on aver-
age. The size of each group is independent and identically distributed with
a probability ai of having a batch of size i (with generating function φ(z)
described earlier). Customers are served one by one, even though they may
have arrived in batches, and it takes the server an exp(μ) time to serve each
customer. Model the system using a CTMC and write down the balance
equations. Define (z) as the generating function
∞
(z) = pj zj ,
j=0
Exponential Interarrival and Service Times: Closed-Form Expressions 71
p0 λ = μp1
p1 (λ + μ) = μp2 + λa1 p0
.. .. ..
. . .
and multiply the first equation by z0 , the second by z1 , the third by z2 , the
fourth by z3 , and so on. Then, upon adding we get
μ μ
λ(z) + μ(z) − μp0 = (z) − p0 + λa1 z(z)
z z
+ λa2 z2 (z) + λa3 z3 (z) + · · · .
λa4
λa4
λa3 λa3 λa3
λa2 λa2 λa2 λa2
λa1 λa1 λa1 λa1 λa1
0 1 2 3 4
μ μ μ μ μ
FIGURE 2.4
Rate diagram for the M[X] /M/1 queue.
72 Analysis of Queues
μp0 (1 − z)
(z) = .
μ(1 − z) + λz(φ(z) − 1)
λzφ(z) − λz
A(z) =
1−z
so that
μp0
(z) = .
μ + A(z)
λzφ(z) − λz
A(1) = lim = −λφ
(1)
z→1 1−z
where the last equality is using L’Hospital’s rule since A(z) would also
result in a 0/0 form by substituting z = 1. However, we showed earlier that
φ
(1) = E[X] and hence A(1) = − λE[X]. Thus, (1) = 1 implies that
1 − λE[X]
p0 =
μ
provided λE[X] < μ. The condition λE[X] < μ is necessary for stability, and
it is intuitive since λE[X] is the effective average arrival rate of customers.
Thus, we have
μ(1 − λE[X]/μ)(1 − z)
(z) =
μ(1 − z) + λz(φ(z) − 1)
L =
(1)
d(z)
= lim dz
z→1
−(μ(1−z)+λz(φ(z)−1))−(1−z)(−μ+λ(φ(z)−1)+λzφ
(z))
= μp0 lim {μ(1−z)+λz(φ(z)−1)}2
z→1
−μ(1−z)−λzφ(z)+λz+(1−z)μ−λφ(z)+λ+λzφ(z)−λz−λzφ
(z)+λz2 φ
(z)
= μp0 lim {μ(1−z)+λz(φ(z)−1)}2
z→1
−λφ(z)+λ−λzφ
(z)+λz2 φ
(z)
= μp0 lim {μ(1−z)+λz(φ(z)−1)}2
.
z→1
However, taking the limit results in a 0/0 format, we use L’Hospital’s rule
and continue as follows:
# $
−λφ
(z) − λφ
(z) − λzφ
(z) + 2λzφ
(z) + λz2 φ
(z)
L = μp0 lim
z→1 2{μ(1 − z) + λz(φ(z) − 1)}{−μ + λφ(z) − λ + λzφ
(z)}
2λ(z − 1)φ
(z) + λz(z − 1)φ
(z)
= μp0 lim
z→1 2{μ(1 − z) + λz(φ(z) − 1)}{−μ + λφ(z) − λ + λzφ
(z)}
⎛ ⎞
2λφ
(1) + λφ
(1)
= μp0 .
2{−μ + λE[X]}{−μ + λφ(1) − λ + λφ
(1)}
The last equation uses the result we earlier derived namely limz→1 (φ(z) − 1)/
(z − 1) = − A(1) = E[X]. Also since E[X] = φ
(1), E[X2 ] − E[X] = φ
(1), and
realizing μp0 = μ − λE[X], we can rewrite L as
# $
2λE[X] + λE[X2 ] − λE[X]
L = (μ − λE[X])
2{−μ + λE[X]}{−μ + λE[X]}
λE[X] + λE[X2 ]
= .
2{μ − λE[X]}
λE[X] + λE[X2 ]
W= .
2λE[X]{μ − λE[X]}
74 Analysis of Queues
Problem 15
As an abstract model, we have a single server queue where customers arrive
according to a Poisson process with mean arrival rate λ. Service times are
exponentially distributed with mean 1/μ. There is infinite room for requests
to wait. The server stays “on” for a random time distributed exponentially
with mean 1/α after which a catastrophic breakdown occurs. When the
server turns off (i.e., breaks down), all customers in the system are ejected.
Note that the server can break down when there are no customers in the sys-
tem. The server stays off for a random time distributed exponentially with
mean 1/β. No requests can enter the system when the server is off (typically
Exponential Interarrival and Service Times: Closed-Form Expressions 75
after a time-out the client browser would display something to the effect of
unable to reach server). Model the system as a CTMC, obtain steady-state prob-
abilities, and performance measures such as average number in the system,
fraction of requests lost, and sojourn time.
Solution
Note that the system behaves as an M/M/1 queue when the server is on, and
the system is empty when the server is off. The server toggles between on
and off states irrespective of the queueing process. We model the system as a
CTMC. Let X(t) = i (for i = 0, 1, 2, 3, . . .) imply that there are i requests in the
system and the server is on at time t. In addition, let X(t) = D denote that the
server is down (and there are no customers) at time t. Clearly, {X(t), t ≥ 0} is
a CTMC with rate diagram shown in Figure 2.5. The CTMC is ergodic, and
for j = D, 0, 1, 2, . . ., let
α(p0 + p1 + · · · ) = βpD
.. .. ..
...
λ λ λ λ λ λ
0 1 2 3 4 5
μ μ μ μ μ μ
α α
α α
α
α
D
β
FIGURE 2.5
Rate diagram of the CTMC. (From Gautam, N., J. Revenue Pricing Manag., 4(1), 7, 2005.)
76 Analysis of Queues
μ(ψ(z) − p0 )
βpD + + λzψ(z) = (λ + α + μ)ψ(z) − μp0
z
μp0 − zβpD − p0 μz
ψ(z) = . (2.8)
μ + λz2 − λz − αz − μz
Since we already know pD = α/(α + β), the only unknown in Equation 2.8
is p0 . However, standard techniques such as ψ(0) = p0 and ψ(1) = β/(α + β)
do not yield a solution for p0 . Hence, we need to do something different to
determine p0 and thereby ψ(z).
Note that since ψ(z) is p0 + p1 z + p2 z2 + p3 z3 + p4 z4 + · · · , it is a continuous,
differentiable, bounded, and increasing function over z ∈ [0, 1]. However,
from Equation 2.8, ψ(z) is of the form φ(z) = A(z)/B(z), where A(z) and B(z)
are polynomials corresponding to the numerator and denominator of the
equation. If there exists a z∗ ∈ [0, 1] such that B(z∗ ) = 0, then A(z∗ ) = 0 (other-
wise it violates the condition that ψ(z) is a bounded and increasing function
over z ∈ [0, 1]). We now use the previous realization to derive a closed-form
algebraic expression for ψ(z).
By setting the denominator of ψ(z) in Equation 2.8 to zero, we get
'
∗ (λ + μ + α) − (λ + μ + α)2 − 4λμ
z = ,
2λ
as the unique solution such that z∗ ∈ [0, 1]. Setting the numerator of ψ(z) in
Equation 2.8 to zero at z = z∗ , we get
αβz∗
p0 = .
(α + β)μ(1 − z∗ )
'
αβ λ+μ+α− (λ + μ + α)2 − 4λμ
p0 = ' . (2.9)
μ(α + β) λ − μ − α + (λ + μ + α)2 − 4λμ
Exponential Interarrival and Service Times: Closed-Form Expressions 77
μp0 (1 − z) − zαβ/(α + β)
ψ(z) = . (2.10)
λz2 − (λ + μ + α)z + μ
! "
1 λβ − μβ + p0 μ(α + β)
L= ,
α α+β
given that it arrived when the server was up, is given by (conditioning on
the number of requests seen upon arrival)
∞
j+1
pj μ μ 1 μ
= ψ
1 − pD μ+α μ + α 1 − pD μ+α
j=0
μ β − p0 (α + β)
= .
1 − pD λ(α + β)
Therefore, the rate at which requests exit the queue is μ(β)/(α + β) − μp0 ,
which makes sense as whenever there are one or more requests in the system,
the exit rate is μ. In addition, since the drop rate (derived earlier) is αL, we
can write μ((β)/(α + β)) − μp0 = λ(1 − pD ) − αL, which again makes sense
and the total arrival rate when web server is on is λ(1 − pD ).
We also have a fraction pD requests that are rejected when the server is
down. Therefore, the loss probability is (λpD + αL)/λ, and by substituting
for pD we obtain P in terms of L as
αL(α + β) + λα
P = .
λ(α + β)
Using Little’s law, we can derive W in the following manner. The expected
number of requests in the system when the server is on is L/(1 − pD ). In
steady state, of these requests a fraction (λ(1 − pD ) − αL)/λ(1 − pD ) only
will receive service. Therefore, the average sojourn time (or response time)
at the server as experienced by users that receive a response is given by
L/λ(1 − pd )2 , which yields W in terms of L as
L(α + β)2
W= .
λβ2
pi qij = pj qji .
Note how this also requires the necessary condition which can also be math-
ematically shown that if qij is zero or nonzero, then qji is also respectively
zero or nonzero (since pi and pj are nonzero). It is not necessary for i to be a
scalar; it just represents a possible value that X(t) can take.
It is worthwhile to note that the CTMC corresponding to the M/M/s/K
queue for any s and K (as long as the queue is stable) is reversible. In essence,
the condition pi qij = pj qji is identical to either the balance equation corre-
sponding to arc cuts between successive nodes or corresponding to the case
qij = qji = 0. In essence, any one-dimensional birth and death process that is
ergodic would be reversible (for the same reason as the M/M/s/K queue).
However, it is a little tricky to check if other CTMCs (that satisfy the neces-
sary condition) are reversible. To address this shortcoming, we next explain
80 Analysis of Queues
Remark 6
in the CTMC {Y(t), −∞ < t < ∞}. We next describe a result characterizing the
truncated CTMC {Y(t), −∞ < t < ∞}.
Remark 7
This remark essentially says that {Y(t), −∞ < t < ∞} described ear-
lier is reversible. Next is to obtain the steady-state distribution of
{Y(t), −∞ < t < ∞}. For the reversible process {X(t), −∞ < t < ∞}, we have
pi qij = pj qji for all i ∈ A and j ∈ A. Also, since qij for {X(t), −∞ < t < ∞}
and {Y(t), −∞ < t < ∞} are identical, the steady-state probability that the
CTMC {Y(t), −∞ < t < ∞} is in state j is proportional to pj . However, using
the normalizing condition that all the steady-state probabilities must add to
1, we have the steady-state probability that the CTMC {Y(t), −∞ < t < ∞} is
in state j as
pj
k∈A pk
for all j ∈ A.
As an illustration of this, consider the M/M/s queue. Let λ < sμ where λ
and μ are, respectively, the arrival and service rates. If X(t) is the number
of customers in the system at time t, then {X(t), −∞ < t < ∞} is a reversible
M/M/s
CTMC on state space S = {0, 1, 2, 3, . . .} with steady-state probabilities pj
given in Section 2.1. Now the truncated process {Y(t), −∞ < t < ∞}, where
Y(t) is the number of customers in the system in an M/M/s/K queue, is also
reversible and defined on state space A = {0, 1, . . . , K}. Verify from the results
M/M/s/K
in Section 2.1 that the steady-state probabilities pj satisfy
M/M/s
M/M/s/K
pj
pj = M/M/s
K
i=0 pi
for j = 0, 1, . . . , K.
according to exp(μi ). During the entire duration of the connection, each class
i request uses bi kbps of bandwidth. Note that these applications are usu-
ally real-time multimedia traffic and we assume no buffering takes place. In
the traditional telephone network, we have N = 1 class, and each call uses 60
kbps with C/b1 being the number of lines a telephone switch could handle.
This problem is just a multiclass version of that.
Let Xi (t) be the number of ongoing class i connections at time t across
the bottleneck link under consideration. Clearly, there is a constraint at all
times t:
% &
CTMC which is independent of other CTMCs Xj∞ (t), −∞ < t < ∞ for i = j.
( )
In addition, Xi∞ (t), −∞ < t < ∞ is the queue length process of an M/M/∞
queue with arrival rate λi and service rate μi for each server. The steady-state
probabilities for this queue are for i = 1, 2, . . . , N and j = 0, 1, . . .
j
λi 1
pij (∞) = e−λi /μi
μi j!
( )
P X1∞ (t) = x1 , X2∞ (t) = x2 , . . . , XN
∞
(t) = xN
⎛ ⎞
N N xi
*
− λi /μi λi 1
= p1x1 (∞)p2x2 (∞) . . . pN
xN (∞) = ⎝e i=1 ⎠ .
μi xi !
i=1
( ∞ )
Since the joint process X1 (t), X2∞ (t), . . . , XN
∞ (t) , −∞ < t < ∞ is a
reversible process, its truncated process {(X1 (t), X2 (t), . . . , XN (t)), −∞ < t <
∞} is also reversible with the steady-state probability of having x1 class-1
connections, x2 class-2 connections, . . ., xN class-N connections as
N N xi
*
− λi 1
px1 ,x2 ,...,xN ∝ e i=1 λi /μi
μi xi !
i=1
N xi
* λi 1
px1 ,x2 ,...,xN = R (2.11)
μi xi !
i=1
px1 ,x2 ,...,xN = 1.
x1 ,x2 ,...xN :b1 x1 +b2 x2 +···bN xN ≤C
84 Analysis of Queues
In other words
⎡ ⎤−1
N xi
* λi 1⎦
R=⎣ . (2.12)
μi xi !
x1 ,x2 ,...xN :b1 x1 +b2 x2 +···bN xN ≤C i=1
Problem 16
Consider a channel with capacity 700 kbps on which two classes of
bandwidth-sensitive traffic can be transmitted. Class-1 uses 200 kbps band-
width and class-2 uses 300 kbps bandwidth. Let λ1 and λ2 be the parameters
of the Poisson processes corresponding to the arrivals of the two class. Also,
let each admitted request spend exp(μi ) time holding onto the bandwidth
they require for i = 1, 2. Let X1 (t) and X2 (t) be the number of ongoing class-1
and class-2 requests at time t. Model the CTMC {(X1 (t), X2 (t)), t ≥ 0} and
obtain its steady-state probabilities.
Solution
Note that this is a special case when N = 2, C = 700, b1 = 200, and b2 = 300.
The CTMC {(X1 (t), X2 (t)), t ≥ 0} can be modeled as the rate diagram in
Figure 2.6. Since we have the constraint 200X1 (t) + 300X2 (t) ≤ 700, the
μ1 2μ1 3μ1
0,0 1,0 2,0 3,0
λ1 λ1 λ1
λ2 λ2 λ2
μ2 μ2 μ2
μ1 2μ1
0,1 1,1 2,1
λ1 λ1
λ2
2μ2
0,2
FIGURE 2.6
Arc cut example.
Exponential Interarrival and Service Times: Closed-Form Expressions 85
state space is {(0,0), (0,1), (0,2), (1,0), (1,1), (2,0), (2,1), (3,0)}. In order to fully
appreciate the power of using reversibility results, the reader is encouraged
to solve for p0,0 , p0,1 , p0,2 , p1,0 , p1,1 , p2,0 , p2,1 , and p3,0 using the node balance
equations.
Now, using Equation 2.12, we can obtain the normalizing constant as
1
R= .
λ1 λ21 λ2 λ31 λ22 λ1 λ2 λ21 λ2
1+ + + + + + +
μ1 2μ21 μ2 6μ31 2μ22 μ1 μ2 2μ21 μ2
λ2 λ22 λ1
p0,0 = R, p0,1 = R, p0,2 = R, p1,0 = R,
μ2 2μ22 μ1
Reference Notes
The underlying theme of this chapter, as the title suggests, is queues that
can be modeled and analyzed using CTMCs. The sources for the three main
thrusts, namely, arc cuts for birth and death processes, generating functions,
and reversibility, have been different. In particular, the M/M/s/K queue
and special cases are mainly from Gross and Harris [49]. The section on
using generating functions is largely influenced by Kulkarni [67], Medhi [80],
and Prabhu [89]. The reversibility material is presented predominantly from
Kelly [59]. The exercise problems are essentially a compilation of homework
and exam questions in courses taught by the author over the last several
years. However, a large number of the exercise problems were indeed
adapted from some of the books described earlier.
Exercises
2.1 Compute the variance of the number of customers in the system in
steady state for an M/M/s/K queue.
2.2 Consider an M/M/s/K queue with λ = 10 and μ = 1.25. Write a com-
puter program to plot pK and W for various values of K from s to
s + 19. Consider two cases (i) s = 10 and (ii) s = 5.
86 Analysis of Queues
L(2) = λ2 W2 .
2.4 Let U be a random variable that denotes the time between successive
departures (in the long run) from the system in an M/M/1 queue-
ing system. Assume that λ < μ. Show by conditioning whether or
not a departure has left the system empty that U is an exponentially
distributed random variable with mean 1/λ.
2.5 Feedback queue. In the M/M/1 system suppose that with probability q,
a customer who completes their service rejoins the queue for further
service. What is the stability condition for this queue? Assuming
conditions for stability hold, derive expressions for L and W.
2.6 Static control. Consider an M/M/1 queue where the objective is to
pick a service rate μ in an optimal fashion. There are two types of
costs associated: (i) a service-cost rate, c (cost per unit time per unit
of service rate) and (ii) a holding-cost rate h (cost per unit time per
customer in the system). In other words, (i) if we choose a service
rate μ, then we pay a service cost c μ per unit time; (ii) the system
incurs h i monetary units of holding cost per unit time while i cus-
tomers are present. Let C(μ) be the expected steady-state cost per
unit time, when service rate μ is chosen, that is,
the average waiting time W, decide which system you will go with,
single-queue or two-queue system?
2.8 Consider a single-server queue with two classes. Class i customers
arrive according to PP(λi ) for i = 1,2. For both classes, the service
times are according to exp(μ). If the total number of customers (of
both classes) in the system is greater than or equal to K, class-1 cus-
tomers do not join the system, whereas class-2 customers always
join the system. Model this system as a CTMC. When is this sys-
tem stable? Under stability, what is the steady-state distribution
of the number of customers in the system? Compute the expected
sojourn time for each type of customer in steady state, assuming
they exist. Note that for type 1 customers, you are only required
to obtain the mean waiting time for those customers that join the
system.
2.9 There is a single line to order drinks at a local coffee shop. When
the number of customers in the line is three or less, only one person
does the check out as well as making beverages. This takes exp(μ1 )
time. When there are more than three persons in the line, the store
manager comes in to help. In this case, the service rate increases
to μ2 > μ1 (i.e., the reduced service times now become exp(μ2 )).
Assume that the arrival process is PP(λ). Model this queue as a
CTMC.
2.10 Consider a standard M/M/1 queue with arrival rate λ and service
rate μ. The server toggles between being busy and idle. Let B and
I denote random times the server is busy and idle, respectively,
in a cycle. Obtain an expression for the ratio E(B)/E(I). Using that
relation, show that
1
E(B) = .
μ−λ
λ
P
(z) = P(z),
μ(1 − qz)
∞
where P(z) = pi zi . Then, show that the following is a solution
i=0
to this differential equation:
! "λ/(μq)
1−q
P(z) = .
1 − qz
Exponential Interarrival and Service Times: Closed-Form Expressions 89
αλ(λ(1 − z) + μ2 )
ψ1 (z) = p00 .
μ1 μ2 /z − λμ1 (1 − α/z) − λμ2 (1 − β/z) − λ2 (1 − z)
λp0 = μp1
λp1 = 2μp2
λp2 = 2μp3
λp3 = 2μp4
λp4 = 2μp5
.. .. ..
. . .
(z) = p0 + p1 z + p2 z2 + · · · .
2μp0 + μzp1
(z) = .
2μ − λz
90 Analysis of Queues
Solve for the unknowns p0 and p1 (for this do not use the results from
M/M/s queue but feel free to verify).
2.18 Solve the retrial queue steady-state equations in Section 2.2.2 and
compute p00 using the arc cut method.
2.19 Consider a post office with two queues: queue 1 for customers with-
out any financial transactions (such as waiting to pick up mail)
and queue 2 for customers requiring financial transactions (such as
mailing a parcel). For i = 1,2, queue i gets arrivals according to a
Poisson process with parameter λi , service time for each customer is
according to exp(μi ), and has i servers. Due to safety reasons, a max-
imum of four customers are allowed inside the post office at a time.
Model the system as a reversible CTMC and derive the steady-state
probabilities.
2.20 Consider a queueing system with two parallel queues and two
servers, one for each queue. Customers arrive to the system accord-
ing to PP(λ) and each arriving customer joins the queue with the
fewer number of customers. If both queues have the same number of
customers, then the arriving customer picks either with equal proba-
bility. The service times are exponentially distributed with mean 1/μ
at either server. When a service is completed at one queue and the
other queue has two more customers than this queue, then the cus-
tomer at the end of the line instantaneously switches to the shorter
queue to balance the queues. Let X1 (t) and X2 (t) be the number of
customers in queues 1 and 2, respectively, at time t. Assuming that
X1 (0) = X2 (0) = 0, we have |X1 (t) − X2 (t)| ≤ 1 for all t. Model the
bivariate stochastic process {(X1 (t), X2 (t)), t ≥ 0} as a CTMC by writ-
ing down the state space and drawing the rate diagram. Assuming
stability, let
(i.e., when there are two or more customers when a shuttle arrives).
Model the number of customers in the queueing system at time t as a
CTMC and write down the balance equations. Obtain the generating
function of the number of customers in the system in steady state for
the special case λ = μ. Compute L and W for this queueing system.
2.22 Consider an M/M/1 queue where customers in queue (but not the
one in service) may get discouraged and leave without receiving ser-
vice. Each customer who joins the queue will leave after an exp(γ)
time, if the customer does not enter service by that time. Assume
FCFS.
(a) What fraction of arrivals are served? Hence, what are the
average departure rates both after service and without service.
(b) Suppose an arrival finds one customer in the system. What is the
probability that this customer is served?
(c) On an average, how long do customers that get served wait in
the queue before beginning service?
2.23 For an M[X] /M/2 queue with batch arrival rate λ, constant batch size
4, exp(μ) service time, and traffic intensity ρ = 2λ/μ < 1, show that
∞
the generating function P(z) = pn zn for the distribution of the
n=0
number of customers in the system is
2.24 Justify using a brief reasoning whether each of the following is TRUE
or FALSE.
(a) Consider two M/M/1 queues: one has arrival rate λ and service
rate μ, while the other has arrival and service rates as 2λ and 2μ,
respectively. Is the following statement TRUE or FALSE? On an
average, both queues have the same number of customers in the
system in steady state.
(b) Consider two stable queues: one is an M/M/1 queue with arrival
rate λ and service rate μ, while the other is an M/M/2 queue
with arrival rate λ and service rate μ for EACH server. Is the
following statement TRUE or FALSE? On an average, twice as
many entities depart from the M/M/2 queue as compared to the
M/M/1 queue in steady state.
(c) Consider an M/M/1 queue with reneging. The arrival rate is λ,
the service rate is μ, but the reneging rate is also equal to μ (i.e.,
θ = μ). Note that the birth and death process is identical to that of
an M/M/∞ queue. Is the following statement TRUE or FALSE?
For this M/M/1 queue with reneging, we have Lq = 0.
92 Analysis of Queues
(d) Consider two stable queues: one is an M[X] /M/1 queue with
batch arrival rate λ, constant batch size N (i.e., P(X = N) = 1), and
service rate μ, while the other is an M/M/1 queue with arrival
rate Nλ and service rate μ. Note that both queues have the same
effective entity-arrival rate. Is the following statement TRUE or
FALSE? On an average, entities spend more time in the system in
the M[X] /M/1 queue as compared to the M/M/1 queue in steady
state.
(e) Consider a stable M/M/1 queue that uses processor sharing dis-
cipline (see Section 4.5.2). Arrivals are according to PP(λ), and it
would take exp(μ) time to process an entity if it were the only
one in the system. Is the following statement TRUE or FALSE?
The average workload in the system at an arbitrary point in
steady state is λ/(μ(μ − λ)).
3
Exponential Interarrival and Service Times:
Numerical Techniques and Approximations
for all i ∈ S and j ∈ S. Essentially with every transition, the stochastic process
jumps to a state one value higher (birth) or one value lower (death). These
are also called skip-free CTMCs because it is not possible to go from state i
to state j without going through every state in between (i.e., no skipping of
states is allowed). The rates λ0 , λ1 , . . . are known as birth rates and the rates
93
94 Analysis of Queues
μ1 , μ2 , . . . are known as death rates. For the M/M/1 queue, all birth rates are
equal to λ and all death rates are equal to μ. The steady-state distribution of
the one-dimensional birth and death process (or chain) is easy to compute
using arc cuts.
However, in the multidimensional case, it is not as easy unless one
can use reversibility argument discussed toward the end of the previ-
ous chapter. In this chapter (particularly in this section), we will show
numerical techniques to obtain the steady-state distribution of the multi-
dimensional birth and death chains that are not reversible. For that, we
first define a multidimensional birth and death chain. An n-dimensional
CTMC {(X1 (t), X2 (t), . . . , Xn (t)), t ≥ 0} is multidimensional birth and death
chain if with every transition the CTMC jumps to a state one value higher
or lower in exactly one of the dimensions. In other words, if X(t) is an n-
dimensional row vector [X1 (t) X2 (t) . . . Xn (t)] and ei is an n-dimensional unit
(row) vector (i.e., one in the ith dimension and zero everywhere else), then
the next state the CTMC {X(t), t ≥ 0} goes to from X(t) is either X(t) + ei
or X(t) − ei for some i ∈ [1, 2, . . . , n]. It is worthwhile to point out that
the discrete time version of this is called a multidimensional random walk,
although sometimes the terms “random walk” and “birth–death” are used
interchangeably.
In the remainder of this section, we first motivate the need to study mul-
tidimensional birth–death chains using an example in optimal control. Then,
we provide an efficient algorithm to obtain the steady-state probabilities.
Finally, we provide an example based on energy conservation in data centers
where this approach comes in handy.
Problem 17
Consider two single server queues that work in parallel. Both queues have
finite waiting rooms of size Bi , and the service times are exponentially dis-
tributed with mean 1/μi in queue i (for i = 1, 2). Arrivals occur into this
two-queue system according to a Poisson process with mean rate λ. At every
arrival, a scheduler observes the number in the system in each queue and
decides to take one of three control actions: reject the arrival, send the arrival
to queue 1, or send the arrival to queue 2. Assume that the control actions
happen instantaneously and customers cannot jump from one queue to the
other or leave the system before their service is completed. The system earns
a reward r dollars for every accepted customer and incurs a holding cost hi
dollars per unit time per customer held in queue i (for i = 1, 2). Assume that
the reward and holding cost values are such that the scheduler rejects an
arrival only if both queues are full. Describe the structure of the scheduler’s
optimal policy.
Solution
The system is depicted in Figure 3.1. For i = 1, 2, let Xi (t) be the number of
customers in the system in queue i at time t (including any customers at the
servers). If an arrival occurs at time t, the scheduler looks at X1 (t) and X2 (t)
to decide whether the arrival should be rejected or sent to queue 1 or queue 2.
Note that because of the assumption that the scheduler rejects an arrival only
if both queues are full, the scheduler’s action in terms of whether to accept or
reject a customer is already made. Also, if only one of the queues is full, then
the assumption requires sending the arrival to the nonfull queue. Therefore,
the problem is simplified so that the control action is essentially which queue
to send an arrival to when there is space in both (we also call this routing
policy, i.e., decision to send to queue 1 or 2 depending on the number in
each queue).
Intuitively, the optimal policy when there is space in both queues is to
send an arriving request to queue i if it is “shorter” than queue 3 − i for
i = 1, 2. If μ1 = μ2 , B1 = B2 , and h1 = h2 , then it can be shown that routing
Reject arrival
Arrival Exp(μ1)
PP(λ)
Scheduler
Exp(μ2)
FIGURE 3.1
Schematic for scheduler’s options at arrivals.
96 Analysis of Queues
Region 1: Send
arrival to queue 1
Region 2: Send
arrival to queue 2
X1(t)
B1
FIGURE 3.2
Structure of the optimal policy given arrival at time t.
To show the previous set of results, we need to first formulate the prob-
lem as a semi-Markov decision process (SMDP) and then investigate the
optimal policy in various states. The reader is encouraged to read any stan-
dard text on stochastic dynamic programming or Markov decision processes
for a thorough understanding of this material. We first define the value func-
tion V(x), which is the maximal expected total discounted net benefit over
an infinite horizon, starting from state x, that is, (x1 , x2 ). Note that although
x is a vector, V(x) is a scalar. We also use the term “discounted” because we
consider a discount factor α and V(x) denotes the expected present value.
It is customary in the SMDP literature to pick appropriate time units so that
α + λ + μ1 + μ2 = 1,
h(x) = h1 x1 + h2 x2 .
+
Let a+ denote max{a, 0}, if a is a scalar and x+ = x+
1 , x2 . Now, we are in a
position to write down the optimality or Bellman equation.
The value function V(x) satisfies the following optimality equation: for
x1 ∈ [0, B1 ) and x2 ∈ [0, B2 ),
+ μ2 V((x − e2 )+ ). (3.1)
We will not derive this optimality equation (the reader is encouraged to refer
to any standard text on stochastic dynamic programming or Markov deci-
sion processes). However, there is merit in going over the equation itself.
When the system is in state x, a holding cost of h(x) is incurred per unit time
(the negative sign in front of h(x) is because it is a cost as opposed to a bene-
fit). If an arrival occurs (at rate λ), a revenue of r is obtained and depending
on whether the arrival is sent to queue 1 or 2, the new state becomes x + e1
or x + e2 , respectively. From the new state x + ei for i = 1, 2, the net benefit is
V(x + ei ). It is quite natural to select queue 1 or 2 depending on which has
a higher net benefit, hence the maximization. Instead of the arrival, if the
next event is a service completion at queue i (for i = 1, 2), then the new state
becomes (x−ei )+ at rate μi and the value function is V((x−ei )+ ). In summary,
the left-hand side (LHS) V(x) equals the (negative of) holding cost incurred
in state x, plus the net benefit that depends on the next event (arrival routed
to queue 1 or 2, service completion at queue 1, and service completion at
queue 2), which would lead to a new state. The reason it appears as if the
units do not match in Equation 3.1 is that in reality, the entire right-hand
98 Analysis of Queues
1
side (RHS) should be multiplied by α+λ+μ 1 +μ2
, and in our case, that is equal
to 1. As a matter of fact, if xi = 0 for i = 1 or 2, then the actual equation is
(α+λ+μ3−i )V(x) = −h(x)+λ max{r+V(x+e1 ), r+V(x+e2 )}+μ3−i V((x−e3−i )+ )
since server i cannot complete service as there are no customers. How-
ever, we add μi V(x) (since it is equal to μi V((x − ei )+ )) to both sides to
get V(x)(α + λ + μ1 + μ2 ) = − h(x) + λ max{r + V(x + e1 ), r + V(x + e2 )} +
μ1 V((x − e1 )+ ) + μ2 V((x − e2 )+ ), which is identical to Equation 3.1 since
α+λ+μ1 +μ2 = 1. A similar argument can be made for V(x) when x1 = x2 = 0,
and hence we do not have to be concerned about that either.
Also, by looking at the states for which Equation 3.1 holds, we still need
the value function at the boundaries, that is, x1 = B1 or x2 = B2 , which we
present next. For x1 = B1 and x2 < B2 where arrivals are routed to queue 2
+ μ2 Vn ((x − e2 )+ ). (3.5)
≥ r + Vn (x + e1 + e2 ) − r − Vn (x + 2e1 + e2 ) − r − Vn (x + e1 ) + r
+ Vn (x + 2e1 )
= Vn (x + e1 + e2 ) − Vn (x + 2e1 + e2 ) − Vn (x + e1 ) + Vn (x + 2e1 ) ≥ 0.
≥ r + Vn (x + e1 + e2 ) − r − Vn (x + e1 + 2e2 ) − r − Vn (x + e1 ) + r
+ Vn (x + e1 + e2 )
= Vn (x + e1 + e2 ) − Vn (x + e1 + 2e2 ) − Vn (x + e1 )
+ Vn (x + e1 + e2 ) ≥ 0.
≥ r + Vn (x + 2e1 + e2 ) − r − Vn (x + 3e1 ) − r − Vn (x + e1 + e2 ) + r
+ Vn (x + 2e1 )
= g(x + e1 + e2 ) − r − Vn (x + 2e1 + e2 ) − r − Vn (x + e1 + e2 )
+ g(x + e1 )
≥ r + Vn (x + 2e1 + e2 ) − r − Vn (x + 2e1 + e2 ) − r − Vn (x + e1 + e2 )
+ r + Vn (x + e1 + e2 )
≥ 0.
explains the structure of the optimal policy but not the optimal policy itself.
For example, if numerical values for the preceding problem are given, where
does the optimal line that separates region 1 from 2 lie? In other words,
can we draw Figure 3.2 precisely for a given set of numerical values? The
answer is yes. For every candidate switching curve, we can model the result-
ing system as a CTMC and evaluate its performance using the steady-state
probabilities. For example, one algorithm would be to start with the switch-
ing curve being the straight line from (0, 0) to (B1 , B2 ) and evaluate the
expected discounted net benefit (via the steady-state probabilities). Then try
all possible neighbors to determine the optimal switching curve that would
maximize the expected discounted net benefit. We explain the algorithm in
detail and provide a numerical example at the end of the next subsection.
However, we first need a method to quickly compute steady-state probabil-
ities of such CTMCs so that we can efficiently search through the space of
switching curves swiftly, which is the objective of the next subsection.
Problem 18
Two classes of requests arrive to a computer system according to a Poisson
process with rate λi per second for class i (for i = 1, 2). The number of bytes
of processing required for class i requests are exponentially distributed with
mean 1/μi MB. Assume that there is a 1 MB/s single processor that simulta-
neously processes all the requests using a full processor-sharing regime. In
other words, if there are two class-1 requests and three class-2 requests cur-
rently running, then each of the five requests get 200 kB/s (assuming there
are 1000 kB in 1 MB, which is not technically correct as there ought to be
104 Analysis of Queues
1024 kB in 1 MB). However, in practice, the processor cycles through the five
requests processing each for a tiny amount of time called time-quantum, and
this is approximated as a full processor-sharing discipline. Further, there is
a restriction that a maximum of four requests of class1 and three of class2
can be simultaneously served at any given time. Model the system as a two-
dimensional birth and death process and obtain the steady-state distribution
of the number of customers of each class in the system.
Solution
Let Xi (t) be the number of class i customers in the system at time t (for
i = 1, 2). Then the stochastic process {(X1 (t), X2 (t)), t ≥ 0} is a CTMC on
state space S = {(0, 0), (0, 1), (0, 2), (0, 3), (1, 0), . . . , (4, 2), (4, 3)} and infinitesi-
mal generator matrix Q such that
Q − Diag(Q) =
⎡ ⎤
0 λ2 0 0 λ1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
⎢μ2 0 λ2 0 0 λ1 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥
⎢ ⎥
⎢0 μ 0 λ 0 0 0 0⎥
⎢ 2 2 λ1 0 0 0 0 0 0 0 0 0 0 0 ⎥
⎢ ⎥
⎢ 0 0 μ2 0 0 0 0 λ1 0 0 0 0 0 0 0 0 0 0 0 0⎥
⎢ ⎥
⎢μ1 0 0 0 0 λ2 0 0 λ1 0 0 0 0 0 0 0 0 0 0 0⎥
⎢ ⎥
⎢ 0 μ1 0 0 μ2 0 0 0 λ1 0 0 0 0 0 0 0 0 0 0⎥
⎢ 2 2 λ2 ⎥
⎢ ⎥
⎢ 0 0 μ31 0 0 2μ3 2 0 λ2 0 0 λ1 0 0 0 0 0 0 0 0 0⎥
⎢ ⎥
⎢ 0 0 0 μ1 0 0 3μ2
0 0 0 0 λ1 0 0 0 0 0 0 0 0⎥
⎢ 4 4 ⎥
⎢0 0 0 0 μ 0 0⎥
⎢ 1 0 0 0 0 λ2 0 0 λ1 0 0 0 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 0 2μ1 0 0 μ32 0 λ2 0 0 λ1 0 0 0 0 0 0⎥
⎢ 3 ⎥
⎢0 0 0 0 0 0 2μ1
0 0 2μ4 2 0 0⎥
⎢ 4 0 λ2 0 0 λ1 0 0 0 ⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 2μ1
0 0 3μ2
0 0 0 0 λ1 0 0 0 0⎥
⎢ 5 5 ⎥
⎢0 0 0 0 0 0 0 0 μ1 0 0 0 0 λ2 0 0 λ1 0 0 0⎥
⎢ ⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 0 0 3μ4 1 0 0 μ42 0 λ2 0 0 λ1 0 0⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 0 0 0 3μ1
0 0 2μ5 2 0 λ2 0 0 λ1 0 ⎥
⎢ 5 ⎥
⎢ 3μ1 3μ2 ⎥
⎢0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 λ1 ⎥
⎢ 6 6 ⎥
⎢0 0 0 0 0 0 0 0 0 0 0 0 μ1 0 0 0 0 λ2 0 0⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 0 0 0 0 0 0 4μ5 1 0 0 μ52 0 λ2 0 ⎥
⎢ ⎥
⎢ 4μ1 ⎥
⎣0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 2μ6 2 0 λ2 ⎦
4μ1 3μ2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 7 0
⎡ ⎤
Q00 Q01 0 0 0
⎢ ⎥
⎢ Q10 Q11 Q12 0 0 ⎥
⎢ ⎥
Q=⎢
⎢ 0 Q21 Q22 Q23 0 ⎥
⎥
⎢ ⎥
⎣ 0 0 Q32 Q33 Q34 ⎦
0 0 0 Q43 Q44
where
⎡ ⎤
0 0 0 0
⎢0 0⎥
⎢ 0 0 ⎥
0=⎢ ⎥,
⎣0 0 0 0⎦
0 0 0 0
⎡ ⎤
−λ1 − λ2 λ2 0 0
⎢ μ −λ1 − λ2 − μ2 ⎥
⎢ 2 λ2 0 ⎥
Q00 =⎢ ⎥,
⎣ 0 μ2 −λ1 − λ2 − μ2 λ2 ⎦
0 0 μ2 −λ1 − μ2
⎡ ⎤
λ1 0 0 0
⎢0 0⎥
⎢ λ1 0 ⎥
Q01 =⎢ ⎥,
⎣0 0 λ1 0⎦
0 0 0 λ1
⎡ ⎤
μ1 0 0 0
⎢0 μ1 ⎥
⎢ 0 0 ⎥
Q10 =⎢ 2
μ1 ⎥,
⎣0 0 3 0 ⎦
μ1
0 0 0 4
⎡
−λ1 − λ2 − μ1 λ2 0
⎢ μ2
−λ1 − λ2 − μ1
− μ2
λ2
⎢ 2 2 2
Q11 =⎢ 2μ2 μ1 2μ2
⎣ 0 3 −λ1 − λ2 − 3 − 3
3μ2
0 0 4
⎤
0
0 ⎥
⎥,
λ2 ⎦
3μ2 μ1
−λ1 − 4 − 4
⎡ ⎤
λ1 0 0 0
⎢0 λ1 0 0⎥
Q12 =⎢
⎣0
⎥,
0 λ1 0⎦
0 0 0 λ1
106 Analysis of Queues
⎡ ⎤
μ1 0 0 0
⎢0 2μ1
0 0 ⎥
⎢ 3 ⎥
Q21 = ⎢ 2μ1 ⎥,
⎣0 0 4 0 ⎦
2μ1
0 0 0 5
⎡
−λ1 − λ2 − μ1 λ2 0
⎢ μ2
−λ1 − λ2 − 2μ1
− μ2
⎢ 3 3 3 λ2
Q22 =⎢ 2μ2 2μ1 2μ2
⎣ 0 4 −λ1 − λ2 − 4 − 4
3μ2
0 0 5
⎤
0
0 ⎥
⎥,
λ2 ⎦
3μ2 2μ1
−λ1 − 5 − 5
⎡ ⎤
λ1 0 0 0
⎢0 0⎥
⎢ λ1 0 ⎥
Q23 =⎢ ⎥,
⎣0 0 λ1 0⎦
0 0 0 λ1
⎡ ⎤
μ1 0 0 0
⎢0 3μ1
0 0 ⎥
⎢ 4 ⎥
Q32 =⎢ 3μ1 ⎥,
⎣0 0 5 0 ⎦
3μ1
0 0 0 6
⎡
−λ1 − λ2 − μ1 λ2 0
⎢ μ2
−λ1 − λ2 − 3μ1
− μ2
⎢ 4 4 4 λ2
Q33 =⎢ 2μ2 3μ1 2μ2
⎣ 0 5 −λ1 − λ2 − 5 − 5
3μ2
0 0 6
⎤
0
0 ⎥
⎥,
λ2 ⎦
3μ2 3μ1
−λ1 − 6 − 6
⎡ ⎤
λ1 0 0 0
⎢0 0⎥
⎢ λ1 0 ⎥
Q34 =⎢ ⎥,
⎣0 0 λ1 0⎦
0 0 0 λ1
Exponential Interarrival and Service Times: Numerical Methods 107
⎡ ⎤
μ1 0 0 0
⎢0 4μ1
0 0 ⎥
⎢ 5 ⎥
Q43 = ⎢ 4μ1 ⎥
⎣0 0 6 0 ⎦
4μ1
0 0 0 7
⎡ ⎤
−λ2 − μ1 λ2 0 0
⎢ μ2
−λ2 − 4μ1
− μ2
0 ⎥
⎢ 5 5 5 λ2 ⎥
and Q44 =⎢ 2μ2 4μ1 2μ2 ⎥.
⎣ 0 6 −λ2 − 6 − 6 λ2 ⎦
3μ2
0 0 7 − 3μ7 2 − 4μ1
7
4
3
pi,j = 1.
i=0 j=0
However, that process gets computationally intensive for large state spaces.
Therefore, we describe an alternate procedure to obtain p, which is essen-
tially the Servi algorithm that we would subsequently describe for a general
two-dimensional birth and death process.
Instead of obtaining the 1 × 20 row vector p directly, we write
p = a[R0 R1 R2 R3 R4 ] where a is a 1 × 4 row vector and Ri is a 4 × 4 matrix
(for i = 0, 1, 2, 3, 4). The vector a and matrix Ri (for i = 0, . . . , 4) are unknown
and need to be obtained recursively. Since pQ = [0 . . . 0], we have
Using Q as
⎡ ⎤
Q00 Q01 0 0 0
⎢ ⎥
⎢ Q10 Q11 Q12 0 0 ⎥
⎢ ⎥
Q=⎢ 0 Q21 Q22 Q23 0 ⎥
⎢ ⎥
⎢ ⎥
⎣ 0 0 Q32 Q33 Q34 ⎦
0 0 0 Q43 Q44
108 Analysis of Queues
The following set of Ri (for i = 0, . . . , 4) and a values would ensure that the
previous set of equations are satisfied:
4
3
pi,j = 1.
i=0 j=0
where 0 is a (b2 + 1) × (b2 + 1) matrix of zeros and for all i, j, Qi,j are (b2 + 1) ×
(b2 + 1) matrices. Assuming that the CTMC is irreducible (i.e., it is possible
to go from every state to every other state in one or more transitions), our
objective is to determine the steady-state probabilities pi,j for 0 ≤ i ≤ b1 and
0 ≤ j ≤ b2 where
Problem 19
Consider a bilingual customer service center where there are two finite
capacity queues: one for English-speaking customers and other for Spanish-
speaking customers. A maximum of three Spanish-speaking customers can
be in the system at any time. Likewise, a maximum of four English-
speaking customers can be in the system simultaneously. Spanish-speaking
and English-speaking customers arrive into their respective queues accord-
ing to a Poisson process with respective rates 4, and 6 per hour. There is
a Spanish-speaking server that takes on an average 12 min to serve each
of his customers and there is an English-speaking server that takes on an
average 7.5 min to serve each of her customers. Assume that none of the cus-
tomers are bilingual; however, the manager who oversees the two servers
can speak English and Spanish. Whenever the number of customers in one
of the queues has two or more customers than the other queue, the man-
ager helps out the server with the longer queue, thereby increasing the
service rate by 2 per hour. Assume that all service times are exponentially
distributed.
Model the bilingual customer service center system using a two-
dimensional birth and death process. Then use the Servi algorithm to obtain
the steady-state probabilities of the number of customers in the system
speaking Spanish and English. What fraction of customers of each type is
rejected without service? Determine the average time spent by each type of
accepted customer in the system.
Solution
Let X1 (t) be the number of Spanish-speaking customers in the system at
time t and X2 (t) be the number of English-speaking customers in the sys-
tem at time t. From the problem description, b1 = 3 and b2 = 4. Clearly,
{(X1 (t), X2 (t)), t ≥ 0} is a finite-state CTMC that can be modeled as a two-
dimensional birth and death process with 0 ≤ X1 (t) ≤ 3 and 0 ≤ X2 (t) ≤ 4
for all t. The state space is S = { (0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0),
(1, 1), (1, 2), (1, 3), (1, 4), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (3, 0), (3, 1), (3, 2),
(3, 3), (3, 4) }. Note that when the Spanish-speaking server is by himself,
the service rate is 5 per hour, and when the English-speaking server is
by herself, the service rate is 8 per hour. However, when the manager
comes to assist, the service rate of the Spanish-speaking server becomes
7 per hour and that of the English-speaking server becomes 10 per hour.
Exponential Interarrival and Service Times: Numerical Methods 111
Using that and the arrival rates, the Q matrix (in the order of states in S)
is given as
⎡
−10 6 0 0 0 4 0 0 0 0
⎢ 8 −18 6 0 0 0 4 0 0 0
⎢
⎢ 0 10 −20 6 0 0 0 4 0 0
⎢
⎢ 0 0 10 −20 6 0 0 0 4 0
⎢
⎢ 0 0 0 10 −14 0 0 0 0 4
⎢
⎢ 5 0 0 0 0 −15 6 0 0 0
⎢
⎢ 0 5 0 0 0 8 −23 6 0 0
⎢
⎢ 0 0 5 0 0 0 8 −23 6 0
⎢
⎢ 0 0 0 5 0 0 0 10 −25 6
⎢
⎢ 0 0 0 0 5 0 0 0 10 −19
⎢
⎢ 0 0 0 0 0 7 0 0 0 0
⎢
⎢ 0 0 0 0 0 0 5 0 0 0
⎢
⎢ 0 0 0 0 0 0 0 5 0 0
⎢
⎢ 0 0 0 0 0 0 0 0 5 0
⎢
⎢ 0 0 0 0 0 0 0 0 0 5
⎢
⎢ 0 0 0 0 0 0 0 0 0 0
⎢
⎢ 0 0 0 0 0 0 0 0 0 0
⎢
⎢ 0 0 0 0 0 0 0 0 0 0
⎢
⎣ 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
⎤
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 ⎥
⎥
0 0 0 0 0 0 0 0 0 0 ⎥
⎥
0 0 0 0 0 0 0 0 0 0 ⎥
⎥
0 0 0 0 0 0 0 0 0 0 ⎥
⎥
4 0 0 0 0 0 0 0 0 0 ⎥
⎥
0 4 0 0 0 0 0 0 0 0 ⎥
⎥
0 0 4 0 0 0 0 0 0 0 ⎥
⎥
0 0 0 4 0 0 0 0 0 0 ⎥
⎥
0 0 0 0 4 0 0 0 0 0 ⎥
⎥.
−17 6 0 0 0 4 0 0 0 0 ⎥
⎥
8 −23 6 0 0 0 4 0 0 0 ⎥
⎥
0 8 −23 6 0 0 0 4 0 0 ⎥
⎥
0 0 8 −23 6 0 0 0 4 0 ⎥
⎥
0 0 0 10 −19 0 0 0 0 4 ⎥
⎥
7 0 0 0 0 −13 6 0 0 0 ⎥
⎥
0 7 0 0 0 8 −21 6 0 0 ⎥
⎥
0 0 5 0 0 0 8 −19 6 0 ⎥
⎥
0 0 0 5 0 0 0 8 −19 6 ⎦
0 0 0 0 5 0 0 0 8 −13
112 Analysis of Queues
⎡ ⎤
Q0,0 Q0,1 0 0
⎢ Q1,0 Q1,1 Q1,2 0 ⎥
Q=⎢
⎣ 0
⎥
Q2,1 Q2,2 Q2,3 ⎦
0 0 Q3,2 Q3,3
⎡ ⎤
1 0 0 0 0
⎢ 0 1 0 0 0 ⎥
⎢ ⎥
R0 = ⎢
⎢ 0 0 1 0 0 ⎥.
⎥
⎣ 0 0 0 1 0 ⎦
0 0 0 0 1
⎡ ⎤
2 −1.2 0 0 0
⎢ −1.6 3.6 −1.2 0 0 ⎥
⎢ ⎥
R1 = ⎢
⎢ 0 −2 4 −1.2 0 ⎥.
⎥
⎣ 0 0 −2 4 −1.2 ⎦
0 0 0 −2 2.8
⎡ ⎤
5.0857 −7.9200 1.4400 0 0
⎢ −7.5429 19.6000 −9.8400 1.4400 0 ⎥
⎢ ⎥
R2 = ⎢
⎢ 2.2857 −15.6000 22.4000 −10.8000 1.4400 ⎥,
⎥
⎣ 0 3.2000 −17.2000 24.0000 −9.3600 ⎦
0 0 4.0000 −15.6000 12.2400
Exponential Interarrival and Service Times: Numerical Methods 113
⎡ ⎤
20.2596 −31.3420 16.1280 −1.7280 0
⎢ −39.8041 80.0539 −70.1280 18.4320 −1.7280 ⎥
⎢ ⎥
R3 = ⎢
⎢ 23.3796 −77.6735 135.8400 −78.4800 18.4320 ⎥.
⎥
⎣ −3.6571 30.1714 −119.7600 146.5600 −63.4080 ⎦
0 −4.5714 43.3600 −99.4400 62.9920
3
4
ipij = 1.1122
i=0 j=0
3
4
jpij = 1.2926.
i=0 j=0
114 Analysis of Queues
Hence, from Little’s law, the average time spent by accepted Spanish- and
English-speaking customers is 1.1122/3.4572 = 0.3217 h and 1.2926/5.5512 =
0.2329 h, respectively.
Problem 20
Laurie’s Truck Repair offers emergency services for which they have two
facilities: one in the north end of town and the other in the south end. All
trucks that require an emergency repair call a single number to schedule
their repair. When a call is received, the operator must determine whether
to accept the repair request. If a repair is accepted, the operator must also
determine whether to send it to the north or south facility. The company has
installed a sophisticated system that can track the status of all the repairs (this
means the operator can know how many outstanding repairs are in progress
at each facility). Due to space restrictions to park the trucks, the north-side
facility can handle at most three requests at a time, whereas the south-side
facility can only handle two simultaneous requests. Use the following infor-
mation to determine the routing strategy: calls for repair arrive according
to a Poisson process with mean rate of four per day; the service times are
exponentially distributed at both facilities; however, north-side facility can
repair three trucks per day on average, whereas the south-side facility can
repair two per day on average. The average revenue earned per truck repair
is $100. The holding cost per truck per day is $20 in the north side and $10 in
the south side (the difference is partly due to the cost of insurance in the two
neighborhoods). Assume the following: the time to take a truck to a repair
facility is negligible compared to the service time; decisions to accept/reject
and route are made instantaneously and are based only on the number of
committed outstanding repairs at each facility; once accepted at a facility,
the truck does not leave it until the repair is complete; the operator would
never reject a call if there is space in at least one of the facilities to repair; at
Exponential Interarrival and Service Times: Numerical Methods 115
either location, trucks are repaired one at a time; the revenue earned for a
truck repair is independent of the time to repair, location, and type of repair.
Solution
For notational convenience, we use subscript “1” to denote the north side
and subscript “2” for south side. Note that the problem description is almost
identical to that of Problem 17 with B1 = 3, B2 = 2, r = $100, μ1 = 3 per day,
μ2 = 2 per day, λ = 4 per day, h1 = $20 per truck per day, and h2 = $10 per
truck per day. However, one key difference between Problem 17 and this
one is that here the solution needs to be the actual policy (not the structure as
required in Problem 17). In other words, if a request for service is made when
there are i trucks in the north side and j in the south side, should we accept
the request, and if we do accept, should it be sent to the north or south side?
Let X1 (t) and X2 (t) denote the number of trucks under repair in the north-
and south-side facilities, respectively. Ignoring the time to schedule as well
as to drive to the appropriate facility, the system can be modeled as a two-
dimensional birth and death process {(X1 (t), X2 (t)), t ≥ 0} with state space
S = {(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2), (3, 0), (3, 1), (3, 2)}.
The action in state (3, 2) is to reject requests for repairs. When the system
is in states (0, 2), (1, 2), and (2, 2), the optimal action is to route to facility 1
(i.e., north) since there is no space in facility 2. Likewise, when the system
is in states (3, 0) and (3, 1), the optimal action is to route to facility 2 (i.e.,
south) since there is no space in facility 1. Thus, we only need to determine
the actions in the six states (0, 0), (0, 1), (1, 0), (1, 1), (2, 0), and (2, 1).
Although there are 26 = 64 possible actions in the six states together
where we need to determine the optimal action (route to 1 or 2), from the
solution to Problem 17, we know the optimal solution is a monotonic switch-
ing curve. Therefore, we are reduced to only 10 different sets of actions that
we need to consider, which are summarized in Table 3.1. Let Aij be the action
in state (i, j) such that Aij = 1 implies routing to 1 and Aij = 2 implies routing
to 2. Therefore, there is considerable computation and time savings from 64
possible alternatives to 10. The only possible concern is that in Problem 17,
the objective is in terms of long-run average discounted cost, whereas here
it is the long-run average cost per unit time. As it turns out, the average cost
per unit time case also yields the same structure of the optimal policy.
Each one of the 10 alternative actions in Table 3.1 yields a two-
dimensional birth and death process. As described earlier, for all the 10
alternatives, X1 (t) and X2 (t) denote the number of trucks under repair in
the north- and south-side facilities, respectively, and {(X1 (t), X2 (t)), t ≥ 0}
would be a two-dimensional birth and death process with state space
S = {(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2), (3, 0), (3, 1), (3, 2)}.
The key difference among the 10 alternatives would be the Q matrix. There-
fore, although notationally we have the same set of pi,j values, they would
depend on the Q matrix. The objective is to determine the optimal one among
the 10 alternatives that would maximize the expected net revenue per unit
116 Analysis of Queues
TABLE 3.1
Alternatives for Actions in the 6 States
Where They Are to Be Determined
A00 A01 A10 A11 A20 A21
1 1 1 1 1 1
1 1 1 1 2 1
1 1 1 1 2 2
1 1 2 1 2 1
1 1 2 1 2 2
2 1 2 1 2 1
1 1 2 2 2 2
2 1 2 1 2 2
2 1 2 2 2 2
2 2 2 2 2 2
time. For a given alternative, if the steady-state probabilities pi,j can be com-
puted for all i and j such that (i, j) ∈ S, the steady-state expected net revenue
per unit time is
3
2
rλ(1 − p3,2 ) − (ih1 + jh2 )pi,j
i=0 j=0
dollars per day. This is due to the fact that a fraction (1 − p3,2 ) of requests
are accepted (which arrive at rate λ on average) and every request on aver-
age brings a revenue of r dollars; hence, the average revenue is rλ(1 − p3,2 ).
The remaining term is the average holding cost expenses that need to be sub-
tracted from the revenue. Since at any given time there are i trucks in location
1 and j trucks in location 2 with probability pi,j , by conditioning the number
of trucks in each location, we can obtain the holding cost per unit time as
ih1 + jh2 if there are i trucks in location 1 and j in location 2.
For each of the candidate alternate actions, to evaluate the objective func-
tion, that is, the steady-state expected net revenue per unit time, we solve for
pi,j and compute the objective function. To speed up the process to obtain pi,j ,
we use Servi algorithm in Section 3.1.2 but do not present the details here.
Using the numerical values for r, h1 , h2 , λ, μ1 , and μ2 , we obtain the optimal
set of actions as A00 = 1, A01 = 1, A10 = 2, A11 = 1, A20 = 2, and A21 = 2 with an
objective function value of $320.8905 per day. This optimal action set yields
a two-dimensional birth and death process with rate diagram depicted in
Figure 3.3. For this system, obtaining the steady-state probability using Servi
algorithm is left as an exercise.
There are many such queueing control problems where the objective
is to determine the optimal control actions in each state. In a majority of
Exponential Interarrival and Service Times: Numerical Methods 117
μ1 μ1 μ1
μ2 μ2 λ μ2 λ λ
μ2
μ1 μ1 μ1
0,1 1,1 2,1 3,1
λ λ
μ2 μ2 μ2 λ μ2 λ
μ1 μ1 μ1
0,2 1,2 2,3 3,2
λ λ λ
FIGURE 3.3
Two-dimensional birth and death process corresponding to optimal action.
Problem 21
Consider a queueing system with two parallel queues and two servers, one
for each queue. Customers arrive to the system according to PP(λ), and each
arriving customer joins the queue with fewer number of customers. If both
queues have the same number of customers, then the arriving customer picks
Exponential Interarrival and Service Times: Numerical Methods 119
either with equal probability. The service times are exponentially distributed
with mean μi at server i for i = 1, 2. When a service is completed at one queue
and the other queue has two more customers than this queue, then the cus-
tomer at the end of the line instantaneously switches to the shorter queue
(this is called jockeying) to balance the queues. Let X1 (t) and X2 (t) be the
number of customers in queues 1 and 2, respectively, at time t. Assuming
that X1 (0) = X2 (0) = 0, we have |X1 (t) − X2 (t)| ≤ 1 for all t. Model the bivari-
ate stochastic process {(X1 (t), X2 (t)), t ≥ 0} as a QBD by obtaining A0 , A1 , A2 ,
B0 , B1 , and B2 .
Solution
The bivariate CTMC {(X1 (t), X2 (t)), t ≥ 0} has a state space
S = {(0, 0), (1, 0), (0, 1), (1, 1), (2, 1), (1, 2), (2, 2), (3, 2), (2, 3), . . .}.
By writing down the Q matrix of the CTMC and considering sets of three
states as a “level” with three “phases,” it is easy to verify that Q has a QBD
structure with
⎛ ⎞
0 0 0
A0 = B0 = ⎝ λ 0 0 ⎠
λ 0 0
⎛ ⎞
0 μ2 μ1
A2 = B2 = ⎝ 0 0 0 ⎠
0 0 0
⎛ ⎞
−λ − μ1 − μ2 λ/2 λ/2
A1 = ⎝ μ1 + μ2 −λ − μ1 − μ2 0 ⎠ and
μ1 + μ2 0 −λ − μ1 − μ2
⎛ ⎞
−λ λ/2 λ/2
B1 = ⎝ μ2 −λ − μ2 0 ⎠.
μ1 0 −λ − μ1
Care should be taken to write the states in the order (i, i), then (i + 1, i), fol-
lowed by (i, i + 1) for all i = 0, 1, 2, . . . and use those three states as part of
a level. Note that the previous CTMC does not resemble a birth and death
process at all.
Problem 22
Consider a single server queue with infinite waiting room where the service
times are exp(μ). Into this queue, arrivals occur according to a Poisson pro-
cess with one of three possible rates λ0 , λ1 , or λ2 . The rates are governed
by a CTMC {Z(t), t ≥ 0} called the environment process on states {0, 1, 2} and
120 Analysis of Queues
⎡ ⎤
−α1 − α2 α1 α2
⎣ β0 −β0 − β2 β2 ⎦.
γ0 γ1 −γ0 − γ1
S = {(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2), . . .}.
By writing down the Q matrix of the CTMC and considering sets of three
states as a “level” with three “phases,” it is easy to verify that Q has a QBD
structure with
⎛ ⎞
λ0 0 0
A0 = B0 = ⎝ 0 λ1 0 ⎠
0 0 λ2
⎛ ⎞
μ 0 0
A2 = B2 = ⎝ 0 μ 0 ⎠
0 0 μ
⎛ ⎞
−λ0 − μ − α1 − α2 α1 α2
A1 = ⎝ β0 −λ1 − μ − β0 − β2 β2 ⎠ and
γ0 γ1 −λ2 − μ − γ0 − γ1
⎛ ⎞
−λ0 − α1 − α2 α1 α2
B1 = ⎝ β0 −λ1 − β0 − β2 β2 ⎠.
γ0 γ1 −λ2 − γ0 − γ1
Note that the levels correspond to the number in the system and phase
corresponds to the state of the environment CTMC {Z(t), t ≥ 0}. In this
example, the phases and levels have a true meaning unlike the previous
example. In fact, historically these types of queues were first studied as QBDs
and hence some of that terminology remain. Further, such types of time-
varying arrival processes are called Markov-modulated Poisson processes.
Exponential Interarrival and Service Times: Numerical Methods 121
Problem 23
Customers arrive at a single server facility according to PP(λ). An arriving
customer, independent of everything else, belongs to class-1 with probabil-
ity α and class-2 with probability β = 1 − α. The service time of class i (for
i = 1, 2) customers are IID exp(μi ) such that μ1 = μ2 . The customers form a
single line and are served according to first come first served (FCFS). Let
X(t) be the total number of customers in the system at time t, and Y(t) be the
class of the customer in service (with Y(t) = 0 if there are no customers in the
system). Model the bivariate stochastic process {(X(t), Y(t)), t ≥ 0} as a QBD
by obtaining A0 , A1 , A2 , B0 , B1 , and B2 .
Solution
The bivariate CTMC {(X(t), Y(t)), t ≥ 0} has a state space
S = {(0, 0), (1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2), . . .}.
By writing down the Q matrix of the CTMC and considering sets of two
states as a “level” with two “phases,” note that Q has a QBD structure with
= 3, m = 2
λ 0
A0 =
0 λ
⎛ ⎞
0 0
B0 = ⎝ λ 0 ⎠
0 λ
αμ1 βμ1
A2 =
αμ2 βμ2
0 αμ1 βμ1
B2 =
0 αμ2 βμ2
122 Analysis of Queues
−λ − μ1 0
A1 = and
0 −λ − μ2
⎛ ⎞
−λ αλ βλ
B1 = ⎝ μ1 −λ − μ1 0 ⎠.
μ2 0 −λ − μ2
p0 B1 + p1 B2 = 0
p0 B0 + p1 A1 + p2 A2 = 0
p1 A0 + p2 A1 + p3 A2 = 0
Exponential Interarrival and Service Times: Numerical Methods 123
p2 A0 + p3 A1 + p4 A2 = 0
p3 A0 + p4 A1 + p5 A2 = 0
.. .. ..
. . .
pi = p1 Ri−1 .
This is known as the matrix geometric solution due to the matrix geometric
relation between the stationary probabilities. Clearly, for that to be a solu-
tion, we need A0 + RA1 + R2 A2 = 0, where R is an unknown. In fact, the
crux of the MGM is in computing the R that satisfies A0 + RA1 + R2 A2 = 0.
The matrix R is known as the auxiliary matrix. Then, once R is known, we
can obtain p0 in terms of p1 so that it satisfies both p0 B1 + p1 B2 = 0 and
p0 B0 + p1 A1 + p1 RA2 = 0. Note that the result in + m sets of equations with
unknowns for p0 and m unknowns for p1 . However, as with any CTMC, the
+ m equations are not linearly independent. Thus, we would have to drop
one of them and use the normalizing condition that the elements of p sum to
one. For this, it is convenient to write down all the pi terms are in terms of
p1 . The normalizing condition can be written as
p0 1 + (p1 + p2 + p3 + · · · )1 = 1,
p0 1 + (p1 + p1 R + p1 R2 + · · · )1 = 1,
p0 1 + p1 (I + R + R2 + · · · )1 = 1,
p0 1 + p1 (I − R)−1 1 = 1
provided that all eigenvalues of R are in the open interval between −1 and 1.
This is also sometimes written as the spectral radius of R should be less than
1. An intuitive way to think about that result is that if x is an eigenvector and
k an eigenvalue, then Rx = kx and Ri x = ki x. Thus, the sum (I + R + R2 + · · · )x
can be written as (1 + k + k2 + · · · )x which converges provided |k| < 1. Thus,
the sum (I +R+R2 +· · · ) converges to I −R if |k| < 1. Since the spectral radius
of R is the largest |k|, by ensuring it is less than 1, all eigenvalues are between
−1 and 1.
Summarizing, we obtain the steady-state probabilities p = [p0 p1 p2 . . .]
by solving for p0 and p1 in the following set of equations (after dropping one
124 Analysis of Queues
p0 B1 + p1 B2 = 0
p0 B0 + p1 A1 + p1 RA2 = 0
p0 1 + p1 (I − R)−1 1 = 1
where
1 is an × 1 column vector
R is the minimal nonnegative solution to the equation
R2 A2 + RA1 + A0 = 0.
t
PP(λ) bz(t)
FIGURE 3.4
Schematic representation. (From Mahabhashyam, S., and Gautam, N., Queueing Syst. Theory
Appl., 51(1–2), 89, 2005. With permission.)
⎛ ⎞
B1 B0 0 0 0 0 ...
⎜ B2 A1 A0 0 0 0 ... ⎟
⎜ ⎟
⎜ ... ⎟
⎜ 0 A2 A1 A0 0 0 ⎟
Q=⎜ ⎟
⎜ 0 0 A2 A1 A0 0 ... ⎟
⎜ ⎟
⎜ 0 0 0 A2 A1 A0 ... ⎟
⎝ ⎠
.. .. .. .. .. .. ..
. . . . . . .
126 Analysis of Queues
where
⎛ ⎞
λ 0 0 ... 0
⎜ 0 λ 0 ... 0 ⎟
⎜ ⎟
⎜ ⎟
A0 = B0 = ⎜ 0 0 λ ... 0 ⎟
⎜ ⎟
⎜ .. .. .. .. .. ⎟
⎝ . . . . . ⎠
0 0 ... 0 λ
⎛ ⎞
θ1 0 0 ... 0
⎜ 0 θ2 0 ... 0 ⎟
⎜ ⎟
⎜ ⎟
A2 = B2 = ⎜ 0 0 θ3 ... 0 ⎟
⎜ ⎟
⎜ .. .. .. .. .. ⎟
⎝ . . . . . ⎠
0 0 ... 0 θm
⎛ ⎞
s(1) q1,2 q1,3 ... q1,m
⎜ q2,1 s(2) q2,3 ... q2,m ⎟
⎜ ⎟
⎜ ⎟
A1 = ⎜ q3,1 q3,2 s(3) q3,4 ... ⎟
⎜ ⎟
⎜ .. .. .. .. .. ⎟
⎝ . . . . . ⎠
qm,1 qm,2 . . . qm,m−1 s(m)
such that for i = 1, . . . , m, u(i) = qi,i − λ. Recall that qi,j values in these matrices
are from Qz . Note that A0 , A1 , and A2 are square matrices of size m × m.
In addition, B0 , B1 , and B2 are also of sizes m × m; in essence, the value
corresponding to the Bi matrices for i = 0, 1, 2 equals m. The zeros in the Q
matrix are also of size m × m.
In summary, {(Z(t), X(t)), t ≥ 0} is a level-independent infinite-level QBD
process. Next, we use MGM to obtain the steady-state probabilities. Since
Qz = A0 + A1 + A2 and {Z(t), t ≥ 0} is an irreducible CTMC, we satisfy the
requirement that A0 + A1 + A2 is an irreducible. Let π be the stationary prob-
ability for the CTMC {Z(t), t ≥ 0}. The 1 × m row vector π = [π1 . . . πm ] can
be obtained by solving π(A0 + A1 + A2 ) = [0 0 . . . 0] and π1 = 1, where 1 is
an m × 1 column vector. Since the condition for the CTMC to be stable is
Exponential Interarrival and Service Times: Numerical Methods 127
m
μ πi bi > λ.
i=1
Note that this condition implies that the average arrival rate must be smaller
than the average service rate for stability. However, an interesting observa-
tion is that the average service time experienced by a customer is not the
reciprocal of the time-averaged service rate (described earlier).
Having described the stability condition, assuming it is met, the next
step is to obtain the steady-state probabilities p of the QBD process with rate
matrix Q. As described in the MGM analysis, we write p as [p0 p1 p2 . . . ],
where p0 , p1 , p2 , . . ., are 1 × m row vectors. We obtain the steady-state prob-
abilities p = [p0 p1 p2 . . . ] by solving for p0 and p1 in the following set of
equations:
p0 B1 + p1 B2 = 0
p0 B0 + p1 A1 + p1 RA2 = 0
p0 1 + p1 (I − R)−1 1 = 1
R2 A2 + RA1 + A0 = 0.
Therefore, the expected waiting (including service) time of a job in the system
can be calculated using Little’s law as follows:
where W is the average sojourn time in the system for an arbitrary arrival in
steady state.
128 Analysis of Queues
Problem 24
Consider a web server that streams video traffic at different bandwidths.
This is very common in websites that broadcast sports over the Internet.
The users are given an option to select one of the two bandwidths offered
depending on their connection speed. Let us denote the two bandwidths by
r1 = 0.265 Mbps (low bandwidth) and r2 = 0.350 Mbps (high bandwidth). Let
the processing capacity of the web server be C = 0.650 Mbps. The arrival rates
of requests for the two bandwidths (low and high, respectively) are exponen-
tially distributed with parameters λ1 = 1 per second and λ2 = 2 per second.
The service rates are exponentially distributed with parameters μ1 = 2 per
second and μ2 = 3 per second for the two bandwidths, respectively. Note
that service rate corresponds to the holding time that the streaming request
stays connected streaming traffic at its bandwidth. Besides the streaming
traffic, there is also elastic traffic, which is usually data. The arrival rate and
file size of elastic traffic are exponentially distributed with parameter λ = 3
per second and μ = 8 per MB (note the unit of the file size parameter). The
elastic traffic uses whatever remaining capacity (out of C) the processor has.
Compute mean number of elastic traffic requests in the system in steady state
as well as the steady-state response time they experience.
Exponential Interarrival and Service Times: Numerical Methods 129
Solution
Let the state of the environment be a two-dimensional vector denoting the
number of low- and high-bandwidth streaming requests in the system at
time t. Note that the low- and high-bandwidth requirements are r1 = 0.265
Mbps and r2 = 0.350 Mbps, respectively, with total capacity C = 0.650 Mbps.
Hence, the possible states for the environment are (0, 0), (1, 0), (2, 0),
(0, 1), (1, 1), where the first tuple represents the number of ongoing low-
bandwidth requests and the second one represents the number of ongoing
high-bandwidth requests. Without loss of generality, we map the states
(0, 0), (1, 0), (2, 0), (0, 1), and (1, 1) to 1, 2, 3, 4, and 5, respectively. Therefore,
the environment process {Z(t), t ≥ 0} is a CTMC on state space {1, 2, 3, 4, 5}.
The corresponding available bandwidths (in Mbps) for the elastic traffic
in those five states are b1 = 0.650, b2 = 0.385, b3 = 0.120, b4 = 0.300, and
b5 = 0.035. Further, the infinitesimal generator matrix Qz for the irreducible
CTMC {Z(t), t ≥ 0} is given by
⎡ ⎤
−λ1 − λ2 λ1 0 λ2 0
⎢ μ1 −μ1 − λ1 − λ2 λ1 0 λ2 ⎥
⎢ ⎥
⎢ ⎥
Qz = ⎢ 0 2μ1 −2μ1 0 0 ⎥
⎢ ⎥
⎣ μ2 0 0 −μ2 − λ1 λ1 ⎦
0 μ2 0 μ1 −μ1 − μ2
⎡ ⎤
−3 1 0 2 0
⎢ 2 −5 1 0 2 ⎥
⎢ ⎥
⎢ ⎥
=⎢ 0 4 −4 0 0 ⎥.
⎢ ⎥
⎣ 3 0 0 −4 1 ⎦
0 3 0 2 −5
Now, consider the elastic traffic. Elastic traffic requests arrive into a single
server queue according to a Poisson process with mean rate λ = 3 per second.
Each arriving request brings a certain amount of work distributed exponen-
tially with mean 1/μ = 0.125 Mb that is processed at varying rates, depending
on the available capacity left over by the streaming traffic. In particular, if
the environment process is in state i, the request (if any) in process is served
at rate θi = μbi for i = 1, 2, 3, 4, 5. Let X(t) be the number of elastic requests
in queue at time t. Let Z(t) be the state of the environment process, which
governs the server speed at time t and {Z(t), t ≥ 0} is an irreducible finite-
state CTMC with m = 5 states and infinitesimal generator matrix Qz = [qi,j ]
for i ∈ {1, 2, . . . , m} and j ∈ {1, 2, . . . , m} described earlier. When the state of
the environment process Z(t) = i, the service speed available is bi bytes per
unit time, which is also given earlier for i = 1, . . . , 5. Clearly, the bivariate
stochastic process {(Z(t), X(t)), t ≥ 0} is a two-dimensional CTMC on state
space {(1, 0), (2, 0), . . . , (5, 0), (1, 1), (2, 1), . . . , (5, 1), (1, 2), (2, 2), . . . , (5, 2), . . .}
130 Analysis of Queues
where
⎛ ⎞ ⎛ ⎞
λ 0 0 0 0 θ1 0 0 0 0
⎜ 0 λ 0 0 0 ⎟ ⎜ 0 θ2 0 0 0 ⎟
⎜ ⎟ ⎜ ⎟
A0 = B0 = ⎜
⎜ 0 0 λ 0 0 ⎟ , A2 = B2 = ⎜
⎟ ⎜ 0 0 θ3 0 0 ⎟,
⎟
⎝ 0 0 0 λ 0 ⎠ ⎝ 0 0 0 θ4 0 ⎠
0 0 0 0 λ 0 0 0 0 θ5
⎛
−λ1 − λ2 − λ − θ1 λ1 0
⎜ μ1 −μ1 − λ1 − λ2 − λ − θ2 λ1
⎜
A1 = ⎜
⎜ 0 2μ1 −2μ1 − λ − θ3
⎝ μ2 0 0
0 μ2 0
⎞
λ2 0
0 λ2 ⎟
⎟
0 0 ⎟
⎟
−μ2 − λ1 − λ − θ4 λ1 ⎠
μ1 −μ1 − μ2 − λ − θ5
5
μ πi bi > λ
i=1
is satisfied. Since
5
μ πi bi = 3.2585
i=1
R2 A2 + RA1 + A0 = 0.
⎡ ⎤
0.4551 0.0845 0.0131 0.1456 0.0400
⎢ 0.2526 0.4187 0.0600 0.1273 0.1210 ⎥
⎢ ⎥
⎢ ⎥
R=⎢ 0.2585 0.2917 0.4400 0.1291 0.0907 ⎥.
⎢ ⎥
⎣ 0.2959 0.0905 0.0145 0.4774 0.0830 ⎦
0.2958 0.2373 0.0363 0.2367 0.4574
132 Analysis of Queues
p0 B1 + p1 B2 = 0
p0 B0 + p1 A1 + p1 RA2 = 0
p0 1 + p1 (I − R)−1 1 = 1
⎡ ⎤
−4 1 2 1
⎢ 1 −3 2 0 ⎥
Q=⎢
⎣ 2
⎥. (3.8)
2 −6 2 ⎦
3 0 0 −3
Problem 25
Obtain the left eigenvector of the Q matrix in Equation 3.8 corresponding to
eigenvalue 0 and normalize it so that it adds to 1 to obtain p.
Solution
The left eigenvectors of Q are [−0.6528 −0.4663 −0.3730 −0.4663],
[−0.5 0.5 −0.5 0.5], [−0.0000 0.4082 −0.8165 0.4082], and [0.4126 −0.7220
−0.2063 0.5157]. They correspond to eigenvalues 0, −6, −7, and −3,
respectively. The left eigenvector corresponding to eigenvalue of 0 is
[−0.6528 −0.4663 −0.3730 −0.4663]. Normalizing by dividing each element
by the sum of the elements of the previous eigenvector, we get
Problem 26
For the Q matrix in Equation 3.8, write down p in matrix form using Q so
that it can be used to solve for p.
Solution
Since the steady-state probabilities p = [p1 p2 p3 p4 ] can be obtained by
solving for pQ = [0 0 0 0] and p1 + p2 + p3 + p4 = 1, we have
⎡ ⎤
−4 1 2 1
⎢ 1 −3 ⎥
⎢ 2 0 ⎥
[p1 p2 p3 p4 ] ⎢ ⎥ = [0 0 0 0]
⎣ 2 2 −6 2 ⎦
3 0 0 −3
and
⎡ ⎤
1
⎢ 1 ⎥
[p1 p2 p3 p4 ] ⎢ ⎥
⎣ 1 ⎦ = 1.
1
where
∞
(Qt)j
exp(Qt) = I + .
j!
j=1
Therefore, by using a large enough value of t, one could obtain p using the
previous equations (the reason for it being a one-line code is that in most
mathematical software packages, exponential of a matrix is a built-in func-
tion, namely, in MATLAB, the command is expm(Q*t)). We show this via
an example.
Problem 27
For the Q matrix in Equation 3.8, obtain p as the limit of a transient analysis.
Solution
Since the steady-state probabilities p = [p1 p2 p3 p4 ] can be obtained by
year 2006). In a study conducted in January 2006, almost half the fortune 500
IT executives identified power and cooling as problems in their data centers.
A study identified that 100,000 ft2 data center would cost about $44 million
per year just to power the servers and $18 million per year to power the
cooling infrastructure.
One of the biggest concerns for data centers is to find a way to signif-
icantly reduce the energy consumed. Although there are several strategic
initiatives to design green data centers, by controlling their operations energy
consumption in data centers can be significantly conserved. For example,
instead of running one application per server, collect a set of applications
and run them on multiple servers. If the load for a collection of applications
is low, then one could turn off a few servers. For example, if we have 8 appli-
cations a1 , a2 , . . ., a8 and 10 servers s1 , s2 , . . ., s10 , then a possible assignment
is as follows: the set of applications {a1 , a3 , a6 } are assigned to each of servers
s3 , s5 , s7 , and s10 ; applications {a2 , a5 , a7 , a8 } are assigned to each of servers
s1 , s2 , s8 , and s9 ; and application {a4 } is assigned to each of servers s4 and
s6 . Then, if the load for the set of applications {a1 , a3 , a6 } is low, medium, or
high, then one could perhaps turn off two, one, or zero servers, respectively,
from the set of servers assigned to this application {s3 , s5 , s7 , s10 }.
Two of the difficulties for turning off servers are that (a) turning servers
on and off frequently causes wear and tear and reduces their lifetime in addi-
tion to spending energy for powering on and off; and (b) it takes few minutes
for a server to be powered on, and therefore any unforeseen spikes in load
would cause degradation in service to the clients. To address concern (a), in
the model that we develop there would be a cost to boot a server (also popu-
larly known as switching cost in the queueing control literature). Further,
a strategy for concern (b) is to perform what is known as dynamic volt-
age/frequency scaling. In essence, the speed at which the server processes
information (which is related to the CPU frequency) can be reduced by scal-
ing down the voltage dynamically. By doing so, the CPU not only consumes
lesser energy (which is proportional to the cube of the frequency), but also
can switch instantaneously to a higher voltage when a spike occurs in the
load. However, even at the lowest frequency (similar to hibernation on a
laptop), the server consumes energy and therefore the best option is still to
turn servers off.
Next, we describe a simple model to develop a strategy for controlling the
processing speed of the servers as well as powering servers on and off. We
assume that applications are already assigned to servers and consider a sin-
gle collection of applications, all of which are loaded on K identical servers.
At any time, a subset of the K servers are on (the rest are off) and all on
servers run at the same frequency. There is a single queue for the set of K
servers and requests arrive according to a Poisson process with mean rate λ.
The number of bytes to process for each request is assumed to be IID with an
exponential distribution. There are possible frequencies to run the servers.
Therefore, we assume that at any time, the service times are exponentially
138 Analysis of Queues
Z(t)
3 μ1 μ2 μ3 μ3 μ3 μ3 μ3 ...
2 μ1 μ1 μ2 μ2 μ3 μ3
1 μ1 μ1 μ2 μ2 μ3
X(t)
0 1 2 3 4 5 6 7 8 9
FIGURE 3.5
Schematic for hysteretic policy.
are two thresholds (let us call them θ1 and θ2 such that θ1 ≤ θ2 ), and the hys-
teretic policy suggests that if the queue length is larger than θ2 , then switch
on a server, but do not switch it off until the queue length goes below θ1 .
However, for the frequencies, the optimal policy is threshold type.
Next, we illustrate the hysteretic policy for the number of servers and
threshold policy for the service rate using Figure 3.5. Let X(t) denote the
number of customers in the system at time t and let Z(t) be the number of
servers on at time t. For the purpose of illustration, let there be three servers
(i.e., K = 3) and the number of service rates possible is three (i.e., = 3 and
hence three rates μ1 , μ2 , and μ3 ). We represent the state of the system at time t
using the two-tuple (X(t), Z(t)). Since there are two actions in each state, the
actions depicted in Figure 3.5 for each state need some explanation. When
the system is empty and one server is running (this corresponds to state (0,1)
in the figure), the server runs at rate μ1 . Now, if a new customer arrives in
this state (which is the only possible event), the action based on the policy
is to continue with one server and run the server in this new state (1,1) at
rate μ1 . In state (1,1) if an arrival occurs, the policy is to continue with one
server; however, the server will run at rate μ2 in the new state (2,1). Whereas,
if a service is completed in state (1,1), the policy is to stick with one server
and run the server at rate μ1 in the new state (0,1). As long as there are less
than or equal to four requests in the system, only one server will run at rate
μi , which will depend on the number in the system (i.e., μ1 for 0 or 1 in the
system, μ2 for 2 or 3 in the system, and μ3 for 4 in the system).
Now, in state (4,1), if an arrival occurs, from the policy in Figure 3.5, a
new server is added to the system as the first action. For the second action,
we observe the new state after arrival as (5,2), where the action is to run both
servers at rate μ2 (note that prior to this in state (4,1), the single server was
running at μ3 and that is slowed down). Note that in state (5,2), if a service is
completed, we do not immediately go back to 1 server but wait till the num-
ber of customers goes below 2. In other words, once there are two servers
running and the number of customers are between 2 and 7, the action with
respect to number of servers is to do nothing (i.e., no addition or subtraction).
The action in terms of service rates of both servers is to use μ1 for 2 or 3 in
140 Analysis of Queues
the system, μ2 for 4 or 5 in the system, and μ3 for 6 or 7 in the system. In state
(2,2), if a service is completed, the action is to turn off a server (of course the
natural choice is to select the server, which completed the service). Also, in
the new state (1,1), the single server would run at rate μ1 . Likewise, if an
arrival occurs in state (7,2), then the action is to power on a new server and
in the new state (8,3), all three servers would run at μ3 . As long as there are
three or more customers in the system, all the three servers would continue
running at rates μ1 for 3 in the system, μ2 for 4 in the system, and μ3 for 5
or more in the system. When the number in the system reaches 3 with three
servers running and one service completes, one of the servers are turned off
and the remaining two servers in the new state (2,2) run at rate μ1 . All this is
depicted in the policy schematic in Figure 3.5. The policy is also described in
Table 3.2.
Although the optimal policy has a structure as described earlier (hys-
teretic for number of servers and threshold for service rates), it is not clear
what the exact optimal policy is. For that, we need to search across all such
policies and determine the one that results in the minimal long-run average
cost per unit time. For that, we explain how to compute the long-run average
cost per unit time for a given policy. For any given policy, we can describe
a CTMC {(X(t), Z(t)), t ≥ 0} with state space S. As an example, consider the
policy in Table 3.2 for a system with K = 3 and = 3. The rate diagram of the
corresponding CTMC is depicted in Figure 3.6. Assume that for this CTMC,
we can obtain the steady-state probabilities pi,j for all (i, j) ∈ S. Then, the
long-run average cost per unit time is
Bλ(p4,1 + p7,2 ) + j(C0 + Cμ(i, j)3 )pi,j + hipi,j .
(i,j)∈S (i,j)∈S
TABLE 3.2
Hysteretic Policy in Tabular Form
Current State Server New Rate
(X(t), Z(t)) Event Action State Action
(0,1) Arrival Do nothing (1,1) μ1
(1,1) Arrival Do nothing (2,1) μ2
Departure Do nothing (0,1) μ1
(2,1) Arrival Do nothing (3,1) μ2
Departure Do nothing (1,1) μ1
(3,1) Arrival Do nothing (4,1) μ3
Departure Do nothing (2,1) μ2
(4,1) Arrival Add 1 server (5,2) μ2
Departure Do nothing (3,1) μ2
(2,2) Arrival Do nothing (3,2) μ1
Departure Remove 1 (1,1) μ1
(3,2) Arrival Do nothing (4,2) μ2
Departure Do nothing (2,2) μ1
(4,2) Arrival Do nothing (5,2) μ2
Departure Do nothing (3,2) μ1
(5,2) Arrival Do nothing (6,2) μ3
Departure Do nothing (4,2) μ2
(6,2) Arrival Do nothing (7,2) μ3
Departure Do nothing (5,2) μ2
(7,2) Arrival Add 1 server (8,3) μ3
Departure Do nothing (6,2) μ3
(3,3) Arrival Do nothing (4,3) μ2
Departure Remove 1 (2,2) μ1
(4,3) Arrival Do nothing (5,3) μ3
Departure Do nothing (3,3) μ1
(5,3) Arrival Do nothing (6,3) μ3
Departure Do nothing (4,3) μ2
(6,3) Arrival Do nothing (7,3) μ3
Departure Do nothing (5,3) μ3
(7,3) Arrival Do nothing (8,3) μ3
Departure Do nothing (6,3) μ3
(8,3) Arrival Do nothing (9,3) μ3
Departure Do nothing (7,3) μ3
(9,3) Arrival Do nothing (10,3) μ3
Departure Do nothing (8,3) μ3
.. ..
. Arrival Do nothing . μ3
.. ..
. Departure Do nothing . μ3
142 Analysis of Queues
FIGURE 3.6
Rate diagram corresponding to hysteretic policy.
Problem 28
Using the notation described so far in this section, consider a small exam-
ple of K = 3 servers and = 2 different service rates (such that μ1 = 10 and
μ2 = 15). For this system, obtain optimal values of l2 , l3 , u1 , u2 , m12 , m22 , and
m32 so that the long-run average cost per unit time is minimized subject to
the following constraints:
2 ≤ m12 ≤ u1 ,
2 ≤ l2 ≤ m22 ≤ u2 ,
l2 ≤ u1 + 1 ≤ u2 ,
l2 + 1 ≤ l3 ≤ m32 ,
l3 ≤ u2 + 1.
Exponential Interarrival and Service Times: Numerical Methods 143
1. For each l2 , l3 , u1 , u2 , m12 , m22 , and m32 that satisfies the following
constraints
2 ≤ m12 ≤ u1
2 ≤ l2 ≤ m22 ≤ u2
l2 ≤ u1 + 1 ≤ u2
l2 + 1 ≤ l3 ≤ m32
l3 ≤ u2 + 1
S = {(0, 1), (1, 1), . . . , (u1 − 1, 1), (u1 , 1), (l2 , 2), (l2 + 1, 2), . . . , (u2 − 1, 2),
Although there are many ways to numerically obtain pi,j for all
(i, j) ∈ S (especially by writing all pi,j values in terms of pu2 +1,3 ),
we use a finite state approximation by truncating the state space.
In particular, we pick a large M such that M is much larger than
m32 . Since λ = 30 and 3μ2 = 45, the terms in the state space beyond
FIGURE 3.7
Rate diagram corresponding to optimal solution.
Exponential Interarrival and Service Times: Numerical Methods 145
pi,j if (i, j) ∈ S
pi,j =
0 otherwise.
4. Now that we have an expression for pi,j for all (i, j) ∈ S given l2 , l3 , u1 ,
u2 , m12 , m22 , and m32 , the next step is to obtain the objective function.
Let us denote the objective function as f (l2 , l3 , u1 , u2 , m12 , m22 , m32 ).
Using the expression described prior to the problem state-
ment for the long-run average cost per unit time, the objective
function is
uj
3
uj
3
3
+ j C0 + Cμ2 pi,j + hipi,j .
j=1 i=mj2 j=1 i=lj
It is worthwhile to point out that the various pi,j values are them-
selves functions of l2 , l3 , u1 , u2 , m12 , m22 , and m32 . In particular, we
write pi,j = gij (l2 , l3 , u1 , u2 , m12 , m22 , m32 ) although we do not know
gij (·) explicitly, and this is done purely to write down a mathematical
program. Therefore, the relationship between the decision variables
and the objective function does not exist in closed form. The math-
ematical programming formulation to optimally select integers l2 , l3 ,
u1 , u2 , m12 , m22 , and m32 is
146 Analysis of Queues
Subject to
2 ≤ m12 ≤ u1
2 ≤ l2 ≤ m22 ≤ u2
l2 ≤ u1 + 1 ≤ u2
l2 + 1 ≤ l3 ≤ m32
l3 ≤ u2 + 1
j2 −1
3 m
+ j C0 + Cμ31 pi,j
j=1 i=lj
uj
3
uj
3
+ j C0 + Cμ32 pi,j + hipi,j
j=1 i=mj2 j=1 i=lj
∀ (i, j) : j = 1, 2, 3; lj ≤ i ≤ uj .
Reference Notes
The material presented in this chapter is rather unusual in a queueing
theory text, and in fact in most universities this material is not part of a grad-
uate level course on queueing. However, there are two topics in this chapter
that have received a tremendous amount of attention in the literature: matrix
analytical methods and control of queues. There are several excellent books
on matrix analytical methods, and one of the pioneering works is by Neuts
[85]. Other books include Stewart [99], and Latouche and Ramaswami [73].
The essence of matrix analytical methods is to use numerical and iterative
methods for Markov chains, and it is general enough to be used in a vari-
ety of settings beyond what is considered in this chapter. We will visit this
technique in the next chapter as well to get a full appreciation.
The topic of control of queues is also widespread. However, the litera-
ture on performance analysis and control is quite distinct. There is a clear
optimization flavor in control of queues and the use of stochastic dynamic
programming. A recent book by Stidham [100] meticulously covers the topic
of optimization in queues by design and control. The crucial point made
in this chapter is that stochastic dynamic programming only provides the
structure of the optimal policy (such as threshold, switching curve, and hys-
teretic). Whereas to obtain the exact optimal policy, one needs to explore the
space of solutions that satisfy the given policy. This chapter, especially the
first and third parts, explicitly explores that.
Exercises
3.1. Refer to the notation in Problem 17 in Section 3.1.1. First show that
−h(x) satisfies conditions (3.2), (3.3), and (3.4). Then, assuming that
Vn (x) is nonincreasing and satisfies the conditions (3.2), (3.3), and
(3.4), show that Vn ((x − ei )+ ) for i = 1, 2 also satisfy the conditions
(3.2), (3.3), and (3.4).
3.2. A finite two-dimensional birth and death process on a rectangular
lattice for some given a and b has a state space given by S = {(i, j) :
0 ≤ i ≤ a, 0 ≤ j ≤ b}, that is, all integer points on the XY plane such
that the x points are between 0 and a and the y points are between
0 and b. The rate of going from state (i, j) to (i + 1, j) is αi for i < a
and the rate of going from state (i, j) to (i, j + 1) is γj for j < b such
that (i, j) ∈ S. The rates αi and γj are known as birth rates. Likewise,
define death rates βi and δj such that they are probabilities of going
from (i, j) to (i − 1, j) and (i, j − 1), respectively, when i > 0 and j > 0.
Show that this generic two-dimensional birth and death process has
148 Analysis of Queues
Q matrix of the block tridiagonal form (as described before the Servi
algorithm in Section 3.1.2) by writing it down in that form.
3.3. Consider Problem 20 in Section 3.1.3. The two-dimensional birth and
death process corresponding to the optimal action is described in
Figure 3.3. Using the numerical values stated in that problem, com-
pute the steady-state probabilities for that two-dimensional birth
and death process using the Servi algorithm. Also, compute the
long-run average cost per unit time.
3.4. For an M[X] /M/1 batch arrival queue with individual service where
batches arrive according to PP(λ) and each batch is of size 1, 2, 3, or 4
with probability 0.4, 0.3, 0.2, and 0.1, respectively. If the service rate
is μ, model the number of customers in the system as a QBD process.
Obtain the condition for stability and the steady-state probabilities
using MGM. For the special case of λ = 1 and μ = 2.5, obtain numeri-
cal values for the steady-state probabilities and the average number
in the system in steady state.
3.5. Consider a CPU of a computer that processes tasks from a software
agent as well as other tasks on the computer in parallel by sharing
the computer’s processor. The software agent submits tasks accord-
ing to a Poisson process with parameter λ and each task has exp(μ)
work (in terms of kilobytes) in it that the CPU has to perform. If the
only process running on the CPU is that of the agent, it receives all
the CPU speed. However, if there are few other processes running
on the CPU, only a fraction of the CPU speed is available depending
on how many processes, running at the same time. Model the system
as a queue with time-varying service rates according to an external
environment process (the other processes that run on the CPU). For
this, let the available capacity vary according to a CTMC {Z(t), t ≥ 0}
with generator matrix Qz such that at time t the available processing
speed for the agent tasks is bZ(t) (kilobytes per second). There are
five possible server speeds, that is, Z(t) takes values 1–5. They are
b1 = 1, b2 = 2, b3 = 3, b4 = 4, and b5 = 5. The infinitesimal generator
matrix Qz is a 5×5 matrix given by
⎡ ⎤
−6 2 1 2 1
⎢ 1 −7 3 2 1 ⎥
⎢ ⎥
Qz = ⎢
⎢ 3 2 −8 2 1 ⎥.
⎥
⎣ 2 1 1 −5 1 ⎦
3 4 1 2 −10
The mean arrival rate λ = 2.5 and the mean task size 1/μ = 1. Com-
pute the mean response time for jobs posted by the software agent
to the CPU. Use a similar framework as Problem 24 in Section 3.2.3.
Exponential Interarrival and Service Times: Numerical Methods 149
and
−θ θ
QY = .
ν −ν
151
152 Analysis of Queues
This means that across each node i, the flow out equals the flow in just like
in the CTMCs. Notice that the preceding expression does not include pii ;
however, if that is preferred, we could add πi pii to both sides of the equation
to get
πi pij = πj pji .
j∈S j∈S
Since the balance equations along with the normalizing conditions are
exactly analogous to those of the CTMC (essentially replacing qij by pij
General Interarrival and/or Service Times 153
G(t) = P{Si ≤ t}
for all i. Service times are nonnegative and hence G(t) = 0 for all t < 0. How-
ever, there are no other restrictions on the random variables Si ; in fact, they
could be discrete, continuous, or a mixture of discrete and continuous. For
the sake of notational convenience, we let the mean and variance of service
times to be 1/μ and σ2 , respectively, such that for all i,
1
E[Si ] =
μ
and
Var[Si ] = σ2 .
Since the mean and variance of the service times can easily be derived from
the CDF G(t), sometimes μ and σ are not specified while describing the
service times.
We have almost everything in place to call the preceding system an
M/G/1 queue. The only aspect that remains is the service discipline.
Although Kendall notation specifies that the default discipline is FCFS and
we will derive all our results assuming FCFS, it is worthwhile to discuss
the generality of the analysis. For most of the analysis, we only require that
if there is an entity in the system, useful work is always performed and at
most one entity in the queue can have incomplete service. This deserves
some explanation. Firstly, the server cannot be idle when there are customers
in the system. That also means that if there are customers waiting, as soon
154 Analysis of Queues
as service completes for a customer, the service for the next customer starts
instantaneously. This is an aspect that is frequently overlooked while collect-
ing service time data. The simplest way to fix the problem is to add any time
spent between service of customers to the customers’ service time. Secondly,
since at most one customer can have incomplete service, this precludes
disciplines involving preemption or processor sharing. However, schemes
such as LCFS without preemption and random order of service can be
analyzed.
Having described the setting for the M/G/1 queue, we next model and
analyze the system. Notice that unless the service times are according to
some mixture of exponential distributions, we cannot model the system
using a CTMC. In fact, we will see that even for the mixture of exponential
case, modeling as a DTMC provides some excellent closed-form algebraic
expressions that the CTMC model may fail to provide. With that said, we
begin modeling the system. The first question that comes to mind is when to
observe the system so that the resulting process is a DTMC. We can immedi-
ately rule out observing the system any time in the middle of a service since
the remaining service time would now depend on history and Markov prop-
erty would not be satisfied. Therefore, the only options are to observe at the
beginning and/or end of service times. Although it may indeed be possible
to model the system as a DTMC by observing both at the beginning and at
the end of a service, we will see that it is sufficient if we observed the system
at the end of a service. In other words, we will observe the system whenever
a customer departure occurs. The next question is that during these depar-
ture epochs, the number in the system goes down by one—so should we
observe before or after a departure? Although either case would work, we
will observe immediately after the departure so that the departing customer
is not included in the state.
With that in place, we let Xn be the number of customers in the system
immediately after the nth departure. The state space, that is, set of all pos-
sible values of Xn , for all n is {0, 1, 2, 3, . . .}. For some arbitrary n, let Xn = i
such that i > 0. We now derive a distribution for Xn+1 , given Xn = i. If Xn = i
immediately after the nth departure, we will have one customer at the server
and i − 1 waiting. When this customer at the service completes service, we
observe the system next, that is, the n+1st departure. So Xn+1 would be equal
to i − 1 plus all the customers that arrived during the service time that just
completed. In other words, Xn+1 would be i − 1 + j with probability aj , where
aj is the probability that j customers arrive during a service (for j = 0, 1, 2, . . .).
Hence, we write mathematically
P{Xn+1 = i − 1 + j|Xn = i} = aj
occurs, during this customer’s service if j arrivals occur, then when this n+1st
customer departs, there would be j in the system. In other words, if Xn = 0,
Xn+1 would be j with probability aj , where aj once again is the probabil-
ity that j customers arrive during a service (for j = 0, 1, 2, . . .). This we write
mathematically as
P{Xn+1 = j|Xn = 0} = aj
for all j ≥ 0. This deserves a little explanation as it is a little different from the
case where Xn > 0 when the time between observations was equal to a service
time. Notice that when Xn = 0, the next observation is after two events, one
arrival and one service, in that order. Clearly, Xn+1 would be equal to the
number of customers that arrive during the second phase, that is, a service.
Thus, we are able to use the same notation aj . Of course, we do not have an
expression for aj and would need to derive it. We will do that after modeling
the system as a DTMC.
From the earlier description, to determine Xn+1 we only need to be given
Xn and not the history. Also the probability of transitioning from Xn to Xn+1
does not depend on n. Therefore, we can model {Xn , n ≥ 0} as a DTMC with
state space {0, 1, 2, . . .} and transition probability matrix
⎡ ⎤
a0 a1 a2 a3 ...
⎢ a0 a1 a2 a3 ... ⎥
⎢ ⎥
⎢ 0 a0 a1 a2 ... ⎥
⎢ ⎥
P=⎢ 0 0 a0 a1 ... ⎥
⎢ ⎥
⎢ 0 0 0 a0 ... ⎥
⎣ ⎦
.. .. .. .. ..
. . . . .
∞ (λt)j
aj = e−λt dG(t).
j!
0
The limiting
distribution π = (π0 π1 . . .) can be obtained by solving π = πP
and πj = 1. To solve the balance equations (i.e., the equations that corre-
spond to π = πP for each node), we use the generating function approach
seen in Chapter 2. The balance equations are
π0 = a0 π0 + a0 π1
π1 = a1 π0 + a1 π1 + a0 π2
π2 = a2 π0 + a2 π1 + a1 π2 + a0 π3
π3 = a3 π0 + a3 π1 + a2 π2 + a1 π3 + a0 π4
.. .. ..
...
We multiply the first equation by z0 , the next by z1 , the next by z2 , and so on.
Upon summing we get
π0 z0 + π1 z1 + π2 z2 + π3 z3 + · · · = π0 (a0 z0 + a1 z1 + a2 z2 + · · · )
+ π1 (a0 z0 + a1 z1 + a2 z2 + · · · )
+ π2 (a0 z1 + a1 z2 + a2 z3 + · · · )
+ π3 (a0 z2 + a1 z3 + a2 z4 + · · · ) + · · ·
General Interarrival and/or Service Times 157
π0 z0 + π1 z1 + π2 z2 + π3 z3 + · · · = π0 (a0 z0 + a1 z1 + a2 z2 + · · · )
+ π1 (a0 z0 + a1 z1 + a2 z2 + · · · )
+ π2 z(a0 z0 + a1 z1 + a2 z2 + · · · )
+ π3 z2 (a0 z0 + a1 z1 + a2 z2 + · · · ) + · · · .
π0 z0 + π1 z1 + π2 z2 + π3 z3 + · · ·
= (a0 z0 + a1 z1 + a2 z2 + · · · )(π0 + π1 + π2 z + π3 z2 + · · · ).
Since we are going to use generating functions, we can rewrite the preceding
equation as
π0 z0 + π1 z1 + π2 z2 + π3 z3 + · · · = (a0 z0 + a1 z1 + a2 z2 + · · · )
1
π0 + (−π0 + π0 z0 + π1 z1 + π2 z2 + π3 z3 + · · · )
z
∞
φ(z) = πi zi ,
i=0
∞
A(z) = aj zj .
j=0
Note that A(z) can be computed based on the inputs λ and G(t). Hence, by
rewriting the preceding equation, we get φ(z) as
π0 A(z)(1 − z)
φ(z) = . (4.2)
A(z) − z
The only unknown on the RHS of Equation 4.2 is π0 . To obtain that, we first
need to write down some properties for A(z).
From the definition of A(z), we have A(1) = 1 since a0 + a1 + · · · = 1.
However, to get other properties, we first write A(z) in the simplest pos-
sible form in terms of the parameters in the problem definition, namely, λ
and G(t). By using the definition of aj , we get
⎛ ⎞
∞
∞
∞ ∞ ∞
(λt)j (λt) ⎠
j
A(z) = aj zj = e−λt zj dG(t) = e−λt ⎝ zj dG(t)
j! j!
j=0 j=0 0 0 j=0
⎛ ⎞
∞ ∞
(zλt)j ∞
= e−λt ⎝ ⎠ dG(t) = e−λt ezλt dG(t)
j!
0 j=0 0
∞
= e−(1−z)λt dG(t) = G̃((1 − z)λ)
0
where the last expression G̃((1 − z)λ) is the LST of G(t) at (1 − z)λ. By defi-
nition, if S is a random variable corresponding to the service times, then the
LST of G(t) at u is
∞
E e−uS = e−ut dG(t) = G̃(u).
u=0
Also, using the properties of LSTs, G̃(0) = 1, G̃ (0) = − E[S] = − 1/μ, and
G̃ (0) = E[S2 ] = 1/μ2 + σ2 . Therefore, from the earlier results and the relation
A(z) = G̃((1 − z)λ), we get
λ
A (1) = −λG̃ (0) = , (4.3)
μ
λ2
A (1) = λ2 G̃ (0) = + λ2 σ2 . (4.4)
μ2
Now we get back to Equation 4.2. To obtain π0 we first try φ(0) and that
gives φ(0) = π0 , which is true but does not help us to get π0 . Next we try
General Interarrival and/or Service Times 159
π0 A(z)(1 − z)
φ(1) = lim
z→1 A(z) − z
(1 − z)
= π0 A(1) lim .
z→1 A(z) − z
Using A(1) = 1 and realizing that the limit is of the type 0/0, we use
L’Hospital’s rule to get
(−1)
φ(1) = π0 lim
z→1A (z) − 1
1
= π0 .
1 − A (1)
λ
π0 = 1 − .
μ
λ < μ.
λ
ρ= .
μ
Problem 29
Consider a stable M/G/1 queue with PP(λ) arrivals, mean service time 1/μ,
and variance of service time σ2 . Compute L, the long-run time-averaged
number of entities in the system. Using that obtain W the average sojourn
time spent by an entity in the system in steady state.
Solution
Recall that πj is the long-run fraction of time a departing customer sees j
others in the system. It is also known as departure-point steady-state proba-
bility. However, to compute L, we need the time-averaged (and not as seen
by departing customers) fraction of time spent in state j, which we call pj . For
that, we know from PASTA (Poisson arrivals see time averages) described in
Section 1.3.4 that pj must be equal to the long-run fraction of time an arriving
customer sees j others in the system, that is, π∗j . In other words, pj = π∗j . But
we also know from Section 1.3.3 that π∗j = πj for any G/G/s queue and hence
pj = πj for all j.
Using that logic, the average number of customers in the system is
∞
∞
L= jpj = jπj = φ (1).
j=0 j=0
φ (z) =
(1 − ρ)G̃(λ − λz) + (1 − ρ)(1 − z)λG̃ (λ − λz) − (1 + λG̃ (λ − λz))φ(z)
.
z − G̃(λ − λz)
General Interarrival and/or Service Times 161
Earlier in this section, we saw that G̃(0) = 1 and G̃ (0) = − 1/μ. Using
those (and also realizing φ(1) = 1) to compute φ (1) by taking the limit as z
approaches one, we get a 0/0 expression. Therefore, using L’Hospital’s rule,
we get
Notice that both sides of the preceding equation has φ (z). Now by taking the
limits as z → 1 using G̃(0) = 1, G̃ (0) = − 1/μ, G̃ (0) = 1/μ2 + σ2 , and φ(1) = 1,
we get
λ2 (σ2 + 1/μ2 )
φ (1) = ρ +
2 1−ρ
λ2 (σ2 + 1/μ2 )
L=ρ+ . (4.6)
2 1−ρ
1 λ (σ2 + 1/μ2 )
W= + .
μ 2 1−ρ
Notice that the preceding equation for W as well as Equation 4.6 for L
are in terms of λ, μ, and σ only. The entire service time distribution G(t) is
not necessary for those expressions. From a practical standpoint, this is very
useful because μ and σ can be estimated more robustly compared to G(t).
Next, having discussed the distribution of the queue length, it is quite
natural to consider the distribution of the sojourn time in the system that
we call waiting time. For this, we require FCFS and it is the first time we
truly require FCFS discipline. Up until now, all the results can be derived
for any work conserving discipline with a maximum of one customer hav-
ing incomplete service at any given time. We describe the next result on the
sojourn time distribution as a problem, by recognizing that we already know
the mean sojourn time from the previous problem.
Problem 30
Let Y be the sojourn time in the system for a customer arriving into a stable
M/G/1 queue in steady state. If the service is FCFS, then show that the LST
162 Analysis of Queues
of the CDF of Y is
(1 − ρ)sG̃(s)
E e−sY = .
s − λ(1 − G̃(s))
Solution
As before, Xn denotes the number of customers in the M/G/1 queueing
system as seen by the nth departing customer. Let Bn be the number of cus-
tomers that arrive during the nth customer’s sojourn, which includes time
spent waiting (if any) and time for service. Since the service discipline is
FCFS, we have
Xn = Bn .
E zXn = E zBn .
∞
φ(z) = lim P{Xn = i}zi = lim E zXn .
n→∞ n→∞
i=0
Therefore, from the equality E[zXn ] = E[zBn ] and the earlier expression, we
have
lim E zBn = φ(z). (4.7)
n→∞
by conditioning on Wn as follows:
∞
E zBn = E zBn |Wn = w dHn (w)
0
∞
∞
(λw)i i
= e−λw z dHn (w)
i!
0 i=0
where the last equation uses the fact that the arrivals are PP(λ) and the
probability of getting i arrivals in time w is e−λw (λw)i /i!. Therefore, we have
∞
∞ (λw)i
Bn −λw i
E z = e z dHn (w)
i!
0 i=0
∞ ∞
= e−λw eλwz dHn (w) = e−(1−z)λw dHn (w)
0 0
E e−sWn = H̃n (s).
Taking the limits as n → ∞ of the preceding expression, and using the fact
that Wn → Y as n → ∞, we get
E e−sY = H̃(s).
Bn
1−s
E e−sY = H̃(s) = lim E .
n→∞ λ
164 Analysis of Queues
Bn
1−s 1−s
lim E =φ .
n→∞ λ λ
−sY 1−s
Ee = H̃(s) = φ . (4.9)
λ
Using the expression for φ(z) in Equation 4.5 in terms of z, λ, G̃(·), and μ, we
have by letting z = 1 − s/λ
(1 − ρ)sG̃(s)
E e−sY =
s − λ(1 − G̃(s))
where ρ = λ/μ.
Problem 31
Derive the relationship between higher moments of the steady-state sojourn
time against those of the number in the system for an M/G/1 queue with
FCFS service.
Solution
Let E[Yr ] be the rth moment of the steady-state sojourn time, for r = 1, 2, 3, . . .,
which can be computed as
dr H̃(s)
E[Yr ] = lim(−1)r .
s→0 dsr
Likewise, let L(r) be the rth factorial moment of the steady-state number in
the system. Note that for a discrete random variable X on 0, 1, 2, . . ., the rth
factorial moment is defined as E[X(X − 1)(X − 2) . . . (X − r + 1)]. Therefore,
L(1) is L itself. Then, L(2) can be used to compute the variance of the number
in the system as L(2) + L − L2 . Likewise, higher moments of the number in
the system can be obtained. However, notice that
dr φ̃(z)
E[L(r) ] = lim .
z→1 dsr
There are many results like the one in the preceding text that can be eas-
ily derived for the M/G/1 queue that sometimes we do not even mention
while talking about the special case M/M/1, although methodologically they
would require quite different approaches. Having said that, the next result
is one that would typically be analyzed using identical methods for M/G/1
and M/M/1, followed by a curious paradox. That result is presented as a
problem.
Problem 32
In a single-server queue, a busy period is defined as a consecutive stretch of
time when the server is busy serving customers. A busy period starts when a
customer arrives into an empty single-server queue and ends when the sys-
tem becomes empty for the first time after that. With that definition, obtain
the LST of the busy period distribution of an M/G/1 queue.
166 Analysis of Queues
Solution
Let Z be a random variable denoting the busy period initiated by an arrival
into an empty queue in steady state. Also, let S be the service time of this
customer that just arrived. Remember that we are only going to consider
nonpreemptive schemes, although this result would not alter if we consid-
ered preemption, as long as it is work conserving. But the proof would have
to be altered, hence the assumption. Let N be the number of customers that
arrive during the service of this “first” customer, that is, in time S. Of course,
if N is zero, then the busy period is S itself. Let us remember this case but
for now assume N > 0. We keep these N customers aside in what we call ini-
tial pool. Take one from the initial pool and serve that customer and in the
mean time if any new customers arrive serve them one by one till there are
no customers in the system except the N − 1 in the initial pool. It is critical to
realize that the time to serve the first customer in the initial pool and all the
customers that came subsequently till the queue did not have any customers
that are not part of the initial pool is stochastically equal to a busy period.
We call this time Z1 . Next, pick the second customer (if any) from the initial
pool and spend a busy period (of length Z2 ) serving that customer and all
that arrive until the queue only has customers from the initial pool. Repeat
the process until there are no customers left in the initial pool. We use this to
write down the conditional relation for some u ≥ 0:
E e−uZ S = t, N = n = E e−u(t+Z1 +Z2 +···+Zn ) .
But this conditional relation also works when N = 0. So from now on, we
remove restriction on N and say that the preceding is true for all N ≥ 0.
Unconditioning the earlier equation using P{N = n|S = t} = e−λt (λt)n /n!,
we get
∞
∞
n (λt)n
E e−uZ = e−ut E e−uZ e−λt dG(t), (4.11)
n!
0 n=0
since Z, Z1 , Z2 , . . ., are IID random variables. We use the notation F̃Z (u) as
the LST of the CDF of Z that is defined mathematically as
∞
F̃Z (u) = E e−uZ = e−uz dFZ (z)
0
where FZ (z) = P{Z ≤ z}. Notice that we do not know FZ (z) and are trying to
obtain it via the LST F̃Z (u). Rewriting Equation 4.11 in terms of the LST of
the busy period distribution, we get
General Interarrival and/or Service Times 167
∞
∞
[F̃Z (u)λt]n −λt
F̃Z (u) = e−ut e dG(t).
n!
0 n=0
∞
F̃Z (u) = e−ut e[F̃Z (u)λt] e−λt dG(t) = G̃(u + λ − λF̃Z (u)).
0
Remark 8
For an M/G/1 queue with σ > 1/μ since the mean busy period E[Z] = 1/
(μ − λ) and the mean sojourn time W = 1/μ + λ/2(σ2 + 1/μ2 )/(1 − ρ), we
get E[Z] < W. In other words, the mean busy period is smaller than the mean
waiting time when σ > 1/μ. However, this appears like a paradox because if
you take any busy period, the waiting time of a customer that entered and
left during this busy period is always smaller than the busy period itself. But
is the expected waiting time greater than the expected busy period? How
could that be?
customers stuck in the long busy period and, averaging over customers, the
sojourn times would end up being long. A simulation might help with the
intuition and the reader is encouraged to try one out. Having described this,
we wrap up the topic M/G/1 queue and move on to its counterpart, the
G/M/1 queue.
∗
Xn , n ≥ 0 is a DTMC with state space S = {0, 1, 2, . . .}, and
⎡ ⎤
b0 a0 0 0 . . .
⎢ b1 a1 a0 0 . . . ⎥
⎢ ⎥
⎢ b2 a2 a1 a0 . . . ⎥
⎢ ⎥
P = ⎢ b3 a3 a2 a1 . . . ⎥
⎢ ⎥
⎢ b4 a4 a3 a2 . . . ⎥
⎣ ⎦
.. .. .. .. ..
. . . . .
∞
bj = ai for all j ≥ 0
i=j+1
∞ (μt)j
aj = e−μt dG(t).
j!
0
Notice that the preceding is derived using exactly the same argument as that
in the M/G/1 queue (the reader is encouraged to that verify). Also, the pre-
ceding equation assumes the interarrival times as being purely continuous.
We will continue to treat the interarrival times that way with the under-
standing that if there were discrete-valued point masses, then the Riemann
integral would be replaced by the Lebesgue integral.
The case not considered in the preceding text is when there are actually
k = i + 1 departures, where i is the number of customers in the system when
the previous arrival occurred. Then, when the next customer arrives, there
would be no other customer in the system. However, the probability of going
from i to 0 in the DTMC is not ai+1 . This is because ai+1 denotes the probabil-
ity there are exactly i+1 departures during one interarrival time interval. But,
170 Analysis of Queues
the i + 1 departures would have occurred before the interarrival time period
ended and if there were more in the system, perhaps there could have been
more departures. Hence, if there were an abundant number of customers in
the system, then during the interarrival time interval, there would be i + 1
or more departures. Hence, the probability of transitioning from state i to 0
in the DTMC is ai+1 + ai+2 + ai+3 + · · · , which we call bi as defined earlier.
Notice that the rows add to 1 in the P matrix and this is a lower Hessenberg
matrix.
Having modeled the G/M/1 queue as a DTMC, next we analyze the
steady-state behavior and derive performance measures. Let π∗j be the
limiting probability that in the long run an arriving customer sees j in
the system, that is,
π∗j = lim P Xn∗ = j .
n→∞
The limiting distribution
π∗ = π∗0 π∗1 . . . , if it exists, can be obtained by
solving π∗ = π∗ P and π∗j = 1. The balance equations that arise out of
∗ ∗
solving for π = π P are
.. ..
. .
and we solve it using a technique we have not used before in this text. Since
there is a unique solution to the balance equations (if it exists), we try some
common forms for the steady state-probabilities. In particular, we try the
form π∗i = (1 − α)αi for i = 0, 1, 2, . . ., where α is to be determined. The justi-
fication for that choice is that for the M/M/1 system, π∗i is of that form. The
very first equation from the earlier set π∗0 = b0 π∗0 + b1 π∗1 + b2 π∗2 + b3 π∗3 + · · · ,
is a little tricky, but all others are straightforward. Plugging in π∗i = (1 − α)αi
and bi = ai+1 + ai+2 + ai+3 for i = 0, 1, 2, . . ., we get
= a0 (1 − α0 ) + a1 (1 − α1 ) + a2 (1 − α2 ) + a3 (1 − α3 ) + · · ·
∞
= (a0 + a1 + a2 + a3 + · · · ) − ai αi
i=0
∞
α= ai αi .
i=0
.. ..
. .
respectively, to
.. ..
. .
is the solution to α = ∞
which are all satisfied if α i
i = 0 ai α . Let us first write
∞
down the condition α = i=0 ai α in terms of the variables in the G/M/1
i
172 Analysis of Queues
∞
∞
∞ (μt)i
∞ ∞
(αμt)i
α= ai αi = αi e−μt dG(t) = e−μt dG(t)
i! i!
i=0 i=0 0 0 i=0
∞
= e−(1−α)μt dG(t) = G̃((1 − α)μ) = E e−(1−α)μTj
0
where G̃(s) is the LST of G(t) at some s ≥ 0. In summary, the limiting proba-
bility π∗i exists for i = 0, 1, 2, . . ., and is equal to π∗i = (1 − α)αi if there exists a
solution to α = G̃((1 − α)μ) such that α ∈ (0, 1). Next, we check the condition
when α = G̃((1 − α)μ) has a solution such that α ∈ (0, 1). As it turns out, that
would be the stability condition for the DTMC Xn∗ , n ≥ 0 .
We use a graphical method to describe the condition for stability for the
G/M/1 queue, which is the same as the condition for positive recurrence
for the DTMC Xn∗ , n ≥ 0 . We write G̃((1 − α)μ) as G̃(μ − αμ). Note that
∞
from the definition, G̃(μ − αμ) = 0 e−(1−α)μt dG(t), where α only appears
on the exponent, and G̃(μ − αμ) is a nondecreasing convex function of α.
Also, G̃(0) = 1 and hence one solution to α = G̃((1 − α)μ) is indeed α = 1.
With these properties of G̃(μ − αμ) in mind, refer to Figure 4.1. We plot
G̃(μ − αμ) versus α as well as the 45◦ line, that is, the function f (α) = α. Since
G̃(μ − αμ) is nondecreasing and convex it, would intersect the 45◦ line at two
points. If the slope of G̃(μ − αμ) at α = 1 is greater than 1, then G̃(μ − αμ)
would intersect the 45◦ line once at some α ∈ (0, 1). This is depicted in the
LHS of Figure 4.1. However, if the slope is less than 1, then the intersection
occurs at some α ≥ 1 depicted in the RHS of Figure 4.1. In fact, if the slope is
exactly 1, then the two points on intersection converge to one point and the
45◦ line just becomes a tangent (we do not show this in Figure 4.1). Therefore,
G˜ (μ – μα) G˜ (μ – μα)
α α
0 1 0 1
FIGURE 4.1
Two possibilities for G̃(μ − μα) vs. α.
General Interarrival and/or Service Times 173
the condition for stability is that the slope of G̃(μ − αμ) at α = 1 must be
greater than 1.
For this, we let α = 1 after computing dG̃(μ − αμ)/dα and require that
−μG̃ (0) > 1. However, G̃ (0) = − 1/λ since the first moment or mean interar-
rival time is 1/λ. Therefore, the condition for stability of the G/M/1
queue or
the condition for positive recurrence of the irreducible DTMC Xn∗ , n ≥ 0 is
λ
ρ= <1
μ
π∗j = (1 − α)αj
which is the unique solution to the DTMC balance equations, where α is the
solution in (0, 1) to
α = G̃(μ − μα).
Problem 33
For a stable G/M/1 queue with FCFS service policy, using the preceding
results derive the distribution for the sojourn time spent by a customer in
the system.
Solution
Let Y be the sojourn time experienced by an arbitrary arrival into the system
in steady state. This is also referred to as the waiting time or time in the sys-
tem. Since we know that in steady state the probability an arriving customer
sees j in the system is π∗j , we can obtain the LST of the distribution of Y by
conditioning on the number in the system as seen by an arrival. Therefore,
we have the LST as
∞ ∗
E e−sY = E e−sY X∞ = j π∗j
j=0
system upon arrival, the sojourn time for this customer is the sum of service
times of the j customers ahead as well as the customer’s own service times.
Hence, the conditional sojourn time is the sum of j + 1 exponentials with
parameter μ (i.e., according to Erlang(j + 1, μ)). Thus, we have
−sY ∞ ∗
Ee = E e−sY X∞ = j π∗j
j=0
∞
j+1 ∞
j+1
μ μ
= π∗j = (1 − α)αj
μ+s μ+s
j=0 j=0
∞
μ μα j μ(1 − α)
= (1 − α) =
μ+s μ+s s + μ(1 − α)
j=0
where the last equation uses the fact that since 0 < α < 1 and 0 < μ/(μ + s) < 1,
the infinite geometric sum converges. Therefore, Y ∼ exp(μ(1 − α)), that is,
the sojourn time is exponentially distributed with parameter μ(1 − α).
Using the preceding results, we can immediately see that the average
time in the system (sojourn time or waiting time) is
1
W = E[Y] = .
μ(1 − α)
Then, from Little’s law we have the average number of customers in the
system as
λ
L= .
μ(1 − α)
Problem 34
Consider a stable G/M/1 queue and let X(t) be the number of customers in
the system at time t. Define pj as the probability that there are j in the system
in steady state, that is,
p0 = 1 − ρ
pj = ρπ∗j−1 for j > 0
where ρ = λ/μ.
Solution
Let An be the time of the nth arrival in the system with A0 = 0. The bivari-
ate stochastic process Xn∗ , An , n ≥ 0 isa Markov renewal sequence with
∗
Kernel K(x) = [Kij (x)] such that Kij (x) = P Xn+1 = j, An+1 − An ≤ x Xn∗ = i .
Actually the expression
for the Kernel is not necessary; all we need is to
realize that π∗ = π∗j satisfies π∗ = π∗ K(∞) and we already have π∗j . Also,
since the arrivals
are renewal,
we have the conditional expected sojourn
times βj = E An+1 − An Xn∗ = j = 1/λ for all j. Now, the stochastic pro-
cess {X(t), t ≥ 0} is a Markov regenerative process
(MRGP) with embedded
Markov renewal sequence Xn∗ , An , n ≥ 0 . Therefore, from MRGP theory
(see Section B.3.3), we have
∞
π∗i γij
pj = i=0
∞
π∗i βi
i=0
where γij is the expected time spent in state j between time An and An+1
given that Xn∗ = i. We first compute γij and then derive pj . For that, we use the
indicator function IA , which is one if A is true and zero if A is false. Using the
definition of γij , the indicator function, and properties of Poisson processes,
we can derive the following:
⎡ ⎤
An+1
∗
γij = E ⎣ I{X(t)=j} dt Xn = i⎦
An
⎡ ⎤
∞ u ∗
= E ⎣ I{X(t)=j} dt X0 = i, A1 = u⎦ dG(u)
0 0
176 Analysis of Queues
∞ u
= P{X(t) = j|X(0) = i}dt dG(u)
0 0
∞ u (μt)i+1−j
= e−μt dt dG(u)
(i + 1 − j)!
0 0
for i + 1 ≥ j and j ≥ 1. For a given j such that j ≥ 1, from MRGP theory (and
using the fact that βi = 1/λ and γij = 0 if i + 1 < j), we have
∞
π∗i γij
pj = i=0
∞
π∗i βi
i=0
∞ ∞u (μt)i+1−j
π∗i e−μt dt dG(u)
i=j−1 0 0 (i + 1 − j)!
= ∞ 1
π∗
i=0 i λ
∞ u ∞
(μt)i+1−j
=λ e−μt (1 − α)αi dt dG(u)
(i + 1 − j)!
0 0 i=j−1
∞ u
=λ e−μt (1 − α)αj−1 eαμt dt dG(u)
u=0 t=0
λ −(1−α)μt
∞ u
= e (1 − α)μαj−1 dt dG(u)
μ
u=0 t=0
λ
∞
= 1 − e−(1−α)μt αj−1 dG(u)
μ
u=0
λ j−1
= α (1 − G̃(μ(1 − α)))
μ
λ j−1 λ
= α (1 − α) = π∗j−1
μ μ
where the last line uses the fact that α = G̃(μ(1 − α)). Therefore, for j ≥ 1,
pj = ρπ∗j−1 where ρ = λ/μ. From this, p0 can be easily computed as p0 = 1
− j≥1 pj = 1 − ρ.
General Interarrival and/or Service Times 177
Problem 35
Using the results derived for the G/M/1 queue, obtain α, π∗j , pj , P{Y ≤ y}, L,
and W when G(t) = 1 − e−λt for t ≥ 0 when the queue is stable.
Solution
Note that the interarrival times are exponentially distributed; therefore,
some of our results can be verified using those of the M/M/1 queue and
the reader is encouraged to do that. The LST of the interarrival time is
G̃(s) = λ/(λ + s). Therefore, we solve for α in α = G̃(μ−αμ), that is, α = λ/(λ+
(1 − α) μ). We get two solutions to that quadratic equation. Since we require
α ∈ (0, 1), we do not consider the solution α = 1. However, we know the
queue is stable (hence λ/μ < 1) and thus we have
λ
α= .
μ
j
λ λ
π∗j j
= (1 − α) α = 1 − .
μ μ
p0 = 1 − ρ = π∗0
pj = ρπ∗j−1 = (1 − ρ)ρj = π∗j for j > 0.
Therefore, pj = π∗j for all j, which is not surprising due to PASTA. Also notice
that pj is identical to what was derived in the M/M/1 queue analysis. Further,
1
W=
μ−λ
λ
L= .
μ−λ
In a similar manner, one can obtain the preceding expressions for other
interarrival time distributions as well, some of which are given in the
exercises at the end of the chapter. Before wrapping up this section on
DTMC-based analysis, it is worthwhile to describe one more example. This
is the G/M/2 queue. It is crucial to point out that the generic G/M/s queue
can hence be analyzed in a similar fashion. However, analyzing the M/G/s
queue using DTMC is quite intractable for s ≥ 2. The reason for that is the
G/M/s queue, if observed at arrivals, is Markovian, whereas the M/G/s
queue observed at departures is not Markovian. Now to the G/M/2 queue.
Problem 36
Consider a G/M/2 queue. Let Xn∗ be the number of customers just before
the nth arrival. Show that Xn∗ , n ≥ 0 is a DTMC by computing the transi-
tion probability matrix. Derive the condition for stability and the limiting
distribution for Xn∗ .
Solution
We begin with some notation. Let aj be the probability that j departures occur
between two arrivals when both servers are working throughout the time
between the two arrivals. Then
∞ (2μt)j
aj = e−2μt dG(t).
j!
0
Let cj be the probability that j departures occur between two arrivals where
both servers are working until the jth departure, after which only one server
is working but does not complete service. Then
∞
c0 = e−μt dG(t) = G̃(μ)
0
General Interarrival and/or Service Times 179
for j > 0,
∞ t (2μs)j−1
cj = e−μ(t−s) e−2μs 2μds dG(t)
(j − 1)!
0 0
⎧ ⎫
∞ ⎨ j−1
(μt) ⎬
i
= 2j e−μt 1 − e−μt dG(t)
⎩ i! ⎭
0 i=0
and bj is given by
j
bj = 1 − cj − ai−1 .
i=1
⎡ ⎤
b0 c0 0 0 ...
⎢ b1 c1 a0 0 ... ⎥
⎢ ⎥
⎢ b2 c2 a1 a0 ... ⎥
⎢ ⎥
P=⎢ b3 c3 a2 a1 ... ⎥.
⎢ ⎥
⎢ b4 c4 a3 a2 ... ⎥
⎣ ⎦
.. .. .. ..
. . . . ...
Let π∗j be the limiting probability that in steady state an arriving customer
sees j in the system, that is,
π∗j = lim P Xn∗ = j .
n→∞
1. The solution π∗j = βαj works for j > 0 if there is a unique solution
to α = G̃(2μ(1 − α)) such that α ∈ (0, 1), which is the condition of
stability and can be written as λ/(2μ) < 1.
2. Also, π∗0 = βα[1 − 2G̃(μ)]/[(1 − 2α)G̃(μ)].
∞ ∗
3. Therefore, using j=0 πj = 1, we can derive that β = (1 − α)
(1 − 2α)G̃(μ)/[α(1 − α) − αG̃(μ)]. This can be used to compute
π∗ = π∗0 π∗1 . . . .
180 Analysis of Queues
It is worthwhile to point out that one can obtain the distribution of the
sojourn time in the system using an analysis similar to that in Problem 33.
That is left as an exercise for the reader.
L = lim E[Xn ],
n→∞
U = lim E[Un ].
n→∞
Having described the notation, now we are ready for MVA. We first write
down a relation between Xn and Xn+1 . If Xn > 0, then Xn+1 = Xn − 1 + Un+1
because the number in the system as seen by the n + 1st departure is equal
to what the nth departure sees (which is Xn and also includes the n + 1st
customer since Xn > 0) plus all the customers that arrived during the service
of the n + 1st customer minus one (since only the number remaining in the
system is described in Xn+1 ). However, if Xn = 0, then Xn+1 = Un+1 since
when the nth customer departs, the system becomes empty, and then the
n + 1st customer arrives and starts getting served immediately, and all the
customers that showed up during that service would remain when the n+1st
customer departs. Thus, we can write down the following relation:
= E[(Xn − 1)+ |Xn > 0]P(Xn > 0) + E[(Xn − 1)+ |Xn = 0]P(Xn = 0)
+ E[Un+1 ]
U = 1 − π0 .
λ
U = lim E[Un ] = lim E[E[Un |Sn ]] = lim E[λSn ] = .
n→∞ n→∞ n→∞ μ
π0 = 1 − ρ.
Notice that the preceding equation was derived in the M/G/1 analysis using
DTMCs. Also, the condition 0 < π0 < 1 implies ρ < 1, which is the stability
condition, and thus L < ∞.
Continuing with the MVA by squaring Equation 4.12, we get
2
Xn+1 = (Xn − 1)2 I{Xn >0} + 2Un+1 (Xn − 1)I{Xn >0} + Un+1
2
where I{Xn >0} is an indicator function that is one if Xn > 0 and zero otherwise.
Taking the expected value of the preceding equation, we get
2
E Xn+1 = E Xn2 − 2Xn + 1 I{Xn >0}
2
+ 2E[Un+1 ]E[(Xn − 1)I{Xn >0} ] + E Un+1 (4.13)
since Un+1 is independent of (Xn −1)I{Xn > 0} . We derive each term of the RHS
of Equation 4.13 separately starting from the right extreme.
Conditioning
2 on
the service time of the n + 1st customer, we get E Un+1
2 = E E Un+1 |Sn+1
= E[Var[Un+1 |Sn+1 ] + {E[Un+1 |Sn+1 ]}2 ] = E λSn+1 + λ2 S2n+1 = λE[Sn+1 ] +
λ2 E S2n+1 = ρ + λ2 σ2 + ρ2 . Using an identical argument described earlier to
compute E[(Xn − 1)+ ] (see expressions following Equation 4.12), we have the
middle term E[(Xn − 1)I{Xn > 0} ] = E[Xn ] − P(Xn > 0). Of course, we also saw
General Interarrival and/or Service Times 183
earlier that E[Un+1 ] = ρ, which leaves us with the first expression that can be
derived as follows:
E Xn2 − 2Xn + 1 I{Xn >0} = E Xn2 − 2Xn + 1 I{Xn >0} |Xn > 0 P(Xn > 0)
+E Xn2 − 2Xn + 1 I{Xn >0} |Xn = 0 P(Xn = 0)
=E Xn2 − 2Xn + 1 |Xn > 0 P(Xn > 0)
=E Xn2 − 2Xn |Xn > 0 P(Xn > 0) + P(Xn > 0)
=E Xn2 − 2Xn |Xn > 0 P(Xn > 0)
+E Xn2 − 2Xn |Xn = 0 P(Xn = 0)
+ P(Xn > 0)
= E Xn2 − 2Xn + P(Xn > 0).
2
E Xn+1 = E Xn2 − 2E[Xn ] + P(Xn > 0) + 2ρE[Xn ]
− 2ρP(Xn > 0) + ρ + λ2 σ2 + ρ2 .
Taking the limit as n → ∞, canceling the LHS with the first term in the RHS,
and rearranging the terms we get
λ2 σ 2 + ρ 2
L=ρ+ .
2(1 − ρ)
1
λ= ,
E[Tn ]
1
μ= ,
E[Sn ]
TABLE 4.1
Notation for the G/G/1 MVA
An Time of nth arrival
Tn+1 = An+1 − An The n + 1st interarrival time
Sn Service time of the nth customer
Wn Time spent (sojourn) in the system by the nth customer
Dn = An + Wn Time of nth departure
In+1 = (An+1 − An − Wn )+ Idle time between nth and n + 1st service
General Interarrival and/or Service Times 185
Var[Tn ]
C2a = = λ2 Var[Tn ],
(E[Tn ])2
Var[Sn ]
C2s = = μ2 Var[Sn ].
(E[Sn ])2
W = lim E[Wn ],
n→∞
Id = lim E[In ],
n→∞
I(2) = lim E In2 .
n→∞
Using all the preceding definitions, we carry out MVA by first writing down
the sojourn time of the n + 1st customer, Wn+1 being equal to that customer’s
service time plus any time the customer spent waiting for service to begin
(this happens if the customer arrived before the previous one departed). In
other words,
Using the definitions of Dn , Tn , and In in Table 4.1, we can write down the
following sets of equations:
Now, by letting n → ∞ and using the notation defined earlier, we see that
1 1
W= + W − + Id .
μ λ
186 Analysis of Queues
1 1 (1 − ρ)
Id = − =
λ μ λ
1. Recall the definitions of Wn+1 , Sn+1 , and In+1 . Notice that Wn+1 −
Sn+1 is the time the n + 1st customer waits for service to begin and
In+1 is the idle time between serving the nth and n + 1st customers.
Based on that, we have (Wn+1 − Sn+1 )In+1 = 0, since when there is a
nonzero idle time, the n + 1st customer does not wait for service and
vice versa.
2. The time a customer waits for service to begin is independent of the
service time of that customer, hence (Wn+1 − Sn+1 ) is independent
of Sn+1 .
3. Also, the sojourn time of the nth customer is independent of the
time between the nth and n + 1st arrivals. Hence, we have Wn
independent of Tn+1 .
Now using the facts that (Wn+1 − Sn+1 )In+1 = 0, (Wn+1 − Sn+1 ) and Sn+1 are
independent, as well as Wn is independent of Tn+1 , and taking the expected
value of the preceding expression, we get
2
E Wn+1 − 2E[(Wn+1 − Sn+1 )]E[Sn+1 ] − E S2n+1 + E In+1
2
= E Wn2 − 2E[Wn ]E[Tn+1 ] + E Tn+1
2
.
General Interarrival and/or Service Times 187
2 2
Notice that E[Tn+1 ] = 1/λ, E[Sn+1 ] = 1/μ, E Tn+1 = Ca + 1 /λ2 , and
2 2
E Sn+1 = Cs + 1 /μ2 . Making those substitutions and taking the limit as
n → ∞, we get
where ρ = λ/μ.
The only unknown quantity in the preceding expression for W is I(2) .
Therefore, suitable bounds and approximations for W can be obtained by
cleverly bounding and approximating I(2) . Section 4.3 is devoted to bounds
and approximations for queues, and to obtain some of those bounds, we will
use Equation 4.14. However, for the sake of completing this analysis, we
present a simple upper bound for W. Since the variance of the idle time for a
server between customers must be positive, we have I(2) ≥ (Id )2 = (1−ρ)2 /λ2 .
Thus, we have −λ2 I(2) ≤ − (1 − ρ)2 and plugging into Equation 4.14, we get
1 ρ2 C2s + C2a
W≤ + .
μ 2λ{1 − ρ}
A key point to notice is that the preceding bound only depends on the mean
and variance of the interarrival times and service times. Therefore, we really
do not need the entire distribution information. Of course, the preceding
bound for W is quite weak and one can obtain much better bounds and
approximations that we would describe in Section 4.3, which would also
use only λ, μ, C2a , and C2s . However, before that we present another result for
G/G/1 queue using the MVA results.
1 1 1 1 1
lim E(Vn+1 ) = Id + = − + = .
n→∞ μ λ μ μ λ
This is not a surprising result, as when the queue is stable the average depar-
ture rate is the same as the average arrival rate as no customers are created
or destroyed in the queue (see conservation law in Section 1.2.1).
The SCOV of the interdeparture times C2d is a little more involved. For
that, we go back to Equation 4.15. Since In+1 is independent of Sn+1 , tak-
ing variance on both sides of Equation 4.15 we get Var(Vn+1 ) = Var(In+1 ) +
Var(Sn+1 ). By letting n → ∞ we obtain
C2s
lim Var(Vn+1 ) = I(2) − Id2 + .
n→∞ μ2
However, using the definition of C2d and substituting for I(2) from
Equation 4.14, we get
Var(Vn+1 ) (2) C2s
C2d = lim 2 2
= λ I − Id + 2
n→∞ (E[Vn+1 ])2 μ
The reason this is written in terms of W is that now we only need a good
approximation or bound for W. Once we have that, we get a good bound
or approximation for C2d automatically. Hence, in the next section we mainly
focus on obtaining bounds and approximations for only W.
We begin with the single server G/G/1 queue and continue from where we
left off in the previous section. Then we show bounds and approximations
for multiserver queues for the remainder of this section.
L = λW,
Lq = λW − ρ,
1
Wq = W − .
μ
1 ρ2 C2s + C2a
W≤ + .
μ 2λ{1 − ρ}
190 Analysis of Queues
F (x)
h(x) =
1 − F(x)
where F(x) = P{X ≤ x}, the CDF and F (x) its derivative. Some ran-
dom variables are such that h(x) increases with x and they are called
IFR (increasing failure rate) random variables. There are also some
random variables such that h(x) decreases with x and they are called
DFR (decreasing failure rate) random variables. Of course, there are
many random variables that are neither IFR or DFR and the follow-
ing result cannot be used for those. For two positive-valued random
variables Y and Z, if Y is IFR, we have
E Y2
E {(Y − Z)+ }2 ≤ E[(Y − Z)+ ],
E[Y]
Actually, the preceding results do not require IFR or DFR but a much
weaker condition (that they be decreasing or increasing mean resid-
ual life, respectively). However, we use the stronger requirement of
IFR or DFR to obtain the following bound by letting n → ∞ in the
preceding expressions to get I(2) ≤ C2a + 1 /λId if interarrival times
are IFR and I(2) ≥ C2a + 1 /λId if interarrival times are DFR. Plugging
into Equation 4.14, we get
ρ C2a − 1 + ρ + ρ2 C2s 1
W≥ + if interarrival time is IFR,
2λ(1 − ρ) μ
2
ρ Ca − 1 + ρ + ρ2 C2s 1
W≤ + if interarrival time is DFR.
2λ(1 − ρ) μ
Approximation W
ρ21+C2s C2a +ρ2 C2s 1
1 2λ(1−ρ) +
1+ρ2 C2s μ
ρ 1+C2s ρ(2−ρ)C2a +ρ2 C2s
2 2λ(1−ρ) + μ1
2−ρ+ρC2s
ρ2 C2a +C2s 1−C2a C2a ρ
3 2λ(1−ρ) + 2λ + μ1
192 Analysis of Queues
There are other approximations for heavy-traffic queues that we will see
in the G/G/s setting where one can use s = 1 and get G/G/1 approximations.
The reader is encouraged to review those approximations as well. The lit-
erature also has several empirical approximations. Care must be taken to
ensure that the test cases that were used to obtain the empirical approxi-
mations and conclusions are identical to those considered by the reader. It is
worthwhile to point out that the steady-state mean waiting time and number
in the system can also be obtained using simulations that we use for testing
our approximations. In fact, one does not even need sophisticated simulation
software for that, we explain it next using an example.
Problem 37
For a G/G/1 queue, develop an algorithm to simulate and obtain the mean
number in the system in steady-state, given the CDF of interarrival times F(t)
and service times G(t).
Solution
Clearly from the problem description, for all n ≥ 0, F(t) = P{Tn ≤ t} and
G(t) = P{Sn ≤ t}. Using Un and Vn as uniform (0, 1) random variables that
come fairly standard in any computational package, we can obtain sam-
ples of Tn and Sn as F−1 (Un ) and G−1 (Vn ), respectively. Notice that F−1 (·)
is the inverse of the function F(·), for example, if F(t) = 1 − e−λt , then
Tn = F−1 (Un ) = (−1/λ)loge (1−Un ). Now we describe the following algorithm
using Tn and Sn for all n:
Problem 38
Consider a G/G/1 queue where the service times are IID uniform random
variables between 0 and 2/μ. Obtain the mean waiting time using simu-
lations when interarrival times (Tn ) are according to a Pareto distribution
whose CDF is
'
1 − (K/x)β if x ≥ K
P(Tn ≤ x) =
0 otherwise
√
with parameters β = 1 + 2 and K = (β − 1)/(βλ). Show that C2a = 1 and then
compare the mean waiting time with that of the M/G/1 queue. Use λ = 10
and μ = 15.
Solution
First let us analyze the arrival process. Using the CDF we can compute
Kβ 1
E[Tn ] = =
β−1 λ
when β > 1 (which is needed for the first equality and it is the case here) and
K2 β 1
Var[Tn ] = = 2
(β − 1) (β − 2)
2 λ
Notice that I(2) is the second moment of the server idle time between suc-
cessive arrivals. However, under heavy traffic, only for a small fraction of
time (1 − ρ) would the server experience nonzero idle time. Therefore, a
reasonable heavy-traffic approximation as ρ approaches one for the G/G/1
queue is
1 ρ2 C2s + C2a
W≈ + .
μ 2λ{1 − ρ}
1 ρ2 C2s + C2a
W≈ +
μ 2λ(1 − ρ)
General Interarrival and/or Service Times 195
Wq,M/M/s
Wq ≈ Wq,G/G/1
Wq,M/M/1
done. So the owner of TravHelp decided to call one of his former classmates
from Wharton who runs a professional consulting firm ProCon.
7000
6000
5000
4000
Frequency
3000
2000
1000
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Interarrival time (h) ×10−3
FIGURE 4.2
Histogram of interarrival times of customer calls.
immediately realized that this was because of the 32 separate queueing sys-
tems. He soon recalled a homework problem from his queueing theory class,
where it was shown that a single queue would be more efficient than having
multiple parallel lines.
for G/G/s queues in there that would be appropriate to use. The first was a
heavy-traffic approximation
ρ2 C2s + C2a
Wq ≈
2λ(1 − ρ)
αs 1 C2a + C2s
Wq ≈ ,
μ 1−ρ 2s
where αs = (ρs + ρ)/2 when ρ > 0.7 (in this case, ρ is well above 0.7). Plugging
in the numbers for Wq , he got the average waiting time for each customer to
speak to a rep as 0.29 min and 0.13 min using the first and second formula,
respectively.
Although Jacob realized these were approximations, he felt that they
were clearly lower than the 2 min wait that the current system customers
experience on average. Jacob wondered what if there were a fewer number
of reps. He checked for s = 116 reps (a reduction of 6 reps) and the average
wait times for customers to speak to a rep became 0.82 and 0.41 min using the
first and second formula, respectively. Jacob was thrilled, he looked at his
watch and there was enough time to grab a quick latte from a nearby coffee
shop before his meeting. At the brainstorming meeting, Jacob presented his
recommendation, which is to consolidate the 32 clusters into a single large
cluster. When a customer service call arrives, it would go to a free rep if one
is available, otherwise the call would be put on hold until a rep became free.
Jacob also suggested reducing the number of reps to 116. His colleagues liked
the idea. One of them also added another recommendation: to use a monitor
to display the number of customers on hold that all the reps can see. That
way, if the number of customers on hold is zero, then the reps who are busy
can spend more time talking to their customers projecting greater concern.
Whereas if there is a large number of calls on hold, reps can quickly wrap
up their calls. Jacob liked the idea, and when he saw his latte he realized
his coffee shop also adopts a similar notion where if there is a long line, the
orders are taken quickly and if it is empty, the workers spend time chatting
with customers.
Jacob rechecked all his calculations to make sure everything was alright.
Then he proceeded to TravHelp. He made the recommendations that were
discussed. TravHelp decided to adopt them but continue with the 122 reps.
It was an easy redesign for them and they assigned calls to available reps
in some fair round-robin fashion. TravHelp also decided to use monitors
that were spread throughout the call center and reps knew how long the
200 Analysis of Queues
lines were at all times. TravHelp monitored their system as usual and also
collected data electronically as they had done before. Jacob told TravHelp he
would return in a week to see how things were going and analyze the data
to see the actual improvements.
the managers in TravHelp that the geographic clusters were mainly for per-
sonnel reasons (reps started at different times and to accommodate that they
used different time zones). So this time Jacob carefully redesigned the system
with two layers of reps. In the first layer, he recommended a set of 20 reps
that made the initial contact to determine the appropriate cluster to forward
the call. These calls lasted less than 30 s each. At the second layer, there were
23 specialized clusters each with 3–5 reps, as well as a large pool of 40 reps.
The specialized small clusters were for the large volume of quick calls that
were either client specific or for a single service type. The remaining calls
were being handled by the large pool.
To develop this design, Jacob had to perform several what-if analyses
and used queueing approximations to obtain quick results. He also worked
closely with the managers and reps at TravHelp to understand the implica-
tions and estimate quantities for service times. Finally, before implementing
the solution, Jacob developed a simulation of the system. It revealed that the
average wait time of customers (not including the time they spend speaking
to reps) was less than a minute. However, if one were to classify customers
into groups, those that require longer service times and those that require
short ones, then the wait times were larger for the former set of customers.
But this was in line with customers’ expectations. Thus, this differentiated
service was palatable to customers as well as clients and the reps’ morale was
restored. TravHelp implemented the new system and the results matched
those predicted by Jacob’s simulations. Jacob was delighted about that. He
was also appreciative of the use of queueing approximations for doing quick
analysis. And last but not the least, he realized the importance of consid-
ering behavioral aspects while making decisions, the criticality to talk to
the individuals involved to understand the situation better, and finally that
perceptions of customers is a crucial thing to consider.
way. However, for a simple system like a G/G/s queue, it is fairly compu-
tationally intensive. But this is a powerful technique that can be effectively
used in a wide variety of applications beyond queueing.
where
T is an m × m matrix,
T∗ an m × 1 vector,
0 a 1 × m vector of zeros.
dF(y)
f (y) = = p exp(Ty)T∗ .
dy
Notice the use of exponential of a matrix that is defined for a square matrix
M as
M2 M3
exp(M) = I + M + + + ···
2! 3!
The key idea is that if a positive-valued random variable X with CDF G(·)
needs to be approximated as a phase-type distribution Y with CDF F(·),
then there exists at least one m, p, and Q that would ensure that F(y) is
arbitrarily close to G(y) for all y ≥ 0. However, in practice, choosing or
finding the appropriate m, p, and Q is nontrivial. To alleviate that concern
of over-parameterization, one typically considers the following special types
of phase-type distributions (with much fewer parameters to estimate):
Before concluding this section and analyzing phase-type queues, here are
a few words in terms of fitting a positive-valued random variable X with CDF
G(·) as a phase-type distribution Y with CDF F(·). There are several papers
that discuss the selection of m, p, and Q so that the resulting phase-type dis-
tribution fits well. A recent paper (Fackrell [34]) nicely summarizes various
General Interarrival and/or Service Times 205
L ⊗ M = [lij M].
then we have
⎡ ⎤
l11 m11 l11 m12 l12 m11 l12 m12 l13 m11 l13 m12
⎢ l11 m21 l11 m22 l12 m21 l12 m22 l13 m21 l13 m22 ⎥
⎢ ⎥
⎢ l21 m11 l21 m12 l22 m11 l22 m12 l23 m11 l23 m12 ⎥
L⊗M=⎢
⎢
⎥.
⎥
⎢ l21 m21 l21 m22 l22 m21 l22 m22 l23 m21 l23 m22 ⎥
⎣ l31 m11 l31 m12 l32 m11 l32 m12 l33 m11 l33 m12 ⎦
l31 m21 l31 m22 l32 m21 l32 m22 l33 m21 l33 m22
L ⊕ M = L ⊗ IM + IL ⊗ M.
(although if X(t) = 0, then the state is just {(X(t), Z1 (t), . . . , ZK (t)}) is a CTMC.
The CTMC has lexicographically ordered states with infinitesimal generator
(i.e., Q) matrix of the QBD block diagonal form:
The matrices, Bi,j , correspond to transition rates from states where the num-
ber in the system is i to states where the number in the system is j for i, j ≤ s.
Also, A0 , A1 , and A2 are identical to those in the QBD description and valid
when there are more than s customers in the system. We determine the
matrices A0 , A1 , A2 , and Bi,j using Kronecker sums and products as follows
(for i = 1, . . . , s):
,
A2 = I+K ⊗ TS∗ ⊗ pS
i=1 mA,i
s
,
A1 = TA,1 ⊕ TA,2 ⊕ · · · ⊕ TA,K ⊕ (TS )
s
∗ ∗ ∗
A0 = TA,1 ⊗ pA,1 ⊕ TA,2 ⊗ pA,2 ⊕ · · · ⊕ TA,K ⊗ pA,K ⊗ I(mS )s
,
where (M) is the Kronecker sum of matrix M with itself j times, that is,
j
, ,
(M) = M ⊕ M and (M) = M ⊕ M ⊕ M.
2 3
Having modeled the system as a QBD, the next step is to calculate
the steady-state probabilities using MGM. Notice that this is just a minor
extension of the MGM described in Chapter 3 and we will just use the
same analysis here. The reader is encouraged to read Section 3.2.2 before
proceeding. First, we assume that A0 + A1 + A2 is an irreducible infinitesi-
mal generator with stationary probability π (a 1 × m row vector) such that
π(A0 + A1 + A2 ) = 0 and π1 = 1, where for i = 0, 1, i is a column vector of i’s.
The irreducibility assumption is automatically satisfied for phase-type dis-
tributions such as hypoexponential, hyperexponential, and Coxian. Once π
is obtained, the condition that the PH/PH/s queue is stable is
and this usually corresponds to the total mean service rate due to all s servers
being larger that the arrival rate on average.
If the queue is stable, then the next step is to find R that is the minimal
nonnegative solution to the equation
A0 + RA1 + R2 A2 = 0.
p0 B0,0 + p1 B1,0 = 0
p0 B0,1 + p1 B1,1 + p2 B2,1 = 0
p1 B1,2 + p2 B2,2 + p3 B3,2 = 0
p2 B2,3 + p3 B3,3 + p4 B4,3 = 0
.. .. ..
. . .
ps−2 Bs−2,s−1 + ps−1 Bs−1,s−1 + ps Bs,s−1 = 0
ps−1 Bs−1,s + ps A1 + ps RA2 = 0
p0 1 + p1 1 + · · · + ps−1 1 + ps (I − R)−1 1 = 1
lim P{X(t) = j} = pj 1.
t→∞
∞
∞
Lq = ips+i 1 = ips Ri 1 = ps R(I − R)−2 1.
i=1 i=1
Using λ as the mean arrival rate and μ as the mean service rate for each
server (both of which can be computed from the phase-type distributions),
we can write down Wq = Lq /λ, W = Wq + 1/μ, and L = λW. However, what
is not particularly straightforward is the sojourn time distribution. For that
we need to know the arrival point probabilities in steady state, that is, the
distribution of the state of the system when an entity arrives into the system.
Once that is known by conditioning on the arrival point probabilities, then
computing the LST of the conditional sojourn time for the arriving customer,
and then by unconditioning, we can obtain the sojourn time distribution.
Even for a simple example, this computation is fairly tedious and hence not
presented here. In the next section, we consider an example application to
illustrate the results seen in this section.
Likewise, the second arrival stream as well as service times are two-phase
Coxian distributions. In particular, the second arrival stream has parameters
mA,2 = 1, pA,2 = [1 0],
−γ1 βγ1 ∗ (1 − β)γ1
TA,2 = and TA,1 = .
0 −γ2 γ2
Each entity requires a service at one of the two identical servers so that the
service time is according to a Coxian distribution with parameters mS = 1,
pS = [1 0],
−μ1 δμ1 ∗ (1 − δ)μ1
TS = and TA,1 = .
0 −μ2 μ2
Problem 39
A semiconductor wafer fab has a bottleneck workstation with two iden-
tical machines. Products arrive into the workstation from two sources
and they wait in a line to be processed by one of the two identical
machines. Data suggests that the arrival streams as well as service times
can be modeled as two-phase Coxian distributions described earlier. In
particular, for the first arrival stream (λ1 , α, λ2 ) = (20, 0.25, 5), for the sec-
ond arrival stream (γ1 , β, γ2 ) = (9.091, 0.9, 10), and for the service times
(μ1 , δ, μ2 ) = (10, 0.3333, 20). Model the system as a CTMC by writing down
the infinitesimal generator in QBD form using Kronecker sums and products.
General Interarrival and/or Service Times 211
Then solve for the steady-state probabilities to obtain the average number of
products in the system in the long run.
Solution
As defined earlier, X(t) is the number of products in the system, Zi (t) is
the phase of the ith arrival process, and Ui (t) is the phase ith service pro-
cess if there is a product in service, all at time t and for i = 1, 2. Then the
multidimensional stochastic process
where
A 2 = I4 ⊗ TS∗ ⊗ pS ⊕ TS∗ ⊗ pS
A1 = TA,1 ⊕ TA,2 ⊕ TS ⊕ TS
∗ ∗
A0 = TA,1 ⊗ pA,1 ⊕ TA,2 ⊗ pA,2 ⊗ I4
B1,0 = I4 ⊗ TS∗
Notice that we have numerical values for TA,i and pA,i for i = 1, 2 as well
as TS and pS . Therefore, the preceding matrices can be computed. However,
since some of the matrices are too huge to be displayed in the following text
(e.g., A0 , A1 , and A2 are 16 × 16 matrices), we only show a few computa-
tions to illustrate the Kronecker product and sum calculations. In particular,
verify that
⎡ ⎤
−29.091 8.182 5 0
⎢ 0 −30 0 5 ⎥
B0,0 = TA,1 ⊕ TA,2 =⎢
⎣
⎥,
0 0 −14.091 8.182 ⎦
0 0 0 −15
A0 + RA1 + R2 A2 = 0.
General Interarrival and/or Service Times 213
p0 B0,0 + p1 B1,0 = 0
p0 B0,1 + p1 B1,1 + p2 B2,1 = 0
p1 B1,2 + p2 A1 + p2 RA2 = 0
p0 1 + p1 1 + p2 (I − R)−1 1 = 1
lim P{X(t) = j} = pj 1.
t→∞
∞
∞
Lq = ip2+i 1 = ip2 Ri 1 = p2 R(I − R)−2 1.
i=1 i=1
τ = 1/μ). We will typically consider the mean service time to be finite, that
is, τ < ∞. Also, for nontriviality we assume τ > 0. It is crucial to note that
there is no restriction in terms of the service time random variable; it could
be discrete, continuous, or a mixture of discrete and continuous (however,
our analysis is in terms of continuous, for the others the Riemann integral
must be replaced by the Lebesgue integral).
Some of the performance measures are relatively straightforward, such
as the sojourn time distribution is identical to the service time distribution.
Therefore, the key analysis is to obtain the distribution of the number in
the system. Define X(t) as the number (of entities) in the system at time t.
The objective of transient and steady-state analysis is to obtain a probabil-
ity distribution of X(t) for finite t and as t → ∞, respectively. We would
first present the transient analysis and then take the limit for steady-state
analysis.
Transient analysis typically depends on the initial state of the queue. To
obtain simple closed-form expressions, we need to make one of the following
three assumptions: (i) the queue is empty at t = 0, that is, X(0) = 0; (ii) the
queue started empty in the distant past, that is, X(−∞) = 0, and hence the
stochastic process {X(t), t ≥ 0} is stationary; and (iii) if X(0) > 0, then the
service for all the X(0) entities begin at t = 0. If one of the preceding three
assumptions are not satisfied, we will have to know the times when service
began for each of the X(0) customers in order to obtain a distribution for X(t).
For the transient analysis here, we make the first assumption, that is,
X(0) = 0. Hence, we are interested in computing pj (t) defined as
Note that if we made the second assumption, then X(t) would be according
to the steady state distribution for all t ≥ 0. However, if we made the third
assumption, then
min(i,j)
i
P{X(t) = j|X(0) = i, Bi } = [1 − G(t)]k [G(t)]i−k pj−k (t),
k
k=0
with the event Bi denoting that service for all i initial customers begins at
t = 0. Since this is straightforward once pj (t) is known, we continue with
obtaining pj (t) by making the first assumption.
Notice that {X(t), t ≥ 0} is a regenerative process with regeneration epochs
corresponding to when the queueing system becomes empty. To obtain pj (t),
consider an arbitrary arrival at time x such that x ≤ t. Let qx be the probability
that this arriving entity is in the system at time t. Clearly,
qx = 1 − G(t − x)
General Interarrival and/or Service Times 215
since it is the probability that this entity’s service time is larger than t − x. Let
{N(t), t ≥ 0} be a Poisson process with parameter λ that counts the number of
arrivals in time (0, t] for all t ≥ 0. Now consider a nonhomogeneous Bernoulli
splitting of the Poisson (arrival) process {N(t), t ≥ 0} such that with probabil-
ity qx an entity arriving at time x will be included in the split process. Since
the split process counts the number in the M/G/∞ system at time t, we have
⎧ ⎫ - t .j
⎨ t ⎬ λ 0 qx dx
pj (t) = exp −λ qx dx .
⎩ ⎭ j!
0
The proof is described in Kulkarni [67], Gross and Harris [49], and Wolff
[108]. It is based on conditioning the number of arrivals in time t and using
the fact that each of the given arrivals occur uniformly in [0, t]. The argument
is similar to the derivation of the number of departures from the M/G/∞
queue in time t in Problem 41.
Next we write down pj (t) in terms of entities given in the model, namely,
λ and G(t). As a result of using qx = 1 − G(t − x) and change of variables,
we get
⎧ ⎫ - t .j
⎨ t ⎬ λ 0 qx dx
pj (t) = exp −λ qx dx
⎩ ⎭ j!
0
⎧ ⎫ - t .j
⎨ t ⎬ λ 0 [1 − G(t − x)]dx
= exp −λ [1 − G(t − x)]dx
⎩ ⎭ j!
0
⎧ ⎫ - t .j
⎨ t ⎬ λ 0 [1 − G(u)]du
= exp −λ [1 − G(u)]du .
⎩ ⎭ j!
0
t
E[X(t)] = λ [1 − G(u)]du
0
and
t
Var[X(t)] = λ [1 − G(u)]du
0
216 Analysis of Queues
(λτ)j
pj = e−λτ ,
j!
∞
τ= [1 − G(u)]du.
0
In addition, the mean and variance of the number in the system in steady state
are both λτ since X is a Poisson random variable with parameter λτ. Note
that the mean and variance of the sojourn times correspond, respectively,
to the mean and variance of the service times since there is no waiting for
service to begin. Before concluding this section, it is worthwhile observ-
ing that the steady-state probability pj is identical to those of the M/M/∞
system, and thus pj does not depend on the CDF G(·) but just the mean
service time.
Problem 40
Obtain the distribution of the busy period, that is, the continuous stretch of
time when there are one or more entities in the system beginning with the
arrival of an entity into an empty system.
Solution
As we described earlier, {X(t), t ≥ 0} is a regenerative process with regen-
eration epochs corresponding to times when the number in the system
goes from 1 to 0. Each regeneration time corresponds to one idle period
followed by one busy period (time when there are one or more entities
in the M/G/∞ system). Let U be the regeneration time and U = I + B,
where the idle time I ∼ exp(λ) (i.e., time for next arrival in a Poisson pro-
cess), and the busy period B has a CDF H(·). We need to determine H(·).
For that, we develop the following explanation based on Example 8.17 in
Kulkarni [67].
Let F(·) be the CDF of U. Using a renewal argument by conditioning
on U = u, we can write down a renewal-type equation for p0 (t) = P{X(t) =
0|X(0) = 0} as
General Interarrival and/or Service Times 217
t ∞
p0 (t) = p0 (t − u)dF(u) + P{I > t|U = u}dF(u).
0 t
t ∞
p0 (t) = p0 (t − u)dF(u) + P{I > t|U = u}dF(u)
0 0
t
= p0 (t − u)dF(u) + P{I > t}
0
= p0 ∗ F(t) + e−λt
where G(·) is the CDF of the service times. One way to obtain the unknown
F(t) function is to numerically solve for F(t) in p0 (t) = p0 ∗ F(t) + e−λt and
similarly solve another convolution equation to get H(t). An alternate
approach, typically standard when there are convolutions, is to use trans-
forms. Therefore, taking the LST on both sides of the equation p0 (t) = p0 ∗
F(t) + e−λt , we get
s
p˜0 (s) = p˜0 (s)F̃(s) + .
s+λ
to obtain
s s
H̃(s) = 1 + − .
λ λp˜0 (s)
In the most general case, the preceding LST is not easy to invert and obtain
H(t). Another challenge is to obtain p˜0 (s) from p0 (t), which is not trivial.
However, there are several software packages available (such as MATLAB
218 Analysis of Queues
eλτ − 1
E[B] = −H̃ (0) = .
λ
Another way to obtain it is to use regenerative process results and solve for
E[B] in 1 − p0 = E[B]/(E[B] + 1/λ).
Problem 41
Compute the distribution of the interdeparture times both in the transient
case and in the steady-state case.
Solution
Since the output from an M/G/∞ may flow into some other queue, it is crit-
ical to analyze the departure process. Let D(t) be the number of departures
from the M/G/∞ system in time [0, t] given that X(0) = 0. For any arbitrary t,
we seek to obtain the distribution of the random variable D(t) and thereby
characterize the stochastic process {D(t), t ≥ 0}. Similar to the analysis for the
number in the system, here too we first consider transient and then describe
steady-state results. The results follow the analysis in Gross and Harris [49].
However, it is crucial to point out that there are many other elegant ways
of analyzing departures from M/G/∞ queues and extending them, some of
which we will see toward the end of this section.
If we are given that n arrivals occurred in time [0, t], then using stan-
dard Poisson process results we know that the time of arrival of any of
the n arrivals is uniformly distributed over [0, t] and it is independent of
the time of other arrival times. Therefore, consider one of the n arrivals
that occurred in time [0, t]. The probability θ(t) that this entity would have
departed before time t can be obtained by conditioning on the time of arrival
and unconditioning as
1 1
t t
θ(t) = G(t − x)dx = G(u)du.
t t
0 0
In addition, the probability that out of the n arrivals in time [0, t], exactly i of
those departed before time t is ni [θ(t)]i [1 − θ(t)]n−i .
Now, to compute the distribution of D(t), we condition on N(t), which is
the number of arrivals in time [0, t]. In order to remind us that we do make
the assumption that X(0) = 0, we include this condition in the expressions.
Therefore, we have
General Interarrival and/or Service Times 219
∞
(λt)n
P{D(t) = i|X(0) = 0} = P{D(t) = i|N(t) = n, X(0) = 0}e−λt
n!
n=i
∞
n (λt)n
= [θ(t)]i [1 − θ(t)]n−i e−λt
i n!
n=i
∞
[θ(t)]i −λt (λt)n−i
= e (λt)i [1 − θ(t)]n−i
i! (n − i)!
n=i
[λtθ(t)]i −λtθ(t)
= e .
i!
Problem 42
Consider an extension to the M/G/∞ queue. The arrival process is Poisson,
however, the parameter of the Poisson process is time varying. The average
arrival rate at time t (for all t in (−∞, ∞)) is a deterministic function of t rep-
resented as λ(t). Hence, the arrival process is defined as a nonhomogeneous
Poisson process. Everything else is the same as the regular M/G/∞ queue.
We call such a system an Mt /G/∞ queue. Perform transient analysis for this
system.
Solution
This summary of results for the Mt /G/∞ queue is based on Eick, Massey,
and Whitt [27]. Recall that we need to make one of the three assumptions for
initial condition, otherwise we would need to know when service started for
220 Analysis of Queues
1
x
Ge (x) = P{Se ≤ x} = [1 − G(u)]du.
τ
0
with E[X(t)] = Var[X(t)] = μ(t). It is important to note that λ(·) is the func-
tion defined in this section and not make the mistake of thinking that at t = 0
the average number in the system is negative! In fact, observe that the num-
ber in the system at any time depends on the arrival rate Se time units ago.
Further, the departure process also has a similar time lag effect where the
average departure rate at time t is E[λ(t − S)] and the resulting process is
nonhomogeneous Poisson.
For u > 0 and any t,
Cov[X(t), X(t + u)] = E λ t − (S − u)+
e E[(S − u)+ ],
where the notation (y)+ is max(y, 0). In fact, this result can be derived for the
homogeneous case (which we have not done earlier but is extremely useful
especially in computer-communication traffic with long-range dependence).
For an M/G/∞ queue where λ(t) = λ,
There are several results for networks of Mt /G/∞ queues. Since the entities
do not interact and departure processes are Poisson, the analysis is fairly
General Interarrival and/or Service Times 221
convenient. The reader is referred to Eick et al. [27] as well as the references
therein for further results.
Notice that the methods used to analyze the M/G/∞ system and its
extensions are significantly different from the others in this book. In fact,
even the related system, the M/G/s/s queue would be analyzed in Section
4.5.3 differently using a multidimensional continuous-state Markov process.
To describe that method, we first explain the M/G/1 queue with a special dis-
cipline called processor sharing, then use the same technique for the M/G/s/s
queue subsequently.
the same as that of the M/G/1 queue with FCFS service discipline since both
queues are work conserving.
Now we model the M/G/1 processor sharing queue to obtain perfor-
mance measures such as distribution of the number in the system and mean
sojourn time. Let X(t) be the number of customers in the system at time t and
Ri (t) be the remaining service time for the ith customer in the system. The
multidimensional stochastic process {(X(t), R1 (t), R2 (t), . . . , RX(t) (t)), t ≥ 0}
satisfies the Markov property (since to predict the future states we only
need the present state and nothing from the past) and hence it is a
Markov process. However, notice that most of the elements in the state
space are continuous, unlike the discrete ones we have seen before. Typ-
ically such Markov processes are difficult to analyze unless they have a
special structure like this one (and the M/G/s/s queue we will see in
Section 4.5.3).
Define Fn (t, y1 , y2 , . . . , yn ) as the following joint probability
∂ n Fn (t, y1 , y2 , . . . , yn )
fn (t, y1 , y2 , . . . , yn ) = .
∂y1 ∂y2 . . . ∂yn
o(h)
lim = 0.
h→0 h
fn (t + h, y1 , y2 , . . . , yn )
n n+1
h
h h
+ (1 − λh) fn+1 t, y1 + , . . . , yi−1 + , y, yi
n+1 n+1
i=0 0
h h
+ , . . . , yn + dy
n+1 n+1
n
G (yi ) h
+ λh fn−1 t, y1 + , . . . , yi−1
n n−1
i=1
h h h
+ , yi+1 + , . . . , yn + + o(h). (4.17)
n−1 n−1 n−1
The preceding equation perhaps deserves some explanation. Since the ser-
vice discipline is processor sharing, if there is yi amount of service remaining
at time t + h, then at time t there would have been yi + h/n service remaining
when there are n customers in the system during time t to t+h. The probabil-
ity that there are no arrivals in a time-interval h units long is (1−λh)+o(h) and
the probability of exactly one arrival is λh + o(h). First consider the case that
there are no new arrivals in time t to t + h, then one of two things could have
happened: no service completions during that time interval (first expression
in the preceding equation) or one service completion such that at time t there
are n + 1 customers and the one with less than h/(n + 1) service remaining
would complete (second expression in the preceding equation). Therefore,
the first term is pretty straightforward and the second term incorporates via
the integral, the probability of having less than h/(n + 1) service in any of the
(n + 1) spots around the n customers in time t + h. The third term considers
the case of exactly one arrival. This arrival could have been customer i with
workload yi . Notice that the G (yi ) is the PDF of the service times at yi , how-
ever this could be any of the n customers with probability 1/n and hence the
summation.
To simplify Equation 4.17, we use the following Taylor-series expansion
h h h
fn t, y1 + , y2 + , . . . , yn + = fn (t, y1 , y2 , . . . , yn )
n n n
h ∂fn (t, y1 , y2 , . . . , yn )
n
+ + o(h),
n ∂yi
i=1
224 Analysis of Queues
h/(n+1)
h h h h
fn+1 t, y1 + , . . . , yi−1 + , y, yi + , . . . , yn + dy
n+1 n+1 n+1 n+1
0
h
= fn+1 t, y1 , . . . , yi−1 , 0, yi , . . . , yn + o(h).
n+1
fn (t + h, y1 , y2 , . . . , yn )
h ∂fn (t, y1 , y2 , . . . , yn )
n
= (1 − λh)fn (t, y1 , y2 , . . . , yn ) + (1 − λh)
n ∂yi
i=1
n
h
+ (1 − λh) fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0
n
G (yi ) h h
+ λh fn−1 t, y1 + , . . . , yi−1 + ,
n n−1 n−1
i=1
h h
yi+1 + , . . . , yn + + o(h).
n−1 n−1
fn (t + h, y1 , y2 , . . . , yn ) − fn (t, y1 , y2 , . . . , yn )
h
= −λfn (t, y1 , y2 , . . . , yn )
1 ∂fn (t, y1 , y2 , . . . , yn )
n
+ (1 − λh)
n ∂yi
i=1
n
1
+ (1 − λh) fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0
General Interarrival and/or Service Times 225
n
G (yi ) h h
+λ fn−1 t, y1 + , . . . , yi−1 + ,
n n−1 n−1
i=1
h h o(h)
yi+1 + , . . . , yn + + .
n−1 n−1 h
1 ∂fn (t, y1 , y2 , . . . , yn )
n
∂fn (t, y1 , y2 , . . . , yn )
= −λfn (t, y1 , y2 , . . . , yn ) +
∂t n ∂yi
i=1
n
1
+ fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0
n
G (yi )
+λ fn−1 (t, y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1
1 ∂fn (y1 , y2 , . . . , yn )
n
0 = −λfn (y1 , y2 , . . . , yn ) +
n ∂yi
i=1
n
1
+ fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0
n
G (yi )
+λ fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1
One way is to solve the balance equations by trying various n values start-
ing from 0. Another way is to find a candidate solution and check if it satisfies
the balance equation. We try the second approach realizing that if we have
a solution it is the unique solution. In particular, we consider the M/M/1
queue with processor sharing. There we can show (left as an exercise for the
226 Analysis of Queues
reader) that
/
n
fn (y1 , y2 , . . . , yn ) = (1 − ρ)λn [1 − G(yi )].
i=1
As a first step we check if the earlier solution satisfies the balance equations
for the M/G/1 with processor sharing case. +
In fact, when fn (y1 , y2 , . . . , yn ) = (1 − ρ)λn ni=1 [1 − G(yi )], it would
imply that
n
1
fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn ) = λfn (y1 , y2 , . . . , yn )
n+1
i=0
+
since fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn ) = (1 − ρ)λn+1 [1 − G(0)] ni=1 [1 − G(yi )] =
(y1 , y2 , . . . , yn ) since G(0) = 0. In addition, if fn (y1 , y2 , . . . , yn ) =(1 − ρ)
λfn+
λn ni=1 [1 − G(yi )], then
∂fn (y1 , y2 , . . . , yn )
= −λG (yi )fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn )
∂yi
since
∂fn (y1 , y2 , . . . , yn ) /
i−1 /
n
= (1 − ρ)λn (−G (yi )) [1 − G(yj )] [1 − G(yj )]
∂yi
j=1 j=i+1
/
i−1 /
n
= −λ(1 − ρ)G (yi )λn−1 [1 − G(yj )] [1 − G(yj )]
j=1 j=i+1
1 ∂fn (y1 , y2 , . . . , yn )
n
0 = −λfn (y1 , y2 , . . . , yn ) +
n ∂yi
i=1
n
1
+ fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0
n
G (yi )
+λ fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1
General Interarrival and/or Service Times 227
∞
∞ ∞ ∞
... fn (y1 , y2 , . . . , yn )dyn . . . dy2 dy1 = 1
n=0 y1 =0 y2 =0 yn =0
+
for fn (y1 , y2 , . . . , yn ) = (1 − ρ)λn ni=1 [1 − G(yi )] and hence is the steady-state
solution.
Now, to obtain the performance measures, let pi be the steady-state
probability and there are i in the system, that is,
∞ ∞ 1
[1 − G(yj )]dyj = yj G (yj )dyj = ,
μ
yj =0 yj =0
we have
∞ ∞ ∞
pi = ... fi (y1 , y2 , . . . , yi )dyi . . . dy2 dy1 = (1 − ρ)ρi .
y1 =0 y2 =0 yi =0
Notice that this is identical to the number in the system for an M/M/1 queue
with FCFS discipline. Thus, L = ρ/(1 − ρ) and W = 1/(μ − λ). It is also possi-
ble to obtain the expected conditional sojourn time for a customer arriving in
steady state with a workload S as S/(1 − ρ). It uses the fact that the expected
number of customers in the system throughout the sojourn time (due to sta-
tionarity of the stochastic process) is one plus the average number, that is,
1 + ρ/(1 − ρ) = 1/(1 − ρ). Hence, a workload of S would take S/(1 − ρ) time
to complete processing at a processing rate of 1.
call, this amounts to an arrival to the switch. If the caller hears a dial tone, it
means a line is available and the caller punches the number he or she wishes
to call. If a line is not available, the caller would get a tone stating all lines
are busy (these are also quite common in cellular phones where messages
such as “the network is busy” are received). The telephone switch has s lines
and each line is held for a random time S by a caller and this time is also
frequently known as holding times. The pioneering work by A. K. Erlang
resulted in the computation of the blocking probability (or the probability a
potential caller is rejected).
For this, let X(t) be the number of customers in the system at time
t and Ri (t) be the remaining service time at the ith busy server. The
multidimensional stochastic process {(X(t), R1 (t), R2 (t), . . . , RX(t) (t)), t ≥ 0}
satisfies the Markov property (since to predict the future states we only need
the present state and nothing from the past) and hence it is a Markov pro-
cess. It is worthwhile to make two observations here. First of all, this analysis
is almost identical to that of the M/G/1 processor sharing queue seen in the
previous section. Some of the terms used here such as o(h) have been defined
in that section and the reader is encouraged to go over that. Second, it is
possible to model the system as the remaining service time in each of the s
servers. However, additional constraints on whether or not the server is busy
imposes more bookkeeping. Hence, we just stick to the X(t) busy servers at
time t with the understanding that the alternating formulation could also
be used.
Define Fn (t, y1 , y2 , . . . , yn ) as the following joint probability
∂ n Fn (t, y1 , y2 , . . . , yn )
fn (t, y1 , y2 , . . . , yn ) = .
∂y1 ∂y2 . . . ∂yn
fn (t + h, y1 , y2 , . . . , yn )
= (1 − λh)fn (t, y1 + h, y2 + h, . . . , yn + h)
n h
+ (1 − λh) fn+1 (t, y1 + h, . . . , yi−1 + h, y, yi + h, . . . , yn + h)dy
i=0 0
n
G (yi )
+ λh fn−1 (t, y1 + h, . . . , yi−1 + h, yi+1 + h, . . . , yn + h) + o(h).
n
i=1
(4.18)
fn (t, y1 + h, y2 + h, . . . , yn + h) = fn (t, y1 , y2 , . . . , yn )
n
∂fn (t, y1 , y2 , . . . , yn )
+h + o(h),
∂yi
i=1
h
fn+1 (t, y1 + h, . . . , yi−1 + h, y, yi + h, . . . , yn + h)dy
0
fn (t + h, y1 , y2 , . . . , yn )
n
∂fn (t, y1 , y2 , . . . , yn )
= (1 − λh)fn (t, y1 , y2 , . . . , yn ) + (1 − λh)h
∂yi
i=1
n
+ (1 − λh) fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )h
i=0
n
G (yi )
+ λh fn−1 (t, y1 + h, . . . , yi−1 + h, yi+1
n
i=1
+ h, . . . , yn + h) + o(h).
fn (t + h, y1 , y2 , . . . , yn ) − fn (t, y1 , y2 , . . . , yn )
h
n
∂fn (t, y1 , y2 , . . . , yn )
= −λfn (t, y1 , y2 , . . . , yn ) + (1 − λh)
∂yi
i=1
n
+ (1 − λh) fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )
i=0
n
G (yi ) o(h)
+λ fn−1 (t, y1 + h, . . . , yi−1 + h, yi+1 + h, . . . , yn + h) + .
n h
i=1
1 ∂fn (t, y1 , y2 , . . . , yn )
n
∂fn (t, y1 , y2 , . . . , yn )
= −λfn (t, y1 , y2 , . . . , yn ) +
∂t n ∂yi
i=1
n
1
+ fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0
n
G (yi )
+λ fn−1 (t, y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1
that. Since the stochastic process {X(t), R1 (t), R2 (t), . . . , RX(t) (t)} is a stable
Markov process, in steady-state the stochastic process converges to a station-
ary process. In other words as t → ∞, we have ∂fn (t, y1 , y2 , . . . , yn )/∂t = 0 and
fn (t, y1 , y2 , . . . , yn ) converges to the stationary distribution fn (y1 , y2 , . . . , yn ),
that is, fn (t, y1 , y2 , . . . , yn ) → fn (y1 , y2 , . . . , yn ). Therefore, from the preceding
equation as we let t → ∞, we get the following balance equation:
n
∂fn (y1 , y2 , . . . , yn )
0 = −λfn (y1 , y2 , . . . , yn ) +
∂yi
i=1
n
+ fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn )
i=0
n
G (yi )
+λ fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1
λn /
n
fn (y1 , y2 , . . . , yn ) = K [1 − G(yi )],
n!
i=1
n
fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn ) = λfn (y1 , y2 , . . . , yn )
i=0
+
since fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn ) = (1 − ρ)[λn+1 /(n + 1)!][1 − G(0)] ni=1
[1 − G(yi )] = λ/(n + 1)f + n (y1 , y2 , . . . , yn ) with G(0) = 0. In addition, if fn (y1 ,
y2 , . . . , yn ) = K(λn /n!) ni=1 [1 − G(yi )], then
∂fn (y1 , y2 , . . . , yn ) λ
= − G (yi )fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn )
∂yi n
232 Analysis of Queues
since
∂fn (y1 , y2 , . . . , yn ) λn /
i−1 /
n
= K (−G (yi )) [1 − G(yj )] [1 − G(yj )]
∂yi n!
j=1 j=i+1
λn−1 / /
i−1 n
λ
= −K G (yi ) [1 − G(yj )] [1 − G(yj )]
n (n − 1)!
j=1 j=i+1
λ
= − G (yi )fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
+n
Thus, fn (y1 , y2 , . . . , yn ) = K(λn /n!) i=1 [1 − G(yi )] satisfies the balance
equation
n
∂fn (y1 , y2 , . . . , yn )
0 = −λfn (y1 , y2 , . . . , yn ) +
∂yi
i=1
n
+ fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn )
i=0
n
G (yi )
+λ fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1
s ∞ ∞ ∞
... fn (y1 , y2 , . . . , yn )dyn . . . dy2 dy1 = 1
n=0 y1 =0 y2 =0 yn =0
+n
for fn (y1 , y2 , . . . , yn ) = K(λn /n!) i=1 [1 − G(yi )] to get
1
K= .
s 1
(λ/μ)j
j=0 j!
General Interarrival and/or Service Times 233
∞ ∞ 1
[1 − G(yj )]dyj = yj G (yj )dyj = ,
μ
yj =0 yj =0
we have
∞ ∞ ∞ λi
pi = ... fi (y1 , y2 , . . . , yi )dyi . . . dy2 dy1 = K .
i!μi
y1 =0 y2 =0 yi =0
Therefore, for 0 ≤ i ≤ s,
(λ/μ)i /i!
pi = s .
k=0 (λ/μ) /k!
k
The Erlang loss formula due to A. K. Erlang is the probability that an arriving
customer is rejected (or the fraction of arriving customers that are lost in
steady state) and is given by
(λ/μ)s /s!
ps = s .
(λ/μ)i /i!
i=0
Notice that the distribution of the number in the system in steady state for
an M/G/s/s queue does not depend on the distribution of the service time.
Using the steady-state number in the system, we can derive
λ
L= (1 − ps ).
μ
Since the effective entering rate into the system is λ(1 − ps ), we get W =
1/μ. This is intuitive since there is no waiting for service for customers that
enter the system, the average sojourn time is indeed the average service time.
For the same reason, the sojourn time distribution for customers that enter
the system is same as that of the service time. In addition, since there is no
waiting for service, Lq = 0 and Wq = 0. We conclude by making a remark
without proof.
234 Analysis of Queues
Remark 9
in steady state is in fact a reversible process. In simple terms, that means if the
process is recorded and viewed backward, it would be stochastically identi-
cal to running it forward. One of the artifacts of reversibility is the existence
of product-form solutions such as the expression for fn (y1 , y2 , . . . , yn ). Further,
because of reversibility, the departures from the original system correspond
to arrivals in the reversed system. Therefore, the departure process from the
M/G/s/s queue is a Poisson process with rate (1 − ps )λ departures per unit
time on average.
Reference Notes
Unlike most of the other chapters in this book, this chapter is a hodgepodge
of techniques applied to a somewhat common theme of nonexponential
interarrival and/or service times. We start with DTMC methods, then gravi-
tate toward MVA, develop bounds and approximations, then present CTMC
models, and finally, some special-purpose models. For that reason, it has
been difficult to present the complete details of all the methods. In that light,
we have provided references along with the description so that the read-
ers can immediately get to the source
to find out the missing steps. These
include topics such as G/G/s and PH/PH/s queues, phase type distribu-
tions and fitting, M/G/∞ queue, M/G/s/s queue, and M/G/1 with processor
sharing. Leaving out some of the details was a difficult decision to make con-
sidering that most textbooks on queues also typically leave those out. But
perhaps there is a good reason for doing so. Nevertheless, thanks to Prof.
Don Towsley’s class notes for all the details on M/G/1 processor sharing
queues that was immensely useful here.
The approximations and bounds presented in this chapter using MVA are
largely due to Buzacott and Shanthikumar [15]. All the empirical approxima-
tions are from Bolch et al. [12]. However, topics such as M/G/1 and G/M/1
queues have been treated in a similar vein as Gross and Harris [49]. Many
of the results presented on those topics have also been heavily influenced
by Kulkarni [67]. A lot of the results presented here on those topics have
been explained in a lot more crisp and succinct fashion in Wolff [108]. Fur-
ther, there is a rich literature on using fluid and diffusion approximations
as well as methodologies to obtain tail distributions. The main reason for
leaving them out in this chapter is that those techniques lend themselves
General Interarrival and/or Service Times 235
Exercises
4.1 For a stable M/G/1 with FCFS service, derive the average sojourn
time in the system
1 λ (σ2 + 1/μ2 )
W= +
μ 2 1−ρ
the server begins service. Successive server vacations are IID ran-
dom variables. Let ψ(z) be the generating function of the number
of arrivals during a vacation (the vacation length may depend upon
the arrival process during the vacation). Let Xn be the number of
customers in the system after the nth service completion. Show
that {Xn , n ≥ 0} is a DTMC by describing the transition probability
matrix. Then:
(a) Show that the system is stable if ρ = λ/μ < 1.
(b) Assuming that ρ < 1, show that the generating function φ(z) of
the steady-state distribution of Xn is given by
1−ρ G̃(λ − λz)
φ(z) = (ψ(z) − 1),
m z − G̃(λ − λz)
1 ρ2 m(2)
L=ρ+ (1 + σ2 μ2 ) + ,
21−ρ 2m
where m(2) = ψ (1) is the second factorial moment of the number
of arrivals during a vacation.
4.7 Consider a G/M/1 queue with interarrival distribution
where 0 < r < 1 and λi > 0 for i = 1, 2. Find the stability condition and
derive an expression for pj , the steady-state probability that there are
j customers in the system.
4.8 A service station is staffed with two identical servers. Customers
arrive according to a PP(λ). The service times are IID exp(μ). Con-
sider the following two operational policies used to maintain two
separate queues:
(a) Every customer is randomly assigned to one of the two servers
with equal probability.
(b) Customers are alternately assigned to the two servers.
Compute the expected number of customers in the system in steady
state for both cases. Which operating policy is better?
4.9 Requests arrive to a web server according to a renewal process. The
interarrival times are according √to an Erlang distribution with mean
10 s and standard deviation 50 s. Assume that there is infinite
General Interarrival and/or Service Times 237
waiting room available for the requests to wait before being pro-
cessed by the server. The processing time (i.e., service time) for the
server is according to a Pareto distribution with CDF
β
K
G(x) = 1 − , if x ≥ K.
x
The mean service time is Kβ/(β − 1), if β > 1, and the variance of the
service time is K2 β/[(β − 1)2 (β − 2)] if β > 2. Use K = 5 and β = 2.25
so that the mean and standard deviation of the service times are
9 and 12 s, respectively. Using the results for the G/G/1 queue,
obtain bounds as well as approximations for the average response
time (i.e., waiting time in the system including service) for an arbi-
trary request in the long run. Pick any bound or approximation from
the ones given in this chapter.
4.10 Consider a stable M/G/1 queue with PP(λ) arrivals and G̃(w) as the
LST of the CDF of the service times such that G̃ (0) = − 1/μ. Write
down the LST of the interdeparture times (between two successive
departures picked arbitrarily in steady state) in terms of λ, μ, and
G̃(w). (Note that w is used instead of s in the LST to avoid confusion
with S, the service time.)
4.11 Consider a stable G/M/1 queue with traffic intensity ρ and parame-
ter α, which is a solution to α = G̃(μ − μα), where G(t) is the CDF of
the interarrival time and μ is the mean service rate. Derive an expres-
sion for the generating function (z) of the number of entities in the
system in steady state as a closed-form expression in terms of ρ, α,
and z.
4.12 Answer the following multiple choice questions:
(i) For a stable G/M/1 queue with G̃(s) being the LST of the inter-
arrival time CDF and service rate μ, which of the following
statements are not true?
(a) There is a unique solution for α in (0, 1) to the equation
α = G̃(μ − μα).
(b) In steady state, the time spent by an arbitrary arrival in the
system before service begins is exponentially distributed.
(c) If G̃(s) = λ/(λ + s), then the average total time in the system
(i.e., W) is 1/(μ − λ).
(d) The fraction of time the server is busy in the long run is
−1/[G̃ (0)μ].
(ii) Consider a stable M/G/1 queue and the notation given in this
chapter. Which of the following statements are not true?
238 Analysis of Queues
σ2A
W=τ + .
2{a − τ}
(b) Consider a stable M/M/1 queue that uses processor sharing dis-
cipline. Arrivals are according to PP(λ) and it would take exp(μ)
time to process an entity if it were the only one in the system. Is
the following statement TRUE or FALSE? The average workload
in the system at an arbitrary point in steady state is λ/[μ(μ − λ)].
(c) The Pollaczek–Khintchine formula to compute L in M/G/1
queues requires the service discipline to be FCFS. TRUE or
FALSE?
4.14 Compare an M/E2 /1 and E2 /M/1 queue’s L values for the case when
both queues have the same ρ. The term E2 denotes an Erlang-2
distribution.
4.15 Which is better: an M/G/1 queue with PP(λ) and processing speed of
1 or one with PP(2λ) and processing speed 2? By processing speed,
we mean the amount of work the server can process per unit time,
so if there is x amount of work brought by an arrival and the pro-
cessing speed is c, then the service time would be x/c. Assume that
the amount of work has a finite mean and finite variance. Also, use
mean sojourn time to compare the two systems.
General Interarrival and/or Service Times 239
4.16 Consider an M/G/1 queue with mean service time 1/μ and variance
1/(3μ2 ). The interarrival times are exponentially distributed with
mean 1/λ and the service times are according to an Erlang distri-
bution. Let Wn and Sn , respectively, be the time in the system and
service time for the nth customer. Define a random variable called
slowdown for customer n as Wn /Sn . Compute the mean and variance
of slowdown for a customer arriving in steady state, that is, compute
E[Wn /Sn ] and Var[Wn /Sn ] as n → ∞. Assume stability. Note that the
term x-factor defined in the exercises of Chapter 1 is E[Wn ]/E[Sn ],
however the means low down is E[Wn /Sn ].
4.17 Consider a G/M/2 queue which is stable. Obtain the cumulative
distribution function (CDF) of the time in the system for an arbi-
trary customer in steady state by first deriving its Laplace Stieltjes
transform (LST) and then inverting it. Also, based on it, derive
expressions for W and thereby L.
4.18 Consider an M/G/1 queue with mean service time 1/μ and second
moment of service time E[S2 ]. In addition, after each service comple-
tion, the server takes a vacation of random length V with probability
q or continues to serve other units in the queue with probability
p (clearly p = 1 − q). However, the server always takes a vacation
of length V as soon as the system is empty; at the end of it, the
server starts service, if there is any unit waiting, and otherwise it
waits for units to arrive. Let Ṽ(s) = E[e−sV ] be the LST of the vaca-
tion time distribution. Use MVA to derive the following results for
p0 , the long-run probability the system is empty, and L, the long-run
average number of units in the system:
1 − (λ/μ) − qE(V)λ
p0 = ,
Ṽ(λ) + pE(V)λ
' 0
λ λ2 2q
L= + 2 2
E[S ] + qE(V ) + E(V)
μ 2 1 − (λ/μ) − λqE(V) μ
λ2 pE(V 2 )
+ .
2 Ṽ(λ) + pE(V)λ
Assume stability.
4.19 Let X(t) be the number of entities in the system in an M/M/1 queue
with processor sharing. The arrival rate is λ and the amount of ser-
vice requested is according to exp(μ) so that the traffic intensity is
ρ = λ/μ. Model {X(t), t ≥ 0} as a birth and death process and obtain
the steady-state distribution. Using that and the properties of the
240 Analysis of Queues
lim Fn (t, y1 , y2 , . . . , yn ),
t→∞
where
and Ri (t) is the remaining service for the ith customer in the system.
From that result, show that
/
n
fn (y1 , y2 , . . . , yn ) = (1 − ρ)λn [1 − G(yi )],
i=1
In the models considered in the previous chapters, note that there was only
a single class of customers in the system. However, there are several appli-
cations where customers can be differentiated into classes and each class has
its own characteristics. For example, consider a hospital emergency ward.
The patients can be classified into emergency, urgent, and regular cases with
varying arrival rates and service requirements. The question to ask is: In
what order should the emergency ward serve its patients? It is natural to
give highest priority to critical cases and serve them before others. But how
does that impact the quality of service for each class? We seek to address
such questions in this chapter. We begin by describing some introductory
remarks, then evaluate performance measures of various service disciplines,
and finally touch upon the notion of optimal policies to decide the order
of service.
5.1 Introduction
The scenario considered in this chapter is an abstract system into which
multiple classes of customers enter, get “served,” and depart. Why are
there multiple classes? First, the system might be naturally classified into
various classes because there are inherently different items requiring their
own performance measures (e.g., in a flexible manufacturing system, if a
machine produces three different types of parts, and it is important to mea-
sure the in-process inventory of each of them individually, then it makes
sense to model the system using three classes). Second, when the service
times are significantly different for different customers, then it might be ben-
eficial to classify the customers based on service times (e.g., in most grocery
stores there are special checkout lines for customers that have fewer items).
Third, due to physical reasons of where the customers arrive and wait, it
might be practical to classify customers (e.g., at fast-food restaurants cus-
tomers can be classified as drive-through and in-store depending on where
they arrive).
241
242 Analysis of Queues
Next, given that there are multiple classes, how should the customers
be served? In particular, we need to determine a service discipline to serve
the customers. For that one typically considers measures such as cost, fair-
ness, performance, physical constraints, goodwill, customer satisfaction,
etc. Although we will touch upon the notion of optimal service disciplines
toward the end of this chapter, we will assume until that point that we have
a system with a given service discipline and evaluate the performance expe-
rienced by each class in that system. This is with the understanding that in
many systems one may be restricted to using a particular type of service pol-
icy. To get a better feel for that, in the next section we present example of
several systems with multiple classes. Before proceeding with that, as a final
comment, it is worth mentioning that this work typically falls in the literature
under the umbrella of stochastic scheduling.
These examples are meant to motivate the reader but by no means an indi-
cation of the variety of circumstances in which multiclass queueing systems
occur. While presenting various service disciplines, we will describe some
more examples to put things in better perspective. In fact, many of our fol-
lowing examples would fall under service systems (such as hospitals), which
is another application domain that has received a lot of attention recently.
Next, we present some results that are applicable to any multiclass queueing
system with “almost” any reasonable service discipline.
Li = λi Wi .
Of course we can aggregate across all K classes of customers and state the
usual Little’s law as
L = λW
where
L is the total number of customers in the system on average in steady state
across all classes
W is the sojourn time averaged over all customers (of all classes)
λ is the aggregate arrival rate
Multiclass Queues under Various Service Disciplines 245
Since the system is flow conserving and class switching is not permitted,
we have
λ = λ1 + λ2 + · · · + λK ,
L = L1 + L2 + · · · + LK .
Li = λi Wi .
Also, similar results can be derived for Liq and Wiq which, respectively,
denote the average number waiting in queue (not including customers at
servers) and average time spent waiting before service. In particular, for all
i ∈ [1, K]
Liq = λi Wiq
1
Wi = Wiq +
μi
Li = Liq + ρi
246 Analysis of Queues
where
λi
ρi = .
μi
Note that ρi here is not the traffic intensity offered by class i when s > 1.
But we will mainly consider the case s = 1 for most of this chapter which
would result in ρi being the traffic intensity and will use the previous results.
In addition, L and W are the overall mean number of customers and mean
sojourn time averaged over all classes. Recall that L = L1 + L2 + · · · + LK
and if λ = λ1 + λ2 + · · · + λK , the net arrival rate, then W = L/λ. For the
G/G/1 case with multiple classes, more results can be derived for a spe-
cial class of scheduling policies called work-conserving disciplines which we
describe next.
W(t) FCFS
S2 S3
S1
A1 A2 A3 t
D1 D2 D3
W(t) LCFS-PR
S2 S3
S1
A1 A2 A t
D2 3 D3 D1
FIGURE 5.1
W(t) vs. t for FCFS and LCFS-PR.
decreases at unit rate. A sample path of the workload in the system at time t,
W(t), is described in Figure 5.1 with An denoting the time of the nth arrival
and Sn its service time requirement for n = 1, 2, 3. The figure gives depar-
ture times Dn for the nth arriving customer using two service disciplines,
FCFS and LCFS-PR (with preemptive resume) across all classes. Note that
although the departure times are different under the two service disciplines,
the workload W(t) is identical for all t. In other words, the workload W(t)
at time t is conserved. But that does not always happen. If the server were
to idle when W(t) > 0 or if we considered LCFS with preemptive repeat
(and Sn is not exponentially distributed), the workload would not have been
conserved.
In this and the next section, we only consider the class of service dis-
ciplines that result in the workload being conserved. The essence of work-
conserving disciplines is that the system workload at every instant of time
remains unchanged over all work-conserving service-scheduling disciplines.
Intuitively this means that the server never idles whenever there is work
to do, and the server does not do any wasteful work. The server continu-
ously serves customers if there are any in the system. For example, FCFS,
LCFS, and ROS are work conserving. Certain priority policies that we will
see later such as nonpreemptive and preemptive resume policies are also
work conserving. Further, disciplines such as processor sharing, shortest
expected processing time first, and round-robin policies are also work con-
serving when the switch-over times are zero. There are policies that are
nonwork conserving such as preemptive repeat (unless the service times
248 Analysis of Queues
are exponential) and preemptive identical (i.e., the exact service time is
repeated as opposed to preemptive repeat where the service time is resam-
pled). Usually when the server takes a vacation from service or if there
is a switch-over time (or setup time) during moving from classes, unless
those can be explicitly accounted for in the service times, those servers are
nonwork conserving.
Having described the concept of work conservation, next we present
some results for queues with such service disciplines. Note that across all
work-conserving service-scheduling disciplines, not only is W(t) identical
for all t, but also the busy period and idle time sample paths are identical.
Therefore, all the results that depend only on the busy period distribution
can be derived for all work-conserving disciplines. We present some of those
next. Consider the notation used earlier in this section as well as those in
Section 5.1.2. Define ρ, the overall traffic intensity, as
K
ρ= ρi .
i=1
ρ < 1.
This result can be derived using the fact that ρ is just the overall arrival
rate times the average service time across all classes. It is also equal to the
ratio of the mean busy period to the mean busy period plus idle period. The
busy period and idle period are identical across all work-conserving disci-
plines, and we know the previous result works for the single class G/G/1
queue with FCFS. Therefore, the FCFS across all classes is essentially a single
class FCFS with traffic intensity ρ. This result works for all work-conserving
service-scheduling disciplines. In a similar manner, we can show that when
a G/G/1 system is work conserving, the probability that the system is
empty is 1 − ρ.
In the next section, we describe a few more results for multi-class G/G/1
queues with work-conserving scheduling disciplines. However, we require
an additional condition that rules out some of the work-conserving schemes.
For that we also need additional notation. Let Si be the random variable
denoting the service time of a class-i customer (this is different from Sn
we defined earlier which is the service time realization of the nth arriving
customer). It is also crucial to point out that since the service times for all
customers of particular class are IID, we use a generic random variable, such
Multiclass Queues under Various Service Disciplines 249
as Si for class i. Using that notation, the second moment of the overall service
time is
1 K
E S2 = λi E S2i .
λ
i=1
Remark 10
We now explain the method used here. For that the first step is to ensure
that the stochastic system is stable. That can be checked fairly quickly; in
fact for the multiclass G/G/1 queue, all we need to check is if ρ < 1. If
the system is stable, the key idea is to observe the system at an arbitrary
time in steady state. Then the observation probabilities correspond to the
steady-state probabilities. For example, in a stable G/G/1 queue with mul-
tiple classes, the probability that an arbitrary observation in steady state
would result in an empty system is 1 − ρ, the steady-state probability of
having no customers in the system. Further, if the system is stationary and
250 Analysis of Queues
ergodic, then the observation just needs to be made at an arbitrary time and
the results would also indicate time-averaged behavior. Thus, the expected
number in the system during this arbitrary observation in steady state is L
(the steady-state mean number in the system). This should not be confused
with customer-stationary process results such as Poisson arrivals see time
averages (PASTA). Note that here we are considering just one observation
and the observation does not correspond to a customer arrival or departure.
For the next step of the analysis, recall that our system is a G/G/1 queue
with K classes and a service discipline that is work conserving with at most
one partially completed service allowed. We can divide this system into two
subsystems, one the waiting area and the other the service area. The work-
load in the system at any time is the sum of the workload in the service area
and that in the waiting area. Note that all arriving customers go to the wait-
ing area (although they may spend zero time there), then go to the service
area and exit the system (without going back to the waiting area). With that
in mind, we present two results that are central to such special cases of work-
conserving disciplines. These results were not presented for the single class
case (but they are easily doable by letting the number of classes K = 1). We
present these results as two problems.
Problem 43
Consider a G/G/1 queue with K classes and a service discipline that is
work conserving with at most one partially completed service allowed.
Assume that the queue is stable, that is, ρ < 1. If the system is observed
at an arbitrary time in steady state, then show that for that observation,
the expected workload at the server (i.e., expected remaining service time)
is λE[S2 ]/2.
Solution
Let C be the class of the customer in service when the observation is made at
an arbitrary time in steady state with C = 0 implying there is no customer
in service. Also, let R be the remaining service time when this observa-
tion is made (again R is indeed the workload at the server). We need to
compute E[R].
Note that P{C = 0} = 1 − ρ and for i = 1, . . . , K, P{C = i} = ρi .
Although P{C = 0} has been mentioned earlier in this section, P{C = i}
deserves some explanation. First consider P{C = i|C > 0} which would
be the probability that if all the service times were arranged back to back
at the server and an arbitrary time point was picked. For this consider a
long stick made up of ni small sticks of random lengths sampled from Si (the
service times of class i) for i = 1, . . . , K. If a point is selected uniformly on
this stick,then the point would be on a class i small stick with probability
ni E[Si ]/( j nj E[Sj ]). The number of sticks ni correspond to the number of
class-i customers sampled. As the sample size grows, ni / j nj → λi / j λj
Multiclass Queues under Various Service Disciplines 251
E[S2i ]
E[R|C = i] = .
2E[Si ]
Unconditioning, we get
K
E[R] = E[R|C = i]P{C = i}
i=0
K
E S2 ρi i
=0+
2E[Si ]
i=1
K
E S2 λi E[Si ]
i
=
2E[Si ]
i=1
K
E S2 λi i
=
2
i=1
λE S2
=
2
1 K
E S2 = λi E S2i .
λ
i=1
In the previous problem, we did not use all the power of work conserva-
tion. In particular, the workload at some time t, W(t) is conserved over all
work-conserving disciplines. The next problem discusses the second subsys-
tem, namely the waiting area, and uses the fact that W(t) does not depend
on the service discipline.
Problem 44
Consider a stable G/G/1 queue with K classes and a service discipline that is
work conserving with at most one partially completed service allowed. Let
Wiq be the average waiting time in the queue (not including service) for a
class-i customer. Let the system be observed at an arbitrary time in steady
state. Show that the expected workload in the waiting area (not including
any at the server) at that observation is
K
ρi Wiq .
i=1
K
ρi Wiq .
i=1
This result only tells us the average workload in the waiting area subsystem
for any given work-conserving discipline where at most one customer can be
at the server. However, we are yet to show that this quantity when computed
for any such service discipline would be a constant. For that consider the
basic work conservation result that states that the amount of work in the
system at a given time point (here we consider an arbitrary time point in
steady state) is the same irrespective of the service discipline, as long as it
is work conserving. Naturally the expected value of the amount of work is
Multiclass Queues under Various Service Disciplines 253
λE S2
K
+ ρi Wiq (5.1)
2
i=1
and is conserved across disciplines. However, the term λE[S2 ]/2 remains a
constant across disciplines. Therefore
K
ρi Wiq
i=1
(Section 5.3) and knowledge of service times (Section 5.4). In this section
where we consider classification based on customer type, we also assume
that there is no switch-over time from customer to customer or class to
class. In addition, we assume that the arrival and service times are nonan-
ticipative, specifically the realized service time is known only upon service
completion.
Before describing the model and analysis technique for such service dis-
ciplines, we describe some physical examples of multiclass queues with clas-
sification based on types. The emergency ward situation that we described
earlier and a case study we will address later is a canonical example of such
a system. In particular, depending on the urgency of the patient, they can
be classified as critical, urgent, or regular. Their needs in terms of queue
performance are significantly different. Although in this example the service
performed could be different, next is the one where they are similar. Many
fast-food restaurants nowadays accept online orders. Typically those cus-
tomers are classified differently compared to the ones that stand in line and
order physically. The needs of the online orders in terms of sojourn times
are certainly different but the service times are no different from the other
class. In addition there are many examples in computer systems, network-
ing, transportation, and manufacturing where the analysis to follow can be
applied.
Although there are several applications, we present a fairly generic
model for the system. Consider a special case of the G/G/1 queue with
K classes where the arrival process is PP(λi ) for class i (i = 1, 2, . . . , K).
The service times are IID with mean E[Si ] = 1/μi , second moment E[S2i ],
CDF Gi (·), and ρi = λi /μi for class i (for i = 1, 2, . . . , K). There is a
single server with infinite waiting room. From an analysis standpoint, it
does not matter whether there is a queue for each class or whether all
customers are clubbed into one class as long as the class of each cus-
tomer and the order of arrival are known to the server. We present
results for three work-conserving disciplines: FCFS, nonpreemptive pri-
ority, and preemptive resume priority. Note that the case of preemptive
repeat (identical or random) is not considered but is available in the liter-
ature. Other schemes such as round-robin will be dealt with in subsequent
sections.
1
K
G(t) = P(S ≤ t) = λi Gi (t),
λ
i=1
1 1
K
E[S] = = λi E[Si ],
μ λ
i=1
1
K
1
E S2 = σ2 + 2 = λi E S2i ,
μ λ
i=1
ρ = λE[S].
Note that the average aggregate service rate is μ and the variance of the
aggregate service time is σ2 . Also, ρ = ρ1 + · · · + ρK .
Assume that the system is stable, that is, ρ < 1. Note that the system is
identical to that of a single class M/G/1 queue with PP(λ) arrivals and service
times with CDF G(t), mean 1/μ, and variance σ2 . Then using Pollaczek–
Khintchine formula (see Equation 4.6) for a single class M/G/1 queue, we
get the following results:
1 λ2 E S2
L=ρ+ ,
2 1−ρ
L
W= ,
λ
Wq = W − 1/μ,
1 λ2 E S2
Lq = .
2 1−ρ
Now we need to derive the performance measures for each class i for i =
1, . . . , K. The key result that enables us to derive performance measures for
256 Analysis of Queues
1 λE S2
Wiq = Wq = ,
2 1−ρ
Liq = λi Wiq ,
Li = ρi + Liq ,
1
Wi = Wiq + .
μi
λi L λi
Li = − + ρi .
λ μ
between classes and priorities simple, we let class-1 to be the highest pri-
ority and class K the lowest (we will see later how to do this optimally).
Also, service discipline within a class is FCFS. Therefore, the server upon
a service completion always starts serving a customer of the highest class
among those waiting for service, and the first customer that arrived within
that class. Of course, if there are no customers waiting, the server selects the
first customer that arrives subsequently. However, it is important to clar-
ify that the server completes serving a customer before considering whom
to serve next. The meaning of nonpreemptive priority is that a customer in
service does not get preempted (or interrupted) while in service by another
customer of high priority (essentially it is only in the waiting room where
there is priority).
Recall that we are considering an M/G/1 queue with K classes where the
arrival process is PP(λi ) for class i (i = 1, 2, . . . , K); the service times are IID
with mean E[Si ] = 1/μi , second moment E[S2i ], CDF Gi (·), and ρi = λi /μi
for class i (for i = 1, 2, . . . , K). There is a single server with infinite waiting
room. From an analysis standpoint, it does not matter whether there is a
queue for each class or whether all customers are clubbed into one class as
long as the class of each customer and the order of arrival are known to the
server. However, from a practical standpoint it is easiest to create at least a
“virtual” queue for each class and pick from the head of the nonempty line
with highest priority. Further, we assume that the system is stable, that is,
ρ < 1. For the rest of this section, we will use the terms class and priority
interchangeably. With that said, we are ready to analyze the system and
obtain steady-state performance measures.
To analyze the system, we consider a class-i customer that arrives into
the system in steady state for some i ∈ {1, . . . , K}. We reset the clock and call
that time as 0. Another way to do this is to assume the system is stationary at
time 0 and a customer of class i arrives. Consider the random variables defined
in Table 5.1 (albeit with some abuse of notation). It is crucial to note that
the terms in Table 5.1 would perhaps have other meanings in the rest of
this book.
When a customer of class i arrives in the stationary queue at time 0, this
customer first waits for anyone at the server to be served (i.e., for a random
time U). Then the customer also waits for all customers that are in the system
TABLE 5.1
Random Variables Used for Nonpreemptive and Preemptive Resume Cases
q
Wi Waiting time in the queue (not including service) for customer of class i
(note that this is a random variable and not the expected value)
U Remaining service time of the customer in service (this is zero if the server is idle)
Rj Time to serve all customers of type j who are waiting in the queue at time 0 (for 1 ≤ j ≤ i)
q
Tj Time to serve all customers of type j who arrive during the interval [0, Wi ] (for 1 ≤ j < i)
258 Analysis of Queues
i
of equal or higher priority at time 0 (i.e., for a random time Rj ). Note
j=1
that during the time this customer waits in the system to begin service (i.e.,
q
Wi ), there could be other customers of higher priority that may have arrived
i−1
and served before this customer. Thus, this customer waits a further Tj
j=1
(with the understanding that the term is zero if i = 1) before service begins.
Therefore, we have
q
i
i−1
Wi = U + Rj + Tj .
j=1 j=1
q
i
i−1
E Wi = E[U] + E[Rj ] + E[Tj ]. (5.2)
j=1 j=1
We need to derive expressions for E[U], E[Rj ], and E[Tj ] which we do next.
Problem 45
Derive the following results (for the notations described in Table 5.1):
λ 2
E[U] = ES
2
q
E[Rj ] = ρj E Wj
q
E[Tj ] = ρj E Wi
where ρi = λi E[Si ].
Solution
Recall from Problem 43 that if a stable G/G/1 queue with K classes is
observed at an arbitrary time in steady state, then the expected remaining
service time is λE[S2 ]/2. Of course this requires a service discipline that
is work conserving with at most one partially completed service allowed,
which is true here. However, since the arrivals are Poisson, due to PASTA
and M/G/1 system being ergodic, an arriving class-i customer in steady
state would observe an expected remaining service time of λE[S2 ]/2. In
other words, the arrival-point probability is the same as the steady-state
probability. Thus, from the definition of U we have
λ 2
E[U] = ES .
2
Multiclass Queues under Various Service Disciplines 259
Once again because of PASTA this arriving customer at time 0 will see Ljq
customers of class j waiting for service to begin. Therefore, by the definition
of Rj (time to serve all customers of type j waiting in the queue at time 0), we
have E[Rj ] = E[E[Rj |Nj ]] = E[Nj /μj ] = Ljq /μj , where Nj is a random variable
denoting the number of class-j customers in steady state waiting for service
to begin. Next, using Little’s law Ljq = λj Wjq , we have
1 q
E[Rj ] = λj Wjq = ρj E Wj .
μj
Now, to compute E[Tj ] note the definition that it is the time to serve all cus-
q
tomers of type j who arrive during the interval [0, Wi ] for any j < i. Clearly,
the expected number of type j arrivals in time t is λj t because the arrivals are
according to a Poisson process and each of those arrivals require 1/μj service
q q
time. Hence, we have E[Tj |Wi ] = λj Wi /μj . Taking expectations we get
q
E[Tj ] = ρj E Wi .
q λ i
q q
i−1
E Wi = E S2 + ρj E Wj + E Wi ρj .
2
j=1 j=1
We rewrite this equation using the notation that we have used earlier for the
q
average waiting time before service, that is, Wiq = E[Wi ]. Thus, we have for
all i ∈ {1, . . . , K}
λ i i−1
Wiq = E[S2 ] + ρj Wjq + Wiq ρj .
2
j=1 j=1
K
1
2 λj E S2j
j=1
Wiq = .
(1 − αi )(1 − αi−1 )
260 Analysis of Queues
Now, using Wiq we can derive the other performance measures as follows
for all i ∈ {1, . . . , K}:
Liq = λi Wiq ,
Wi = Wiq + E[Si ],
Li = Liq + ρi .
L = L1 + L2 + · · · + LK ,
L
W= ,
λ
1
Wq = W − ,
μ
Lq = λWq .
So far we have assumed that we are given which class should get the
highest priority, second highest, etc. This may be obvious in some settings
such as a hospital emergency ward. However, in other settings such as a
manufacturing system we may need to determine an optimal way of assign-
ing priorities. To do that, consider there are K classes of customers and it
costs the server Cj per unit time a customer of class j spends in the system
(this can be thought of as the holding cost for class j customer). It turns out
(we will show that in a problem next) if the objective is to minimize the total
expected cost per unit time in the long run, then the optimal priority assign-
ment is to give class i higher priority than class j if Ci μi > Cj μj (for all i, j
such that i = j). In other words, sort the classes in the decreasing order of the
product Ci μi and assign first priority to the largest Ci μi and the last priority
to the smallest Ci μi over all K classes. This is known as the Cμ rule. Also
note that if all the Ci values were equal, then this policy reduces to “serve the
customer with the shortest expected processing time first.” We derive the
optimality of the Cμ rule next.
Problem 46
Consider an M/G/1 queue with K classes with notations described earlier in
this section and service discipline being nonpreemptive priority. Further, it
costs the server Cj per unit time a customer of class j spends in the system
and the objective is to minimize the total expected cost per unit time in the
long run. Show that the optimal priority assignment is to give class i higher
priority than class j if Ci μi > Cj μj (provided i = j).
Multiclass Queues under Various Service Disciplines 261
Solution
Let TC be the average cost incurred per unit time if the priorities are
1, 2, . . . , K from highest to lowest for the system under consideration. Since
a cost Cn is incurred during the sojourn of a class n customer, the total cost
incurred per class n customer on average in steady state is Cn Wn (the reason
for not using i or j but n is that i and j are reserved for something else). Also,
class n customers arrive at rate λn resulting in an average cost per unit time
due to class n customers being λn Cn Wn . Thus, we have
K
TC = λn Cn Wn .
n=1
λn μn μn
= − ,
(1 − αn )(1 − αn−1 ) 1 − αn 1 − αn−1
1
K
μn μn λn Cn
λn Cn Wn = Cn λk E S2k − + .
2 1 − αn 1 − αn−1 μn
k=1
that while computing Tc − TCe , all the terms except the ith and jth
Next, note
terms in n λn Cn Wn would be identical and cancel out. Using these results,
262 Analysis of Queues
K i−1
TC − TCe Cr μr Cr μr Cr μr Cj μj
= − − −
1 K 2
n=1 λn E Sn
1 − αr 1 − αr−1 1 − αr 1 − αi−1 − ρj
2 r=1 r=1
Ci μi
K
Cr μr
i−1
Cr μr
− − +
1 − αi−1 − ρj − ρi 1 − αr 1 − αr−1
r=i+2 r=1
Cj μj Ci μi
K
Cr μr
+ + +
1 − αi−1 1 − αi−1 − ρj 1 − αr−1
r=i+2
Ci μi Cj μj Ci μi Cj μj Cj μj
= + − − −
1 − αi 1 − αi − ρj 1 − αi−1 1 − αi 1 − αi−1 − ρj
Ci μi Cj μj Ci μi
− + +
1 − αi−1 − ρj − ρi 1 − αi−1 1 − αi−1 − ρj
1 1 1
= (Ci μi − Cj μj ) − +
1 − αi 1 − αi−1 1 − αi−1 − ρj
1
−
1 − αi − ρ j
ρi ρj (αj − 2)(Ci μi − Cj μj )
= .
(1 − αi )(1 − αi−1 )(1 − αi−1 − ρj )(1 − αi − ρj )
and service begins for this new higher priority customer. When the pre-
empted customer returns to service, service resumes from where it was
preempted. This is a work-conserving discipline (however, if the service
has to start from the beginning which is called preemptive repeat, then it
is not work conserving because the server wasted some time serving). As
we described earlier, if the service times are exponential, due to memo-
ryless property, preemptive resume and preemptive repeat are the same.
However, there is another case called preemptive identical which requires
that the service that was interrupted is repeated with an identical service
time (in the preemptive repeat mechanism, the service time is sampled again
from a distribution). We do not consider those here and only concentrate on
preemptive resume priority.
All the other preliminary materials for the nonpreemptive case also hold
here for the preemptive resume policy (namely, multiclass M/G/1 with class-
1 being highest priority and class K lowest). Also, customers within a class
will be served according to FCFS policy. But a server will serve a customer
of a particular class only when there is no customer of higher priority in the
system. Upon arrival, customer of class i can preempt a customer of class
j in service if j > i. Also, the total service time is unaffected by the inter-
ruptions, if any. Assume that the system is stable, that is, ρ < 1. Note that
there could be more than one customer with unfinished (but started) service.
Therefore, the results of Section 5.1.4 cannot be applied here. However, the
service discipline is still work conserving and we will take advantage of that
in our analysis. Further, note that the sojourn time of customers of class i is
unaffected by customers of class j if j > i.
With that thought we proceed with our analysis. We begin by consid-
ering class-1 customers. Clearly, as far as class-1 customers are concerned,
they can be oblivious of the lower class customers. Therefore, class-1 cus-
tomers effectively face a standard single class M/G/1 system with arrival rate
λ1 and service time distribution G1 (·). Class-1 customers get served upon
arrival if there are no other class-1 customers in the system, and they will
wait only for other class-1 customers for their service to begin. Thus, from
Pollaczek–Khintchine formula in Equation 4.6, we get the sojourn time of
class-1 customers as
1 λ1 E S21
W1 = + .
μ1 2(1 − ρ1 )
workload in the system with the first i classes alone. In addition, due to
PASTA, W(i) will also be the average workload in the preemptive resume
M/G/1 queue as seen by an arriving class-i customer. This in turn is also
equal to the average workload due to the first i classes in the K-class system.
Now consider an M/G/1 queue with all K classes where a customer of class i
is about to enter in steady state. Then the sojourn time in the system for this
customer depends only on the customers of classes 1 to i in the system upon
arrival. Therefore, Wi can be computed by solving
1
Wi = W(i) + + αi−1 Wi
μi
as the mean sojourn time is equal to the expected workload upon arrival
from all customers of classes 1 to i plus the mean service time of this class
i customer plus the average service time of all the customers of classes 1 to
i − 1 that arrived during the sojourn time. Substituting the expression for
W(i) and rearranging terms, we have
i
λj E S2j
1 j=1
Wi = + .
μi (1 − αi−1 ) 2(1 − αi )(1 − αi−1 )
Wiq = Wi − E[Si ],
Li = λi Wi ,
Liq = Li − ρi .
The results for the individual classes can be used to obtain aggregate
performance measures as follows:
L = L1 + L2 + · · · + LK ,
L
W= ,
λ
1
Wq = W − ,
μ
Lq = λWq .
variations to the models we have seen in this section. After that, we will
move on to other policies in subsequent sections.
adieu, Jenna went about gathering all the information she could as well as
collected the necessary data for her analysis. The first thing she found out
was that the billboard wait times were updated every 15 min (through an
automatic RSS feed) and the displayed value was the average wait time over
the past 2 h.
Jenna wanted to know how they computed wait times and she was told
that the wait time for a patient is the time from when the patient checks in
until when the patient is called by a clinical professional. Jenna immediately
realized that it did not include the time the clinical professional spends see-
ing the patient. So it represented the waiting time in the queue and not the
sojourn time. Jenna found out that the entire time spent by patients in the ER
could even be several hours if a complicated surgery needs to be performed.
However, she was glad that she did not have to focus on those issues. But
what was concerning for her was whether someone with a heart failure had
to wait on average for 14 min to see a clinical professional. She was reassured
that when patients arrive at the emergency ward, they are immediately seen
by a triage nurse. The nurse would determine the severity of the patient’s
illness or injury to determine if they would have to be seen immediately
by a clinical professional. Priority was given to patients with true emergen-
cies (this does not include life-threatening cases, pregnancies, etc., where
patients are not seen by a clinical professional but are directly admitted to
the hospital).
Upon speaking with the triage nurse, Jenna found out that there are
essentially two classes of patients. One class is the set of patients with true
emergencies and the second class is the remaining set of patients. Within a
class, patients were served according to FCFS; however, the patients with
true emergencies were given preemptive priority over those that did not
have a true emergency. The triage nurse also whispered to Jenna that she
would much rather have three classes instead of two. It was hard to talk a
lot to the triage nurse because she was always busy. But Jenna managed to
also find out that there are always two doctors (i.e., clinical professionals) at
the emergency ward, and like Jenna saw during her Master’s thesis days, the
most crowded times were early evenings. Jenna next stopped at the informa-
tion technology office to get historical data of the patients. A quick analysis
revealed that patients arrived according to a Poisson process and the time
a doctor took to see a patient was exponentially distributed. Interestingly,
the time a doctor spent to see a patient was indifferent for the two classes of
patients.
Jenna looked at her textbook for her course on waiting line models. She
distinctly remembers studying preemptive queues. However, when she saw
the book, she did not see anything about two-server systems (note that since
the ward has two doctors, that would be the case here). Further, the book
only had results for mean wait times and not distributions, which is some-
thing she thought was needed for her analysis. Nonetheless, she decided to
go ahead and read that chapter carefully so that she gets ideas to model the
268 Analysis of Queues
system and analyze it. Jenna also checked the simulation software packages
she was familiar with and none had an in-built preemptive priority option
(all of them only had nonpreemptive). At this time Jenna realized that her
only option was to model the system from scratch. She wondered if the two-
server system was even work conserving. But she did feel there was hope
since the interarrival times and service times were exponentially distributed.
Also, the service times were class independent. “How hard can that be to
analyze,” she thought to herself.
Jenna started to model the system. Based on her data she wrote down
that class-1 patients (with true emergencies) arrived to the emergency ward
according to PP(λ1 ) and class-2 patients arrived according to PP(λ2 ). The
service time for either class is exp(μ). There are two servers that use FCFS
within a class and class-1 has preemptive resume priority over class-2. In
the event that there are two class-2 patients being served when a class-1
arrives, Jenna assumed that with equal probability one of the class-2 patients
was selected to be preempted. Jenna first started to model the system as
a CTMC {(X1 (t), X2 (t)), t ≥ 0} where for i = 1, 2, Xi (t) is the number of
class-i patients in the system. Then Jenna realized that there must be an
easier way to model the system. She recalled how the M/G/1 queue with
preemptive priority was modeled in her textbook. An idea immediately
dawned on her.
ρ21 1
W1q =
1 − ρ21 μ
∞
i−1
2μ
E e−sYq = p0 + p1 + pi
2μ + s
i=2
where pi is the steady-state probability that the M/M/2 queue has i in the
system. From the M/M/2 analysis, p0 = (1 − ρ1 )/(1 + ρ1 ) and for i ≥ 1,
pi = 2p0 ρi1 . Then she obtained
−sY 1 + 2ρ1 + 2ρ1 λ1
q = p
Ee 0
2μ − λ1 + s
which says that more than 1% of the patients with a true emergency would
wait over 30 min to see a clinical professional. Jenna thought to herself that
this could mean quite a few patients a month that would wait over 30 min,
and it was not surprising to her that many would be blogging about it.
urgent care facilities. In other words, Jenna felt that it is important to have a
low overall average wait time so that based on the billboard display, some
nonemergency patients would be lured to the emergency ward as opposed
to visiting an urgent care facility. So she proceeded to compute the average
time spent by the stable patients (i.e., ones without a true emergency) waiting
before they see a clinical professional.
To model that, Jenna let X(t) be the total number of patients in the system
at time t; these include those that do and do not have a true emergency.
The “system” in the previous sentence includes patients that are waiting to
see a clinical professional as well as those that are being seen by a clinical
professional. Since the service times for both times of patients are identically
distributed exp(μ), X(t) would be stochastically identical to the number of
customers in an M/M/2 queue with FCFS service, PP(λ1 + λ2 ) arrivals, and
exp(μ) service. For such an M/M/2 queue with FCFS service, the steady-
state average number in the system is (λ1 + λ2 )/μ + (ρ2 /(1 − ρ2 ))(λ1 + λ2 )/μ
where ρ = (λ1 + λ2 )/(2μ). Thus, it is also equal to the expected value of the
total number of patients in the system in steady state, L1 + L2 , where Li for
i = 1, 2 is the mean number of class-i patients in the system in steady state.
Therefore,
λ1 + λ2 ρ2 λ1 + λ2
L1 + L2 = + .
μ 1 − ρ2 μ
where ρ1 = λ1 /2μ using the W1q result discussed earlier and the fact that
L1 = λ1 /μ + λ1 W1q . Thus, she calculated L2 as
2ρ 2ρ1
L2 = − .
1 − ρ2 1 − ρ21
Using that she wrote down, W2q , the average time spent by class-2 patients
waiting as
L2 − ρ2
W2q =
λ2
where ρ2 = λ2 /μ. Based on Jenna’s data, the arrival rate was about 2.5 class-
2 patients per hour (λ2 = 0.0417 per minute) and the average service time
Multiclass Queues under Various Service Disciplines 271
of 17.5 min (1/μ = 17.5 min) resulting in ρ2 = 0.3646. Plugging into this
formula for W2q , Jenna computed W2q = 31.2759 min. Although this would
mean that UTH cannot guarantee an “average” wait time of less than 30 min
for stable patients, across all patients the average wait time was a little over
18 min which sounded reasonable. Jenna felt it was important to clarify that
W2q included time spent while being preempted by a class-1 patient, and not
just the time to see a clinical professional for the first time.
PP(λ1)
PP(λ2)
PP(λ3)
Server
PP(λK)
FIGURE 5.2
Schematic of a polling system with a single server.
queue and leaves it only when it becomes empty; (2) gated discipline where the
server serves all the customers that arrived to that queue prior to the server
arrival in that cycle; (3) limited discipline where a maximum fixed number
can be served during each poll. Naturally, if the switch-over times are large
(such as a setup time to manufacture a class of jobs), then one may favor
the exhaustive discipline. Whereas if they are small, then it makes sense to
consider a limited discipline (such as even a maximum of one customer per
visit to the queue). The gated discipline falls in between. We assume that the
server can see the contents of a queue only upon arrival. Hence, even if a
queue is empty, the server would still spend the time to switch in and out of
that queue. We also would like to point out that customers in a single queue
(i.e., of a single class) are served according to FCFS.
In addition to this notation, we also require that not all mean switch-over
times can be zero. That provides us with a result quite unique to polling
systems which we describe next. Let ρ = ρ1 + ρ2 + · · · + ρK with K ≥ 2. If
the system is stable (we will describe the conditions later), then the long-run
fraction of time the server is attending to customers is ρ. Likewise, (1 − ρ)
is the fraction of time spent switching in the long run. This is a relatively
straightforward observation since the server is never at a queue idling when
that queue is empty. However, to prove it rigorously one needs to consider
times when the system regenerates and use results from regenerative pro-
cesses to show that. Using that we can state the following result: if E[C] is the
expected time to complete a cycle (including service as well as switch-over
times) in steady state, then
K
E[Di ]
i=1
= 1 − ρ. (5.3)
E[C]
274 Analysis of Queues
Although one is tempted to say that by taking the limit, all the switch-over
times can become zero, but that one has to be careful because of the close
tie with ρ. It turns out the zero switch-over time case is more complicated
to handle than the one with at least one nonzero switch-over time. Next we
analyze each of the types of queue emptying policies, that is, exhaustive,
gated, and limited.
K
i
K
Cni = Dj + Bnj + Bn−1
j
j=1 j=1 j=i+1
since it is essentially the sum of the time the server spends in each queue
plus the time switching. However, note that we have been careful to use
n − 1 for queues greater than i and n for others so that the index n − 1 or n
appropriately denotes the cycle number with respect to the server.
Using these definitions, we can immediately derive the following results.
Given Cni , the expected number of arrivals of class-i customers during that
cycle time is λi Cni . All these arrivals would be served before this cycle time is
completed, and each one of them on average requires 1/μi amount of service
Multiclass Queues under Various Service Disciplines 275
time. Therefore, the average amount of time spent by the server in queue i in
the nth cycle, given Cni , is
λi Cni
E Bni Cni = = ρi Cni .
μi
ρ < 1.
(2)
1 ρ2i
m
Li = ρi + 1 + σ2i μ2i + i ,
2 1 − ρi 2mi
(2)
where mi and mi are, respectively, the mean and second factorial moment
of the number of arrivals during a vacation. Since the arrivals are according
to a Poisson process with rate λi into queue i, we can write down mi = λi E[Vi ]
(2)
and mi = λ2i E[Vi2 ]. Plugging that into Li and writing in terms of Wiq as
1 ρi /μi
2 2
E Vi2
Wiq = 1 + σi μi + .
2 1 − ρi 2E[Vi ]
276 Analysis of Queues
As n → ∞, we get
⎛ ⎞
K
1 − ρi
E[Vi ] = (1 − ρi )E[C] = ⎝ E[Dj ]⎠
1−ρ
j=1
where the last equality is from Equation 5.3. However, obtaining E[Vi2 ] is
fairly involved. In the interest of space, we merely state the results from
Takagi [101] without describing the details.
Let Tin be the station time for the server in queue i defined as
Tin = Bni + Di .
In other words, this is the time between when the server leaves queue i − 1
and queue i. Define bij as the steady-state covariance of cycle times of queues
i and j during consecutive visits, with the understanding that if i = j, it would
be the variance. In other words, for all i and j
Cov Tin , Tjn if j > i,
bij = lim
n−1 n
n→∞ Cov Ti , Tj if j ≤ i.
Using the results in Takagi [101], we can obtain bij by solving the following
sets of equations:
⎛ ⎞
j−1
ρi ⎝
K i−1
bij = bjk + bjk + bkj ⎠ , for j < i
1 − ρi
k=i+1 k=1 k=j
⎛ ⎞
j−1
ρi ⎝
K i−1
bij = bjk + bkj + bkj ⎠ , for j > i
1 − ρi
k=i+1 k=j k=1
⎛ ⎞
ρi ⎝
i−1 K
Var(Di ) λi E S2i E[Vi ]
bii = + bij + ⎠
bij + .
(1 − ρi )2 1 − ρi (1 − ρi )3
j=1 j=i+1
These sets of equations can be solved using a standard matrix solver by writ-
ing down these equations in matrix form [bij ]. Assuming that can be done,
we can write down Var(Vi ) as
Multiclass Queues under Various Service Disciplines 277
⎛ ⎞
1 − ρi ⎝
i−1 K
Var[Vi ] = Var[Di ] + bij + bij ⎠ .
ρi
j=1 j=i+1
Using that we can obtain Wiq . Thereby we can also immediately write down
Liq = λi Wiq , Wi = Wiq + 1/μi , and Li = λi Wi . Of course we can also obtain
the total number in the entire system in steady-state L as L1 + L2 + · · · + LK .
Using that we could get metrics such as W, Wq , and Lq .
K
i−1
K
Cni = Dj + Bnj + Bn−1
j
j=1 j=1 j=i
278 Analysis of Queues
since it is essentially the sum of the time the server spends in each queue
plus the time switching. However, note that we have been careful to use
n − 1 for queues greater than or equal to i and n for others so that the
index n − 1 or n appropriately denotes the cycle number with respect to the
server.
Using these definitions, we can immediately derive the following results.
Given Cni , the expected number of arrivals of class i customers during that
cycle time is λi Cni . All these arrivals would be served during the server’s
sojourn in queue i, and each one of them on average requires 1/μi amount
of service time. Therefore, the average amount of time spent by the server in
queue i in the nth cycle, given Cni , is
λi Cni
E Bni Cni = = ρi Cni .
μi
ρ < 1.
All the results derived so far are identical to those of exhaustive service
policies. However, this is enabled only by a careful selection of how Cni is
defined. It may hence be worthwhile to note the subtle differences in both
policies. Now we are in a position to derive expressions for the performance
measures of the system assuming it is stable.
Problem 47
Consider an arbitrary queue, say i (such that 1 ≤ i ≤ K). Let E[Ci ] and
E[C2i ], respectively, denote the steady-state mean and second moment of
the cycle time Cni . Write down an expression for Wiq in terms of E[Ci ]
and E[C2i ].
Solution
Let a customer arrive into queue i in steady state. From results of renewal
theory, the remaining time for completion as well as the elapsed time
since the start of the cycle in progress are both according to the equilib-
rium distribution of Ci . Therefore, the expected value of both the elapsed
time since the start of the cycle in progress as well as the remaining time
Multiclass Queues under Various Service Disciplines 279
for the cycle in progress to end are equal to E[C2i ]/(2E[Ci ]). The customer
in question would have to wait for the cycle in progress to end plus the
service times of all the customers that arrived since the cycle in progress
began. Therefore, the average waiting time in the queue for this customer
is E[C2i ]/(2E[Ci ]) + λi E[C2i ]/(2E[Ci ]μi ). The second term uses the fact that
λi E[C2i ]/(2E[Ci ]) customers would have arrived on average since the cycle
in progress began, and each of them requires on average 1/μi service time.
Thus, we have
(1 + ρi )E C2i
Wiq = .
2E[Ci ]
There are certainly other ways to derive Wiq , one of which is given as an
exercise problem.
⎛ ⎞
K
1
E[Ci ] = E[C] = ⎝ E[Dj ]⎠
1−ρ
j=1
where the last equality is from Equation 5.3. However, obtaining E[C2i ] is
fairly involved. In the interest of space, we merely state the results from
Takagi [101] without describing the details.
Let Tin be the station time for the server in queue i defined (slightly
different from the exhaustive polling case) as
In other words, this is the time between when the server enters queue i and
queue i + 1. Define bij as the steady-state covariance of cycle times of queues i
and j during consecutive visits, with the understanding that if i = j, it would
be the variance. In other words, for all i and j
Cov(Tin , Tjn ) if j > i,
bij = lim
n→∞ Cov(Tin−1 , Tjn ) if j ≤ i.
280 Analysis of Queues
Using the results in Takagi [101], we can obtain bij by solving the following
sets of equations:
⎛ ⎞
j−1
K
i−1
bij = ρi ⎝ bjk + bjk + bkj ⎠ , for j < i
k=i k=1 k=j
⎛ ⎞
j−1
K
i−1
bij = ρi ⎝ bjk + bkj + bkj ⎠ , for j > i
k=i k=j k=1
⎛ ⎞
i−1
K
K
bii = Var(Di+1 ) + ρi ⎝ bij + bij ⎠ + ρ2i bji + λi E[S2i ]E[C].
j=1 j=i+1 j=1
These sets of equations can be solved using a standard matrix solver by writ-
ing down these equations in matrix form [bij ]. Assuming that can be done,
we can write down E(C2i ) as
⎛ ⎞
2 1
i−1
K
K
E Ci = {E[C]}2 + ⎝ bij + bij ⎠ + bji .
ρi
j=1 j=i+1 j=1
Using that we can obtain Wiq . Thereby we can also immediately write down
Liq = λi Wiq , Wi = Wiq + 1/μi , and Li = λi Wi . Of course we can also obtain
the total number in the entire system in steady-state L as L1 + L2 + · · · + LK .
Using that we could get metrics such as W, Wq , and Lq .
whole distribution even for the small case of K = 2. In that light, we will
make several simplifying assumptions. First we let = 1, that is, at each poll
if a queue is empty, the server immediately begins its journey to the next
queue; otherwise the server serves one customer and then begins its journey
to the next queue.
Let Bni be the random time the server spends at queue i in the nth cycle
serving customers that were in the queue when it arrived. Of course Bni will
either be equal to zero or equivalent of one class-i customer’s service time.
We drop the superscript n by either considering a stationary system or letting
n → ∞. Thus, Bi is the random variable corresponding to the time spent
serving customers in queue i during a server visit in stationary or steady
state. Recall that Di is the random variable associated with the time to switch
from queue i−1 to i. Let E[D] = E[D1 ]+E[D2 ]+· · ·+E[DK ] so that E[D] is the
average time spent switching in each cycle. Since ρi is the long-run fraction
of time the server spends in queue i, we have
E[Bi ]
ρi = K .
E[D] + E[Bj ]
j=1
K
ρ
E[Bj ] = E[D] .
1−ρ
j=1
λi E[D] < 1 − ρ
for all i ∈ [1, 2, . . . , K]. Further, the mean cycle time E[C], as stated in
Equation 5.3, can be verified from previous equation as
K
E[D]
E[C] = E[D] + E[Bj ] = .
1−ρ
j=1
The next step is to obtain performance metrics such as Wiq . It turns out
that unlike the exhaustive and gated cases where we could write down Wiq
in terms of just the first two moments of the service times and switch-over
times, here in the limited case we do not have the luxury. Except for K = 2,
282 Analysis of Queues
the exact analysis is quite intractable. However, we can still develop some
relations between the various Wiq values known as pseudo-conservation law.
Note that this is equivalent to the work conservation result for multiclass
queues with at most one partially complete service. Note that although
here too we have at most one partially complete service, because of the
switch-overs the system is not work conserving. However, by suitably
adjusting for the time spent switching, we can show that the amount of
workload in the queue for the limited polling policy is the same as that of
an equivalent M/G/1 queue with FCFS. The resulting pseudo-conservation
law yields
ρ
K K K
λi R ρ
ρi 1 − Wiq = λi E S2i + Var[Di ]
1−ρ 2(1 − ρ) 2E[D]
i=1 i=1 i=1
E[D] K
2
+ ρ+ ρi .
2(1 − ρ)
i=1
Using this expression, the only case we can obtain Wiq is when the system
is symmetric, that is, the parameters associated with each queue and switch-
over time is identical to that of the others. Instead of using the subscript i, we
use sym to indicate the symmetric case. Thus, the average time spent waiting
in the queue before service in the symmetric case is
Kλsym E S2sym + E[Dsym ](K + ρ) + Var[Dsym ]Kλsym Var[Dsym ]
Wsym,q = + .
2(1 − ρ − λsym KE[Dsym ]) 2E[Dsym ]
Thereby we can also immediately write down Lsym,q = λsym Wsym,q , Wsym =
Wsym,q + 1/μsym , and Lsym = λsym Wsym . Of course we can also obtain the total
number in the entire system in steady-state L as KLsym . Using that we could
get metrics such as W, Wq , and Lq .
that nonanticipative). However, here the service times are declared upon
arrival (which we call anticipative). In applications such as web servers, this
is reasonable since we would know the file size of an arriving request, and
hence its service time. Also in many flexible manufacturing systems, the pro-
cessing times can be calculated as soon as the specifications are known from
the request. Therefore, based on the knowledge of service times we consider
each customer to belong to a different class indexed by the service time.
For analytical tractability, we assume that the service time is a continuous
random variable without any point masses. Thus, it results in a multiclass
system with an uncountably infinite number of classes, where each class
corresponds to service time.
Although the overall system is indeed a single class system, we treat it
as a multiclass system by differentiating the classes based on their service
time requirement. An arrival would be classified as class x if its service time
requirement is x amount of time. Analogous to the mean waiting time before
service for a discrete class-i customer defined as Wiq in the previous sections,
here we define Wxq . The quantity Wxq is the time a customer of class x would
wait in the queue on average, not including the service time (which is x).
Likewise, Wx would indicate the corresponding sojourn time for this cus-
tomer with x as the amount of service. Of course the quantities Wxq and Wx
would depend on the scheduling policy. The overall sojourn time (W) as well
as the overall time waiting in the queue (Wq ) for the various policies can be
computed as
∞
W= Wx dG(x),
0
∞
Wq = Wxq dG(x).
0
service time information and have at most one job with partially completed
service, we have (from Equation 4.6)
λ E S2
Wq = Wxq = Wx − x = ,
2 1 − λE[S]
W = E[S] + Wq .
x
Wx = .
1 − λE[S]
αx−dx . For an infinitesimal dx, note that λx = λ dG(x) dx dx. This is because λx
corresponds to the arrival rate of customers with service time in the interval
(x, x + dx) which equals λ times the probability that an arrival has service
time in the interval (x, x + dx), and that is exactly the PDF of service times at x
multiplied by dx. Also, as dx → 0, we have E[S2x ] → x2 (since the service time
in the interval (x, x + dx) converges to a deterministic quantity x as dx → 0).
→ 0, we need to compute αx . By definition, if x were countable,
Finally, as dx
then αx = xz=0 λz E[Sz ] which by letting x dx → 0 and using the result for λz ,
we get in the uncountable case αx = z=0 λzdG(z) realizing that E[Sz ] → z.
x
Therefore, we have as dx → 0, αx → αx−dx → 0 λtdG(t).
Using the results for Wiq in Section 5.2.2 for class i corresponding to
service time in (x, x + dx), we get by letting dx → 0
∞
1
λy2 dG(y) 1
2 λE[S ]
2 2
y=0
Wxq = x 2 = 2
,
1− λtdG(t) (1 − ρ(x))
0
where
x
ρ(x) = λtdG(t).
0
∞
W= Wx dG(x),
0
∞
Wq = Wxq dG(x).
0
the one in service. It is crucial to realize that the server only uses the initially
declared service times for determining priorities but resumes from where the
service was completed. To implement this, the server stores jobs in the queue
by sorting according to the total service time and always serving customers
on the top of the list. Therefore, this policy is the continuous analog of the
preemptive resume priority considered in Section 5.2.3. In fact, we merely
use those results to derive Wx here which we do next as a problem.
Problem 48
Derive an expression for the sojourn time for a request with service time x
under PSJF.
Solution
Consider the preemptive resume priority discipline analyzed in Section 5.2.3.
First, let the number of classes K in that setting go to infinite. To map from
class i in that setting to class x here, if the service time is in the interval (x, x +
dx), that customer belongs to class x. Note that if x + dx < y, then class
x is given higher preemptive priority than y which is consistent with PSJF
requirements. We need to derive the expected time a customer with service
time x spends in the system, Wx . Recall from the corresponding expression
in Section 5.2.3, we need expressions for λx , μx , E[S2x ], αx , and αx−dx . For an
infinitesimal dx, note that λx = λ dG(x) dx dx. This is because λx corresponds to
the arrival rate of customers with service time in the interval (x, x+dx) which
equals λ times the probability that an arrival has service time in the interval
(x, x+dx), and that is exactly the PDF of service times at x multiplied by dx. As
dx → 0, μx → 1/x since the service time would just be x. Also, as dx → 0, we
have E[S2x ] → x2 (since the service time in the interval (x, x + dx) converges to
a deterministic quantity x as dx → 0). Finally, as dx → 0, we need to compute
αx . By definition, αx = xz=0 x λz E[Sz ] which by letting dx → 0 and using the
result for λz , we get αx = z=0 λzdG(z) realizing that E[Sz ] → z. Therefore,
x
we have as dx → 0, αx → αx−dx → 0 λtdG(t).
Using the results for Wx in Section 5.2.3 for class i corresponding to
service time being in (x, x + dx), we get by letting dx → 0
x
λy2 dG(y) 1
x y=0 x 2 λ(x)
Wx = x +
x 2 = 1 − ρ(x) + ,
1− z=0 λzdG(z) 2 1− λzdG(z) (1 − ρ(x))2
z=0
where
x
(x) = y2 dG(y)
0
Multiclass Queues under Various Service Disciplines 287
and
x
ρ(x) = λtdG(t).
0
∞
W= Wx dG(x),
0
∞
Wq = Wxq dG(x).
0
Wx = Vx + Rx ,
288 Analysis of Queues
where
Vx is the expected time for an arriving customer with service time x to
begin processing by the server (note that until that time, the remaining
processing time is equal to service time x)
Rx is the expected time from when this customer enters the server for the
first time until service is completed (during this time, the server could
get preempted by arriving customers with service time smaller than the
remaining processing time for the customer in question)
x
ρ(x) = λudG(u).
0
Further, the probability that an arriving customer will see the server with a
customer whose remaining service time is less than x is β(x), given by
where λx(1 − G(x)) is the long-run fraction of time the server serves cus-
tomers with remaining processing time less than x, although initially they
had more than x service time to begin with. A busy period of type x is defined
as the continuous stretch of time during which the server only processes
customers with remaining processing time less than x. It is crucial to point
out that if a class x customer arrives during a busy period of type x (note
that this happens with probability β(x)), that customer waits till the busy
period ends to begin service. Of course if this class x customer arrives at
a time other than during a busy period of type x, the customer immedi-
ately gets served by the server. Therefore, by conditioning on whether or
not an arriving class x customer encounters a busy period of type x and then
unconditioning, we get
Vx = β(x)E[Be (x)]
where Be (x) is the remaining time left in the busy period of type x.
Since Be (x) is the equilibrium random variable corresponding to B(x), the
length of the busy period of type x, from renewal theory we can write down
E[Be (x)] as
Multiclass Queues under Various Service Disciplines 289
E B(x)2
E[Be (x)] = .
2E[B(x)]
Thus, all we need to compute are E[B(x)2 ] and E[B(x)]. For this we need
another notation τ(x), the remaining service time of the job that initiated a
type x busy period. Note that when a busy period of type x is initiated, there
would be exactly one customer in the system with remaining processing time
not greater than x, and this customer initiates the busy period. This can hap-
pen in two ways: (1) if a customer with service time t such that t < x arrives
when a busy period of type x is not in progress, then this will start a busy
period of type x with τ(x) = t so that the probability that a given customer
initiates a busy period of type x and the initiating customer has remaining
processing time between t and t + dt is (1 − β(x))dG(t); (2) if a customer that
has an original processing time greater than x initiates a type x busy period as
soon as this customer’s remaining time reaches x so that the probability that
a given customer initiates a busy period of type x and the initiating customer
has remaining processing time x is (1 − G(x)). Therefore, the probability that
a customer initiates a busy period of type x is
x
(1 − β(x))dG(t) + (1 − G(x)) = 1 − G(x)β(x).
0
We also have
x
0 tdG(t) + x(1 − G(x))
(1 − β(x))
E[τ(x)] = ,
1 − G(x)β(x)
2
(1 − β(x)) 0x t2 dG(t) + x2 (1 − G(x))
E τ(x) = .
1 − G(x)β(x)
The remaining customers that are served in a busy period of type x arrive
after the busy period is initialized and have service times less than x. Let Sx
be the service times of one such customer, then clearly we have
x
0 tdG(t)
E[Sx ] = ,
G(x)
x
0 t2 dG(t)
E S2x = .
G(x)
Problem 49
Using the notation and description from previous text, show that
E[τ(x)] β(x)
E[B(x)] = = ,
1 − ρ(x) λ(1 − G(x)β(x))
2
E τ(x)2 E S2x
E B(x) = + λG(x)E[τ(x)] .
{1 − ρ(x)}2 (1 − ρ(x))3
Solution
The busy period B(x) can be computed by selecting an appropriate schedul-
ing discipline. First serve the customer that initializes the busy period and
this takes τ(x) time. During this time τ(x), say N(τ(x)) new customers arrived
with service times smaller than x according to a Poisson process with param-
eter λG(x). After τ(x) time, we serve the first of the N(τ(x)) customers (if
there is one) and all the customers that arrive during this service time that
have service times smaller than x. Note that this time is identical to that of
the busy period of an M/G/1 queue with PP(λG(x)) arrivals and service times
according to Sx . Once this “mini” busy period is complete we serve the sec-
ond (if any) of the N(τ(x)) for another mini busy period and then continue
until all the N(τ(x)) customers’ mini busy periods are complete. Thus, we can
write down
N(τ(x))
B(x) = τ(x) + bi (x), (5.4)
i=1
where bi (x) is the ith mini busy period of an M/G/1 queue with PP(λG(x))
arrivals and service times according to Sx . Note that bi (x) over all i are IID
random variables. Based on one of the exercise problems of Chapter 4 on
computing the moments of the busy period of an M/G/1 queue, we have
E[Sx ]
E[bi (x)] = ,
1 − λG(x)E[Sx ]
2 E S2x
E[bi (x) ] = .
(1 − λG(x)E[Sx ])3
τ(x)
E[B(x)|τ(x)] = τ(x) + E[N(τ(x))]E[bi (x)] = ,
1 − ρ(x)
E B(x)2 τ(x) = τ(x)2 + E[N(τ(x))]E bi (x)2 + 2τ(x)E[N(τ(x))]E[bi (x)]
+ E[N(τ(x)){N(τ(x)) − 1}]{E[bi (x)]}2
2 E S2x
= τ(x) + λG(x)τ(x)
(1 − λG(x)E[Sx ])3
E[Sx ]
+ 2τ(x)2 λG(x)
1 − λG(x)E[Sx ]
2
E[Sx ]
+ {λG(x)τ(x)}2
1 − λG(x)E[Sx ]
τ(x)2 E S2x
= + λG(x)τ(x) .
{1 − ρ(x)}2 (1 − ρ(x))3
E[τ(x)]
E[B(x)] = ,
1 − ρ(x)
E τ(x)2 E S2x
E B(x)2 = + λG(x)E[τ(x)] .
{1 − ρ(x)}2 (1 − ρ(x))3
Wx = Vx + Rx
x x
λ 0 t2 dG(t) + x2 (1 − G(x)) dt
= + .
2(1 − ρ(x))2 (1 − ρ(t))
0
Multiclass Queues under Various Service Disciplines 293
∞
W= Wx dG(x).
0
for good reasons. But it is not just the strictly preemptive policies that are
included; policies such as processor sharing (which are only partially pre-
emptive) should also be considered in that group. Further, when the service
times are revealed upon arrival, it is sometimes called anticipative (although
anticipative could include a much broader class of policies, not just reveal-
ing service times upon arrival). Therefore, as examples of the four classes of
policies we have:
cashier, presumably that is why there is a separate line for customers with
fewer items in a grocery store). Thus, when the service times are known
upon arrival, it appears like a fair thing to do is the sojourn times be pro-
portional to the service time. For example, when there is a single class (i.e.,
K = 1), the mean sojourn time for a customer with service time x under
processor sharing scheme is x/(1 − ρ). Thus, the mean sojourn time is pro-
portional to the service time, and it would not be terribly unreasonable to
consider processor sharing as a “fair” policy. That is why many computer
system CPUs adopt roughly a processor sharing regime where the CPU
spends a small time (called time quantum) for each job and switches con-
text to the next job. Interestingly, the preemptive LCFS policy also has mean
sojourn time for a customer with service time x as x/(1 − ρ). However, it
is unclear if preemptive LCFS would be considered “fair” by the customers
although it certainly is for the service provider. Therefore, while determining
an optimal policy for a queueing system (which we will see next), it becomes
crucial to consider the perspectives of both the customers and the service
providers.
The optimal service-scheduling policy in a multiclass queueing system
depends greatly on the choice of objective function. The issue of fair-
ness becomes extraordinarily important if the customers can observe the
queues, in which case FCFS or anything considered fair by the users must
be adopted. However, for the rest of this chapter we assume that the
customers cannot observe the queue; however, the service provider has real-
time access to the queue (usually number in the system, sometimes service
time requirements and amount of service complete). The service provider
thus uses a scheduling policy that would optimize a performance measure
that would strike a balance between the needs of the customers and the
service provider. We consider one such objective function which is to min-
imize the mean number of customers in the system. In particular, let Lπ i
represent the average number of class-i customers in the system in steady
state under policy π (for i = 1, . . . , K). Then, the objective function is to
determine the optimal service scheduling among all work-conserving poli-
cies that minimizes the total number in the system. Hence, our objective
function is
K
min Lπ
i .
π∈
i=1
This is indeed the hazard rate function when x = 0. We will use the
relation subsequently. Define κ(a) as
and x∗ = arg maxx≥0 k(a, x), that is, the value of x that maximizes
k(a, x). Then, the Gittins index policy works as follows: at every
instant of time, serve the customer with the largest κ(a). To imple-
ment this, select the customer with the largest κ(a) and serve this
customer until whichever happens first among the following: (1) the
customer is served for time x∗ , (2) the customer’s service is com-
plete, or (3) a new customer arrives with a higher Gittins index. The
proof that the Gittins index policy is fairly detailed is not presented
here. However, the key idea is to target the application with the
highest probability of completing service within a time x and at the
same time have a low expected remaining service time. Although
not exactly, the k(a, x) measure roughly captures that. We saw ear-
lier that if the service times are known upon arrival, then SRPT is the
best, and when they are unknown, we do what best we can based on
the information we have (which is what Gittins index policy does).
Two special cases of Gittins index policy are discussed subsequently
when the hazard rate functions are monotonic.
3. Service times unknown upon arrival; only one job with partially complete
service allowed: Here we consider the case where it is not possible to
know the service time of each customer upon arrival (i.e., nonan-
ticipative), and it is not possible to have more than one partially
complete service. Interestingly, in this restricted framework every
policy would yield the same W. Therefore, all work-conserving
policies in this restricted framework (such as FCFS, nonpreemptive
LCFS, random order of service, etc.) are optimal. Thus, a policy such
as FCFS which is generally fair would be ideal.
Problem 50
When the service times are unknown upon arrival and many jobs with par-
tially complete service are allowed, then Gittins index policy is optimal.
Show that if the service times are IFR, then the Gittins index policy reduces
to FCFS-like policies that do not allow more than one job to be partially
complete (although that is not a requirement). Also describe the special case
optimal policy when the service times are DFR.
Solution
Refer to Aalto et al. [1] for a rigorous proof; we just provide an outline
based on that paper here. From the definition of k(a, x) given in Equation 5.5,
we have
x x
k(a, x) (1 − G(a + y))dy = g(a + y)dy
0 0
where g(y) is the PDF of the service times, that is, dG(y)/dy. By taking
derivative with respect to x of this equation, we get
∂k(a, x)
x
(1 − G(a + y))dy + k(a, x)[1 − G(a + x)] = g(a + x).
∂x
0
We can rewrite this expression in terms of the hazard (or failure) rate
function h(y) defined as
dG(y)/dy
h(y) =
1 − G(y)
clearly the RHS is ≤ (or ≥, respectively) h(a+x) if the service times are IFR (or
DFR, respectively). Therefore, if the service times are IFR, k(a, x) ≤ h(a + x),
and if they are DFR, k(a, x) ≥ h(a + x). However, we showed earlier that
k(a, x) would increase (or decrease, respectively) with respect to x if h(a + x)
is greater than (or less than, respectively) k(a, x). Hence, we can conclude
that if the service times are IFR (or DFR, respectively), k(a, x) would increase
(or decrease, respectively) with respect to x. Thus, we can compute κ(a) =
maxx {k(a, x)} when the service times are IFR and DFR. In particular, we can
show the following:
1 − G(a)
κ(a) = k(a, ∞) = ∞ .
0 [1 − G(a + t)]dt
This can be used to show that κ(a) increases with a (using a very
similar derivative argument, left as an exercise for the reader). Since
Gittins index policy always picks the job with the largest κ(a), the
optimal policy is to select the job with the largest attained service a.
Consider a queue that is empty, since the discipline is work con-
serving, an arriving customer is immediately served. This customer
is never interrupted because we always serve the customer that has
the largest attained service. When this customer completes service
if there are many jobs to choose from, any of them can be picked
since they all have a = 0 and served. Again this customer is never
interrupted until service is complete. Thus, any FCFS-like policy that
does not allow more than one job to be partially complete is optimal.
Note that all of these FCFS-like policies yield the same L or W.
• If service times are DFR
however, the former would result in going back and forth between
the two customers that have equally attained service. Hence the
name FB.
we saw that the Gittins index policy minimizes L in the case consid-
ered. Would that work here too for a general K? As it turns out, the
functions used in the Gittins index (such as k(a, x) and κ(a)) are class-
dependent. Therefore, one has to be careful in writing down the
Gittins index parameter for each customer in the system. However,
if we are able to do that, then indeed the Gittins index policy would
be optimal (see Theorems 1 and 2 in Aalto et al. [1] where what we
refer to as Gittins index policy is what they call Gittins index quan-
tum policy). The idea of the proof is similar to that when there is a
single class and the reader is encouraged to refer to that. Now we
explain the policy in the general K class case. Let a be the amount of
attained service (i.e., amount of completed service) for an arbitrary
customer in a queue. Then define ki (a, x) for a class-i customer in the
system as
a+x
dGi (y)
ki (a, x) = a+xa .
a (1 − Gi (y))dy
and x∗i = arg maxx≥0 ki (a, x), that is, the value of x that maximizes
ki (a, x). Then, the Gittins index policy works as follows: at every
instant of time, serve the customer with the largest κi (a) over all cus-
tomers of all classes i. To implement this, from all the customers
in the system (belonging to various classes) select the one with the
largest κi (a). Say this customer is of class j. Serve this class-j customer
until whichever happens first among the following: (1) the cus-
tomer is served for time x∗j , (2) the customer’s service is complete, or
(3) a new customer arrives with a higher Gittins index.
Now we briefly discuss two special cases of Gittins index policy,
that is, when the hazard rate functions are monotonic: (1) If the ser-
vice time distributions of all K classes are DFR, then the Gittins index
policy reduces to serving the job with the highest failure rate hi (a).
Since the service times are DFR, within a class we always use LAS.
However, across classes we need to compare the hazard rate (or fail-
ure rate) functions of the least attained service customer in each class
and serve the one with the highest failure rate. An interesting case
is when the failure rate functions hi (x) do not overlap, then we can
order the classes according to hazard rate and use a preemptive pri-
ority policy (and LAS within a class) that assigns highest priority
to the class with the highest hazard rate function. (2) If the service
time distributions of all K classes are IFR, then the Gittins index
302 Analysis of Queues
K
ρi Wiq
i=1
polyhedron formed by K i=1 ρi Wiq = Kc , where Kc is a constant that
can be computed for say FCFS. This polyhedron feasible region is
the achievable region described earlier. The proof is to mainly show
that the nonpreemptive policy is indeed one of the corner points and
the one that minimizes the objective function.
Reference Notes
Analysis of queues with multiple classes of customers can be approached
from many angles as evident from the literature, namely, based on applica-
tions, based on objectives such as performance analysis versus optimization,
and also theory versus practice. However, a common thread across the
angles is the objective where the server needs to decide which class of cus-
tomer to serve. For that reason, this is also frequently referred to as stochastic
scheduling. The topic of stochastic scheduling has recently received a lot
of attention after a surge of possibilities in computer network applications.
Although the intention of this chapter was to provide a quick review of
results from the last 50 years of work in single-station, single-server, mul-
ticlass queues, a large number of excellent pieces of work had to be left out.
The main focus of this chapter has been to present analytical expressions
for various performance measures under different service-scheduling poli-
cies. These are categorized based on how the customers are classified, that
is, depending on type, location, or service times.
This chapter brings together some unique aspects of queues and the the-
oretical underpinnings for those can be found in several texts. For example,
the fundamental notion of work conservation has been greatly influenced
304 Analysis of Queues
Exercises
5.1 Consider a repair shop that undertakes repairs for K different
types of parts. Parts of type i arrive to the repair shop accord-
ing to a Poisson process with mean arrival rate λi parts per hour
(i = 1, . . . , K). At a time only one part can be repaired in the
shop with a given mean repair time τi hours and a given stan-
dard deviation of σi hours (i = 1, . . . , K). There is a single waiting
room for all parts. This system can be modeled as a standard
M/G/1 multiclass queue. Use K = 5 and the following numerical
values:
i λi τi σi
1 0.2 1.2 1.0
2 0.3 0.3 0.6
3 0.1 1.5 0.9
4 0.4 1.0 1.0
5 0.2 0.3 0.8
∞
(z) = πi zi .
i=0
Please note that you are only asked for (z) and NOT the
individual πj values.
5.4 Consider an M/M/1 queue with nonpreemptive LCFS service dis-
cipline. Nonpreemptive means that a customer in service does not
get replaced by a newly arriving customer. Show that the LST of
the sojourn time in the system in terms of the LST of the busy
period distribution B̃(s) is
μ μ
E[e−sY ] = (1 − ρ) + ρB̃(s)
s+μ s+μ
at time t = T, the system becomes empty for the first time. Then,
T is a random variable known as the busy period. Then, B(·) is
the CDF of the busy period. It is crucial to realize that the busy
period does not depend on the service discipline as long as it is
work conserving. So if we know B̃(s) for FCFS discipline, then we
can compute E[e−sY ] for the nonpreemptive LCFS.]
5.5 Consider an M/G/1 queue with K classes. Using the expressions
for Wiq for both FCFS and nonpreemptive priority service disci-
K
plines, show that ρi Wiq results in the same expression. In
i=1
other words, this verifies that the amount of work in the wait-
ing area is conserved and is equal to the expression previous.
Further, for the special case of exponential service times for all
K classes, show that the preemptive resume policy also yields
K
the same expression for ρi Wiq by rewriting the expression
i=1
for FCFS and nonpreemptive priority using exponential service
times. Although the preemptive resume policy does not satisfy
the condition that there can be at most only one customer with
partially complete service, why does the result hold?
5.6 Answer the following multiple choice questions:
(i) Consider a stable G/G/1 queue with four classes (average
arrival and service rates are λi and μi , respectively, for class
i) using preemptive resume priority. What fraction of time in
the long run is the server busy?
λ1 +λ2 +λ3 +λ4
(a) μ1 +μ2 +μ3 +μ4
λ1 λ2 λ3 λ4
(b) μ1 + μ2 + μ3 + μ4
λ1 +λ2 +λ3 +λ4
(c) 1 − μ1 +μ2 +μ3 +μ4
1
(d) λ1 λ2 λ3 λ4
μ1 + μ2 + μ3 + μ4
(ii) Consider an M/G/1 queue with four classes (call them classes
A, B, C, D) such that the average service time for them are 2, 3,
1, and 5 min, respectively. Also, the holding cost for retaining
a customer of classes A, B, C, and D are, respectively, 3, 5, 2,
and 5 dollars per item per minute. If we use a nonpreemptive
priority, what should be the order of priority from highest to
lowest?
(a) D, A, B, C
(b) C, A, B, D
(c) D, B, A, C
(d) C, B, A, D
Multiclass Queues under Various Service Disciplines 307
where C̃i (·) is the LST of the cycle time associated with queue i
in steady state and all other variables are described in Section 5.3.
Using the previous expression, show that
(1 + ρi )E C2i
Wiq = .
2E[C]
5.11 For an M/G/1 queue with a single class, show that if service times
are IFR, then the Gittins index policy parameter κ(a) increases
with a using the expression
1 − G(a)
κ(a) = ∞
0 [1 − G(a + t)]dt
311
312 Analysis of Queues
1 3 5
0 2 4
FIGURE 6.1
Example of an acyclic network.
1 3 5
0 2 4
FIGURE 6.2
Example of a cyclic network.
Exact Results in Network of Queues: Product Form 313
TABLE 6.1
Conditions to be Satisfied at Node i
Service Time
No. of Servers Capacity Distribution Stability
si ∞ Exponential Required
si si General Not applicable
∞ ∞ General Not applicable
node i and gets served, then upon service completion, the customer joins
node j with probability pij and leaves the network with probability ri . The
queue service in node i must satisfy one of the categories given in Table 6.1
(although we do not consider in this section, the results also hold if node
i is a single-server queue with processor sharing discipline or LCFS with
preemptive resume policy). If N = 1, the single node case, these corre-
spond to M/M/s, M/G/s/s, and M/G/∞ cases in the order provided in the
table. With that in mind, our first step to analyze the acyclic queueing net-
work is to characterize the output (or departure) process from the M/M/s,
M/G/s/s, and M/G/∞ queues, which would potentially act as input for a
downstream node.
Problem 51
Consider an M/M/1 queueing system with PP(λ) arrivals and exp(μ) ser-
vice time distribution. Assume that λ < μ. Let U be a random variable that
denotes the time between two arbitrarily selected successive departures from
the system in steady state. Show by conditioning, whether or not the first
of those departures has left the system empty, that U is an exponentially
distributed random variable with mean 1/λ.
Solution
Say a departure just occurred from the M/M/1 queue in steady state. Let
X denote the time of the next arrival and Y denote the service time of the
next customer. We would like to obtain the CDF of U, the time of the next
departure. Define F(x) = P(U ≤ x). Also, let Z be a random variable such that
Z = 0 if there are no customers in the system currently (notice that a depar-
ture just occurred), and Z = 1 otherwise. If Z = 0 then U = X + Y, otherwise
314 Analysis of Queues
U = Y. Recall that πj = π∗j = pj for all j, that is, the probability there are j in
the system as observed by a departing customer in steady state would be the
same as that of an arriving customer as well as the steady-state probability
that there are j in the system. We also know that p0 = 1 − ρ where ρ = λ/μ.
Therefore, we have the LST of F(x) by conditioning on Z as
pi qij = pj qji
which is a direct artifact of the balance equations resulting from the arc cuts
for consecutive nodes i and j (otherwise qij = 0).
One of the implications of reversible processes is that if the system is
observed backward in time, then one cannot tell the difference in the queue
length process. Thus the departure epochs would correspond to the arrival
epochs in the reversed process and vice versa. Therefore, the departure
process would be stochastically identical to the arrival process, which is
a Poisson process. In fact, for the M/G/∞ and M/G/s/s queues as well,
the departures are according to a Poisson process. We had indicated in
Chapter 4 that if we define an appropriate Markov process for the M/G/s/s
queue (note that when s = ∞ we get the M/G/∞ queue, so the same result
holds), then that process is reversible. Due to reversibility, the departures
from the original system correspond to arrivals in the reversed system.
Exact Results in Network of Queues: Product Form 315
Therefore, the departure process from the M/G/s/s queue is a Poisson pro-
cess with rate (1 − ps )λ departures per unit time on average, where ps is
the probability that there are s in the system in steady state (i.e., zero when
s = ∞), that is, the probability of a potential arrival is rejected.
It is indeed strange that for the stable M/M/s queue and the M/G/∞
queue, the output process is not affected by the service process. Of course,
this is incredibly convenient in terms of analysis. Poisson processes have
other extremely useful properties, such as superpositioning and splitting,
that are conducive for analysis, which we will see next. Before that it is
worthwhile to point out that in the M/G/s/s case, through ps the departure
processes does depend on the mean service rate. However, the distribution
of service times has no effect on the departure process in steady state. Hav-
ing said that, except for a small example we will consider in the next section,
until we reach Section 6.4.4 on loss networks, we will only consider infinite
capacity stable queues and not consider any rejections.
PP(λ1) q1 PP(λq1)
PP(λ2) q2 PP(λq2)
PP(λ) PP(λ)
qk
PP(λn) PP(λqk)
FIGURE 6.3
Merge, flow through a stable queue with s exponential servers and split.
Problem 52
Consider the acyclic network in Figure 6.1. Say customers arrive externally at
nodes 0, 2, and 4 according to Poisson processes with mean rates 13, 12, and
15 customers per hour, respectively. After service at nodes 3, 4, and 5, a cer-
tain fraction of customers exit the network. After service at all nodes i (such
that 0 ≤ i ≤ 5), customers choose with equal probabilities among the options
available. For example, after service at node 4, with an equal probability of
1/3, customers choose nodes 3 or 5 or exit the network (while customers that
complete service at nodes 0, 1, and 2 do not immediately exit the network).
Further, node 0 has two servers but a capacity of 2 and generally distributed
service times. Nodes 1, 3, and 5 are single-server nodes with exponentially
distributed service times and infinite capacity. Node 2 is a two-server node
with exponentially distributed service times and infinite capacity. Node 4
is an infinite-server node with generally distributed service times. Assume
that the mean service times (in hours) at nodes 0, 1, 2, 3, 4, and 5 are 1/26,
Exact Results in Network of Queues: Product Form 317
1/15, 1/10, 1/30, 1/7, and 1/20, respectively. Compute the average number
of customers in each node in steady state as well as the overall number in the
network.
Solution
We consider node by node and derive Lj , the average number of customers
in node j (for 0 ≤ j ≤ 5).
Node 0: Arrivals to node 0 are according to PP(13). Since there are two
servers and capacity of 2, this node can be modeled as an M/G/2/2 queue.
The probability that this node is full is p2 and is given by
2
1 13
2 26
p2 = 2 = 1/13.
13 1 13
1+ 26 + 2 26
Interestingly, the above example does not fully illustrate the properties
of Poisson-based acyclic queueing networks in all its glory. In fact, one of
the most unique properties that these acyclic networks satisfy and that is not
satisfied by the cyclic networks (that we will discuss in the next section) is
given in the next remark.
Remark 11
Consider an acyclic network that has N nodes with external arrivals accord-
ing to a Poisson process and conditions in Table 6.1 satisfied. Assume that
the network is in steady state at time 0. Let Xi (t) be the number of customers
in node i at time t for 1 ≤ i ≤ N and t ≥ 0. Then for any j and u such that
1 ≤ j ≤ N, j = i, and u ≥ 0, the two random quantities Xi (t) and Xj (u) are
independent.
This remark enables us to decompose the queueing network and study one
node at a time (of course, the right order should be picked) without wor-
rying about dependence between them. Further, if one were to derive the
sojourn time distribution for a particular customer, then it would just be
the sum of sojourn times across each node in its path, which are all inde-
pendent. For example, a customer that enters node 2 in Problem 52, goes
to node 3, then to 5, and exits the network would have a total sojourn time
T equal to the sum of the times in nodes 2, 3, and 5, say T2 , T3 , and T5 ,
respectively. Then since T2 , T3 , and T5 are independent, we can compute the
LST of T as
E e−sT = E e−s(T2 +T3 +T5 ) = E e−sT2 E e−sT3 E e−sT5 .
Remark 12
Consider a stable G/G/m queue where many servers are busy. The arrivals
are according to a renewal process and the servers are identical with ser-
vice times according to a general distribution. Whitt [104] shows that the
departures from such a queue is approximately a Poisson process.
he worried about was to make sure that the expectations of the parent com-
pany from which he franchised the motels were being met. However, in this
day and age where customers can easily check reviews on their smart phones
while they are traveling, it has become extremely important to provide excel-
lent service individually (not just overall). Cleve thought to himself that a
good number of his prospective clients are going to search on their smart-
phones and it would be of paramount importance to have good reviews.
Subsequently, Cleve brainstormed with Vineet opportunities for enhanc-
ing customer satisfaction. Based on that, three main ideas emerged: (1) per-
form a complementary multipoint inspection at the end of a service/repair
for all customers (a vehicle inspector would need to be hired for that);
(2) offer guarantees such as if you do not get your vehicle back within some τ
minutes, the service is free; and (3) install an additional bay and hire an addi-
tional automotive technician. It was not clear to Cleve, what the benefits of
these improvements would be. He knew for sure that idea (1) needs to be
done because all the dealership service stations are providing that inspec-
tion, and to stay competitive, Carfix must also offer it. Cleve told Vineet that
once he figures out the best option, he would be ready for another round
of golf with Vineet to discuss what kind of discounts he could provide his
customers to stay at one of Vineet’s motels.
One of Cleve’s nieces, Lauren, was a senior in Industrial Engineering who
Cleve thought could help him analyze the various options. When Lauren
heard about the problem, she was excited. She talked to her professor to find
out if he would allow her and a couple of her friends to work on Cleve’s
problem for their capstone design course. The professor agreed, in fact
he was elated because it would solve the mismatch he was encountering
between the number of students and projects. When Lauren and her friends
arrived at Carfix they found out that there was no historical data, so they
spent a few hours collecting data. Interestingly, when the students were at
Carfix, Cleve also had two inspector-candidates who were going to inter-
view for the multipoint inspection position. For the interview, Cleve asked
the two candidates to perform inspections on a few vehicles.
to be greeted. That is because if the greeter was busy when a new vehicle
arrived, the cashier or Cleve himself would go and greet.
At the second stage, technicians would pick up the parked vehicle in the
order they arrived and take them to their bay. There were four technicians,
each with his or her own bay. The bays were equipped to handle all repairs
and service operations. After completing the repair, the technicians would
return the vehicles back to the parking area. The time between when a tech-
nician would pick up a vehicle from where it is parked till he or she would
drop that vehicle off is exponentially distributed with mean about 36 min.
Given the different types of services and repairs as well as the types of vehi-
cles, it did not surprise Lauren and friends that a high-variability distribution
such as exponential fitted well. Then in the third stage, the inspector would
perform a multipoint inspection at the parking lot and send a report.
Among the two inspectors interviewed, inspector A had a mean of 6 min
and a standard deviation of 4.24 min, whereas inspector B had a mean of
7 min and a standard deviation of 3.5 min. Although the number of sample
points were small, Lauren and her friends felt comfortable using a gamma
distribution for the inspection times. They also realized that it was not nec-
essary to include the time at the cashier because what the customers really
cared about was the time between when they dropped off the car and when
they hear from Cleve or another staff member (on days Cleve is out golfing)
that the repair or service is complete. So Lauren and friends decided on using
a three-stage tandem system to represent Carfix’s shop.
Since there was never a queue buildup, Lauren and friends modeled
the first stage as an M/G/∞ queue with PP(λ) arrivals and Unif (a, b) ser-
vice times where λ = 4.8 per hour, a = 1 min, and b = 2 min. Clearly, the
departures from the queue would be PP(λ) and will act as arrivals to the sec-
ond stage. They modeled the second stage as an M/M/4 queue with PP(λ)
arrivals and exp(μ) service times, where μ = 5/3 per hour. Since the depar-
ture from a stable M/M/s queue is a Poisson process, they modeled the third
stage as an M/G/1 queue with PP(λ) arrivals, and mean and variance of
service times depending on whether inspector A or B is used.
e−s (1 − e−s )
E e−sY1 = .
s
322 Analysis of Queues
At this point, Lauren and friends were not sure if the LST was necessary
but felt that it was good to keep it in case they were to compute the LST
of the total sojourn time in the system. Then, using the fact that λ = 4.8 per
hour, the average number of vehicles in stage 1 is λ ∗ 1.5/60 = 0.12. Thus the
steady-state probability that there are no more than two vehicles in stage 1
is (1 + 0.12 + (0.12)2 /2!)e−0.12 = 0.9997, which clearly justifies the use of the
M/G/∞ model. In fact, it shows how rarely Cleve would have to greet a
customer (although it appears that the cashier would have to greet only one
in 100 vehicles on average as the probability of zero or one vehicles at stage
one is 0.9934, in reality it was a lot more often because the greeter was called
upon to run odd jobs from time to time).
Lauren and friends next considered stage 2, which they modeled as an
M/M/4 queue with PP(λ) arrivals and exp(μ) service. Plugging in λ = 4.8
per hour and μ = 5/3 per hour, they got a traffic intensity ρ = λ/(4μ) = 0.72 at
stage 2. Clearly, this is stable albeit not a low-traffic intensity. It does appear
like adding a new bay (thus, M/M/5 queue) would significantly reduce the
traffic intensity to 0.576 resulting in better customer service. Starting with
the present M/M/4 system, Lauren and friends calculated the mean sojourn
time in stage 2 as W2 = 50.79 min (then for the M/M/5 system they calculated
that it would be 39.52 min). Also, with s servers, the sojourn time Y2 had a
CDF F2 (y) = P{Y2 ≤ y} given by
s−1
(λ/μ)s sμ
−μy
F2 (y) = pj (1 − e ) + p0 (1 − e−μy )
s! (s − 1)μ − λ
j=0
sμ2
− (1 − e−(sμ−λ)y )
(sμ − λ)[(s − 1)μ − λ]
s−1
−1
1 λ n (λ/μ)s 1
p0 = +
n! μ s! 1 − λ/(sμ)
n=0
(1 − ρ3 )sG̃(s)
E e−sY3 =
s − λ(1 − G̃(s))/60
where the traffic intensity ρ3 = 0.1λ since the mean inspection time is 6 min,
that is, 0.1 h. Inverting the LST, Lauren and friends computed the stage-3
sojourn time (in min) CDF as
P{Y3 ≤ y} = 1.3724 1 − e−0.1252y − 0.3724 1 − e−0.4615y .
the required what-if analysis so that Cleve could use a cost-benefit analysis
to determine the best alternatives. Eventually, Cleve ended up implementing
all the recommendations Lauren and friends made. As expected, it improved
customer satisfaction as well as increased the demand. However, Cleve was
well positioned for that higher demand.
To analyze the open Jackson network, one of the preliminary results is flow
conservation and stability, which we describe next.
Exact Results in Network of Queues: Product Form 325
N
aj = λj + ai pij ∀j = 1, 2, . . . , N. (6.1)
i=1
This result is due to the fact that the total arrival rate into node j equals the
external arrival rate λj plus the sum of the departure rates from each node i
(for i = 1, . . . , N) times the fraction that are routed to j (i.e., pij ). Therefore,
let a = (a1 , a2 , . . . , aN ) be the resulting row vector that we need to obtain.
We can rewrite the aj Equation 6.1 in matrix form as a = λ + aP, where
λ = (λ1 , λ2 , . . . , λN ) is a row vector. Then a can be solved using
Note that for this result we require (I − P) to be invertible. Unlike the acyclic
networks where aj values are easy to compute, when the networks are cyclic,
one may have to rely on Equation 6.2.
Now that aj values can be obtained for all j, we are in a position to state
the stability condition. For any j (such that 1 ≤ j ≤ N), node j is stable if
aj < sj μj .
Then we check if each queue is stable. If all queues are stable then we can
conclude that the flow rates through node j are indeed aj . For the open Jack-
son network described earlier, the objective is to derive the joint distribution
as well as marginal distribution (and moments) of the number of customers
in each queue. We do that next and describe a product form for the joint
distribution and thereby derive the marginals.
To obtain p(x) we consider the balance equations (flow out equals flow
in) for x = (x1 , x2 , . . . , xN ) just like we would do for any CTMC. To make
our notation crisp, we write the balance equations in terms of ei , which is
a unit vector with one as the ith element and zeros everywhere else. For
example, if N = 4, then e2 = (0, 1, 0, 0). Also for notational convenience we
denote p(x) as zero if any xj < 0. Thus, the generic balance equation takes
the form
Exact Results in Network of Queues: Product Form 327
N
N
p(x) λi + min(xi , si )μi
i=1 i=1
N
N
= p(x − ei )λi + p(x + ei )ri min(xi + 1, si )μi
i=1 i=1
N
N
+ p(x + ei − ej )pij min(xi + 1, si )μi .
j=1 i=1
To explain this briefly, note that the LHS includes all the transitions out of
state x, which include any external arrivals or service completions. Likewise,
the RHS includes all the transitions into state x, that is, external arrivals
as well as service completions that lead to exiting the network or joining
other nodes. If this is not clear, it may be worthwhile for the reader to try an
example with a small number of nodes before proceeding further.
It is mathematically intractable to directly solve the balance equations
to get p(x) for all x except for special cases. However, since we know that
there is a unique solution to the balance equations, if we find a solution then
that is the solution. In that spirit, consider an acyclic open Jackson network
for which from an earlier section we can compute p(x). For the acyclic open
Jackson network, node j (such that 1 ≤ j ≤ N) would be an M/M/sj queue
with PP(aj ) arrivals, exp(μj ) service and sj servers (if the stability condition
at each node j is satisfied, that is, aj < sj μj ). Hence, it is possible to obtain the
steady-state probability of having n customers in node j, which we denote as
φj (n). Using the M/M/s queue results in Chapter 2, we have
⎧ a n
⎪
⎪ 1 j
φj (0) if 0 ≤ n ≤ sj − 1
⎨ n! μj
φj (n) = a n (6.3)
⎪
⎪ 1 j
φj (0) if n ≥ sj
⎩ n−sj μj
sj ! sj
where
⎡ ⎤−1
sj −1
1 (aj /μj )sj 1
φj (0) = ⎣ (aj /μj )n + ⎦ .
n! sj ! 1 − aj /(sj μj )
n=0
where φj (n) is given by Equation 6.3. We need to verify that this p(x) satisfies
the balance equation
N
N
p(x) λi + min(xi , si )μi
i=1 i=1
N
N
= p(x − ei )λi + p(x + ei )ri min(xi + 1, si )μi
i=1 i=1
N
N
+ p(x + ei − ej )pij min(xi + 1, si )μi .
j=1 i=1
For that, notice first of all if p(x) = φ1 (x1 )φ2 (x2 ) . . . φN (xN ), then
p(x) φi (xi )
= ,
p(x ± ei ) φi (xi ± 1)
for all i and j. In addition, from the definition of φi (n) in Equation 6.3, we can
obtain the following:
N
N
λi + min(xi , si )μi
i=1 i=1
N
min(xi , si )μi λi
N
N
N
ai
= + ai ri + min(xj , sj )μj pij
ai aj
i=1 i=1 j=1 i=1
N
min(xi , si )μi λi
N
N
min(xj , sj )μj
N
= + ai ri + ai pij
ai aj
i=1 i=1 j=1 i=1
N
min(xi , si )μi λi
N
N
min(xj , sj )μj
= + ai ri + (aj − λj )
ai aj
i=1 i=1 j=1
N
N
= ai ri + min(xj , sj )μj
i=1 j=1
N
where the third equation can be derived using ai pij = aj − λj ,
i=1
which is directly from Equation 6.1. Since the other two terms cancel,
p(x) = φ1 (x1 )φ2 (x2 ) . . . φN (xN ) is the solution to the balance equation if
N
N
λi = ai ri .
i=1 i=1
This equation is true because (using the notation e as a column vector of ones)
from Equation 6.2 we have
λe = a(I − P)e,
⎛ ⎞
N N
N
⇒ λi = ai ⎝1 − pij ⎠ ,
i=1 i=1 j=1
N
N
⇒ λi = ai ri .
i=1 i=1
Thus, p(x) = φ1 (x1 )φ2 (x2 ) . . . φN (xN ) satisfies the balance equations for a
generic open Jackson network. In other words, the steady-state joint proba-
bility distribution of having x1 in node 1, x2 in node 2, . . ., xN in node N is
330 Analysis of Queues
equal to the p(x), which is the product of the φj (xj ) values for all j. Hence,
this result is known as product form. Now to get the marginal distribution
that queue j has xj customers in steady state for some j such that 1 ≤ j ≤ N,
all we have to do is sum over all x keeping xj a constant. Thus the marginal
probability that node j has xj in steady state is given by φj (xj ). Notice that the
joint probability is the product of the marginals. From a practical standpoint,
this is extremely convenient because we can model each node j as though it
is an M/M/sj queue (although in reality it may not be) with PP(aj ) arrivals
and exp(μj ) service times. Then we can obtain φj (xj ) and then get p(x). Also,
the steady-state expected number in node j, Lj , can be computed using the
M/M/sj results as well. Then L can be computed by adding over all j from
1 to N. Similarly, it is possible to obtain performance measures at node j
such average waiting time (Wj ), time in queue not including service (Wjq ),
and number in queue not including service (Ljq ), using the single-station
M/M/s queue analysis in Chapter 2. However, while computing something
like sojourn time distribution (across a single node or the network), one has
to be more careful. In the next remark we illustrate some of the issues that
have been nicely described in Disney and Kiessler [26].
Remark 13
Consider an open Jackson network (with cycles) that are stationary (i.e., in
steady state) at time 0. For this network the following results hold:
6.2.3 Examples
In this section, we present four examples to illustrate the approach to obtain
performance measures in open Jackson networks, discuss design issues as
Exact Results in Network of Queues: Product Form 331
Problem 53
Bay-Gull Bagels is a bagel store in downtown College Taste-on. See
Figure 6.4 for a schematic representation of the store as well as numerical
values used for arrivals, service, as well as routing probabilities. Assume
all arrival processes are Poisson and service times are exponential. Aver-
age arrival rates and average service times for each server are given in the
figure. Assume there are infinite servers whenever a station says self-service.
Also assume all queues have infinite capacity. Model the system as an open
Jackson network and obtain the average number of customers in each of
the five stations. Then state how many customers are in the system on an
average.
Solution
The system can be modeled as an open Jackson network with N = 5 nodes or
stations. Let the set of nodes be {B, S, D, C, E} (as opposed to numbering them
from 1 to 5) denoting the five stations: bagels, smoothies, drinks, cashier,
and eat-in. Note that external arrival processes are Poisson and they occur
at stations B, S, and D. The service times are exponentially distributed. Note
the cycle D–C–E–D. Thus we have an open Jackson network with a cycle.
There are three servers at node B, two servers at node S, ∞ servers at node
Special smoothies
Two servers
(4.5 min)
0.2 min
30%
Three servers Cashier
(2.5 min)
Drinks, namely, coffee 50%
0.4 min
Self-service
(2 min) Eat-in
5%
95% Self-service
(20 min)
FIGURE 6.4
Schematic of Bay-Gull Bagels.
332 Analysis of Queues
D, two servers at node C, and ∞ servers at node E. The external arrival rate
vector λ = [λB λS λD λC λE ] is
We assume that after getting served at the bagel queue, each customer
chooses node S, C, and D with probabilities pBS = 0.2, pBC = 0.5, and
pBD = 0.3, respectively. Likewise, after getting served at C, customers go to
node E or exit the system with equal probability, that is, pCE = rC = 0.5. Sim-
ilarly, after service at node E, with probability pED = 0.05 enter the drinks
node or exit the network with probability rE = 0.95. Also, pSC = 1 and pDC = 1.
Thus the routing probabilities in the order {B, S, D, C, E} are
⎡ ⎤
0 0.2 0.3 0.5 0
⎢ 0 0 0 1 0 ⎥
⎢ ⎥
⎢ ⎥
P=⎢ 0 0 0 1 0 ⎥.
⎢ ⎥
⎣ 0 0 0 0 0.5 ⎦
0 0 0.05 0 0
Say in the previous problem, we were to find out the probability there
are 3 customers in node B, 2 in S, 1 in D, 4 in C, and 20 in E. Then that
is equal to the product φB (3)φS (2)φD (1)φC (4)φE (20) and can be computed
using Equation 6.3. Note that φj (n) computation is indeed equal to the prob-
ability there are n in a steady-state M/M/sj queue with PP(aj ) arrivals and
exp(μj ) service. Having described that, we next present an example that dis-
cusses some design issues by comparing various ways of setting up systems
with multiple stages and servers.
Problem 54
Consider a system into which customers arrive according to a Poisson pro-
cess with parameter λ. Each customer needs N stages of service and each
stage takes exp(μ) amount of time. There are N servers in the system and N
buffers for customers to wait. Assume that the buffers have infinite capacity
and λ < μ. There are two design alternatives to consider:
1. Serial system: Each set of buffer and server is placed serially in a tan-
dem fashion as described in Figure 6.5. Each node corresponds to a
different stage of service. Customers arrive at the first node accord-
ing to PP(λ). There is a single server that takes exp(μ) time to serve
after which the customer goes to the second node. At the second
node there is a single server that takes exp(μ) time. This continues
until the Nth node and then the customer exits.
PP(λ)
FIGURE 6.5
System of N buffers and servers in series.
334 Analysis of Queues
1/N PP(λ/N)
1/N
PP(λ) PP(λ/N)
1/N
PP(λ/N)
FIGURE 6.6
System of N buffers and servers in parallel.
By comparing the mean sojourn time for an arbitrary customer in steady state
in both systems, determine whether the serial or parallel system is better.
Solution
Note that although the figures appear to be somewhat different, the
resources of the system and service needs of customers are identical. In other
words, in both systems we have N single-server queues each with an infinite
buffer. Also, in both systems the customers experience N stages of service,
each taking an exp(μ) time. We now analyze the system in the same order
they were presented.
1. The serial system is an open Jackson network (in fact an acyclic net-
work) with N single-server nodes back to back. Service time at each
node is exp(μ). Such a serial system is called a pipeline system in the
computer science literature and tandem network in the manufactur-
ing literature. Clearly, each queue is an M/M/1 queue with PP(λ)
arrivals and exp(μ) service. The average time in each node is thus
1/(μ − λ). Thus, the mean sojourn time in the system is
N
Wseries = .
μ−λ
N λ(N + 1) 2Nμ − Nλ + λ
Wparallel = + = .
μ 2μ(μ − λ) 2μ(μ − λ)
Comparing Wseries and Wparallel , if N > 1 then Wseries > Wparallel ; however, if
N = 1 then Wseries = Wparallel . Thus the parallel system is better.
material handling standpoint as well (or too expensive if the time consump-
tion is reduced). Therefore, in practice, sometimes a combination of serial
and parallel tasking is used. Usually servers are trained in two to four tasks
that they perform together. This not only improves the system performance
but also reduces monotonous conditions. Having described that, the next
two examples are paradoxes that further help understand issues in queueing
networks.
Problem 55
Braess’ paradox: In a network, does adding extra capacity always improve
the system in terms of performance? Although it appears intuitive, adding
extra capacity to a network when the moving entities selfishly choose their
routes can in some cases worsen the overall performance! Illustrate this using
an example.
Solution
Consider a network with nodes A, B, C, and D. There are directed arcs from
A to B, B to D, A to C, and C to D. Customers arrive into node A according
to a Poisson process with mean rate 2λ. The customers need to reach node D
and they have two paths, one through B and the other through C, as shown
in Figure 6.7. Along the arc from A to B there is a single-server queue with
exponentially distributed service times (and mean 1/μ). Likewise, there is an
identical queue along the arc from C to D. In addition, it takes a deterministic
time of 2 units to traverse arcs AC and BD. Assume that
μ > λ + 1.
B
Travel time = 1
μ–λ Travel time = 2
μ
2λ A D
Travel time = 2
Travel time = 1
μ–λ
FIGURE 6.7
Travel times along arcs in equilibrium.
Exact Results in Network of Queues: Product Form 337
1
2+ .
μ−λ
Note that the constant time of 2 units can be modeled as an M/G/∞ queue
with deterministic service time of 2 units, then the output from that queue is
still PP(λ). Another way of seeing that would be that the departure process
after spending 2 time units would be identical to the entering process, just
shifted by 2 time units. Hence, it would have to be Poisson. Thus the time
across either path is 2 + 1/(μ − λ), where the second term is the sojourn time
of an M/M/1 queue with PP(λ) arrivals and exp(μ) service.
Now a new path from B to C is constructed along which it would take a
deterministic time of 1 unit to traverse. For the first customer that arrives into
this new system described in Figure 6.8, this would be a shortcut because the
new expected travel time would be 1 + 2/(μ − λ), which is smaller than the
old expected travel time given earlier under the assumption μ > λ + 1. Soon,
the customers would selfishly choose their routes so that in equilibrium, all
three paths A − B − D, A − C − D, and A − B − C − D have identical mean
travel times. Actually the equilibrium splits would not be necessary to cal-
culate, instead notice that each of the three routes would take 3 time units to
traverse on average (this is the only way the three paths would have identi-
cal travel times). But the old travel time before the new capacity was added,
2 + 1/(μ − λ), is actually less than 3 units under the assumption μ > λ + 1.
Thus adding extra capacity has actually worsened the average travel
times!
B
Travel time = 1 Travel time = 2
μ
Travel
2λ A time = 1 D
Travel time = 2
Travel time = 1
FIGURE 6.8
New travel times along arcs in equilibrium.
338 Analysis of Queues
Problem 56
Can the computation of waiting times in a queueing system depend on
the method? Consider a stable queue that gets customer arrivals externally
according to a Poisson process with mean rate λ. There is a single server
and infinite waiting room. The service times are exponentially distributed
with mean 1/μ. At the end of service each customer exits the system with
probability p and reenters the queue with probability (1 − p). The system
is depicted in Figure 6.9, for now ignore A, B, C, and D. We consider two
models:
1. If the system is modeled as a birth and death process with birth rates
λ and death rates pμ, then L = λ/(pμ − λ) and W = L/λ = 1/(pμ − λ).
2. If the system is modeled as a Jackson network with 1 node and effec-
tive arrival rate λ/p and service rate μ, then L = (λ/p)/(μ − λ/p) and
W = L/(λ/p) = p/(pμ − λ).
Clearly, the two methods give the same L but the W values are different!
Explain.
Solution
Although this appears to be a paradox, that is really not the case. Let us
revisit Figure 6.9 but now let us consider A, B, C, and D. The W from the
first method (birth and death model) is measured between A and D, which
is the total time spent by a customer in the system (going through one or
more rounds of service). The W from the second method (Jackson network
model) is measured between B and C, which is the time spent by a customer
from the time he or she entered the queue until one round of service is com-
pleted. Note that the customer does a geometric number of such services
(with mean 1/p). Therefore, the total time spent on average would indeed be
the same in either methods if we used the same points of reference, that is,
A and D.
1–p
λ μ p
A B C D
FIGURE 6.9
Points of reference.
Exact Results in Network of Queues: Product Form 339
The exercises at the end of the chapter describe several more examples of
open Jackson networks. Next, we consider closed Jackson networks.
3. The service rate at node i when there are n customers in that node is
μi (n) with μi (0) = 0 and μi (n) > 0 for 1 ≤ n ≤ C. The service times are
exponentially distributed.
N
N
N
p(x) μi (xi ) = p(x + ei − ej )pij μi (xi + 1).
i=1 j=1 i=1
In this balance equation, the LHS describes the total rate of all the transitions
out of state x that includes all possible service completions. Likewise, the
RHS includes all the transitions into state x, that is, service completions from
various states that result in x.
Exact Results in Network of Queues: Product Form 341
Except for small C and N, solving the balance equations directly is diffi-
cult. However, like we saw in the open Jackson network case, here too we
will guess a p(x) solution and check if it satisfies the balance equations. If it
does, then we are done since there is only one solution to the balance equa-
tions. As an initial guess for p(x), we try the open-queueing network result
itself. For that, recall from Equation 6.2 that a(I − P) = λ; however, λ is a vec-
tor of zeros since there are no external arrivals. Hence, we define a as the
solution to a(I − P) = 0, in other words a solution to
a = aP.
n
aj
φj (n) = for n ≥ 1. (6.4)
μj (k)
k=1
N
p(x) = G(C)φ1 (x1 )φ2 (x2 ) . . . φN (xN ) = G(C) φi (xi ),
i=1
p(x) = 1.
x:x1 +x2 +···+xN =C
Next, we need to verify if the p(x) here satisfies the balance equation
N
N
N
p(x) μi (xi ) = p(x + ei − ej )pij μi (xi + 1).
i=1 j=1 i=1
342 Analysis of Queues
For that, first of all if p(x) = G(C)φ1 (x1 )φ2 (x2 ) . . . φN (xN ), then
for all i and j. In addition, from the definition of φi (n) in Equation 6.4, we can
obtain the following:
N
N
N
p(x + ei − ej )
μi (xi ) = μi (xi + 1)pij
p(x)
i=1 j=1 i=1
N
N
φi (xi + 1)φj (xj − 1)
= μi (xi + 1)pij
φi (xi )φj (xj )
j=1 i=1
N
N
ai μj (xj )
= μi (xi + 1)pij
μi (xi + 1) aj
j=1 i=1
N
N
μj (xj )
= ai pij
aj
j=1 i=1
N
μj (xj )
N
= ai pij
aj
j=1 i=1
N
μj (xj )
N
= aj = μj (xj )
aj
j=1 j=1
N
where the penultimate equation can be derived using ai pij = aj , which
i=1
is directly from a = aP. Thus, p(x) = G(C)φ1 (x1 )φ2 (x2 ) . . . φN (xN ) satisfies the
balance equations for a closed Jackson network.
In other words, the steady-state joint probability distribution of having
x1 in node 1, x2 in node 2, . . ., xN in node N is equal to p(x), which is
the product of the φj (xj ) values for all j times a normalizing constant G(C).
Exact Results in Network of Queues: Product Form 343
Hence, this result is also a product form. Note that for this result, similar to the
other product-form cases that we will consider subsequently, the difficulty
arises in computing the normalizing constant G(C). In general, it is not com-
putationally trivial. However, once G(C) is obtained one can compute the
marginal distribution that queue j has xj customers in steady state for some j
such that 1 ≤ j ≤ N, all we have to do is sum over all x keeping xj a constant.
We proceed by first explaining a simple example.
Problem 57
Consider a closed Jackson network with three nodes and five customers,
that is, N = 3 and C = 5. The network structure is depicted in Figure 6.10.
Essentially all five customers behave in the following fashion: upon com-
pleting service at node 1, a customer rejoins node 1 with probability 0.5, or
joins node 2 with probability 0.1, or joins node 3 with probability 0.4; upon
completing service in node 2 or 3, a customer always joins node 1. Node 1
has a single server that serves at rate i if there are i customers at the node.
Node 2 has two servers each with service rate 1. Node 3 has one server with
service rate 2. Determine the joint as well as marginal probability distribu-
tion of the number of customers at each node in steady state. Also compute
the average number in each node as well as the network in steady state.
Solution
Although it is not explicitly stated, the service times are exponentially dis-
tributed (since it is a closed Jackson network). For such a system, to compute
the joint distribution of the steady-state number in each node we first solve
for vector a in
a = aP
where P is the routing probability matrix that can be obtained from the
problem statement as
⎡ ⎤
0.5 0.1 0.4
P=⎣ 1 0 0 ⎦.
1 0 0
2
0.1
1
0.5 1
0.4
1 3
FIGURE 6.10
Closed Jackson network with C = 5 customers.
344 Analysis of Queues
since the single server at node 1 serves at rate n when there are n customers.
Thus, μ1 (n) = n for all n, which is the service rate when there are n customers
in node 1. Likewise, since node 2 has two servers each serving at rate 1, if
there is only 1 customer in node 2, the service rate is 1; however, if there are
2 or more customers in that node, the net service rate at the node (which is
the rate at which customers exit that node) is 2. Hence we have
Since there is a single server serving at rate 2 in node 3, the service rate vector
when there are n customers in node 3 is given by
n
aj
φj (n) = .
μj (k)
k=1
1
= q(x1 , x2 , x3 ).
G(C)
x1 ,x2 ,x3 :x1 +x2 +x3 =5
TABLE 6.2
Example Joint Probability Distribution
x1 x2 x3 p(x1 , x2 , x3 ) x1 x2 x3 p(x1 , x2 , x3 ) x1 x2 x3 p(x1 , x2 , x3 )
0 0 5 0.0077 0 1 4 0.0039 0 2 3 0.0010
0 3 2 0.0002 0 4 1 0.0001 0 5 0 0.00002
1 0 4 0.0386 1 1 3 0.0193 1 2 2 0.0048
1 3 1 0.0012 1 4 0 0.0003 2 0 3 0.0964
2 1 2 0.0482 2 2 1 0.0121 2 3 0 0.0030
3 0 2 0.1607 3 1 1 0.0803 3 2 0 0.0201
4 0 1 0.2009 4 1 0 0.1004 5 0 0 0.2009
5
5
p1 (x1 ) = p(x1 , x2 , x3 ),
x2 =0 x3 =0
5
5
p2 (x2 ) = p(x1 , x2 , x3 ),
x1 =0 x3 =0
5
5
p3 (x3 ) = p(x1 , x2 , x3 ).
x1 =0 x2 =0
For the preceding numerical values, the marginal probability vectors pi (for
i = 1, 2, 3) are
where pi = [pi (0) pi (1) pi (2) pi (3) pi (4) pi (5)]. Let Li be the average number
of customers in node i in steady state for i = 1, 2, 3. We can compute Li as
"
n npi (n). Hence, we have L1 = 3.3764, L2 = 0.3429, and L3 = 1.2807. The total
346 Analysis of Queues
where
y = [y1 , y2 , . . . , yN ] such that y1 + · · · + yN = C − 1
o(h) is a set of terms of the order h such that o(h)/h → 0 as h → 0
pC (x + ei ) is the usual p(x + ei ) with C used for clarity to denote the total
number of customers
N
pij ai = aj from aP = a
i=1
The preceding result is called ASTA because the RHS of the equation is
the time-averaged probabilities and the LHS is as seen by arriving customers.
In fact, to be more precise, if one were to obtain the distribution of the system
state by averaging across those seen by arriving customers (note that arriving
customers do not include themselves in the system state), then this is iden-
tical to a time-averaged distribution of the system state when there is one
less customer. Furthermore, if one were to insert a “dummy” customer in
the system to obtain the system state every time this customer enters a node,
then it is possible to get the system state distribution without this dummy
customer. Sometimes, one is not necessarily interested in the entire vector
of states but just that of the entering node. This is the essence of the next
remark, sometimes also known as arrival theorem.
Remark 14
In a closed Jackson network with C customers, for any n, the probability that
there are n customers in node i, as seen at the time of arrival of an arbitrary
customer to that node, is equal to the probability that there are n customers
at this node with one less job in the network (i.e., C − 1).
This remark can be immediately derived using πj (x) = pC−1 (x) by sum-
ming over all xj such that j = i. For example, if we were to modify Problem
57 so that there are C = 6 customers, then the probability that an arriving
customer will see two customers in node 3 is 0.214, which can be obtained
by considering a network of C = 5 customers (done in that Problem 57);
computing p3 , which is the probability distribution of the number in node 3;
and then using the term corresponding to two customers in the system.
Exact Results in Network of Queues: Product Form 349
Problem 58
Consider a single-server queue where it takes exp(μ) amount of time to
serve a customer. Unlike most of the systems in the previous chapters, here
we assume that there is a finite population of C customers. Each customer
after completion of service returns to the queue after spending exp(λ) time
outside. First model the system as a birth and death process, and obtain
the steady-state probabilities. Then compute the arrival point probabilities.
Subsequently, model the system as a closed Jackson network and compare
the corresponding results.
Solution
The finite population single-server queue model is depicted in Figure 6.11.
The top of the figure is the queue under consideration and the box in the
bottom denotes the idle time before customers return to the queue. There
are several applications of such a system. For example, in the client–server
model with C clients that submit requests to a server. Once the server sends
a response, after a think time the clients send another request and so on. The
request service time for the server is exp(μ) and different requests contend
for the server. The think times for the clients are according to exp(λ). Another
example is a bank that has C customers in total. Each customer visits the bank
(with a single teller, although the model and analysis can easily be extended
to multiple tellers), waits for service, gets served for exp(μ) time, and revisits
the bank after spending exp(λ) time outside.
To model this system as a birth and death process, let X(t) denote the
number of customers in the queue (including any at the server) at time t.
When X(t) = n, there are C − n customers outside the queue and hence the
arrival time would be when the first of those C − n customers enters the
queue. Since each customer spends exp(λ) time outside, the arrival rate
when X(t) = n is (C − n)λ. Likewise, if there is a customer at the queue, the
service rate is μ. Therefore, we can show the {X(t), t ≥ 0} process is a birth
and death process with birth parameters λn = (C − n)λ for 0 ≤ n ≤ C − 1 and
exp(μ)
exp(λ)
FIGURE 6.11
Finite population queue with C customers.
350 Analysis of Queues
C λ i
i i! μ
pi =
"C C λ j
. (6.5)
j=0 j j! μ
To obtain this conditional probability, we use Bayes’ rule and then compute
the conditional probabilities in the following manner:
(C − j)λpC
j
= C .
(C − i)λpC
i
i=0
Exact Results in Network of Queues: Product Form 351
pC
j C pC
0
= .
pC−1 C−j pC−1
j 0
(C − j) C pC
π∗j = C pC−1 0
(C − i)pC
j C−j pC−1
i 0
i=0
CpC
= C
0
pC−1
j
C−1 C
p0 (C − i)pi
i=0
= kpC−1
j
π∗j = pC−1
j , (6.6)
that is, the probability that there are j in the queue as seen by an arriving
customer is the same as the probability that there are j in the queue for a
similar system with one less customer.
Next, we model the system in the problem as a closed Jackson network
with N = 2 nodes and C customers. We denote the single-server queue as
node 1 and outside of the queue as node 2. The service rate at node 1 when
there are n (such that n > 0) customers in it is μ, hence μ1 (n) = μ. The service
rate at node 2 when there are n customers in it is nλ (which is essentially
the rate at which a departure occurs when there are n in node 2). Thus,
μ2 (n) = nλ. The routing probability matrix P can be obtained from the
problem statement as
!
0 1
P= .
1 0
n
aj 1
φ1 (n) = = n
μj (k) μ
k=1
and
n
aj 1
φ2 (n) = = .
μj (k) n!λn
k=1
Also for j = 1, 2 we have φj (0) = 1. Further, since we only have two nodes,
if one node has x1 , then necessarily the other node must have x2 = C − x1 .
Therefore, the joint probability distribution
1
p(x1 , C − x1 ) = G(C)φ1 (x1 )φ2 (C − x1 ) = G(C)
(C − x1 )!λC−x1 μx1
x1
1 C λ
= G(C) C
x1 ! .
C!λ x1 μ
C!λC
G(C) =
"C C λ j
.
j=0 j j! μ
Hence, we have
C λ x1
x1 x1 ! μ
p(x1 , C − x1 ) =
"C C λ j
j=0 j j! μ
We do not have an expression for any of the preceding measures and the
objective is to obtain them iteratively. However, before describing the itera-
tive algorithm, we first explain the relationship between those parameters.
On the basis of the arrival theorem described in Remark 14, in a network
with k customers (such that 1 ≤ k ≤ C), the expected number of customers that
an arrival to node i (for any i ∈ {1, . . . , N}) would see is Li (k − 1). Note that
Li (k − 1) is the steady-state expected number of customers in node i when
there are k − 1 customers in the system. Thereby, the net mean sojourn time
experienced by that arriving customer in steady state is the average time to
serve all those in the system upon arrival plus that of the customer. Since the
average service time is 1/μi , we have
1
Wi (k) = [1 + Li (k − 1)].
μi
Let a be the solution to a = aP as usual with the only exception that the aj
values sum to one here. Thus, the aj values describe the fraction of visits
that are made into node j. The aggregate sojourn time weighted across the
N
network using the fraction of visits is given by ai Wi (k) when there are
i=1
k customers in the network. One can think of an aggregate sojourn time as
354 Analysis of Queues
the sojourn time for a customer about to enter a node. Hence, by condition-
ing on the node of entry as i (which happens with probability ai ) where the
N
mean sojourn time is Wi (k), we can get the result ai Wi (k). Thereby we
i=1
derive the average flow in the network using Little’s law across the entire
network as
k
λ(k) = N
ai Wi (k)
i=1
when there are k customers in the network. Essentially, λ(k) is the average
rate at which service completion occurs in the entire network, taken as a
whole. Thereby applying Little’s law across each node i we get
1
Wi (k) = [1 + Li (k − 1)],
μi
k
λ(k) = N ,
ai Wi (k)
i=1
User Internet
Web Application Database
Browser server server server
website
FIGURE 6.12
Three-tier architecture for e-business websites.
Exact Results in Network of Queues: Product Form 355
their browsers by connecting to the first tier, namely, the web server. The
web server provides web pages and forms for users to enter requests or
information. When the users enter the information and send back to the
web server, it passes the information onto the application server (which is
the second tier). The application server processes the information and com-
municates with the database server (in the third tier). The database server
then searches its database and responds to the application server, which
in turn passes it onto to the web server, which transmits to the user. For
example, consider running a website for a used car dealership (with URL
www.usedcar.com). When a user types www.usedcar.com on their browser,
the request goes to the first tier for which the web server responds with the
relevant web page. Say the user fills out a set of makes and models as well as
desirable years of manufacture. When the user submits this form expecting
to see the set of used cars available that meets his or her criteria, the web
server passes the set of criteria to the application server (second tier). The
application server processes this set of criteria to check if the form is filled
with all the required fields and then submits to the database server (third
tier). The database server queries the database of all cars available in the deal-
ership that meet the criteria and then responds with the appropriate set. Hav-
ing described some background for websites, we now describe a problem.
Problem 59
The bottleneck in many three-tier architecture websites is the database server
that does not scale up to handle a large number of users simultaneously. Let
us say that the database server can handle at most C connections simultane-
ously. During peak periods, one typically sees the database server handling
its full capacity of C connections at every instant of time. In practice, every
time one of the C connections is complete, instantaneously a new connection
is added to the database server thus maintaining C connections through-
out the peak period. Hence, we can model the database server system as
a closed queueing network with C customers. The database server system
consists of a processor and four disks as shown in Figure 6.13. All five nodes
are single-server queues with exponential service times. Each customer after
being processed at the processor goes to any of the four disks with equal
probability. The average service time (in milliseconds) at the processor is
6 and at the four disks are 17, 28, 13, and 23, respectively. For C = 16 use
the preceding algorithm to determine the expected number at each of the
five nodes as well as the throughput of the database–server system. What
happens to those metrics when C is 25, 50, 75, and 100?
Solution
Note that we have a Jackson network with N = 5 nodes and C = 16 customers
(we will later consider other C values). The P matrix corresponding to the
processor node and the four disks is
356 Analysis of Queues
3
1
Processor 4
Disks
FIGURE 6.13
Closed queueing network inside database server.
⎡ ⎤
0 0.25 0.25 0.25 0.25
⎢ 1 0 0 0 0 ⎥
⎢ ⎥
P=⎢
⎢ 1 0 0 0 0 ⎥.
⎥
⎣ 1 0 0 0 0 ⎦
1 0 0 0 0
TABLE 6.3
The Single-Server Closed Jackson Network Iterations
k L1 (k) L2 (k) L3 (k) L4 (k) L5 (k) λ(k)
1 0.2286 0.1619 0.2667 0.1238 0.2190 0.0762
2 0.4631 0.3102 0.5570 0.2294 0.4403 0.1256
3 0.7018 0.4452 0.8714 0.3195 0.6621 0.1599
4 0.9433 0.5674 1.2102 0.3962 0.8829 0.1848
5 1.1860 0.6776 1.5737 0.4615 1.1013 0.2034
6 1.4284 0.7765 1.9620 0.5173 1.3158 0.2178
7 1.6692 0.8650 2.3754 0.5649 1.5255 0.2291
8 1.9073 0.9439 2.8138 0.6057 1.7294 0.2382
9 2.1414 1.0142 3.2772 0.6406 1.9266 0.2455
10 2.3706 1.0766 3.7657 0.6706 2.1165 0.2515
11 2.5940 1.1321 4.2790 0.6964 2.2985 0.2565
12 2.8109 1.1812 4.8169 0.7187 2.4723 0.2607
13 3.0207 1.2246 5.3792 0.7379 2.6376 0.2642
14 3.2228 1.2631 5.9654 0.7546 2.7942 0.2672
15 3.4167 1.2970 6.5752 0.7690 2.9421 0.2697
16 3.6023 1.3270 7.2080 0.7815 3.0812 0.2719
TABLE 6.4
When C Is Increased to 25, 50, 75, and 100
C L1 L2 L3 L4 L5 λ
25 4.8794 1.4786 13.8299 0.8418 3.9704 0.2818
50 5.9284 1.5435 37.0910 0.8660 4.5711 0.2856
75 5.9972 1.5454 61.9915 0.8667 4.5992 0.2857
100 5.9999 1.5455 86.9880 0.8667 4.6000 0.2857
N
bj = uj + bi pij .
i=1
b = u[I − P]−1
where
u is a row vector of uj values
b a row vector of bj values
I the identity matrix
P the routing matrix
Next, define the following for all i ∈ {1, . . . , N}: φi (0) = 1 and
n
bi
φi (n) = for n ≥ 1.
μi ( j)
j=1
N
Define x̂ = xi . Using this notation it is possible to show that the
i=1
steady-state probability p(x) is given by
N
x̂
p(x) = c φi (xi ) λ( j),
i=1 j=0
That can be used to get the distribution of the number of customers in the
system as well as the mean (and higher moments). Then using Little’s law,
the mean sojourn times (across a node and the network itself) can also be
obtained.
An immediate extension to this model is to allow service rates to depend
on the number of customers in each node of the network. Therefore, the
service rate at node i when there are x1 , . . . , xN customers, respectively, at
nodes 1, . . . , N instead of being μi (xi ), is now μi (x). A network with that
extension is known as a Whittle network for which Serfozo [96] describes con-
ditions for a product-form solution. In the next few sections, we will describe
other networks where the steady-state probabilities can be represented as
product form.
R
yi = xri
r=1
(when yi is zero, the service completion rate is zero as well). All three service
disciplines mentioned earlier satisfy that condition, and others that do can
also be included in the list of service disciplined allowed. With that under-
standing, we will next just describe p(x) as a product form without going
into details of the derivation.
Define the following for all i ∈ {1, . . . , N}: φi (0, 0, . . . , 0) = βi and
⎛ ⎞& '
n
1 R
λnr r
φi (n1 , n2 , . . . , nR ) = βi n! ⎝ ⎠ for n ≥ 1
μi min(j, si ) nr !
j=1 r=1
where
R
∞
k λr Mrj
r=1
n= nr and β−1
j =1+ .
r
μj min(n, si )
k=1 n=1
Using this notation it is possible to show that the steady-state probability p(x)
is given by
(N
p(x) = i=1 φi (x1i , x2i , . . . , xRi ) if x ∈ E
0 otherwise.
One can consider several extensions to this model. In fact, Kelly net-
works, on which the preceding analysis is based, are a lot more general. The
reader is referred to the end of the next section on multiclass networks for a
description of some of the possible extensions as they are common to both
sections. After all what we saw here is just a special type of multiclass net-
work. Before forging ahead, we present a small example to illustrate these
results.
Exact Results in Network of Queues: Product Form 363
Problem 60
Consider the four-node network in Figure 6.14. There are three routes
described using three types of arrows. Route-1 uses path 1-3-4-2, route-2 uses
4-2-1, and route-3 uses 2-3. Route-1 customers arrive according to a Poisson
process with mean rate 4 per hour. Likewise, route-2 and route-3 customers
arrive according to Poisson processes with mean rates of 2 and 3 per hour,
respectively. Nodes 1, 2, and 3 have a single server that serves according
to an exponential distribution with rates 8, 10, and 9 per hour, respectively.
Node 4 has two servers each serving at rate 4 per hour. The service disci-
pline is FCFS in nodes 1, 2, and 4 but processor sharing in node 3. What is
the probability that there is one route-1 customer in each of the four nodes,
two route-2 customers in node 4, two in node 2 and one in node 1, and three
route-3 customers in node 2 and four in node 3?
Solution
The problem illustrates an example of a queueing network with fixed
routes. Using the notation of this section, N = 4 nodes and R = 3 routes.
Also, (λ1 , λ2 , λ3 ) = (4, 2, 3), (μ1 , μ2 , μ3 , μ4 ) = (8, 10, 9, 4), and (s1 , s2 , s3 , s4 ) =
(1, 1, 1, 2). In the problem, the vector of xrj values for route r and
node j describes x given by x = (x11 , x21 , x31 , x12 , x22 , x32 , x13 , x23 , x33 , x14 ,
x24 , x34 ). Using the numerical values in the problem, we have x = (1, 1, 0, 1, 2,
3, 1, 0, 4, 1, 2, 0) for which we need to compute p(x). Using the results in this
section we have
⎛ ⎞
2
1
φ1 (1, 1, 0) = β1 (2!) ⎝ ⎠ (λ1 λ2 ) = β1 1
μ1 4
j=1
1 3
Route 1
2 4
Route 2
Route 3
FIGURE 6.14
Queueing network with fixed routes.
364 Analysis of Queues
⎛ ⎞& '
6
1 λ1 λ22 λ33 81
φ2 (1, 2, 3) = β2 (6!) ⎝ ⎠ = β2
μ2 12 3125
j=1
⎛ ⎞& '
5
1 λ1 λ43 20
φ3 (1, 0, 4) = β3 (5!) ⎝ ⎠ = β3
μ3 24 729
j=1
⎛ ⎞& '
3
1 λ1 λ22 1
φ4 (1, 2, 0) = β4 (3!) ⎝ ⎠ = β4 .
min(j, 2)μ4 6 32
j=1
Thus, the only thing left is to obtain the βj values for j = 1, 2, 3, 4. Although
it is possible to directly use the formula, it is easier if we realize that βj is the
probability that node j is empty in steady state. Since nodes 1, 2, and 3 are
effectively M/M/1 queues with arrival rates 6, 9, and 7 as well as service
rates 8, 10, and 9, respectively, we have β1 = 1/4, β2 = 1/10, and β3 = 2/9.
Likewise, node 4 is effectively an M/M/2 queue with arrival rate 6 and
service rate 4 for each server. Hence, we have β4 = 1/7. Thus, the prob-
ability that there is one route-1 customer in each of the four nodes, two
route-2 customers in node 4, two in node 2 and one in node 1, and three
route-3 customers in node 2 and four in node 3 is β1 β2 β3 β4 (1/180000) =
1/22680000.
6. The service discipline is one of the following: FCFS (in which case
we require μki to be independent of k (i.e., all K class have the
same service rate that we call μi ), processor sharing, or LCFS with
preemptive resume.
7. When a class k customer completes service at node i, the customer
departs the network with probability rki or joins the queue at node j
as a class customer with probability pki,j . The routing of a customer
does not depend on the state of the network.
8. There is infinite waiting room at each node and stability condition is
satisfied at every node i.
As earlier, here we obtain the visit ratios aj for class customers into
node j (i.e., the effective arrival rate of class customers into node j).
We solve the following set of simultaneous equations:
N
K
aj = λj + aki pki,j
i=1 k=1
for all (such that 1 ≤ ≤ K) and j (such that 1 ≤ j ≤ N). For i = 1, . . . , N and
k = 1, . . . , K, let Xki (t) be the number of customers belonging to class k in node
i at time t. Let X(t) be a vector that captures a snapshot of the state of the
network at time t and is given by X(t) = [X11 (t), X21 (t), . . . , Xki (t), . . . , XKN (t)].
Let p(x) be the steady-state probability of being in state x = (x11 , x21 , . . . , xKN ),
that is,
Note that the stochastic process {X(t), t ≥ 0} is not a CTMC if the discipline
is FCFS (although for the other disciplines mentioned earlier, it would be a
CTMC) since the transition rates would be different if the history is known
as opposed to when it is not known as the state information does not include
the customer class under service. For the FCFS case we would have to keep
track of the class of customer in each position of every queue in the network
that would form a CTMC. But any permutation within a queue would result
in the same probability. Thus, adding across all permutations we can obtain
p(x). With that understanding in place, next we describe p(x), which would
be a product form, without going into details of the derivation.
Let i be the total arrival rate into node i aggregated over all classes,
that is,
K
i = aki .
k=1
366 Analysis of Queues
1 aki
K
= .
i μki i
k=1
Note that if the discipline is FCFS, i = μki for all k since μki does not change
with k. Next, define the following for all i ∈ {1, . . . , N}:
&& ''
K
& n '
1 − i akik
φi (n1 , n2 , . . . , nK ) = nk ! n .
i nk !μkik
k k=1
Using this notation it is possible to show that the steady-state probability p(x)
is given by
N
p(x) = φi (x1i , x2i , . . . , xKi ).
i=1
Problem 61
Consider an open-queueing network with K = 2 classes and N = 3 nodes.
Node 1 uses processor sharing, while nodes 2 and 3 use LCFS preemptive
resume policy. The service rates μki for class k customers in node i are μ11 = 8,
μ21 = 24, μ12 = 12, μ22 = 32, μ13 = 16, μ23 = 36. Arrivals for both classes occur
externally into node 1 at rate 1 per unit time. The routing probabilities are
p12,11 = 0.6, p22,21 = 0.7, p11,12 = 0.4, p12,13 = 0.4, p21,22 = 0.3, p22,23 = 0.3,
p11,13 = 0.3, p13,11 = 0.5, p21,23 = 0.6, p23,21 = 0.4, p13,12 = 0.5, p23,22 = 0.6,
r11 = 0.3, and r21 = 0.1. Note that class switching is not allowed in the net-
work. Compute the probability that there are one class-1 and two class-2
customers in node 1, one class-1 and one class-2 customers in node 2, and
zero class-1 and one class-2 customer in node 3.
Solution
Note that since the external arrival rate is 1 into node 1 for both classes,
we have λ11 = 1, λ21 = 1, λ12 = 0, λ22 = 0, λ13 = 0, and λ23 = 0. Using
those and solving for the simultaneous equations for the visit ratios,
we get a11 = 3.3333, a21 = 10, a12 = 2.2917, a22 = 8.0488, a13 = 1.9167, and
a23 = 8.4146. In fact, since no class switching is allowed, we can solve
the simultaneous equations one class at a time. Therefore, we can com-
pute the net arrival rate into node i aggregating over all the classes i
as 1 = 13.3333, 2 = 10.3404, and 3 = 10.3313. Likewise, we can obtain
1 = 16, 2 = 23.3684, and 3 = 29.2231. Using the formulae for φi (n1 , n2 ),
Exact Results in Network of Queues: Product Form 367
we get φ1 (1, 2) = 0.0362, φ2 (1, 1) = 0.0536, and φ3 (0, 1) = 0.1511. Note that
φi (n1 , n2 ) actually gives us the probability that in node-1 there are n1 class-1
customers and n2 class-2 customers. Of course, the answer to the question
given in the problem is the joint probability, which is the product form
p(1, 2, 1, 1, 0, 1) = φ1 (1, 2)φ2 (1, 1)φ3 (0, 1) = 0.00029271.
One can consider several extensions to the preceding model that would
still give us product-form solutions. As a matter of fact, BCMP networks,
when they were first introduced, considered a few more generalizations.
For example, if a node uses FCFS discipline, any number of servers were
allowed; infinite server queues with general service times is an option (not
just FCFS, processor sharing and LCFS with preemptive resume); general
service times were allowed for processor sharing and LCFS with preemp-
tive resume nodes; also closed-queueing networks are analyzable. Subse-
quently, several research studies further generalized the BCMP networks
to include state-dependent (local and network-wide) arrivals and service,
state-dependent routing, networks with negative customers, networks with
blocking, open networks with a limited total capacity, etc. Refer to some
recent books on queueing networks such as by Serfozo [96], Chao et al.
[18], and Chen and Yao [19] for results under those general cases. In fact,
both multiclass queueing networks with fixed routing (Kelly networks) and
BCMP networks are combined into a single framework (by using routing
probabilities of 0 or 1 for Kelly networks). It is also worthwhile to con-
sider algorithms for product-form networks (especially in the extensions, we
would require the use of normalizing constants that are harder to obtain).
That said, we conclude the product-form network analysis by describing loss
networks next.
network example. Say each accepted telephone call takes 1 unit of arc capac-
ity (this is about 60 kbps). Then on arc j you can have at most Cj calls
simultaneously. Let R be the set of routes in the telephone network such
that a route r is described by a set of arcs that are traversed. In this manner,
we only focus on the arcs and not on the nodes. For all r ∈ R, telephone calls
requesting route r arrive according to a Poisson process with mean rate λr .
A call requesting route r is blocked and lost if there is no capacity available
on any of the links in the route. If the call is accepted, it uses up 1 unit of
capacity in each of the arcs in the route. The holding time for accepted calls
of class r is generally distributed with mean 1/μr .
Define Xr (t) as the number of calls in progress on route r at time t,
for all r ∈ R. Let R = |R|, the number of routes in the network. Let the
R-dimensional vector X(t) be X(t) = (X1 (t), . . . , Xr (t), . . . , XR (t)). Then the
steady-state distribution of the stochastic process {X(t), t ≥ 0} can be com-
puted as a product form (note that the process is reversible). Let p(x) be the
steady-state probability of being in state x = (x1 , x2 , . . . , xR ), that is,
Let us define set E as the set of all possible x satisfying the criterion that
the total number of calls in each link is less than or equal to the capacity.
Therefore, p(x) > 0 if x ∈ E and p(x) = 0, otherwise. We can write down p(x)
as a product form given by
R
1 λr xr
p(x) = G(C1 , . . . , CJ ) , ∀x ∈ E.
xr ! μr
r=1
Problem 62
Consider the six-node network in Figure 6.15. There are four routes in the
network. Route-1 is through nodes A–C–D–E, route-2 is through nodes
A–C–D–F, route-3 is through nodes B–C–D–E, and route-4 is through nodes
B–C–D–F. The capacities of the five arcs are described below the arcs. Each
call on a route uses one unit of the capacity of all the arcs on the route. Calls
on routes 1, 2, 3, and 4 arrive according to a Poisson process with respective
rates 10, 16, 15, 20 per hour. The average holding time for all calls is 3 min.
Exact Results in Network of Queues: Product Form 369
A Arc 1 E
Arc 4
2 2
Arc 3
C D
4 Arc 5
Arc 2
3 3
B F
FIGURE 6.15
Loss network.
What is the probability that in steady state there is one call on each of the
four routes?
Solution
For r = 1, 2, 3, 4, let Xr (t) be the number of calls on route r. Let
x = (x1 , x2 , x3 , x4 ), and we would like to compute p(x). First we describe E,
the set of all feasible x values. Then
E = {(x1 , x2 , x3 , x4 ) : x1 + x2 ≤ 2, x1 + x2 + x3 + x4 ≤ 4, x3 + x4 ≤ 3,
x1 + x3 ≤ 2, x2 + x4 ≤ 3}.
Reference Notes
We began this chapter with acyclic networks as well as open and closed
Jackson networks. Most of the results here were adapted from Kulkarni [67].
In fact, many standard texts on queues would also typically contain a few
chapters on these topics. The main emphasis of this chapter is product-form
solutions. There is a strong connection between product-form queueing net-
works, the notion of reversibility, as well as insensitivity to the service time
distribution. Note that all our product-form results use only the mean arrival
time and mean service time at every node but not the entire distribution. Of
course, the link in itself is quasi-reversibility that results in partial balance
equations. We have not gone into any of those details in this chapter but
370 Analysis of Queues
Exercises
6.1 Consider a queueing network of single-server queues shown in the
Figure 6.16. Note that, external arrival is Poisson and service times
are exponential. Derive the stability condition and compute (1) the
expected number of customers in the network in steady state and (2)
the fraction of time the network is completely empty in steady state.
6.2 Consider a seven-node single-server Jackson network where nodes
2 and 4 get input from the outside (at rate 5 per minute each on an
average). Nodes 1 and 2 have service rates of 85, nodes 3 and 4 have
service rates of 120, node 5 has a rate of 70, and nodes 6 and 7 have
rates of 20 (all in units per minute). The routing matrix is given by
⎛ 1 1 1 1
⎞
3 4 0 4 0 6 0
⎜ 1 1 1 ⎟
⎜ 0 0 0 0 ⎟
⎜ 3 4 3 ⎟
⎜ 0 0 1 1 1
0 0 ⎟
⎜ 3 3 3 ⎟
⎜ ⎟
P = [pij ] = ⎜ 13 0 1
0 1
0 0 ⎟.
⎜ 3 3 ⎟
⎜ 0 0 0 4
0 0 1 ⎟
⎜ 5 6 ⎟
⎜ 1 1 1 1 1 ⎟
⎝ 6 0 6 6 6 6 0 ⎠
1 1 1 1 1
0 6 6 6 6 0 6
λ 1–p
μ1 μ2 μN–1 μN
p
FIGURE 6.16
Single-server queueing network.
Exact Results in Network of Queues: Product Form 371
Find the average number in the network and the mean delay at
each node.
6.3 Consider a closed-queueing network of single-server stations. Let
ρi = ai /μi . Show that the limiting joint distribution is given by
1 xi
N
"
p(x) = ρi when xi = C
γ(C)
i=1
∞
N
C 1
G̃(z) = γ(C)z = .
1 − ρi z
C=0 i=1
j
∞
1
Bj (z) = = bj (n)zn j = 1, 2, . . . , N.
1 − ρi z
i=1 n=0
and
Then one can use this recursion to compute γ(C) = bN (C). Thus, γ(C)
can be computed in O(NC) time.
372 Analysis of Queues
F
65%
FIGURE 6.17
Schematic of a six-station network.
374 Analysis of Queues
A B
FIGURE 6.18
Schematic of a repair shop.
(c) Consider a closed Jackson network with two nodes and C cus-
tomers. Let X(t) be the number of customers in one of the nodes
(the other node would have C − X(t) customers). Is the following
statement TRUE or FALSE? The CTMC {X(t), t ≥ 0} is reversible.
6.12 Answer the following multiple-choice questions:
(a) In a stable open Jackson network, which of the following is the
average time spent by an entity in the network?
"N "N
(i) i = 1 Li / i = 1 λi
"N
(ii) i = 1 Li /λi
Exact Results in Network of Queues: Product Form 375
"N "N
(iii) i = 1 Li / i = 1 ai
"N
(iv) i = 1 Li /ai
(b) Consider an open Jackson network with two nodes. Arrivals
occur externally according to PP(λ) into each node. There is a
single server that takes exp(μ) service time and there is infinite
waiting room at each node. At the end of service at a node, each
customer chooses the other node or exits the system, both with
probability 1/2. What is the average number of customers in the
entire network in steady state, assuming stability?
(i) 2λ/(μ − λ)
(ii) 2λ/(μ − 3λ)
(iii) 4λ/(μ − 2λ)
(iv) 4λ/(μ − 4λ)
6.13 Consider a series system of two single-server stations. Customers
arrive at the first station according to a PP(λ) and require exp(μ1 )
service time. Once service is completed, customers go to the second
station where the service time is exp(μ2 ), and exit the system after
service. Assume the following: both queues are of infinite capacity;
λ < μ1 < μ2 ; no external arrivals into the second station; and both
queues use FCFS and serve one customer at a time. Compute the
LST of the CDF of the time spent in the entire series system.
6.14 Consider an open Jackson network with N nodes and a single-server
queue in each node. We need to determine the optimal service
rate μi at each node i for i ∈ {1, . . . , N} subject to the constraint
μ1 + · · · + μN ≤ C. The total available capacity is C, which is a given
constant. Essentially we need to determine how to allocate the avail-
able capacity among the nodes. Use the objective of minimizing the
total expected number in the system in steady state. Formulate a
nonlinear program and solve it to obtain optimal μi values in terms
of the net arrival rates aj (for all j), which are also constants.
6.15 Consider a feed-forward tandem Jackson network of N nodes and
the arrivals to the first node is PP(λ). The service rate at node i is
exp(μi ). We have a pool of S workers that need to be assigned one
time to the N nodes. Formulate an optimization problem to deter-
mine si , the number of servers in node i (for i ∈ {1, . . . , N}) so that
s1 + . . . sN ≤ S, if the objective is to minimize the total expected num-
ber in the system in steady state. Describe an algorithm to derive the
optimal allocation.
This page intentionally left blank
7
Approximations for General Queueing
Networks
Jackson networks and their extensions that we saw in the previous chap-
ter lent themselves very nicely for performance analysis. In particular, they
resulted in a product-form solution that enabled us to decompose the queue-
ing network so that individual nodes can be analyzed in isolation. However,
a natural question to ask is: What if the conditions for Jackson networks are
not satisfied? In this chapter, we are especially interested in analyzing gen-
eral queueing networks where the interarrival times or service times or both
can be according to general distributions. In particular, how do you ana-
lyze open queueing networks if each node cannot be modeled as an M/M/s,
M/G/c/c, or M/G/∞ queue? For example, the departure process from an
M/G/1 FCFS queue is not even a renewal process, let alone a Poisson pro-
cess. So if this set of customers departing from an M/G/1 queue join another
queue, we would not know how to analyze that queue because we do not
have results for queues where the arrivals are not according to a renewal
process.
So how do we analyze general queueing networks? In the most general
case, our only resort is to develop approximations. It is worthwhile to men-
tion that one way is to use discrete-event simulations. There are several
computer-simulation software packages that can be used to obtain queue-
ing network performance measures numerically. At the time of writing this
book, the commonly used packages are Arena and ProModel especially
for manufacturing systems applications. However, although simulation
methodology is arguably the most popular technique in the industry, it is not
ideal for developing insights and intuition, performing quick what-if anal-
ysis, obtaining symbolic expressions that can be used for optimization and
control, etc. For those reasons we will mainly consider analytical models that
can suitably approximate general queueing networks. However, one of the
objectives of the analytical approximations is that they must possess under-
lying theory, must be exact under special cases or asymptotic conditions and
reasonably accurate under other conditions, and must be relatively easy to
implement.
In that spirit we will consider, for example, approximations based on
reflected Brownian motion (we will define and characterize this subse-
quently). One of the major benefits of reflected Brownian motion is that it
can be modeled using just the mean and variance of the interarrival time as
377
378 Analysis of Queues
to make a scaling argument appropriately as done in Chen and Yao [19] (see
Chapter 8). However, before we proceed with the Brownian approximation,
we first need to write down the relevant performance measures for the queue
in terms of A(t) and S(t). We do that next.
We first describe some notation. Let X(t) denote the number of customers
in the G/G/1 queue at time t with X(0) = x0 , a given finite constant number
of customers initially. To write down X(t) in terms of A(t) and S(t), it is
important to know how long the server was busy and idle during the time
period 0 to t. For that, let B(t) and I(t), respectively, denote the total time the
server has been busy and idle from time 0 to t. We emphasize that the server
is work conserving, that means the server would be idle if and only if there
are no customers in the system. Note that
B(t) + I(t) = t.
since the total number in the system at time t equals all the customers that
were present at time 0, plus all those that arrived in time 0 to t, minus those
that departed in time 0 to t. Note that while writing the number of departures
we need to be careful to use only the time the server was busy. Hence we get
the preceding result.
Equation 7.1 is not conducive to obtain an expression for X(t). Hence we
rewrite as follows:
where
Verify that the preceding result yields Equation 7.1 realizing that
B(t) + I(t) = t. Note that we are ultimately interested in the steady-state dis-
tribution of X(t); however, to do that we start by computing the expected
value and variance of U(t), for large t. Thus we have for large t
E[U(t)] = x0 + (λ − μ)t,
Var[U(t)] ≈ λC2a t + λC2s t
since we have from renewal theory E[A(t)] = λt, E[S(B(t))|B(t)] = μB(t) for
any B(t). However, the variance result is a lot more subtle. Note that for
Approximations for General Queueing Networks 381
a large t, the total busy period can be approximated as B(t) ≈ (λ/μ)t since
λ/μ is the fraction of time the server would be busy. Hence we write down
B(t) = (λ/μ)t in the expression for U(t) and then take
the
variance. However,
since we know that Var[A(t)] = λC2a t and Var S μ λ
t = λC2s t, we get the
preceding approximate result for Var[U(t)]. Note that E[U(t)] is exact for any
t, however, Var[U(t)] is reasonable only for large t, that is, in the asymptotic
case.
It is straightforward to see that for large t, if A(t) and S(t) are normally
distributed random variables, then U(t) is normally distributed with mean
x0 + (λ − μ)t and variance λ(C2a + C2s )t. Therefore, if {A(t), t ≥ T} and
{S(t), t ≥ T} can be approximated as Brownian motions for some large T,
then from the description of U(t), {U(t), t ≥ T} is also a Brownian motion
with initial state x0 , drift λ − μ, and variance λ(C2a + C2s ).
Next we seek to answer the question: If {U(t), t ≥ 0} is a Brownian,
then what about {X(t), t ≥ 0}? To answer this we observe that the following
relations ought to hold for all t ≥ 0:
X(t) ≥ 0, (7.2)
dV(t)
≥ 0 with V(0) = 0 (7.3)
dt
and
dV(t)
X(t) = 0. (7.4)
dt
Problem 63
Given U(t), show that there exists a unique pair X(t) and V(t) such that
X(t) = U(t) + V(t), which satisfy conditions (7.2 through 7.4). Also show
382 Analysis of Queues
that the unique pair X(t) and V(t) can be written in terms of U(t) as
follows:
Solution
We first show that X(t) and V(t) are a unique pair. For that, consider
another pair X̂(t) and V̂(t) such that given a U(t) for all t, X̂(t) = U(t) + V̂(t)
and the pair X̂(t) and V̂(t) satisfy conditions (7.2 through 7.4). Hence
X̂(t) ≥ 0, dV̂(t)/dt ≥ 0 with V̂(0) = 0, and X̂(t)dV̂(t)/dt = 0. If we show that
the only way that can happen is if X(t) = X̂(t) (hence V(t) = V̂(t) because
U(t) = X(t) − V(t) = X̂(t) − V̂(t)). For that, consider 1/2{X(t) − X̂(t)}2 and
write it as follows (the first equation is an artifact of integration and uses
the fact that X(0) = X̂(0) = x0 ; the second equation is due to substituting
X(u) − X̂(u) by V(u) − V̂(u) since U(u) = X(u) − V(u) = X̂(u) − V̂(u) for all
u; the last equation can be derived using condition (7.4), i.e., X(u)dV(u) = 0
and X̂(u)dV̂(u) = 0):
1 t
{X(t) − X̂(t)}2 = {X(u) − X̂(u)}d{X(u) − X̂(u)},
2
0
t
= {X(u) − X̂(u)}d{V(u) − V̂(u)},
0
t t
=− X(u)dV̂(u) − X̂(u)dV(u).
0 0
However, based on conditions (7.2) and (7.3), X(u), dV̂(u), X̂(u), and dV(u)
are all ≥0. Thus we have
1
{X(t) − X̂(t)}2 ≤ 0.
2
But the LHS is nonnegative. So the only way these result holds is if
X(t) = X̂(t). Hence the pair X(t) and V(t) such that X(t) = U(t) + V(t), which
satisfy conditions (7.2 through 7.4), is unique.
Having shown that X(t) and V(t) is unique, we now proceed to show that
V(t) and X(t) defined in Equations 7.5 and 7.6 satisfy X(t) = U(t) + V(t) and
the conditions (7.2 through 7.4). Subtracting Equation 7.5 from Equation 7.6,
Approximations for General Queueing Networks 383
we get X(t) = U(t) + V(t). Since max{−U(s), 0} ≥ 0 for all s, from the defi-
nition of V(t), we have V(t) ≥ 0. Thus if U(t) ≥ 0, X(t) = U(t) + V(t) ≥ 0.
Now, if U(t) < 0, from the definition of the supremum we have V(t) ≥ −U(t)
since V(t) ≥ −U(s) for all s such that 0 ≤ s ≤ t based on Equation 7.5. Since
V(t) ≥ −U(t), U(t)+V(t) ≥ 0, hence X(t) ≥ 0. Thus condition (7.2) is verified.
Next to show condition (7.3) is satisfied, we first show that since U(0) = x0 ,
which is nonnegative, we have V(0) = max{−U(0), 0} = 0. Also, for any
dt ≥ 0 we have V(t + dt) ≥ V(t) since the supremum over time 0 to t + dt
must be greater than or equal to the supremum over any interval within 0 to
t + dt, in particular, 0 to t. Thus we have
Based on the characteristics of this result, X(t) is called the reflected pro-
cess of U(t) and V(t) the regulator of U(t). From the expression for X(t) in
Equation 7.6, we can conclude that if {U(t), t ≥ 0} is a Brownian motion
with initial state x0 , drift θ, and variance σ2 , then {X(t), t ≥ 0} is a reflected
Brownian motion (sometimes also called Brownian motion with reflecting
barrier on the x-axis). To illustrate the Brownian motion and the reflected
Brownian motion, we simulated a single sample path of U(t) for 1000 time
units sampled at discrete time points 1 time unit apart. Using numerical val-
ues for initial state x0 = 6, drift −0.01, and variance 0.09, a sample path of
U(t) is depicted in Figure 7.1. Using the relation between U(t) and X(t) in
Equation 7.6, we generated X(t) values corresponding to U(t). Although this
is only a sample path, note from Figure 7.2, the reflected Brownian motion
starts at x0 and then keeps getting reflected at the origin and behaves like
a Brownian motion at other points. Since the drift is negative, the reflected
Brownian motion hits the origin (i.e., X(t) = 0) infinitely often. In the spe-
cific case of the G/G/1 queue, we showed that the {U(t), t ≥ 0} process
especially for large t is a Brownian motion with drift (λ − μ) and vari-
ance λ(C2a + C2s ). Then the {X(t), t ≥ 0} process is a corresponding reflected
Brownian motion.
Having described an approximation for the number in the system process
{X(t), t ≥ 0} as a reflected Brownian motion, we remark that it is rather awk-
ward to approximate a discrete quantity X(t) by a continuous process such
384 Analysis of Queues
15
10
0
U(t)
–5
–10
–15
–20
0 100 200 300 400 500 600 700 800 900 1000
t
FIGURE 7.1
Simulation of Brownian motion {U(t), t ≥ 0}.
12
10
8
X(t)
0
0 100 200 300 400 500 600 700 800 900 1000
t
FIGURE 7.2
Generated reflected Brownian motion {X(t), t ≥ 0}.
as a reflected Brownian motion. Bolch et al. [12] get around this by map-
ping the probability density function of the reflected Brownian motion to a
probability mass function of the number in the system in steady state. That
is certainly an excellent option. However, here we follow the literature on
diffusion approximation or heavy-traffic approximations. In particular, we
Approximations for General Queueing Networks 385
X(t)
W(t) ≈
μ
Problem 64
Show that for any reflected Brownian motion {W(t), t ≥ 0} with drift θ (such
that θ < 0) and variance σ2 , the steady-state distribution is exponential with
parameter −2θ/σ2 .
Solution
Consider a reflected Brownian motion {W(t), t ≥ 0} with initial state w0 , drift
θ, and variance σ2 . Let the cumulative distribution function F(t, x; w0 ) be
defined as
∂ ∂ σ2 ∂ 2
F(t, x; w0 ) = −θ F(t, x; w0 ) + F(t, x; w0 ) (7.7)
∂t ∂x 2 ∂x2
dF(x) σ2 d2 F(x)
−θ + F(x) = 0
dx 2 dx2
which can be solved by integrating once with respect to x and then using
standard differential equation techniques to yield
2 c
F(x) = ae 2θx/σ −
θ
for some constants a and c that are to be determined. Using the boundary
condition F(0) = 0 and the CDF property F(∞) = 1 we get c = − θ and a = −1.
Thus we have
2
F(x) = 1 − e 2θx/σ .
Therefore, any reflected Brownian motion with drift θ (such that θ < 0) and
variance σ2 has a steady-state distribution that is exponential with parameter
−2θ/σ2 .
For the G/G/1 queue described earlier, since we approximated the work-
load process {W(t), t ≥ 0} as a reflected Brownian motion with initial state
w0 = x0 /μ, drift θ = (λ − μ)/μ and variance σ2 = λ(C2a + C2s )/μ2 , we have the
expected workload in steady state (using the preceding problem where we
showed it is σ2 /(2θ)) as:
λ C2a + C2s
.
2(1 − ρ)μ2
Remark 15
The expression for Wq in (Equation 7.8) is exact when the arrivals are Poisson.
In other words, if we had an M/G/1 queue, then based on the preceding
result
λ 1 + C2s
Wq =
2(1 − ρ)μ2
Problem 65
Simulate a G/G/1 queue with mean arrival rate λ = 1, with both interarrival
times and service times both according to Pareto distributions. Generate 100
replications and in each run use 1 million customer departures to obtain the
time in the system for various values of ρ, C2a , and C2s . Compare the sim-
ulation results against the approximation for W that can be derived from
Equation 7.8.
Solution
For this problem, we are given λ, ρ, C2a , and C2s . Using Equation 7.8, we
can derive an analytical expression (in terms of those four quantities) for the
expected time in the system in steady state as
ρ ρ2 C2a + C2s
W= + .
λ 2(1 − ρ)λ
1
βa = 1 + 1+ ,
C2a
βa − 1
ka = .
λβa
Next, we can easily obtain the inverse distribution for the CDF as F−1 (u) =
ka (1 − u)−1/βa . In a similar manner, one can obtain ks , βs , and the inverse of
the service time CDF by changing all the arrival subscripts from a to s and λ
to μ = λ/ρ.
We perform 100 replications of the simulation algorithm for each set of
λ, ρ, C2a , and C2s values. In each replication, we run the simulation till 1 mil-
lion customers are served to obtain the average time in the system over the
1 million customers. Using the 100 sample averages we obtain a confidence
Approximations for General Queueing Networks 389
TABLE 7.1
Comparing Simulation Confidence Interval against Analytical
Approximation
Confidence Interval for W Analytical
ρ C2a C2s (via Simulations) Approx. for W
0.9 1.00 2.00 (8.0194, 12.2170) 13.0500
0.6 1.00 2.00 (1.2347, 1.5313) 1.9500
0.9 0.49 2.00 (1.1135, 22.0137) 10.9845
0.9 2.00 2.00 (9.1947, 11.0142) 17.1000
0.6 2.00 2.00 (0, 7.7464) 2.4000
0.6 0.49 2.00 (1.0702, 1.7480) 1.7205
0.6 0.49 0.49 (0.7926, 0.8577) 1.0410
0.9 0.49 0.49 (3.8001, 3.9382) 4.8690
0.9 4.00 0.49 (5.8341, 5.9503) 19.0845
0.6 4.00 0.49 (0.8837, 0.9007) 2.6205
interval (three standard deviations on each side of the grand average across
the 100 sample averages). We tabulate the results in Table 7.1. Notice that
λ = 1 in all cases.
From the results of the previous problem it appears that in most cases
the analytically predicted W does not even fall within the confidence inter-
val, leave alone being close to the grand average. However, it is not clear
from the preceding text whether that is because of the simulation with Pareto
distribution or the accuracy of the approximation. Looking at how wide
the confidence intervals are, it gives an indication that it might be an inac-
curacy in simulation. It is worthwhile further investigating this issue by
considering an M/G/1 queue where the service times are according to Pareto
distribution where we know the exact steady-state mean sojourn time. For a
similar situation where we run 100 replications of 1 million customers in
each replication for ρ = 0.9, λ = 1, C2s = 2 with Pareto distribution for ser-
vice times, we get a confidence interval of (10.2088, 14.2815). Although the
exact result using the Pollaczek–Khintchine formula in Equation 4.6 yields
W = 13.05, which is within the confidence interval, the grand average (of
12.2452) using the simulation runs is still significantly away considering it
is averaged over as many as 100 million customers. Another thing to notice
is that the grand average is smaller than the expected analytical value. One
reason for this is that the extremely rare event of seeing a humongous ser-
vice time has not been realized in the simulations but has been accounted
for in the analytical models. There are some research papers that have
been addressing similar problems of determining how many simulation
runs would be needed to predict performance under Pareto distributions
with reasonable accuracy. Having said that, we will move along with the
390 Analysis of Queues
7.1.2.1 Superposition
With that we begin with the first step, namely superposition of flows. Con-
sider m flows with known characteristics that are superimposed into a single
flow that act as arrivals to a queue. For i = 1, . . . , m, let θi and C2i be the
average arrival rate of customers as well as the squared coefficient of vari-
ation of interarrival times on flow i. Likewise, let θ and C2 be the effective
arrival rate as well as the effective squared coefficient of variation of inter-
arrival times obtained as a result of superposition of the m flows. Given
Splitting
Superposition
Flow through queue
FIGURE 7.3
Modeling a node of the network.
Approximations for General Queueing Networks 391
(θ1, C 12)
(θ2, C 22)
(θ, C 2)
(θm, Cm2 )
FIGURE 7.4
Superposition of flows.
for all t. For large t we know from renewal theory that Ni (t) for all
i ∈ {1, . . . , m} is normally distributed with mean θi t and variance θC2i t. Also,
since N(t) is the sum of m independent normally distributed quantities
(for large t), N(t) is also normally distributed. Taking the expectation and
variance of N(t) we get
m
E[N(t)] = E[N1 (t)] + · · · + E[Nm (t)] = θi t,
i=1
m
Var[N(t)] = Var[N1 (t)] + · · · + Var[Nm (t)] = θi C2i t.
i=1
θ = θ1 + · · · + θm .
m
θi
C2 = C2i .
θ
i=1
Notice how this is derived by going backward, that is, originally we started
with a renewal process with a mean and squared coefficient of variation of
392 Analysis of Queues
interrenewal times, and then derived the distribution of the counting process
N(t) for large t; but here we reverse that. It is crucial to notice that the actual
interevent times of the aggregated superpositioned process is not IID, and
hence the process is not truly a renewal process. However, we use the results
as an approximation pretending the superimposed process to be a renewal
process.
θd = θa .
Note that we have derived an expression for C2d in Chapter 4 using mean
value analysis (MVA) in Equation 4.16. We now rewrite that expression
in terms of θa , θs , ρ = θa /θs , C2a , and C2s as
(θa,Ca2) (θd,Cd2)
(θs,Cs2)
FIGURE 7.5
Flow through a queue.
Approximations for General Queueing Networks 393
P1 (θ1, C 12)
(θ2, C 22)
(θ, C 2) P2
Pn (θn, Cn2)
FIGURE 7.6
Bernoulli splitting of a flow.
394 Analysis of Queues
θi = pi θ,
C2i = C2 pi + 1 − pi .
N
aj = pij ai + λj . (7.9)
i=1
ai < μi .
ai
ρi = .
μi
Once the ai ’s are obtained, the only parameter left to compute to use in the
G/G/1 result is the squared coefficient of variation of the interarrival times
into node i that we denote as C2a,i . We use an approximation that the net
arrivals into node i (for all i) is according to a renewal process and obtain
an approximate expression for C2a,i using the results for superposition, flow
through a queue, and splitting that we saw in the previous section. But we
require a feed-forward network for that so that we can perform superposi-
tion, flow, and splitting as we go forward in the network. However, as an
approximation, we consider any generic network and show (see following
problem) that for all j such that 1 ≤ j ≤ N,
λj 2
N
ai pij
C2a,j = CAj + 1 − pij + pij 1 − ρ2i C2a,i + ρ2i C2Si . (7.10)
aj aj
i=1
Therefore, there are N such equations for the N unknowns C2a,1 , . . . , C2a,N ,
which can be solved either by writing as a matrix form or by iterating start-
ing with an arbitrary initial C2a,j for each j. Once they are solved, we can
396 Analysis of Queues
Note that this result is exact for the single-server Jackson network. Before pro-
gressing further, we take a moment to derive the expression for C2a,j described
in Equation 7.10 as a problem.
Problem 66
Using the terminology, notation, and expressions derived until Equa-
tion 7.10, show that for a feed-forward network
λj 2
N
ai pij
C2a,j = CAj + 1 − pij + pij 1 − ρ2i C2a,i + ρ2i C2Si .
aj aj
i=1
Solution
Consider the results for superposition of flows, flow through a queue, as well
as Bernoulli splitting, all described in Section 7.1.2. Based on that we know
that if node i has renewal interarrival times with mean 1/ai and squared
coefficient of variation C2a,i (and service times with mean 1/μi and squared
coefficient of variation C2Si ), then the interdeparture times have mean 1/ai
and squared coefficient of variation (1 − ρ2i )C2a,i + ρ2i C2Si . Since the probability
that a departing customer from node i will join node j is pij , the interarrival
times of customers from node i to node j has mean ai pij and squared coeffi-
cient of variation 1 − pij + pij [(1 − ρ2i )C2a,i + ρ2i C2Si ]. Since the aggregate arrivals
to node j is from all such nodes i as well as external arrivals, the effective
interarrival times into node j has a squared coefficient of variation (defined
as C2a,j ) given by
λj 2
N
ai pij
C2a,j = CAj + 1 − pij + pij 1 − ρ2i C2a,i + ρ2i C2Si
aj aj
i=1
probabilities pij . The algorithm to obtain Lj for all j ∈ {1, 2, . . . , N} given mean
and squared coefficient of variation of IID service times (i.e., 1/μi and C2Si ) as
well as mean and squared coefficient of variation of IID external interarrival
times into node i (i.e., 1/λi and C2Ai ) for all i = 1, 2, . . . , N is as follows:
Problem 67
For a single-server open queueing network described in Figure 7.7, cus-
tomers arrive externally only into node 0 and exit the system only from
node 5. Assume that the interarrival times for customers coming externally
has a mean 1 min and standard deviation 2 min. The mean and stan-
dard deviation of service times (in minutes) at each node is described in
Table 7.2. Likewise, the routing probabilities pij from node i to j is provided
1 3 5
0 2 4
FIGURE 7.7
Single-server open queueing network.
TABLE 7.2
Mean and Standard Deviation of Service Times (min)
Node i 0 1 2 3 4 5
Mean 1/μi 0.8 1.25 1.875 1 1 0.5
Standard deviation C2S /μ2i 1 1 1 1 1 1
i
398 Analysis of Queues
TABLE 7.3
Routing Probabilities from Node on Left to Node on Top
0 1 2 3 4 5
0 0 0.5 0.5 0 0 0
1 0 0 0 1 0 0
2 0 0.2 0 0.3 0.5 0
3 0 0 0 0 0 1
4 0 0 0 0.2 0 0.8
5 0 0 0 0 0.4 0
in Table 7.3. Using this information compute the steady-state average num-
ber of customers at each node of the network as well as the mean sojourn
time spent by customers in the network.
Solution
Since this is an open queueing network of single-server queues, to solve the
problem we use the algorithm described earlier (with the understanding that
the network is not a feed-forward network). Note the slight change from the
original description where nodes moved from 1 to N; however, here it is
from 0 to N − 1, where N = 6. From the problem description we can directly
obtain P from Table 7.3 for nodes ordered {0, 1, 2, 3, 4, 5} as
⎡ ⎤
0 0.5 0.5 0 0 0
⎢ 0 0 0 1 0 0 ⎥
⎢ ⎥
⎢ 0 0.2 0 0.3 0.5 0 ⎥
P=⎢
⎢
⎥.
⎥
⎢ 0 0 0 0 0 1 ⎥
⎣ 0 0 0 0.2 0 0.8 ⎦
0 0 0 0 0.4 0
Also, the external arrival rate is 1 per minute for node 0 and zero for all
other nodes. Based on Equation 7.9 we have a = [a0 a1 a2 a3 a4 a5 ] =
[λ0 λ1 λ2 λ3 λ4 λ5 ][I − P]−1 with λ0 = 1, and with λi = 0 for i > 0, we
have a = [1.0000 0.6000 0.5000 0.93330.9167 1.6667] effective arrivals per
minute into the various nodes. Using the service rates μi for every node i
described in Table 7.2 we can obtain the traffic intensities for various nodes
as [ρ0 ρ1 ρ2 ρ3 ρ4 ρ5 ] = [0.8000 0.7500 0.9375 0.9333 0.9167 0.8333]. Clearly,
all the nodes are stable; however, note how some nodes have fairly high
traffic intensities.
Next we obtain the squared coefficient of variations for the effective inter-
arrival times. For that, from the problem description we have the coefficient
of variation for external arrivals as
C2A0 C2A1 C2A2 C2A3 C2A4 C2A5 = [ 4 0 0 0 0 0 ]
Approximations for General Queueing Networks 399
and those of the service times we can easily compute from Table 7.2 as
C2S0 C2S1 C2S2 C2S3 C2S4 C2S5 = [ 1.5625 0.6400 0.2844 1.0000 1.0000 4.0000 ].
Thereby we use Equation 7.10 for all j to derive C2a,j using the following steps:
First obtain the row vector ψ = [ψj ] for all i, j, k ∈ {0, 1, 2, 3, 4, 5} as
ψj = λj C2Aj + [aP]j − a p2ik j + ai ρ2i C2Si p2ik j
where diag(bk ) is a diagonal matrix using elements of some vector (bk ). Using
the preceding computation we can derive
ψj
= [ 4.0000 0.9750 1.0000 0.5461 1.4149 0.8716 ]
aj
and
C2a,j = [ 4.0000 1.5819 1.7200 1.0107 1.5349 1.0308 ].
single server. There are C customers in total in the network, with no external
arrivals or departures. For i = 1, . . . , N, service times of customers at node
i are IID random variables with mean 1/μi and squared coefficient of varia-
tion C2Si . When a customer completes service at node i, the customer joins the
queue at node j with probability pij so that the routing matrix P = [pij ] has all
rows summing to one. We present two algorithms based on Bolch et al. [12],
one for large C called bottleneck approximation and the other for small C that
uses MVA. There are other algorithms such as maximum entropy method
(see Bolch et al. [12]) that are not described here.
N
vj = pij vi (7.12)
i=1
Define λ as λ = μb /vb . We obtain the traffic intensities ρj (for all j) using the
following approximation:
λvj
ρj = .
μj
Approximations for General Queueing Networks 401
Of course that would result in ρb = 1 for the bottleneck node and hence we
have to be careful as we will see subsequently.
We can also immediately obtain ai = λvi for all i. Then, the only param-
eter left to compute to use in the G/G/1 result is the squared coefficient of
variation of the interarrival times into node i that we denote as C2a,i for all i.
Since the external arrival rate is zero, this would just be a straightforward
adjustment of Equation 7.10 for all j such that 1 ≤ j ≤ N as:
N
ai pij
C2a,j = 1 − pij + pij 1 − ρ2i C2a,i + ρ2i C2Si . (7.13)
aj
i=1
Here too there are N such equations for the N unknowns C2a,1 , . . . , C2a,N , which
can be solved either by writing as a matrix form or by iterating starting with
an arbitrary initial C2a,j for each j. Once they are solved, we can use Equa-
tion 7.8 to obtain an approximate expression for the steady-state average
number of customers in node j for all j = b as
ρ2j C2a,j + C2Sj
Lj ≈ ρj + . (7.14)
2(1 − ρj )
Note that if we used this equation for node b, the denominator would go to
infinity. However, since the total number in the entire network is C, we can
easily obtain Lb using
Lb = C − Lj .
j=b
1. Obtain visit ratios vj into node j by solving Equation 7.12 for all j.
2. Identify the bottleneck node b as the node with the largest vi /μi value
among all i ∈ [1, N].
3. Let λ = μb /vb and obtain aggregate customer arrival rate aj into node
j as aj = λvj for all j.
4. Using aj , obtain the traffic intensity ρj at node j as ρj = aj /μj .
402 Analysis of Queues
5. Solve Equation 7.13 for all j and derive C2a,j using the N simultaneous
equations.
6. Using the derived values of ρj and C2a,j for all j = b, plug them into
Equation 7.14 and obtain Lj approximately.
7. Finally, Lb = C − Li .
i=b
• Wi (k): Average sojourn time in node i when there are k customers (as
opposed to C) in the closed queueing network
• Li (k): Average number in node i when there are k customers (as
opposed to C) in the closed queueing network
• λ(k): Measure of average flow (sometimes also referred to as
throughput) in the closed queueing network when there are k
customers (as opposed to C) in the network
We do not have an expression for any of the preceding metrics, and the objec-
tive is to obtain them iteratively. However, before describing the iterative
algorithm, we first explain the relationship between those parameters.
As a first approximation, we assume that the arrival theorem described
in Remark 14 holds here too. Thus in a network with k customers (such that
1 ≤ k ≤ C) the expected number of customers that an arrival to node i (for
any i ∈ {1, . . . , N}) would see is Li (k − 1). Note that Li (k − 1) is the steady state
expected number of customers in node i when there are k − 1 customers in
the system. Further, the net mean sojourn time experienced by that arriving
customer in steady state is the average time to serve all those in the system
upon arrival plus that of the customer. Note that the average service time is
1/μi for all customers waiting and (1 + C2Si )/(2μi ) for the customer in service
(using the remaining time for an event in steady state for a renewal process).
Thus we have
1 1 + C2Si
Wi (k) = + Li (k − 1) .
μi 2
Approximations for General Queueing Networks 403
k
λ(k) = N
vi Wi (k)
i=1
when there are k customers in the network. Thereby applying Little’s law
across each node i we get
Problem 68
Consider a manufacturing system with five machines numbered 1, 2, 3, 4,
and 5. The machines are in three stages as depicted in Figure 7.8. Four types
of products are produced in the system, 32% are processed on machines 1,
2, and 4; 8% on machines 1, 2, and 5; 30% on machines 1, 3, and 4; and
404 Analysis of Queues
0.4 2 0.8 4
1 0.2
0.5
3 5
0.6 0.5
FIGURE 7.8
Single-server closed queueing network.
TABLE 7.4
Mean and Standard Deviation of Service Times (Min)
Machine i 1 2 3 4 5
Mean 1/μi 3 4 5 6 2
Standard deviation C2S /μ2i 6 2 1 3 2
i
Approximations for General Queueing Networks 405
and the MVA (assuming C is small). From the problem description we can
directly obtain P ordered {1, 2, 3, 4, 5} as
⎡ ⎤
0 0.4 0.6 0 0
⎢ 0 0 0 0.8 0.2 ⎥
⎢ ⎥
P=⎢
⎢ 0 0 0 0.5 0.5 ⎥.
⎥
⎣ 1 0 0 0 0 ⎦
1 0 0 0 0
Next we obtain the squared coefficient of variations for the effective inter-
arrival times. For that, we use the squared coefficient of variation of the
service times, which we can easily compute from Table 7.4 as
C2S1 C2S2 C2S3 C2S4 C2S5 = [ 4.0000 0.2500 0.0400 0.2500 1.0000 ].
Thereby we use Equation 7.13 for all j to derive C2a,j using the following steps:
First obtain the row vector ψ = [ψj ] for all i, j, k ∈ {1, 2, 3, 4, 5} as
ψj = aP j − a p2ik j + ai ρ2i C2Si p2ik j
where diag(bk ) is a diagonal matrix using elements of some vector (bk ). Using
the preceding computation, we can derive
ψj
= [ 0.1709 1.6406 1.9609 0.3706 0.5754 ]
aj
406 Analysis of Queues
and
C2a,j = [ 0.5056 1.7113 2.0669 1.1213 0.9194 ].
Now, using the values of ρj , C2a,j , and C2Sj for j = 1, 2, 3, 5 in Equation 7.14,
we can obtain the row vector of the mean number in each node in steady
state as L1 = 8.3764, L2 = 0.7484, L3 = 4.3464, and L5 = 0.2546. We can
obtain L4 = C − L1 − L2 − L3 − L5 = 16.2742.
is for two reasons. The first reason is to be able to derive expressions for
the second moment of the interdeparture time from a queue (which is why
Section 7.1 also used FCFS). The second reason is it would enable us to ana-
lyze each queue as an aggregated single-class queue. As described in Section
5.2.1 for multiclass M/G/1 queues, here too we can aggregate customers of
all classes in a node in a similar fashion. Further, as we saw in Section 7.1.4,
the approximations for general closed queueing networks were either rather
naive or available only for networks with single-server queues. Therefore,
we restrict our attention to only open queueing networks. With these intro-
ductory remarks, we proceed to analyze multiclass and multiserver open
queueing networks with FCFS discipline.
1. There are N service stations (or nodes) in the open queueing net-
work. The outside world is denoted by node 0 and the other nodes
are 1, 2, . . . , N. It is critical to point out that node 0 is used purely for
notational convenience and we are not going to model it as a “node”
for the purposes of analysis.
2. There are mi servers at node i (such that 1 ≤ mi ≤ ∞), for all i
satisfying 1 ≤ i ≤ N.
3. The network has multiple classes of traffic and class switching is not
allowed. Let R be the total number of classes in the entire network
and each class has its unique external arrival process, service times
at each node, as well as routing probabilities. They are explained
next.
4. Externally, customers of class r (such that r ∈ {1, . . . , R}) arrive at
node i according to a renewal process such that the interarrival time
has a mean 1/λ0i,r and a squared coefficient of variation (SCOV) of
C20i,r . All arrival processes are independent of each other, the service
times and the class.
408 Analysis of Queues
N
pij,r = 1
j=0
The preceding notations are summarized in Table 7.5 for easy reference.
Our objective is to develop steady-state performance measures for such a
TABLE 7.5
Parameters Needed as Input for Multiserver and Multiclass Open Queueing
Network Analysis
N Total number of nodes
R Total number of classes
i Node index with i = 0 corresponding to external world, otherwise i ∈ {1, . . . , N}
j Node index with j = 0 corresponding to external world, otherwise j ∈ {1, . . . , N}
r Class index with r ∈ {1, . . . , R}
pij,r Fraction of traffic of class r that exits node i and join node j
mi Number of servers at node i for i ≥ 1
μi,r Mean service rate of class r customers at node i
C2S SCOV of service time of class r customers at node i
i,r
λ0i,r Mean external arrival rate of class r customers at node i
C20i,r SCOV of external interarrival time of class r customers at node i
Approximations for General Queueing Networks 409
α mi 1 C2Ai + C2Si
Wiq ≈ , (7.15)
μi 1 − ρi 2mi
⎧ mi
⎪ ρ +ρ
⎨ i 2 i if ρi > 0.7,
α mi =
⎪
⎩ mi +1
ρi 2
if ρi < 0.7.
Problem 69
Consider a single G/G/m queue with interarrival and service times according
to gamma distributions. Compare against simulations the expression for Wq
given by
αm 1 C2A + C2S
Wq ≈ ,
μ 1−ρ 2m
i if ρ < 0.7.
TABLE 7.6
Comparison of Simulation’s 95% Confidence Interval against Analytical
Approximation for Wq
Experiment m ρ C2A C2S Approximation Simulation
ρ2i C2Si − 1
C2Di =1+ √ + 1 − ρ2i C2Ai − 1
mi
where ρi = λi /(mi μi ). Note that the result is identical to that of a G/G/1 queue
given in Section 7.1 if we let mi = 1. Also, since the departures from M/G/∞
and M/M/mi queues are Poisson, we can verify from above by using C2Ai = 1
as well as letting mi → ∞ and C2Si = 1, respectively, for the two queues, we
can show that C2Di is one in both cases.
λi,r
ρi,r =
mi μi,r
since it is nothing but the ratio of class i arrival rate to the net service
rate. Thus we can aggregate over all classes and obtain the effective traffic
intensity into node i, ρi , as
R
ρi = ρi,r .
r=1
It is worthwhile to point out that the condition for stability for node i is given
by ρi < 1. We can also immediately obtain the aggregate mean arrival rate
into note i, λi , as the sum of the arrival rate over all classes. Hence
R
λi = λi,r .
r=1
Also, we can obtain the SCOV of the aggregate arrivals into node i (by
aggregating over all classes)
1 2
R
C2Ai = CAi,r λi,r .
λi
r=1
This result can be derived directly from the SCOV of a flow as a result of
superpositioning described in Section 7.1.2.
Having obtained all the expressions for the input to queue i, next we
obtain the aggregate service parameters and the split output from node i. In
particular, μi , the aggregate mean service rate of node i, can be obtained from
its definition using
1 λi
μi = = .
R λi,r 1 ρi
r=1 λi mi μi,r
This result and the next one on the aggregate SCOV of service times across all
classes at a node can be derived using that in Section 5.2.1 for M/G/1 queue
Approximations for General Queueing Networks 413
with FCFS service discipline. Thus the aggregate SCOV of service time of
node i, C2Si , is given by
R 2
λi,r μi
C2Si = −1 + C2Si,r + 1 .
λi mi μi,r
r=1
Finally, the average flow rate for departures from node i into node j using
splitting of flows can be computed. As defined earlier, λij,r is the mean
departure rate of class r customers from node i that end up in node j. Since
pij,r is the fraction of traffic of class r that depart from node i join node j,
we have
N
λi,r = λ0i,r + λj,r pji,r .
j=1
In fact, we would have to solve N such equations to obtain λi,r for all i. In the
single-class case recall that we solved that by inverting I − P and multiplying
that by external arrivals. Although something similar can be done here, care
must be taken to ensure that the set of nodes that class r traffic traverses only
must be considered (otherwise the generic I − P is not invertible). Further,
using the superposition of flows result in Section 7.1.2, we can derive C2Ai,r ,
the SCOV of class r interarrival times into node i as
1 2
N
C2Ai,r = Cji,r λj,r pji,r .
λi,r
j=0
414 Analysis of Queues
To obtain the SCOV of aggregate interarrival times into node i, C2Ai , we once
again use the superposition result in Section 7.1.2 to get
1 2
R
C2Ai = CAi,r λi,r .
λi
r=1
We had seen earlier that we can obtain the effective SCOV of the departures
from node i using C2Si , the aggregate SCOV of service time of node i (also
derived earlier), as
ρ2i C2Si − 1
C2Di =1+ √ + 1 − ρ2i C2Ai − 1 .
mi
Thus we can get C2ij,r , which is the SCOV of time between two consecutive
class r customers going from node i to node j in steady state as
C2ij,r = 1 + pij,r C2Di − 1 .
This result is directly from the splitting of flows departing a queue described
in Section 7.1.2. With that we have defined all the parameters necessary for
the algorithm to obtain steady-state performance measures for all classes of
customers at all nodes.
TABLE 7.7
Parameters Obtained as Part of the QNA Algorithm
μi Aggregate mean service rate of node i
C2S Aggregate SCOV of service time of node i
i
λij,r Mean arrival rate of class r customers from node i to node j
λi,r Mean class r arrival rate to node i (or mean departure rate from node i)
λi Mean aggregate arrival rate to node i
ρi,r Traffic intensity of node i due to customers of class r
ρi Traffic intensity of node i across all classes
C2ij,r SCOV of time between two consecutive class r customers going from node i to node j
C2A SCOV of class r interarrival times into node i
i,r
C2A Aggregate SCOV of interarrival times into node i
i
C2D Aggregate SCOV of inter-departure times from node i
i
Li,r Expected number of class r customers in node i in steady state
Wiq Expected time waiting before service in node i in steady state
Likewise, we assume that upon service completion, there is only one stream
that gets split into multiple streams. The following are the three basic steps
in the algorithm.
Step 1: Calculate the mean arrival rates, utilizations, and aggregate service
rate parameters using the following:
N
λi,r = λ0i,r + λj,r pji,r
j=1
R 2
λi,r μi
C2Si = −1 + C2Si,r + 1 .
λi mi μi,r
r=1
416 Analysis of Queues
(1) Superposition:
1 2
N
C2Ai,r = Cji,r λj,r pji,r
λi,r
j=0
1 2
R
C2Ai = CAi,r λi,r
λi
r=1
(2) Flow:
ρ2i C2Si − 1
C2Di =1+ √ + 1 − ρ2i C2Ai − 1
mi
(3) Splitting:
λi,r
C2ij,r = 1 + pij,r C2Di − 1 .
λi
Note that the splitting formula is exact if the departure process is a renewal
process. However, the superposition and flow formulae are approximations.
Several researchers have provided other expressions for the flow and super-
position. As mentioned earlier, the preceding is QNA, described in Whitt
[103].
Step 3: Obtain performance measures such as mean queue length and
mean waiting times using standard G/G/m queues. Treat each queue as an
independent approximation. Choose αmi such that
⎧ mi
⎪ ρ +ρ
⎨ i 2 i if ρi > 0.7
α mi =
⎪
⎩ mi +1
ρi 2
if ρi < 0.7.
Then the mean waiting time for class r customers in the queue (not including
service) of node i is approximately
α mi 1 C2Ai + C2Si
Wiq ≈ .
μi 1 − ρi 2
Approximations for General Queueing Networks 417
λi,r
Li,r = + λi,r Wiq .
μi,r
Problem 70
Consider an e-commerce system where there are three stages of servers. In
the first stage there is a single queue with four web servers; in the second
stage there are four application servers, two of which are on the same node
and share a queue; an in the third stage there are three database servers (two
on one node sharing a queue and one on another node). The e-commerce
system caters to two classes of customers but serve them in an FCFS man-
ner (both across classes and within a class). This e-commerce system at the
server end can be modeled as an N = 6 node and R = 2 class open queueing
network with multiple servers. This multiserver and multiclass open queue-
ing network is described in Figure 7.9. There are two classes of customers
and both classes arrive externally only into node 1. Class-1 customers exit the
system only from node 5 and class-2 customers only from node 6. Assume
that the interarrival times for customers coming externally have a mean 1/3
units of time and standard deviation 2/3 time units for class-1 and a mean 1/6
units of time and standard deviation 1/4 units of time for class-2. The mean
and standard deviation of service times (in the same time units as arrivals)
at each node for each class are described in Table 7.10. Likewise, the rout-
ing probabilities pij,r from node i to j is provided in Table 7.8 for r = 1 (i.e.,
class-1) and Table 7.9 for r = 2 (i.e., class-2). Using this information compute
the steady-state average number of each class of customers at each node.
Solution
Based on the problem description we first cross-check to see that all input
metrics described in Table 7.5 are given. Clearly, we have N = 6 and R = 2.
1 3
FIGURE 7.9
Multiserver and multiclass open queueing network.
418 Analysis of Queues
TABLE 7.8
Class-1 Routing Probabilities [pij,1 ] from Node
on Left to Node on Top
1 2 3 4 5 6
1 0 0.8 0.2 0 0 0
2 0 0 0 0 1 0
3 0 0 0 0 1 0
4 0 0 0 0 0 0
5 0 0 0.6 0 0 0
6 0 0 0 0 0 0
TABLE 7.9
Class-2 Routing Probabilities [pij,2 ] from Node
on Left to Node on Top
1 2 3 4 5 6
1 0 0 0.1 0.9 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 1
4 0 0 0 0 0 1
5 0 0 0 0 0 0
6 0 0 0.75 0 0 0
TABLE 7.10
Number of Servers and Mean and Standard Deviation of Service Times for Each
Class
Node i 1 2 3 4 5 6
No. of servers mi 4 1 2 1 1 2
Class-1 service rate μi,1 2 2.5 6 N/A 8 N/A
SCOV class-1 service C2S 2 0.64 1.44 N/A 0.81 N/A
i,1
Class-2 service rate μi,2 4 N/A 20 6 N/A 15
SCOV class-2 service C2S 4 N/A 2 0.49 N/A 0.64
i,2
The routing probabilities (for all i and j) pij,1 and pij,2 are provided in
Tables 7.8 and 7.9, respectively. Also, Table 7.10 lists mi , μi,1 , μi,2 , C2Si,1 , and
C2Si,2 for i = 1, 2, 3, 4, 5, 6. Finally, λ01,1 = 3, C201,1 = 4, λ01,2 = 6, and C201,2 = 2.25,
with all other λ0i,r = 0 and C20i,r = 0. Now, we go through the three steps of
the algorithm.
Approximations for General Queueing Networks 419
N
λi,r = λ0i,r + λj,r pji,r
j=1
for all i ∈ [1, . . . , 6] and r = 1, 2, we can simply consider the subset of nodes
that class r customers traverse. Then using the approach followed in Jackson
networks we can get the λi,r values. However, in this example since there is
only one loop in the network, one can obtain λi,r in a rather straightforward
fashion. In particular, we get
and
Using these results we can immediately obtain for all i ∈ {1, . . . , 6},
j ∈ {1, . . . , 6}, and r = 1, 2
using the preceding λi,r values. Thus the aggregate arrival rate into node i
across all classes, λi , can be obtained by summing over λi,r for r = 1, 2. Hence
we have
For all i ∈ [1, . . . , 6] and r = 1, 2, we can write down ρi,r = λi,r /mi μi,r and
2
thereby obtain ρi = ρi,r as
r=1
Clearly, since ρi < 1 for all i ∈ {1, 2, 3, 4, 5, 6}, all queues are stable. Further,
the last computation in step 1 of the algorithm is to obtain the aggregate
service rate and SCOV of service times at node i, which can be computed
using
λi
μi =
ρi
R 2
λi,r μi
C2Si = −1 + C2Si,r + 1 .
λi mi μi,r
r=1
420 Analysis of Queues
[ μ1 μ2 μ3 μ4 μ5 μ6 ] = [ 12 2.5 26.6292 6 8 30 ]
and
C2S1 C2S2 C2S3 C2S4 C2S5 C2S6 = [ 3.125 0.64 2.6291 0.49 0.81 0.64 ].
For step 2 of the algorithm we initialize all C2ij,r = 1. Then, for all
i ∈ [1, . . . , 6] and r = 1, 2, we obtain
1 2
N
C2Ai,r = Cji,r λj,r pji,r
λi,r
j=0
1 2
R
C2Ai = CAi,r λi,r
λi
r=1
ρ2i (C2Si
− 1)
C2Di = 1 + √ + 1 − ρ2i C2Ai − 1 .
mi
C2A1 C2A2 C2A3 C2A4 C2A5 C2A6 = [ 2.8333 2.1198 1.0449 2.2598 1.5487 1.6753 ].
m
Finally, in step 3 of the algorithm, we obtain αmi = (ρi i + ρi )/2 since
ρi > 0.7 for all i. Then using the approximation for Wiq , namely
α mi 1 C2Ai + C2Si
Wiq ≈
μi 1 − ρi 2
Before moving ahead with other policies for serving multiclass traffic in
queueing networks, we present a case study. This case study is based on
the article by Chrukuri et al. [20] and illustrates an application of FCFS mul-
ticlass network approximation where many of the conditions required for
the analysis presented earlier are violated. In particular, the system is (a) a
multiclass queueing network with class switching, (b) a polling system with
limited service discipline, and (c) there are finite-capacity queues with block-
ing. However, since these are not at the bottleneck node, the results are not
terribly affected if we were to continue to use QNA. Further, the presen-
tation of this case study is fairly different from the previous case studies
in some ways.
NIC
LANai
Net
send
HDMA DMA
Computer or
workstation Network
SRAM
(memory) Net
receive
DMA
FIGURE 7.10
Components of a Myrinet VIA NIC.
engine to transfer data from SRAM onto the network; and a Net Receive
DMA (NRDMA) engine to transfer data on to the SRAM from the network.
An example of such a VIA NIC is depicted in Figure 7.10 and its functioning
is described next.
The LANai goes through the following operations cyclically: polling the
doorbell queue to know if there is data that needs to be transferred, polling
the descriptor queue on SRAM that associates a doorbell with its data, and
polling the data queue. In addition, it programs NSDMA and NRDMA to
send and receive the data to and from the network, respectively. LANai polls
the doorbell queue and makes them available for HDMA to obtain the cor-
responding descriptors. Polled doorbells wait in a queue at HDMA to get
serviced on an FCFS basis. They are processed by the HDMA and the corre-
sponding descriptors are stored in the descriptor queue on the SRAM. The
descriptors in this queue are polled by LANai and it makes them available
for HDMA to obtain the corresponding data. In the case of a send descrip-
tor, LANai initiates the transfer of data from the host memory on to the data
queue on SRAM using HDMA. In the case of a receive descriptor, LANai
initiates the transfer of data from the network queue at NRDMA to the data
queue on SRAM using the NRDMA. LANai polls the data queue and if the
polled data is of type “send,” it checks whether NSDMA is busy. If not, it
initiates the transfer of send data from SRAM data queue to NSDMA. If the
polled data is of type “receive,” it initiates the transfer of data from SRAM
data queue to host memory using HDMA.
In summary, the operation of a VIA NIC can be modeled as a multi-
class queueing network. An experiment was performed where only the send
messages were considered (without any data received from the network) to
measure interarrival times and service times. Based on this, the send process
is depicted in Figure 7.11. There are three stations in the queueing network
corresponding to the LANai, HDMA, and NSDMA. At the LANai there are
three queues. Entities arrive externally into one of the queues according to
PP(λ), where λ is in units of per microsecond. It takes 22 μs to serve those
Approximations for General Queueing Networks 423
10 52.7
0.12
22 68.3
NSDMA
PP(λ) 21 (Blocking)
LANai
(polling) HDMA
(FCFS)
FIGURE 7.11
Multiclass queueing network model of an NIC with send.
entities. Then the entities go into the HDMA. Although there are two queues
presented in the figure for illustration, there is really only one queue and
entities are served according to FCFS. When an entity arrives for the first
time it takes 21 μs to serve it and the entities go back to the LANai station
where they are served in just 0.12 μs and they return to the HDMA for a sec-
ond time, this time to be served in 68.3 μs. The entities return to the LANai.
The LANai would spend 10 μs serving the entity if the NSDMA is idle, oth-
erwise the LANai would continue polling its other queues. Note that the
LANai knows if the NSDMA is idle or not because it also polls it to check if
it is idle but the time is negligible. Once the entity reaches the idle NSDMA,
it takes 52.7 μs to process and it exits the system. Note that all service times
are deterministic.
In summary, the model of the system is that of a reentrant line. The first
station LANai uses a limited polling policy where it polls each queue, serves
at most one entity, and moves to the next queue. Also, entities in one of the
queues (with 10 μs service time) can begin service only if the NSDMA sta-
tion is idle. Thus the LANai would serve zero entities in that queue if the
NSDMA is busy. Then the second station is HDMA, which uses a pure FCFS
strategy. And the third station, NSDMA has no buffer, so it would get an
entity only if it is idle, in other words, it blocks an entity in the correspond-
ing LANai queue. Therefore, the system is a multiclass queueing network
with reentrant lines (or class switching and deterministic routing). It uses
a polling system with limited service discipline as opposed to FCFS at one
of the nodes. There is a finite-capacity node that blocks one of the queues.
Thus several of the conditions we saw in this section are violated. How-
ever, a quick glance would reveal that the bottleneck station is the HDMA.
In particular, the utilizations of the NSDMA and HDMA are 52.7λ and 89.3λ,
respectively. The utilization of the LANai is trickier to compute because the
LANai could be idling due to being blocked by the NSDMA. Nonetheless,
the fraction of time the LANai would have one or more entities would be
only a little over 32.12λ.
Thus undoubtedly, the HDMA station would be the bottleneck. To ana-
lyze the system considering the significant difference in utilizations at the
424 Analysis of Queues
1. There are N service stations (or nodes) in the open queueing network
indexed 1, 2, . . . , N.
2. There is one server at node i, for all i satisfying 1 ≤ i ≤ N.
3. The network has multiple classes of traffic and class switching is not
allowed. Let R be the total number of classes in the entire network.
There is a global priority order across the entire network with class-
1 having highest priority and class R having lowest priority. Each
class has its unique external arrival process, service times at each
node, as well as routing probabilities. They are explained next.
4. Externally, customers of class r (such that r ∈ {1, . . . , R}) arrive at
node i according to a Poisson process such that the interarrival time
has a mean 1/λi,r . All arrival processes are independent of each
other, the service times, and the class.
5. Service times of class r customers at node i are IID exponential ran-
dom variables with mean 1/μi,r . They are independent of service
times at other nodes.
6. The service discipline at all nodes is a static and global preemptive
resume priority (with class a having higher priority than class b if
a < b). Within a class the service discipline is FCFS.
7. There is infinite waiting room at each node and stability condition is
satisfied at every node.
8. When a customer of class r completes service at node i, the customer
joins the queue at node j (such that j ∈ {1, . . . , N}) with probability
pij,r . We require that pii,r = 0, although we eventually remark that it
426 Analysis of Queues
N
pij,r ≤ 1
j=1
N
ai,r = λi,r + aj,r pji,r
j=1
ai,r
ρi,r = .
μi,r
To solve the first set of equations, if we select the subset of nodes that class
r traffic visits, then we can create a traffic matrix P̂r for that subset of nodes.
Approximations for General Queueing Networks 427
Then we can obtain the entering rate vector for that subset of nodes (âr ) in
terms of the external arrival rate vector at those nodes (λ̂r ) as âr = λ̂r (I− P̂r )−1 ,
where I is the corresponding identity matrix. Thereby we can obtain ai,r for
all i ∈ [1, . . . , N] and r ∈ [1, . . . , R]. Also, the condition for stability is that
R
ρi,r < 1
r=1
for every i.
Step 2: Sequentially compute for each node i (from i = 1 to i = N) and each
r (from r = 1 to r = R)
r
Li,k
r−1
Li,r = ρi,r + ai,r + Li,r ρi,k , (7.16)
μi,k
k=1 k=0
where ρi,0 = 0 for all i. Notice that it is important to solve for the r val-
ues sequentially because for the case r = 1 it is possible to derive Li,1 using
Equation 7.16, then for r = 2 to derive Li,2 one needs Li,1 in Equation 7.16,
and so on.
Before illustrating the preceding algorithm using an example, we first
explain the derivation of Equation 7.16 and also make a few remarks. Define
Wi,r as the sojourn time for a class r customer during a single visit to node i.
Of course, due to Little’s law we have for all i ∈ [1, . . . , N] and r ∈ [1, . . . , R]
r−1
Li,k
r
ai,k 1
Wi,r = + Wi,r + .
μi,k μi,k μi,r
k=1 k=0
Using Little’s law Li,r = ai,r Wi,r we can rewrite this expression in terms of Li,k
and obtain Equation 7.16. Having explained the algorithm, next we present
a couple of remarks and then illustrate it using an example.
428 Analysis of Queues
Remark 16
The algorithm is exact for the special cases of N = 1 with any R and R = 1
with any N. That is because for N = 1 and any R it reduces to a single station
M/G/1 multiclass queues considered in Section 5.2.3. Also, for R = 1 and
any N we get a Jackson network. It may be a worthwhile exercise to check
the results for the preceding two special cases. Further, notice that under
those special cases due to PASTA, arriving customers do see time-averaged
number in the system.
Remark 17
Although we required that pii,r be zero for every i and r, the algorithm can
certainly be used as an approximation even when pii,r > 0. Also, the algo-
rithm can be seamlessly extended to non-preemptive priorities and closed
queueing networks by suitably approximating what arrivals see. The results
can be found in Bolch et al. [12].
Problem 71
Consider a small Internet service provider that can be modeled as a network
with N = 4 nodes. There are R = 3 classes of traffic. Class-1 traffic essentially
is control traffic that monitors the network states, and it is given highest pri-
ority with an external arrival rate of λi,1 = 0.5 at every node i. Class-2 traffic
arrives at node 1 at rate 3, then goes to server at node 2, and exits the net-
work through node 4. Likewise, class-3 traffic arrives into node 1 at rate 2,
then gets served at node 3, and exits the network after being served in node
4. Assume that all nodes have a single server, infinite waiting room, and the
priority order is class-1 (highest) to 2 (medium) to 3 (lowest) at all nodes.
The policy for priority is preemptive resume. Assume that external arrivals
are according to Poisson processes and service times are exponentially dis-
tributed. The service rates (in the same time units as arrivals) at each node
for each class are described in Table 7.11. Likewise, the routing probabilities
pij,r from node i to j are provided in Table 7.12 for r = 1 (i.e., class-1), Table
TABLE 7.11
Mean Service Rates for Each Class at Every Node
Node i 1 2 3 4
Class-1 service rate μi,1 10 8 8 10
Class-2 service rate μi,2 8 5 N/A 6
Class-3 service rate μi,3 7 N/A 4 8
Approximations for General Queueing Networks 429
TABLE 7.12
Class-1 Routing Probabilities [pij,1 ] from Node
on Left to Node on Top
1 2 3 4
1 0 0.2 0.2 0.2
2 0.25 0 0.25 0.25
3 0.25 0.25 0 0.25
4 0.2 0.2 0.2 0
TABLE 7.13
Class-2 Routing Probabilities [pij,2 ] from Node
on Left to Node on Top
1 2 3 4
1 0 1 0 0
2 0 0 0 1
3 0 0 0 0
4 0 0 0 0
TABLE 7.14
Class-3 Routing Probabilities [pij,3 ] from Node
on Left to Node on Top
1 2 3 4
1 0 0 1 0
2 0 0 0 0
3 0 0 0 1
4 0 0 0 0
7.13 for r = 2 (i.e., class-2), and Table 7.14 for r = 3 (i.e., class-3). Using
this information compute the steady-state average number of each class of
customers at each node.
Solution
To solve the problem we go through the two steps of the algorithm. For
step 1, note that class-1 customers go through nodes 1, 2, 3, and 4; class-2
customers use nodes 1, 2, and 4; whereas class-3 customers use nodes 1, 3,
430 Analysis of Queues
N
ai,r = λi,r + aj,r pji,r
j=1
and
For all i ∈ [1, . . . , 4] and r = 1, 2, 3, we can write down ρi,r = ai,r /μi,r .
Hence we get
and
R
ρi,r < 1
r=1
and
Upon running simulations with 50 replications, it was found that the results
matched exactly for class-1, that is, [L1,1 L2,1 L3,1 L4,1 ] since for class-1 the
system is a standard open Jackson network. For classes 2 and 3, since the
Approximations for General Queueing Networks 431
results are approximations, the 95% confidence interval for the simulations
yielded
[ L1,2 L2,2 L3,2 L4,2 ] = [ 0.9423 ± 0.0022 3.4073 ± 0.1613 0 1.7050 ± 0.0041 ]
and
[ L1,3 L2,3 L3,3 L4,3 ] = [ 3.1390 ± 0.0178 0 2.3621 ± 0.0108 9.4006 ± 0.0894 ].
Step 1: Calculate the mean effective entering rates ai,r and utilizations ρi,r
using the following for all i ∈ [1, . . . , N] and r ∈ [1, . . . , R]:
N
ai,r = λi,r + aj,r pji,r
j=1
ai,r
ρi,r = .
μi,r
To solve the first set of equations, if we select the subset of nodes that class
r traffic visits, then we can create a traffic matrix P̂r for that subset of nodes.
Then we can obtain the entering rate vector for that subset of nodes (âr ) in
terms of the external arrival rate vector at those nodes (λ̂r ) as âr = λ̂r (I −
P̂r )−1 , where I is the corresponding identity matrix. Thereby we can obtain
432 Analysis of Queues
ai,r for all i ∈ [1, . . . , N] and r ∈ [1, . . . , R]. Also, the condition for stability
is that
R
ρi,r < 1
r=1
for every i. Note that this step is identical to step 1 in Section 7.3.1.
Step 2: Sequentially compute for each node i (from i = 1 to i = N) and each
r (from r = 1 to r = R)
r
ai,k 1
r
Li,k − ρi,k
r−1
Li,r = ρi,r + ai,r σ2i,k + + ai,r + Li,r ρi,k (7.17)
k=1
2 μ2i,k k=1
μi,k
k=0
where ρi,0 = 0 for all i. Note that it is important to solve for the r val-
ues sequentially because for the case r = 1 it is possible to derive Li,1 using
Equation 7.17, then for r = 2 to derive Li,2 one needs Li,1 in Equation 7.17,
and so on.
Before illustrating the preceding algorithm using an example, we first
explain the derivation of Equation 7.17. Recall from Section 7.3.1 that Wi,r is
the sojourn time for a class r customer during a single visit to node i. Due to
Little’s law we have for all i ∈ [1, . . . , N] and r ∈ [1, . . . , R]
Using Little’s law Li,r = ai,r Wi,r we can rewrite this expression in terms of Li,k
and obtain Equation 7.17.
Problem 72
Consider an emergency ward of a hospital with four stations: reception
where all arriving patients check in; triage where a nurse takes health-
related measurements; lab area where blood work, x-rays, etc. are done;
and an operating room. At each of the four stations, only one patient
can be served at any time. Thus the system can be modeled as a single-
server queueing network with N = 4 nodes. There are R = 2 classes of
patients: class-1 corresponds to critical cases (and hence given preemp-
tive priority) and class-2 corresponds to stable cases (hence lower priority).
Both classes of patients arrive according to a Poisson process straight to
node 1, that is, the reception. Note that node 2 is triage, node 3 is lab,
and node 4 is the operating room. External arrival rate for class-1 and 2
patients are 0.001 and 0.08, respectively. The service rates (in the same time
units as arrivals) and standard deviation of service time at each node for
each class are described in Table 7.15. Likewise, the routing probabilities
pij,r from node i to j are provided in Table 7.16 for r = 1 (i.e., class-1)
and Table 7.17 for r = 2 (i.e., class-2). Using this information compute
the steady-state average number of each class of customers in the system
as a whole.
TABLE 7.15
Mean Service Rates and Standard Deviation of Service Times for
Each Class at Every Node
Node i 1 2 3 4
Class-1 service rate μi,1 1 0.2 0.05 0.01
Class-1 service time std. dev. σi,1 1 2.5 10 20
Class-2 service rate μi,2 0.5 0.1 0.025 0.02
Class-2 service time std. dev. σi,2 1 8 10 40
TABLE 7.16
Class-1 Routing Probabilities [pij,1 ] from Node
on Left to Node on Top
1 2 3 4
1 0 1 0 0
2 0 0 0.4 0.3
3 0 0 0 0.5
4 0 0 0 0
434 Analysis of Queues
TABLE 7.17
Class-2 Routing Probabilities [pij,2 ] from Node
on Left to Node on Top
1 2 3 4
1 0 1 0 0
2 0 0 0.2 0.1
3 0 0 0 0.1
4 0 0 0 0
Solution
To solve the problem we go through the two steps of the algorithm. For
step 1, note that both classes of customers have the possibility of going
through all four nodes. Thus we solve
N
ai,r = λi,r + aj,r pji,r
j=1
and
For all i ∈ [1, . . . , 4] and r = 1, 2, 3, we can write down ρi,r = λi,r /μi,r .
Using that we can write down
[ ρ1,1 + ρ1,2 ρ2,1 + ρ2,2 ρ3,1 + ρ3,2 ρ4,1 + ρ4,2 ] = [ 0.161 0.805 0.648 0.53 ].
and
R
ρi,r,k < 1
r=1 k
must be met for every i. It is critical to note that this condition may not be
sufficient for stability.
Step 2: At node i let qi (r, k) be the priority given to class r traffic at node i
when it enters it for the kth time. Thus if qi (r, k) < qi (s, n), then class r traffic
entering node i for the kth time is given higher priority than class s traffic that
enters it for the nth time. Sequentially compute for each node i (from i = 1 to
i = N) each r and appropriate k (in the exact priority order qi (r, k))
qi (r,k) qi (r,k)−1
Li,q−1 (j)
Li,r,k = ρi,r,k + ai,r,k i
+ Li,r,k ρi,q−1 (j) (7.18)
μi,q−1 (j) i
j=1 i j=0
where ρi,0 = 0 for all i and q−1 i (j) is the inverse function of qi (·, ·) such that
−1
qi (j) = (s, n) if qi (s, n) = j. Note that it is important to solve for the (r, k)
values in a sequence corresponding to the priority order qi (r, k) at each node i.
Approximations for General Queueing Networks 437
than one entry (with different service times) of each class into a node, we do
not need the k subscript we used for the MVA-based algorithm. However,
the total number of classes may have increased. We can obtain ai,r and ρi,r
using the QNA in step 1. Now, we switch back to the preemptive resume
priority policy. Let ρi,r̂ be the sum of traffic intensities of all classes strictly
higher priority than r at node i. We consider the following two cases:
• If r is not the lowest priority in node i, then Li,r ≈ ρi,r /(1 − ρi,r̂ ) due
to the state-space collapse assumption.
• If r is the lowest priority in node i, then Li,r ≈ ρi,r + ai,r Wiq /(1 − ρi,r̂ )
since due to state-space collapse, we assume that all the workload
belongs to this lowest class.
Problem 73
Consider a manufacturing system with three single-server workstations A,
B, and C, as described in Figure 7.12. There are three classes of traffic. Class-
1 jobs arrive externally into node A according to PP(λA,1 ) with λA,1 = 5 jobs
per day. They get served in node A, then at node B, and then they come back
to node A for another round of service before exiting the network. Class-2
jobs arrive externally into node B according to PP(λB,2 ) with λB,2 = 4 jobs per
day. After service in node B, with probability 0.75 a class-2 job joins node C
for service and then exits the network, whereas with probability 0.25 some
20
10 15
5
15 8
Node A 24
4
Node B
10
3
Node C
0.75
0.25
FIGURE 7.12
Single-server queueing network with local priorities.
Approximations for General Queueing Networks 439
class-2 jobs exit the network after service in node B. Class-3 jobs arrive exter-
nally into node C according to PP(λC,3 ) with λC,3 = 3 jobs per day. They
get served in node C, then at node A, and then node B before exiting the
network. The service times to process a job at every node is exponentially
distributed. The service rates are described in Figure 7.12 in units of number
of jobs per day. In particular, the server in node A serves class-1 jobs dur-
ing their first visit at rate μA,1,1 = 10 and second visit at rate μA,1,2 = 20,
whereas it serves class-3 jobs at rate μA,3,1 = 15. Likewise, from the figure, at
node B, we have μB,1,1 = 15, μB,2,1 = 24, and μB,3,1 = 8, and at node C, we
have μC,2,1 = 5 and μC,3,1 = 10. The server at each node uses a preemptive
resume priority scheme with priority order determined using the shortest-
expected-processing-time-first rule. Thus each server gives highest priority
to the highest μ·,·,· . For such a system compute the steady-state expected
number of each class of customer at every node.
Solution
We will stick to the notation used throughout this section, although it is
worthwhile to point out that it may be easier to map the eight three-tuples
of (node, class, visit number) to eight single-dimension quantities as done
in most texts and articles. Note that the priority order (highest to lowest)
is (A, 1, 2), (A, 3, 1), and (A, 1, 1) in node A; (B, 2, 1), (B, 1, 1), and (B, 3, 1) in
node B; and (C, 3, 1) and (C, 2, 1) in node C. Since the flows are relatively sim-
ple in this example we can quickly compute the effective entering rates as
aA,1,2 = aA,1,1 = aB,1,1 = 5, aA,3,1 = aB,3,1 = aC,3,1 = 3, aB,2,1 = 4, and aC,2,1 = 3
(due to the Bernoulli splitting only 75% of class-2 reach node C). Since the
utilization (or relative traffic intensities) ρi,r,k can be computed as ai,r,k /μi,r,k ,
we have ρA,1,2 = 0.25, ρA,3,1 = 0.2, ρA,1,1 = 0.5, ρB,2,1 = 1/6, ρB,1,1 = 1/3,
ρB,3,1 = 0.375, ρC,3,1 = 0.3, and ρC,2,1 = 0.6 (they are presented so that the
traffic intensities at the same node are together and within each node they
are presented from the highest to lowest priority). Note that the necessary
condition for stability is satisfied since the effective traffic intensity at nodes
A, B, and C are ρA = 0.95, ρB = 0.875, and ρC = 0.9, respectively. Now, we
proceed using the two different algorithms.
MVA-based algorithm: Note that we have already completed step 1 of the
algorithm. In step 2, we just explain the qi (r, k) and q−1 i (j) but do not use
them explicitly. For example, for node A, qA (1, 2) = 1 being the highest pri-
ority at node A. Likewise, qA (3, 1) = 2 and qA (1, 1) = 3. Also, for node B,
q−1 −1 −1
B (1) = (2, 1), qB (2) = (1, 1), and qB (3) = (3, 1). Thus using Equation 7.18,
we get
ρA,1,2
LA,1,2 = = 0.3333,
1 − ρA,1,2
ρA,3,1 + aA,3,1 LA,1,2 /μA,1,2
LA,3,1 = = 0.4545,
1 − ρA,1,2 − ρA,3,1
440 Analysis of Queues
Reference Notes
One of the key foundations of this chapter is approximations based on mod-
eling nodes in a queueing network using a reflected Brownian motion. We
began this chapter by giving a flavor for how the Brownian motion argu-
ment is made in a single node G/G/1 setting and derived an approximation
for the number in the system. The analysis relies heavily on the excellent
exposition in Chen and Yao [19] as well as Whitt [105]. A terrific resource
for key elements of Brownian motion (that have been left out in this chap-
ter) is Harrison [52]. Subsequently, we extend the single node to an open
network of single-server nodes assuming each queue behaves as if it were
a reflected Brownian motion. Then we provide approximations based on
Bolch et al. [12] for closed queueing networks. The second portion of this
chapter on multiclass and multiserver general open queueing networks is
entirely from Whitt [103]. Finally, the topic of queueing network with pri-
orities is mainly based out of Chen and Yao [19]. Although it is critical to
point out that there are several research studies mainly focused on aspects
of stability, they will be dealt with in the next chapter. The semi-martingale
reflected Brownian motion offers the ability to derive approximations for the
performance measures including the state-space collapse.
442 Analysis of Queues
Exercises
7.1 Consider a stable G/G/1 queue with arrival rate 1 per hour, traf-
fic intensity 0.8, C2a = 1.21, and C2s = 1.69. Obtain Wq using the
reflected Brownian motion approximation in Equation 7.8 and
also using the approximations in Chapter 4. Compare the approx-
imations against simulations. For the simulations use either
gamma or hyperexponential distribution.
7.2 Consider the queueing network of single-server queues shown in
Figure 7.13 under a special case of N = 3 nodes, p = 0.6, and λ = 1
per minute. Note that external arrival is Poisson. Service times at
node i are according to gamma distribution with mean 1/(i + 2)
minutes and SCOV i − 0.5. Compute an approximate expression
for the expected number of customers in each node of the network
in steady state.
7.3 Consider the queueing network of single-server queues shown in
Figure 7.13. Consider a special case of N = 3 nodes, p = 1, λ = 0
(hence a closed queueing network), and C = 50 customers. Service
times at node i are according to gamma distribution with mean
1/(i + 2) minutes and SCOV i − 0.5. Compute an approximate
expression for the expected number of customers in each node of
the network in steady state using both the bottleneck approxima-
tion (for large C) and the MVA approximation (for small C).
7.4 A queueing network with six single-server stations is depicted in
Figure 7.14. Externally, arrivals occur into node A according to a
renewal process with average rate 24 per hour and SCOV 2. The
service time at each station is generally distributed with mean (in
minutes) 2, 3, 2.5, 2.5, 2, and 2, and SCOV of 1.5, 2, 0.25, 0.36, and
1, 1.44, respectively, at stations A, B, C, D, E, and F. A percent-
age on any arc (i, j) denotes the probability that a customer after
completing service in node i joins the queue in node j. Compute
the average number of customers in each node of the network in
steady state.
7.5 Consider a seven-node single-server queueing network where
nodes 2 and 4 get arrivals from the outside (at rate 5 per minute
each on an average). Nodes 1 and 2 have service rates of 85,
λ 1–p
μ1 μ2 μN–1 μN
p
FIGURE 7.13
Single-server queueing network.
Approximations for General Queueing Networks 443
25% 35% E
10% 80%
50%
A B
15%
F
65%
FIGURE 7.14
Schematic of six-station network.
TABLE 7.18
Mean and Standard Deviation of Service Times for Each Class
Node A B C D E F
Class-1 mean service time 0.2 N/A 0.1 0.05 0.22 N/A
SCOV class-1 service time 0.25 N/A 0.64 2 0.75 N/A
Class-2 mean service time N/A 0.15 0.05 0.1 N/A 0.14
SCOV class-2 service time N/A 0.36 0.49 2.25 N/A 0.81
Approximations for General Queueing Networks 445
TABLE 7.19
Mean and Standard Deviation of Service Times for Each Class
Node i 1 2 3 4 5 6
Class-1 service time 2 2 3 1 2 4
SCOV class-1 service 1 1.44 1.69 1 0.81 0.49
Class-2 service time 3 1 2 3 4 2
SCOV class-2 service 0.25 1 0.81 0.49 2 0.64
TABLE 7.20
Class-1 Routing Probabilities [pij,1 ] from Node on Left to
Node on Top
1 2 3 4 5 6
1 0 0.1 0.1 0.1 0.1 0.1
2 0.2 0 0.2 0.2 0.2 0.2
3 0.3 0.4 0 0.1 0.1 0.1
4 0.1 0.1 0.1 0 0.2 0.5
5 0.2 0.1 0.5 0.1 0 0.1
6 0.3 0.3 0.2 0.1 0.1 0
TABLE 7.21
Class-2 Routing Probabilities [pij,2 ] from Node on Left to
Node on Top
1 2 3 4 5 6
1 0 0.5 0 0 0 0
2 0 0 1 0 0 0
3 0 0 0 1 0 0
4 0 0 0 0 1 0
5 0 0 0 0 0 1
6 1 0 0 0 0 0
14 21
7
16 9 4
Node A Node B
FIGURE 7.15
Rybko–Stolyar–Kumar–Seidman-type network.
exit the network. Class-2 jobs arrive externally into node B accord-
ing to PP(λB ) with λB = 4 jobs per hour. After service in node
B, class-2 jobs get served at node A and then exit the network.
The service rates are described in Figure 7.15 in units of num-
ber of jobs per hour. In particular, the server in node A serves
class-1 jobs at rate μA,1 = 14, whereas it serves class-2 jobs at
rate μA,2 = 16. Likewise, from the figure, at node B, we have
μB,1 = 21 and μB,2 = 9. There is a single server at each node.
The servers use a preemptive resume priority scheme with prior-
ity order determined using shortest expected processing time first
rule. Thus each server gives highest priority to the highest μ·,· in
that node. For such a system compute the steady-state expected
number of each class of customer at every node. Use both MVA
and state-space collapse technique. Note: The necessary condi-
tions for stability of this network are that λA /μA,1 + λB /μA,2 < 1
and λB /μB,1 + λB /μB,2 < 1. However, the sufficient condition
(assuming class-2 has high priority in node A and class-1 has high
priority in node B) is that λA /μB,1 + λB /μA,2 < 1.
8
Fluid Models for Stability, Approximations,
and Analysis of Time-Varying Queues
In this chapter and in the next two chapters, we will consider the notion
of fluid models or fluid queues. However, there is very little commonality
between what is called fluid queues here and what we will call fluid queues
in the next two chapters. In fact they have evolved in the literature rather
independently, although one could fathom putting them together in a uni-
fied framework. We will leave them in separate chapters in this book with
the understanding that in this chapter we are interested in the fluid limit of
a discrete queueing network whereas in the next two chapters we will directly
consider queueing networks with fluid entities (as opposed to discrete enti-
ties) flowing through them. Another key distinction is that the resulting fluid
network in this chapter is deterministic lending itself straightforward ways
to determine stability of queueing networks, develop performance measures
approximately, as well as study transient and time-varying queues. In the
next section, we will study stochastic fluid networks.
Deterministic fluid models have been applied to many other systems
besides queueing networks. In pure mathematics the deterministic fluid
models are called hydrodynamic limits and in physics they fall under mean
field theory. The key idea is to study systems using only mean values by scal-
ing metrics appropriately. We begin by considering a single queue with a
single server to explain the deterministic fluid model concept and flush out
details such as functional strong law of large numbers which is a concept central
to the theory developed. Subsequently, we will use fluid limits in a network
setting to analyze stability, obtain performance metrics approximately, and
finally to study nonstationary queues under transient conditions.
447
448 Analysis of Queues
A(t)
λ = lim and
t→∞ t
S(t)
μ = lim .
t→∞ t
With that description we are now ready to take the fluid limits of the
discrete system described here. Define An (t) as
A(nt)
An (t) =
n
Problem 74
Consider a G/G/1 queue with interarrival times as well as service times
according to Pareto distributions. The coefficient of variation for interarrival
times is 5 and for the service time it is equal to 2. The average arrival rate is
1 per unit time and the average service rate is 1.25 per unit time. The depar-
tures from this queue act as arrivals to a downstream queue. Let A(t) be the
number of entities that arrive at the downstream node during (0, t]. For t = 0
to 10 time units, graph three sample paths of An (t) = A(nt)/n versus t for
n = 1, 10, 100, and 1000.
Solution
It is crucial to note that the A(t) process is the arrivals to the downstream
node which is the same as the departures from the G/G/1 node described in
the question. Also the average arrival rate is λ = 1. By writing a simulation
using the algorithm in Problem 37 in Chapter 4, we can obtain sample paths
of the output process from the G/G/1 queue, in particular the number of
departures during any interval of time. Using this for various values of n = 1,
10, 100, and 1000, we can plot three sample paths of An (t) = A(nt)/n versus t
as shown in Figure 8.1(a)–(d).
From the figure, note that in (a) where n = 1, the sample paths are piece-
wise constant graphs for the number of arrivals until that time. We expect
to get about 10 arrivals during t = 10 time units. Also notice in (a) that the
sample paths are quite varying. Now, when n = 10 as seen in (b), the sample
paths are still piecewise constant but they are closer than in case (a). We see
this trend more prominent in case (c) where the sample paths have closed
in and the piecewise constant graph has started to look more like a constant
graph. Finally in (d) when n = 1000 which for this example is sufficiently
large, the sample paths merge with one another thereby the entire stochastic
process in the limit goes to a deterministic limit.
Notice that the numerical example is for small t, however, the conver-
gence would only be faster if we used larger t values. Thus we can conclude
that An (t) converges to λt as n grows to infinite. In addition, if we were to
choose a smaller coefficient of variation we would see that the convergence
is much faster as a matter of fact.
450 Analysis of Queues
14 10
9
12
8
10 7
8 6
An(t)
An(t)
5
6
4
4 3
2
2
1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(a) t (b) t
12 12
10 10
8 8
An(t)
An(t)
6 6
4 4
2 2
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(c) t (d) t
FIGURE 8.1
Sample paths of scaled arrival process An (t). (a) n = 1, (b) n = 10, (c) n = 100, and (d) n = 1000.
8.1.2 Functional Strong Law of Large Numbers and the Fluid Limit
One version of the well-known strong law of large numbers (SLLN) is that
if X1 , X2 , . . . are IID random variables with mean m, then Sn = X1 +X2n+···+Xn
Stability, Fluid Approximations, and Non-stationary Queues 451
Average
arrival
rate Constant
λ entities fluid arrival
per hour rate
λ per hour
Scaling
Mean
Constant
service
valve
capacity
capacity
μ entities
μ per hour
per hour
FIGURE 8.2
Snapshot of a discrete stochastic queue and a scaled deterministic fluid queue.
Thus we have Xn (0) = x0 /n which in the limit goes to zero. All the results in
this section use 0 ≤ x0 < ∞.
To derive the scaled process we use a reflection mapping argument
where the steps are identical to that in Section 7.1.1 with similar notation
as well. Let B(t) and I(t) denote the total time the server has been busy and
idle, respectively, from time 0 to t. The corresponding fluid limits by def-
inition are Bn (t) = B(nt)/n and In (t) = I(nt)/n. Of course B(t) + I(t) = t and
Bn (t) + In (t) = t. We can apply scaling to Equation 7.1 and obtain
x0
Xn (t) = + An (t) − Sn Bn (t) .
n
where
x0
Un (t) = + (λ − μ)t + An (t) − λt − Sn Bn (t) − μBn (t) ,
n (8.1)
V n (t) = μIn (t).
Then, by scaling the expressions for U(t) and V(t) defined in Section 7.1.1,
we know that given Un (t), there exists a unique pair Xn (t) and V n (t) such that
Xn (t) = Un (t)+V n (t), which satisfies the following three conditions (obtained
by rewriting conditions (7.2), (7.3), and (7.4) by scaling for any t ≥ 0):
Xn (t) ≥ 0,
dV n (t)
≥0 with V n (0) = 0 and
dt
dV n (t)
Xn (t) = 0.
dt
We also showed in Section 7.1.1 that the unique pair Xn (t) and V n (t) can be
written in terms of Un (t) as
V n (t) = sup max −Un (s), 0 , (8.2)
0≤s≤t
Xn (t) = Un (t) + sup max −Un (s), 0 . (8.3)
0≤s≤t
But how do we use this fluid limit? We will see that in the remainder of this
chapter especially in the context of stability of networks, approximations,
and analyzing time-varying systems.
Node A Node B
μ1 μ2
λ
μ3 μ4
μ5
FIGURE 8.3
Reentrant line example.
Consider a multiclass network depicted in Figure 8.3 (Dai [25]). Jobs enter
node A at average rate λ, after service (mean service time of 1/μ1 ) they go to
node B for a service that takes 1/μ2 time on average. Then they reenter node
A for another round of service (with mean 1/μ3 ) and then go to node B again
for a service that takes an average time of 1/μ4 . They finally visit node A
for a service that takes an average 1/μ5 time before exiting the system. Such
a system is called a queueing network with reentrant lines and is typical in
semiconductor manufacturing facilities. Although there is only a single flow,
for ease of explanation we call the first visit to node A as class-1, first visit to
node B as class-2, second visit to node A as class-3, second visit to node B as
class-4, and final visit to node A as class-5. Notice that the subscripts of the
service rates match the respective classes.
There is a single server at node A that uses a priority order class-5 (high-
est) then class-3 and then class-1 (lowest priority). Likewise, there is a single
server at node B as well that gives higher priority to class-2 than class-4.
However, at all nodes jobs within a class are served FCFS. Also, the priori-
ties are preemptive resume (see Section 5.2.3 for a definition) type. We are not
specifying the probability distributions for the interarrival times or service
times for two reasons: (i) stability can be determined by just knowing their
means; (ii) we do not want to give the impression that the interarrival times
or the service times are IID. Having said that, when we run simulations we
do need to specify distributions and for that reason we would do so in the
examples.
The stability conditions in terms of traffic intensities at nodes A and B are
λ λ λ λ λ
+ + <1 and + <1 (8.5)
μ1 μ3 μ5 μ2 μ4
Problem 75
Consider the reentrant line in Figure 8.3. Let the arrivals be according to a
Poisson process with mean λ = 1. Also all service times are exponentially
distributed with μ1 = 10, μ2 = 2, μ3 = 8, μ4 = 2.5, and μ5 = 1.5. Verify that
conditions in (8.5) are satisfied. Then simulate the system for about 2000 time
units with an initially empty system to obtain the number of jobs in each of
the two nodes A and B over time. Also state if either servers is underutilized
for the duration of the simulation.
Solution
We can immediately verify that conditions in (8.5) are satisfied because
λ λ λ λ λ
+ + = 0.89167 < 1 and + = 0.9 < 1.
μ1 μ3 μ5 μ2 μ4
We start with an empty system and simulate arrivals and service accord-
ing to the description given. By keeping track of the number of jobs in each
of the nodes, we plot Figure 8.4(a) and (b). In Figure 8.4a, notice how the
number of jobs in node A rises and then falls, then rises higher and then
crashes to zero with the high queue length periods increasing in size. A sim-
ilar trend can also be observed in Figure 8.4b. However, a curious finding
is the fact that when there are jobs in one node, the other is more or less
empty. In other words, notice that the number in node A and that in node B
are negatively correlated. Further, if we add the number of jobs in nodes A
and B, we can plot the total number of jobs in the entire system and that is
given in Figure 8.5. Although it is true from Figure 8.4a and b that the num-
ber in each queue hits zero often but if one were to add the number of jobs
in node A to that in node B, the total number of jobs shows an increasing
trend (Figure 8.5). We can hence conclude that the system is indeed unstable
because the total number in the entire system is showing a rising trend over
time. Also, in terms of the utilization during the course of simulation we find
the following. Although the utilization of node A is close to the traffic inten-
sity of 0.89167, the utilization of node B is only 0.7836 which is significantly
lower than the traffic intensity of 0.9 that we expect to see.
Next we investigate the reason behind why we see the strange behavior
of the system becoming unstable and the nodes not reaching their expected
utilization in Problem 75. Let Xi (t) be the number of class-i jobs in the system
at time t for i = 1, 2, 3, 4, 5. In particular, consider jobs of class-2 and class-5.
Could node A be working on a class-5 job at the same time when node B is
working on a class-2 job? Say that is possible and there is one job of class-2
and one job of class-5 in the system. Since class-2 and class-5 get preemp-
tive priorities at their respective nodes, both must be in service. However,
the service times could not have started simultaneously. So let us say that
Stability, Fluid Approximations, and Non-stationary Queues 457
400
350
300
Number of jobs in node A
250
200
150
100
50
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
(a) t
600
500
Number of jobs in node B
400
300
200
100
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
(b) t
FIGURE 8.4
Sample path of number of jobs over time in each node of the reentrant line example. (a) Number
of jobs in A versus t. (b) Number of jobs in node B versus t.
the class-2 job was in the system when the class-5 job entered and began
service (the argument would not change if we went the other way around).
But that is impossible because the class-5 job would have been a class-4 job
that completed service (but the server would be processing class-2 and hence
a class-4 cannot have completed). In other words, before becoming a class-
2 and a class-5 job, they were class-1 and class-4 jobs, respectively, and a
458 Analysis of Queues
600
500
Total number of jobs in the system
400
300
200
100
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
t
FIGURE 8.5
Total number in the entire system in the reentrant line example.
class-1 job cannot be completed when there is a class-5 job in the system and a
class-4 job cannot be completed when there is a class-2 job in the system.
Hence we make the crucial observation that
X2 (t)X5 (t) = 0
λ λ
+ = 1.167 > 1.
μ2 μ5
The system cannot spend a fraction λ/μ2 time serving class-2 and another
λ/μ5 fraction of time serving class-5 since both cannot be served simultane-
ously. Thus a crucial condition for stability is
λ λ
+ < 1. (8.6)
μ2 μ5
Remark 18
The conditions for the network represented in Figure 8.3 (with priority policy
described earlier) to be stable are
λ λ
+ < 1,
μ2 μ5
λ λ λ
+ + <1 and
μ1 μ3 μ5
λ λ
+ < 1.
μ2 μ4
At this juncture a natural question to ask is if reentrant lines or the pri-
ority policy is needed to observe such virtual stations and have additional
conditions for stability. In the next two examples we will relax one of those
two conditions. First we present the example that is popularly known as
Kumar–Seidman–Rybko–Stolyar network in the literature. Sometimes it is
also referred to as Rybko–Stolyar–Kumar–Seidman network and is depicted
in Figure 8.6. Around the same time Kumar and Seidman as well as Rybko
and Stolyar wrote articles considering the network in Figure 8.6. The only
difference is that Kumar and Seidman considered a deterministic system
whereas Rybko and Stolyar a stochastic system.
Since we are only interested in the average rates, the deterministic and
the stochastic versions are identical from a stability standpoint. Class-1 jobs
enter node A externally at an average rate of λ1 per unit time. They get served
at node A for an average time of 1/μ1 and then go to node B where they are
called class-2. Class-2 jobs get served for an average time of 1/μ2 and exit
the system. Class-3 jobs arrive externally at an average rate of λ3 per unit
time into node B. They require an average processing time of 1/μ3 and upon
completion they go to node A and get served for an average time of 1/μ4 (as
class-4) before exiting the network. There is a single server at node A and a
μ1 μ2
λ1
μ4 μ3
λ3
Node A Node B
FIGURE 8.6
Rybko–Stolyar–Kumar–Seidman network.
460 Analysis of Queues
single server at node B. Notice that this is not a reentrant line. However, like
the previous example, here too we consider a preemptive resume priority
scheme. Class-2 and class-4 jobs are given higher priority at their respective
nodes. This is natural because by giving priority to them, we could purge
jobs out of the system (with the hope that it would reduce the number of
jobs in the system).
The stability conditions in terms of traffic intensities at nodes A and B are
λ1 λ3 λ1 λ3
+ < 1 and + <1 (8.7)
μ1 μ4 μ2 μ3
Problem 76
Consider the network in Figure 8.6. Let the arrivals be according to a Poisson
process with mean λ1 = λ3 = 1. Also all service times are exponentially dis-
tributed with μ1 = 5, μ2 = 10/7, μ3 = 4, and μ4 = 4/3. Verify that conditions
in (8.7) are satisfied. Then simulate the system for about 4000 time units with
an initially empty state to obtain the number of jobs in each of the two nodes
A and B over time.
Solution
It is relatively straightforward to verify that conditions in (8.7) are satisfied
since
λ1 λ3 λ1 λ3
+ = 0.95 < 1 and + = 0.95 < 1.
μ1 μ4 μ2 μ3
The reason for instability is very similar to that we saw in Problem 75,
although here too the traffic intensity conditions (8.7) are satisfied. Let X2 (t)
Stability, Fluid Approximations, and Non-stationary Queues 461
1500
Number of jobs in node A
1000
500
0
0 500 1000 1500 2000 2500 3000 3500 4000
(a) t
3000
2500
Number of jobs in node B
2000
1500
1000
500
0
0 500 1000 1500 2000 2500 3000 3500 4000
(b) t
FIGURE 8.7
Number in each node of Kumar–Seidman–Rybko–Stolyar network example. (a) Number of jobs
in node A versus t. (b) Number of jobs in node B versus t.
and X4 (t) be the number of class-2 and class-4 jobs, respectively, in the sys-
tem at time t. It is impossible for node A to be working on a class-4 job at
the same time when node B is working on a class-2 job. This is because with
respect to the class-2 and class-4 jobs, if they were both being served simulta-
neously, the previous event would have been start of a class-2 job or start of a
class-4 job. For that it would be necessary for a class-1 or a class-3 job to have
been completed respectively. But that is impossible because the respective
462 Analysis of Queues
3000
2500
Total number of jobs in the system
2000
1500
1000
500
0
0 500 1000 1500 2000 2500 3000 3500 4000
t
FIGURE 8.8
Number of jobs in the entire Kumar–Seidman–Rybko–Stolyar network.
nodes would be working on the higher priority jobs. In other words, before
becoming a class-2 job, a job would have been a class-1 job that would have
just completed. But for a class-1 job to be complete there could be no class-4
jobs in the system. So there would be a class-2 job in the system only if there
is no class-4 job in the system. Likewise, we can see using an identical argu-
ment that there would be a class-4 job in the system only if there are no
class-2 jobs in the system.
Hence we conclude that
X2 (t)X4 (t) = 0
for all t if we started with an empty system at t = 0. This means that the
system as a whole cannot process a class-2 and a class-4 job simultaneously.
Therefore, if the load brought by class-2 and class-4 jobs is too high, then the
system will not be stable. In Problem 76, notice that
λ1 λ3
+ = 1.45 > 1.
μ2 μ4
The system cannot spend a fraction λ1 /μ2 time serving class-2 and another
λ3 /μ4 fraction of time serving class-4 since both cannot be served simultane-
ously. Thus a crucial condition for stability is
λ1 λ3
+ < 1. (8.8)
μ2 μ4
Stability, Fluid Approximations, and Non-stationary Queues 463
This condition is as though there exists a virtual station into which class-2
and class-4 flow and that station also needs to have a traffic intensity of less
than 1.
Remark 19
The conditions for the network represented in Figure 8.6 (with priority policy
described earlier) to be stable are
λ1 λ3
+ < 1,
μ2 μ4
λ1 λ3
+ < 1 and
μ1 μ4
λ1 λ3
+ < 1.
μ2 μ3
To illustrate that the network is stable if the conditions in Remark 19
are satisfied, we consider a set of numerical values different from those in
Problem 76. Although λ1 = λ3 = 1, we have μ1 = 2, μ2 = 2.5, μ3 = 20/11,
and μ4 = 20/9. Notice that the conditions in Remark 19 are satisfied. In
particular, similar to the numerical values in Problem 76, here too
λ1 λ3 λ1 λ3
+ = 0.95 and + = 0.95.
μ1 μ4 μ2 μ3
However,
λ1 λ3
+ = 0.85.
μ2 μ4
For this set of numerical values we simulate the system and obtain the total
number of customers in this stable system over time in Figure 8.9. By con-
trasting with that of the unstable network in Figure 8.8, notice how the
number in the system does not blow up and keeps hitting zero from time
to time. Thus clearly the standard traffic intensity conditions are only nec-
essary but not sufficient. Having made a case for that we present one final
example of an unstable network.
We present as a last example an FCFS network with reentrant lines. This
network is depicted in Figure 8.10 and is identical to the example considered
in Chen and Yao [19]. Chen and Yao [19] describe this network as a Bramson
network since it is a simplification of the network considered by Bramson
[13]. Like the previous examples here too we only describe the network in
terms of the average rates. Also, the deterministic and the stochastic versions
464 Analysis of Queues
80
70
Total number of jobs in the system
60
50
40
30
20
10
0
0 500 1000 1500 2000 2500 3000 3500 4000
t
FIGURE 8.9
Number of jobs in a stable Kumar–Seidman–Rybko–Stolyar network.
are identical from a stability standpoint. Class-1 jobs enter node A externally
at an average rate of λ per unit time. They get served at node A for an average
time of 1/μ1 and then go to node B where there are called class-2. Class-2 jobs
get served for an average time of 1/μ2 and go for another round of service
at node B as class-3 jobs. Class-3 jobs take an average 1/μ3 time for service
and convert to class-4 jobs at the end of service. Class-4 jobs are also served
at node B at an average rate of μ4 per unit time. Upon service completion,
class-4 jobs convert to class-5 and get served at node A before exiting the
system. Average class-5 service time is 1/μ5 . There is a single server at node
A and a single server at node B and each server at their respective nodes
use FCFS discipline. Notice that this is indeed a reentrant line. However, the
main difference is this is FCFS (and not priority scheme as we saw in the two
previous examples).
μ5
μ4
μ3
μ1 μ2
λ
Node A Node B
FIGURE 8.10
Network with FCFS at all stations.
Stability, Fluid Approximations, and Non-stationary Queues 465
λ λ λ λ λ
+ < 1 and + + <1 (8.9)
μ1 μ5 μ2 μ3 μ4
Problem 77
Consider the network in Figure 8.10. Let the arrivals be according to a
Poisson process with mean rate λ = 1. Also all service times are exponen-
tially distributed with 1/μ1 = 0.02, 1/μ2 = 0.8, 1/μ3 = 0.05, 1/μ4 = 0.05, and
1/μ5 = 0.88. Verify that conditions in (8.9) are satisfied. Then simulate the
system for about 50,000 time units with an initially empty state to obtain the
number of jobs in each of the two nodes A and B over time.
Solution
It is relatively straightforward to verify that conditions in (8.9) are satisfied
since
λ λ λ λ λ
+ = 0.9 < 1 and + + = 0.9 < 1.
μ1 μ5 μ2 μ3 μ4
The reason for instability is very similar to that we saw in the previous
two problems, that is, Problems 75 and 76. However, the virtual station
466 Analysis of Queues
5000
4500
4000
Number of jobs in node A
3500
3000
2500
2000
1500
1000
500
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
(a) t ×104
6000
5000
Number of jobs in node B
4000
3000
2000
1000
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
(b) t ×104
FIGURE 8.11
Sample path for number in each node of FCFS network example. (a) Number of jobs in node A
versus t. (b) Number of jobs in node B versus t.
condition is a lot more subtle and hard to explain. However, here too the
traffic intensity conditions (8.9) are satisfied. But the servers end up idling
for longer than they can afford and keep catching up. As that happens the
queue piles up and this causes a cascading effect. Nonetheless it is not easy
to write down the explicit sufficient conditions for stability. As one would
expect, for larger networks it would indeed be more complicated to test for
stability using virtual stations. Hence we use fluid models to analyze the
stability, which is the focus of the remainder of this section.
Stability, Fluid Approximations, and Non-stationary Queues 467
6000
5000
Total number of jobs in the system
4000
3000
2000
1000
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
t ×104
FIGURE 8.12
Total number in the entire system in the FCFS network example.
fluid network we will address only subsequently (and contrast it against the
notion of “weakly stable”). At this time we just say “stable” to not get dis-
tracted by technical details. There are some excellent texts and monographs
that interested readers are encouraged to consider for a fully rigorous treat-
ment of this subject. They include Dai [25], Meyn [82], Chen and Yao [19],
and Bramson [13], to name a few.
We first describe the network setting. It is crucial to realize that the nota-
tion is somewhat different from those in the previous chapters. The setting
as well as converting from a discrete network to a fluid network has been
adapted from Meyn [82]. Consider a network with many single-server nodes
or stations. Henceforth we will use the terms node and station interchange-
ably. There could be one or more queues or buffers at each station (we use the
terms buffers and queues interchangeably). The key difference in the nota-
tion in this section is that the flow, routing, and service are with respect to
the buffers and not the nodes unlike previous chapters. However, as always,
the flow in this network is discrete and stochastic in terms of arrivals and
service. But the routing from buffer to buffer is deterministic. Next we explic-
itly characterize these networks and describe the inputs for our analysis as
follows:
Problem 78
Consider a network with single servers in each node that has buffers as
depicted in Figure 8.13. There are three products that flow in the network.
The buffers have infinite size. One product has a deterministic route of
buffers 1, 2, and 3 before exiting the network; another goes through buffers
4, 5, and then 6; and the last one enters buffer-7, gets served, and exits after
being served at buffer-8. The external arrival rates and the service rates are
provided. Say the servers at each node use a preemptive priority policy giv-
ing highest priority to the shortest expected processing time among all types
of jobs waiting at its node. Assume that μi < μj if i < j. Can this system be
modeled using the network setting described earlier?
μ6
μ2
μ3
λ7 μ7 μ8
FIGURE 8.13
Network with nodes with multiple buffers and deterministic routes.
470 Analysis of Queues
Solution
The system can be modeled using the network setting described as follows.
The network has N = 3 service stations (or nodes) called node 1, node 2, and
node 3. There is one server at each of the nodes 1, 2, and 3. There are = 8
buffers in the entire network and at least one in each node. Notice that > N.
Using the buffer numbers we can see that C11 = C13 = C17 = 1 since node 1 has
buffers 1, 3, and 7. Likewise, we have C22 = C24 = 1 and C35 = C36 = C38 = 1
for the same reason. Thus we have the C matrix as
⎡ ⎤
1 0 1 0 0 0 1 0
C = ⎣ 0 1 0 1 0 0 0 0 ⎦.
0 0 0 0 1 1 0 1
The service rates at the buffers are specified in Figure 8.13. Also, the service
discipline is described in the problem statement (although we would not use
that here, we just verify that it is nonidling). There is infinite waiting room
at each buffer.
External arrival rate of customers into buffers 1, 2, 3, 4, 5, 6, 7, and 8 are
λ1 , 0, 0, λ4 , 0, 0, λ7 , and 0, respectively. The routing matrix from buffer to
buffer is given by
⎡ ⎤
0 1 0 0 0 0 0 0
⎢ 0 0 1 0 0 0 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 0 0 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 1 0 0 0 ⎥
R=⎢
⎢
⎥
⎥
⎢ 0 0 0 0 0 1 0 0 ⎥
⎢ 0 0 0 0 0 0 0 0 ⎥
⎢ ⎥
⎣ 0 0 0 0 0 0 0 1 ⎦
0 0 0 0 0 0 0 0
network with deterministic routing and reentrant lines. Another way to con-
sider this network is that there is a resource constraint that forces each server
to work on multiple buffers. For example, each queue could be correspond-
ing to a buffer of a machine and each node an operator. So each operator is
responsible for a set of machines and the operator switches between jobs on
all the machines he or she is assigned to work on. Thus the whole node can
be thought of as either a machine or a resource.
With that motivation, the next question to ask is how do we convert
such a discrete and stochastic network into a fluid and deterministic net-
work by scaling it appropriately? It turns out that the procedure is relatively
straightforward where we decompose the network into individual buffers
and replace the discrete arrivals by fluids and valves for emptying buffers.
Thus for the fluid model, all we need to state is the fluid entering rate and
emptying rate for every buffer at all times. That would specify our fluid
model. We explain next how a deterministic and fluid model of a stochas-
tic discrete network looks like. For that we first consider a small example to
explain and subsequently generalize it to any network. Recall the Rybko–
Stolyar–Kumar–Seidman network in Figure 8.6. Priority is given to buffer-4
at node A and buffer-2 at node B. The fluid model of that network would
be constructed in the following manner. Fluid would arrive at constant
rates λ1 and λ3 continuously into buffers 1 and 3, respectively. If buffer-2
is nonempty, then node B would drain it at rate μ2 . Notice that if buffer-2
is empty, that does not mean it is not getting any inputs, it is just that the
input rate is smaller than μ2 . So if buffer-2 is empty, then whatever capac-
ity buffer-2 is not using will be used to drain buffer-3. Likewise, at node A,
if buffer-4 is nonempty then all of the node’s capacity will be used to drain
buffer-4. However, if buffer-4 is empty, then the node will offer just the nec-
essary amount of capacity to buffer-4 to ensure it continues to be empty, and
the remaining capacity to drain out buffer-1.
We formalize that mathematically. Let ζj (t) be the processing capacity
allocated to buffer j for j = 1, 2, 3, 4 for the network in Figure 8.6 (we will
subsequently define ζj (t) more precisely for a generic network). For exam-
ple, if at time t node A is draining a nonempty buffer-4, then ζ4 (t) = 1
and ζ1 (t) = 0. However, at time t if buffer-4 is empty but it gets arrivals at
rate a4 and buffer-1 is nonempty, then ζ4 (t) = a4 /μ4 and ζ1 (t) = 1 − a4 /μ4
(we need the condition a4 /μ4 < 1 for buffer-4 to be empty). Finally if both
buffers 1 and 4 are empty at time t and arrival rates into them are λ1 and
a4 , respectively, then ζ4 (t) = a4 /μ4 and ζ1 (t) = λ1 /μ1 (we need the condition
λ1 /μ1 + a4 /μ4 < 1 for both buffers 1 and 4 to be empty). In a similar fash-
ion, one could consider node B and describe ζ2 (t) and ζ3 (t). With that one
could decompose the network into individual buffers and write down the
arrival as well as emptying rates for each buffer at time t, as described in
Table 8.1.
Therefore, notice that it is relatively straightforward to convert a dis-
crete stochastic network into a fluid deterministic one. Now we formalize
472 Analysis of Queues
TABLE 8.1
Arrival and Drainage Rates in Fluid-Scaled Network
Buffer Arrival Rate at Time t Draining Rate at Time t
1 λ1 μ1 ζ1 (t)
2 μ1 ζ1 (t) μ2 ζ2 (t)
3 λ3 μ3 ζ3 (t)
4 μ3 ζ3 (t) μ4 ζ4 (t)
that for a generic discrete stochastic network with N nodes, buffers, node-
buffer incidence matrix C, and buffer-to-buffer routing matrix R. Also, for all
j ∈ {1, . . . , }, λj is the external average arrival rate into buffer j and 1/μj is the
average service time for a job in buffer j. This can be converted into a fluid
deterministic network and decomposed into individual buffers so that all we
need to specify is the input and drainage rate of each buffer. To explain the
conversion process, we use some extra notation to keep the presentation less
cumbersome. Let Ji be the set of buffers in node i, that is, Ji = {j : Cij = 1} for
all i ∈ [1, . . . , N]. In Figure 8.13, for example, J1 = {1, 3, 7}, J2 = {2, 4}, and
J3 = {5, 6, 8}. Likewise, let s(j) be the node where buffer j resides. Again,
in the example in Figure 8.13, s(3) = 1 since buffer-3 is in node 1, and
s(8) = 3 since buffer-8 is in node 3. This gives us a mapping between buffers
and nodes.
For a given buffer j such that j ∈ {1, . . . , }, let zj (t) be the cumulative time
allocated by node s(j) to process buffer j in time (0, t]. For all t ≥ 0, let ζj (t) be
the right derivative of zj (t) and is written as
d+
ζj (t) = zj (t).
dt
for all i ∈ {1, . . . , N}. This inequality would be an equality if at least one of
the buffers in the set Ji of node i is nonempty.
Stability, Fluid Approximations, and Non-stationary Queues 473
Xj (nt)
→ xj (t).
n
This is what we meant in the previous paragraph that not only do the arrival
and service process converge to their fluid limits but so do the number in
each buffer. Notice that xj (t) is a deterministic quantity. In some articles xj (t)
is also written as Xj (t) to specifically denote the fluid limit. Next we define
various “degrees” of stability for the discrete as well as the fluid network in
terms of Xj (t) and xj (t) for all j ∈ [1, ].
For the discrete stochastic network, we define two “degrees” of stability
as follows:
• Stable: A discrete stochastic network is called stable if j Xj (t) < ∞
for all t, especially as t → ∞. For that one typically shows that the
stochastic process {X(t), t → ∞} is positive Harris recurrent, where
X(t) = (X1 (t), . . . , X (t)).
• Rate stable: A discrete stochastic network is called rate-stable if for
every buffer j, the steady-state departure rate equals the steady-
state “effective” arrival rate obtained by solving the flow balance.
To mathematically state that, let Dj (t) be the number of jobs that
depart buffer j in time (0, t]. Also let a = [a1 . . . a ] be a row vector
474 Analysis of Queues
The reason we presented the degrees of stability for the discrete and fluid
networks in a “corresponding” fashion is that as the title of this section states,
the discrete network is stable if the fluid network is stable in a corresponding
manner. We formalize this in the next remark.
Remark 20
If the fluid deterministic network is weakly stable, then the discrete stochastic
network is rate stable. Also, if the fluid deterministic network is stable, then
the discrete stochastic network is positive Harris recurrent (hence stable).
To explain this remark as well as stability notions, let us consider the sim-
plest example of a single buffer on a single node, as done in Section 8.1. The
deterministic fluid model has an inflow rate λ and an orifice capacity μ. If
λ < μ, no matter how much fluid there was in the system initially, as long
as it was finite, the buffer would empty in a finite time. Therefore, the fluid
model is stable if λ < μ. Remark 20 states that if the fluid model is stable then
the original discrete queue is stable. This can be easily verified because we
know that the discrete stochastic system is stable (or positive Harris recur-
rent) if λ < μ. Now if λ = μ, the fluid queue would remain at the initial level
at all times. Thus if there is a nonzero initial fluid level, then the time to
empty is infinite. But if the initial fluid level is zero, then it would remain
zero throughout. Hence when λ = μ, the queue is only weakly stable but not
stable. Thus when λ = μ the discrete queue is only rate stable. Of course if
λ > μ the fluid queue is unstable and so is the discrete queue.
Stability, Fluid Approximations, and Non-stationary Queues 475
Remark 21
The corresponding fluid networks in Problems 75, 76, and 77 are all weakly
stable since if we started with an empty system they would remain empty.
But they are not stable since if we have a finite nonzero amount of fluid in
the buffer initially, then the time to empty becomes infinite. As evident from
the simulations, the discrete stochastic networks are all not positive Harris
recurrent but they are rate stable.
Usually, if these necessary conditions are satisfied, then the fluid model is
at least weakly stable with allocation rates at buffer j ζj (t) = ρj for all t. But
those conditions may not be sufficient to ensure stability (beyond weak sta-
bility). We would address the sufficient conditions later but first explain the
necessary conditions with an example problem.
Problem 79
Consider the networks in Figures 8.3, 8.6, and 8.10. For all three networks,
derive the necessary conditions for stability which would result in the fluid
models to be at least weakly stable, if not stable?
Solution
For each of the Figures 8.3, 8.6, and 8.10 using their respective R and C, as
well as λj and μj values for each buffer j, we derive the conditions for the
fluid models to be weakly stable in the following manner.
⎡ ⎤
0 1 0 0 0
⎢ 0 0 1 0 0 ⎥
⎢ ⎥
R=⎢
⎢ 0 0 0 1 0 ⎥
⎥
⎣ 0 0 0 0 1 ⎦
0 0 0 0 0
1 0 1 0 1
C= .
0 1 0 1 0
Stability, Fluid Approximations, and Non-stationary Queues 477
λ λ λ λ λ
+ + <1 and + < 1.
μ1 μ3 μ5 μ2 μ4
−1
The effective arrival a = λ(I − R) = [λ1 λ1 λ3 λ3 ]. Thus
rate vector is
λ1 λ1 λ3 λ3
we can obtain ρ = μ1 μ2 μ3 μ4 . The conditions for the fluid model
to be at least weakly stable are CρT < ê which results in
λ1 λ3 λ1 λ3
+ < 1 and + < 1.
μ1 μ4 μ2 μ3
initial fluid level of 1 in buffer-1 and zero in all other buffers, then the
fluid system would never empty if λ1 /μ2 + λ3 /μ4 > 1. In fact we will
show subsequently that the sufficient condition to ensure stability is
λ1 /μ2 + λ3 /μ4 < 1.
• For the network in Figure 8.10, we have N = 2, = 5, λ1 = λ, and
λ2 = λ3 = λ4 = λ5 = 0. Thus λ = [λ 0 0 0 0]. Also, the routing matrix is
⎡ ⎤
0 1 0 0 0
⎢ 0 0 1 0 0 ⎥
⎢ ⎥
R=⎢
⎢ 0 0 0 1 0 ⎥
⎥
⎣ 0 0 0 0 1 ⎦
0 0 0 0 0
λ λ λ λ λ
+ <1 and + + < 1.
μ1 μ5 μ2 μ3 μ4
Having described the necessary conditions for stability, our next goal is
to obtain the sufficient conditions. Unfortunately, unlike the necessary con-
ditions, the sufficient conditions cannot be stated in a generic fashion and
would have to be addressed on a case-by-case basis. However, knowledge
of the dynamics of the network would certainly aid in the process of obtain-
ing some conditions and all we need to do is to check if those conditions are
sufficient. As an example, recall the virtual station conditions described in
Section 8.2.1. Are those virtual station conditions sufficient to ensure stability
or would more conditions be needed? To answer that question, we consider
a specific example, namely the Kumar–Seidman–Rybko–Stolyar network in
Stability, Fluid Approximations, and Non-stationary Queues 479
λ1 λ3 λ1 λ3
+ < 1 and + <1
μ1 μ4 μ2 μ3
otherwise the condition λ1 /μ2 + λ3 /μ4 < 1 would always be satisfied if the
necessary conditions are satisfied. That is because if μ1 ≤ μ2 then λ1 /μ2 +
λ3 /μ4 ≤ λ1 /μ1 + λ3 /μ4 < 1 (similarly when μ3 ≤ μ4 ). Hence we make that
assumption to avoid the trivial solution.
Without loss of generality we assume that all four buffers are nonempty
initially with the understanding that other cases can be handled in a simi-
lar fashion. Node 2 would drain buffer-2 at rate μ2 and node 1 would drain
buffer-4 at rate μ4 since buffers 2 and 4 have priority. This would continue
until one of buffers 2 or 4 becomes empty. Say that is buffer-4 (the argu-
ment would not be different if it was buffer-2). Now that buffer-4 is empty
and is not receiving any inputs from buffer-3 to process, buffer-1 can now
be drained at rate μ1 and at the same time buffer-2 is being drained at μ2 .
Notice that buffers 1 and 3 have been getting input fluids at rates λ1 and λ3 ,
respectively, since time t = 0. Also currently, buffer-2 is getting input at rate
μ1 . Since μ2 < μ1 , contents in buffer-2 would only grow while that in buffer-1
would shrink until buffer-1 becomes empty. Now we have buffers 1 and 4
empty and other two nonempty.
However, since buffer-1 gets input at rate λ1 , its departure is also λ1 . Since
buffer-2 now has a smaller input rate than output, it will drain out all its fluid
and become empty. Thus buffers 1, 2, and 4 are now empty. Now buffer-
3 would start draining at rate μ3 . Since μ3 > μ4 , buffer-4 would now start
building up. Because of that buffer-1 would stop draining and it would also
start accumulating. But buffer-2 would continue to remain empty. Thus the
next event is buffer-3 would empty out. At this time buffers 1 and 4 would
be nonempty. But buffer-4 would now receive input only at rate λ3 from
buffer-3, which would result in buffer-4 draining out but buffer-1 would
continue building up. This would continue till buffer-4 becomes empty at
which time the only nonempty buffer would be buffer-1. At this time buffer-
1 would start draining at rate μ1 into buffer-2 which in turn would drain
at a slower rate μ2 . Thus buffer-1 would drain out, buffer-4 would remain
empty, while buffers 2 and 3 would accumulate. This would continue till
480 Analysis of Queues
FIGURE 8.14
Cycling through buffer conditions in fluid model of Rybko–Stolyar–Kumar–Seidman network.
buffer-1 becomes empty. Thus buffers 2 and 3 are nonempty while buffers
1 and 4 are empty. This is the same situation as the beginning of this para-
graph. In essence this process would cycle through until all buffers empty,
as depicted in Figure 8.14.
Notice that irrespective of the initial finite amount of fluid in the four
buffers, the system would reach one of the four conditions in Figure 8.14.
Then it would cycle through them. A natural question to ask is: would the
cycle continue indefinitely or would it eventually lead to an empty system
and stay that way? Since this is a deterministic system, if we could show that
if in every cycle the total amount of fluid strictly reduces, then the system
would eventually converge to an empty one. Therefore, all we need to show
is if we started in one of the four conditions in Figure 8.14, then the next time
we reach it there would be lesser fluid in the system. Say we start in the state
where buffer-3 is nonempty (with a units of fluid) and buffers 1, 2, and 4
are empty. If we show that the next time we reach that same situation, the
amount of fluid in buffer-3 would be strictly less than a, then the condition
that enables that would be sufficient for the fluid model to be stable. Using
that argument we present the next problem which can be used to show that
the condition λ1 /μ2 +λ3 /μ4 < 1 is sufficient for the fluid model of the network
in Figure 8.6 to be stable.
Problem 80
Consider the fluid model of the network in Figure 8.6. Assume that the
necessary conditions λ1 /μ1 + λ3 /μ4 < 1 and λ1 /μ2 + λ3 /μ3 < 1 are satis-
fied. Also assume that μ1 > μ2 and μ3 > μ4 . Let the initial fluid levels be
x1 (0) = x2 (0) = x4 (0) = 0 and x3 (0) = a for some a > 0. Further, let T be the first
passage time defined as the next time that buffers 1, 2, and 4 are empty, and
buffer-3 is nonempty, that is,
λ3 (T − t2 ) < a,
⇒ λ3 λ1 t2 /(μ2 − λ1 ) < a,
λ3 λ1 t1 (μ3 − λ3 )
⇒ < a,
(μ2 − λ1 )(μ4 − λ3 )
λ3 λ1 a
⇒ < a.
(μ2 − λ1 )(μ4 − λ3 )
If we cancel out a which is positive on both sides and rewrite the expression
we get
482 Analysis of Queues
T = αa
where
μ2
α=
(μ2 − λ1 )(μ4 − λ3 )
for any a > 0. So if we started with a amount if fluid in buffer-3 and all other
buffers empty, then after time T = αa we would have βa amount of fluid in
buffer-3 and all other buffers empty, where
λ 3 λ1
β= .
(μ2 − λ1 )(μ4 − λ3 )
Now if we started with βa, then after time αβa we would have β2 a amount
of fluid in buffer-3 and all other buffers empty. In this manner if we were to
continue, then the total time to empty the system (that started with a amount
of fluid in buffer-3 and all other buffers empty) is
αa μ2 a
αa + βαa + β2 αa + β3 αa + · · · = = .
1−β μ2 μ4 − λ1 μ4 − λ3 μ2
This shows that if the condition λ1 /μ2 + λ3 /μ4 < 1 is satisfied, then the
amount of fluid in the system converges to zero in a finite time. Thus we
can see that if we started with some arbitrary amount of fluid x1 (0), x2 (0),
x3 (0), and x4 (0) in buffers 1, 2, 3, and 4, respectively, then there exists a finite
time δ after which the system would remain empty. Therefore, under that
condition the fluid network is stable. That guarantees that the correspond-
ing stochastic discrete network originally depicted in Figure 8.6 would also
be stable. In a similar manner one could derive the sufficient conditions for
stability of other deterministic fluid networks and thereby the corresponding
stochastic discrete network on a case-by-case basis.
Having said that, it is important to point out that there are other ways
to derive the conditions for stability. In particular, Lyapunov functions pro-
vide an excellent way to check if fluid networks are stable. Although we
do not go into details of Lyapunov functions in this book, it is worthwhile
describing them for the sake of completeness. Lyapunov functions have been
used extensively to study the stability of deterministic dynamical systems.
Stability, Fluid Approximations, and Non-stationary Queues 483
out that they are all based on diffusion approximations, which is the main
technique we consider in this section. What is interesting is that in many sit-
uations it is more appealing to use the G/G/s approximation than the exact
result for even an M/M/s queue! The reason is that the exact M/M/s queue
is not easy to use. For example, if one were to design (or control) the num-
ber of servers, the mean sojourn time formula is a complicated expression in
terms of s that one would rather use the simpler G/G/s approximation. To
add to the mix if we were to also consider abandonments, retrials, and server
breakdowns, diffusion approximations may be the only alternative even for
Markovian systems.
With that motivation in the next few sections we present a brief intro-
duction to diffusion approximations without delving into great detail with
respect to all the technical aspects. There is a rich literature with some
excellent books and articles on this topic. The objective of this section is to
merely provide a framework, perhaps some intuition and also fundamental
background for the readers to access the vast literature on diffusion approxi-
mation (which is also sometimes referred to as heavy-traffic approximations
especially in queues). For technical details on weak convergence, which is
the foundation of diffusion approximations, readers are referred to Glynn
[46] and Whitt [105]. We merely present the scaling procedure which results
in what is called diffusion limit. Similar to the fluid limit we presented in
Section 8.1, next we present the diffusion limit which is based on Chen
and Yao [19]. Subsequently, we will describe diffusion approximations for
multiserver queues.
A(t)
λ = lim .
t→∞ t
Recall that to obtain the fluid limit of the discrete arrival process {A(t), t ≥ 0},
we defined An (t) as
A(nt)
An (t) =
n
√ A(nt) − nλt
Ân (t) = n An (t) − λt = √
n
for any n > 0 and t ≥ 0. We would like to study Ân (t) as n → ∞ which we
will call the diffusion scaling (because the resulting process is a diffusion
process). We first illustrate the diffusion scaling using the same example as
in Section 8.1. Recall that to illustrate the strength of the results we con-
sider (i) an arrival process with an extremely high coefficient of variation,
(ii) a fairly small t, and (iii) analyze arrivals to the second node of a tandem
network (hence arrivals are not IID).
Problem 81
Consider a G/G/1 queue with interarrival times as well as service times
according to Pareto distributions. The coefficient of variation for interarrival
times is 5 and for the service time it is equal to 2. The average arrival rate is
1 per unit time and the average service rate is 1.25 per unit time. The depar-
tures from this queue act as arrivals to a downstream queue. Let A(t) be the
number of entities that arrive at the downstream node during (0, t]. For t = 0
to 10 time units, graph three sample paths of Ân (t) = A(nt)−nλt
√
n
versus t for
n = 1, 10, 100, and 1000.
Solution
It is crucial to note that the A(t) process is the arrivals to the downstream
node which is the same as the departures from the G/G/1 node described
in the question. Also the average arrival rate is λ = 1. By writing a simula-
tion using the algorithm in Problem 37 in Chapter 4, we can obtain sample
paths of the output process from the G/G/1 queue, in particular the number
of departures during any interval of time. Using this for various values of
n = 1, 10, 100, and 1000, we can plot three sample paths of Ân (t) = A(nt)−nλt
√
n
versus t, as shown in Figure 8.15(a)–(d).
From the figure, note that in (a) where n = 1, the sample paths are similar
to the workload process with jumps and constant declining sample paths,
except the values go below zero. When n = 10 as seen in (b), the sample paths
are still similar but they are closer than in case (a) because we have about
100 arrivals as opposed to 10 in case (a). We see this trend more prominent
in case (c) where the sample paths have closed in and the jumps are not so
prominent. Finally in (d) when n = 1000 which for this example is sufficiently
large, the sample paths essentially look like Brownian motions. There are a
couple of things to notice. Unlike the fluid limits, the diffusion limit does not
go to a deterministic value but it appears to be a normal random variable
(and the whole process converges to a Brownian motion). Also, the range
486 Analysis of Queues
4 5
4
3
3
2
2
1 1
An(t)
An(t)
0 0
–1
–1
–2
–2
–3
–3 –4
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(a) t (b) t
8 8
6 6
4
4
2
2
An(t)
An(t)
0
0
–2
–2
–4
–4 –6
–6 –8
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(c) t (d) t
FIGURE 8.15
Sample paths of scaled arrival process Ân (t). (a) n = 1. (b) n = 10. (c) n = 100. (d) n = 1000.
of values for all four cases are more or less the same. In other words, the
variability does not appear to depend on n.
While this result appears to be reasonable for a general arrival process, for
the description of the key results we only consider renewal processes (refer
to Whitt [105] for a rigorous description in the more general case of arrival
processes, not necessarily renewal). Now, let {A(t), t ≥ 0} be a renewal pro-
cess with average interrenewal time 1/λ and squared coefficient of variation
of interarrival time C2a . We know that for a large n and any t, A(nt) would be
approximately normally distributed with mean λnt and variance λC2a nt. Thus
for a large n, Ân (t) = A(nt)−nλt
√
n
would be normally distributed with mean 0
and variance λC2a t. Also, based on the previous example, we can conjecture
that the stochastic process {Ân (t), t ≥ 0} converges to a Brownian motion with
drift 0 and variance term λC2a as n → ∞. The theory that supports this conjec-
ture is an extension of the well-known central limit theorem to functionals.
Stability, Fluid Approximations, and Non-stationary Queues 487
Sn = Z1 + Z2 + · · · + Zn
n −mn
for any n ≥ 1. Central limit theorem essentially states that as n → ∞, Sσ √
n
converges to a standard normal random variable. In practice, for large n
one approximates Sn as a normal random variable with mean nm and vari-
ance nσ2 . Donsker’s theorem essentially generalizes this to functionals thus
resulting in the FCLT. Define Yn (t) as
nt
S
nt − m
nt 1
Yn (t) = √ = √ [Zi − m]
σ n σ n
i=1
N(nt) − nt/m
Rn (t) = √
(σ/m) n/m
for any t ≥ 0. Chen and Yao [19] show that by applying Donsker’s theorem
and random change theorem as n → ∞, the stochastic process {Rn (t), t ≥ 0}
also converges to the standard Brownian motion. To develop an intuition
it may be worthwhile to show that for large t, N(t) is a normal random
variable with mean t/m and variance σ2 t/m3 (see Exercises at the end of
the chapter). In summary, if {B(t), t ≥ 0} is a Brownian motion with drift 0
and variance term 1, then as n → ∞, {Rn (t), t ≥ 0} converges in distribution
to {B(t), t ≥ 0}. Now, we put this in perspective with respect to the arrival
process {A(t), t ≥ 0} which is a renewal process with average interrenewal
time 1/λ and squared coefficient of variation of interarrival time C2a . Using
488 Analysis of Queues
the preceding result we can verify our conjecture that {Ân (t), t ≥ 0} defined
earlier converges to a Brownian motion with drift 0 and variance term λC2a
as n → ∞.
It is not difficult to see that similar to the arrival process, the service
time process when scaled in a similar fashion also converges to a Brownian
motion. Thus the next natural step is to use the results in a G/G/1 setting
where the average arrival rate is λ and the SCOV of the interarrival times
is C2a , and the service rate is μ with service time SCOV C2s . The analysis
would be identical to that in Section 7.1.1. There we showed the results
using the normal approximation which would follow in a very similar fash-
ion, albeit more rigorous, if we modeled the underlying stochastic processes
as Brownian motions. For sake of completeness we simply restate those
results here. As the traffic intensity ρ (recall that ρ = λ/μ) approaches 1, the
workload in the system converges to a reflected
Brownian motion with drift
(λ − μ)/μ and variance term λ C2a + C2s /μ2 . Thus the steady-state distribu-
tion of the workload is exponential with parameter γ per unit time, where
2(1−ρ)μ 2
γ=
2 2
. Since an arriving customer in steady state would wait for a
λ Ca +Cs
time equal to the workload for service to begin, the waiting time before
service is also according to exp(γ) when ρ ≈ 1.
Although we did not explicitly state in Section 7.1.1, this is an extremely
useful result. For example we could answer questions such as what is the
probability that the service for an arriving customer would begin within
the next t0 time (answer: 1 − e−γt0 ). This is also extremely useful in design-
ing the system. For example if the quality-of-service metric is that not more
than 5% of the customers must wait longer than 5 time units (e.g., minutes),
then we can write that constraint as e−γ5 ≤ 0.05. Thus it is possible to obtain
approximate expressions for the distribution of waiting times and sojourn
times using the diffusion approximation when the traffic intensity is close
to one (for that reason these approximations are also referred to as heavy-
traffic approximations). That said, in the next two sections we will explore
the use of diffusion approximations in multiserver queue settings. However,
the approach, scaling, and analysis are significantly different from what was
considered for the G/G/1 case.
√
scaled by a factor “n” across time and n across “space” so that we define
Ẑn (t) as
Z(nt) − Z(nt)
Ẑn (t) = √
n
for any n > 0 and t ≥ 0. The term Z(nt) is the deterministic fluid model
(potentially different from the fluid scaling we saw earlier in this chapter)
of the stochastic process {Z(t), t ≥ 0}. Usually, Z(nt) = E[Z(nt)] or a heuristic
approximation for it. However, if that is not possible, then the usual fluid
scaling (via a completely different scale) can be applied, that is, Z(nt) =
Z (nt) where the RHS is the usual fluid limit where we let the fluid scale
→ ∞.
Assuming that the deterministic fluid model (Z(nt)) can be computed,
the main objective here is to study Ẑn (t). In particular, by applying the
scaling “n,” the analysis is to show that as n → ∞, the stochastic process
{Ẑn (t), t ≥ 0} converges to a diffusion process (that is the reason this method
is called diffusion approximation or diffusion scaling). A diffusion process is
a continuous-time stochastic process with almost surely continuous sample
paths and satisfies the Markov property. Examples of diffusion processes
are Brownian motion, Ornstein–Uhlenbeck process, Brownian bridge pro-
cess, branching process, etc. It is beyond the scope of this book to show the
convergence of the stochastic process {Ẑn (t), t ≥ 0} as n → ∞ to a diffusion
process {Ẑ∞ (t), t ≥ 0}. However, we do provide an intuition and interested
readers are referred to Whitt [105] for technical details. The key idea of diffu-
sion approximation is to start by using the properties of {Ẑ∞ (t), t ≥ 0}, such as
the distribution of Ẑ∞ (∞). Then for large n, Ẑn (∞) is approximately equal in
distribution to Ẑ∞ (∞). Thereby, we can approximately obtain a distribution
for Z(∞) using
√
Z(∞) = Z(∞) + nẐn (∞)
interested readers are referred to the literature, especially Whitt [105], for the
G/G/s case. In M/M/s queues, Markov property leads to diffusion processes,
however, in the G/G/s case although the marginal distribution at any time in
steady state converges to Gaussian, the process itself may not be a diffusion
(since Markov property would not be satisfied). Nonetheless there is merit in
considering the M/M/s case. Thus for the remainder of this section we only
consider M/M/s queues, that is, Poisson arrivals (at rate λ) and exponential
service times (with mean 1/μ at every server).
For such an M/M/s queue, let X(t) be the number of customers in the sys-
tem at time t. We are interested in applying diffusion scaling to the stochastic
process {X(t), t ≥ 0}. Further, define X̂n (t) as
X(nt) − X(nt)
X̂n (t) = √
n
λ p0 (λ/μ)s λ
L= +
μ s!sμ[1 − λ/(sμ)]2
where
s−1 −1
1 (λ/μ)s 1
p0 = (λ/μ)n + .
n! s! 1 − λ/(sμ)
n=0
X(nt) − L
X̂n (t) = √
n
2 1.5
1.5
1
Diffusion-scaled process
Diffusion-scaled process
1
0.5
0.5
0
0
–0.5
–0.5
–1 –1
–1.5 –1.5
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
(a) t (b) t
0.4 0.25
0.2 0.2
Diffusion-scaled process
Diffusion-scaled process
0
0.15
–0.2
0.1
–0.4
0.05
–0.6
0
–0.8
–1 –0.05
–1.2 –0.1
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
(c) t (d) t
FIGURE 8.16
Sample paths of scaled process X̂n (t/n) vs. t for μ = 1 and s = 4. (a) λ = 3.2, ρ = 0.8, n = 25.
(b) λ = 3.6, ρ = 0.9, n = 100. (c) λ = 3.8, ρ = 0.95, n = 400. (d) λ = 3.96, ρ = 0.99, n = 10, 000.
X(t) − L
X̂n (t/n) = √
n
versus t for various increasing values of λ in Figure 8.16. Notice that the
diffusion-scaled process is a little different and not scaled across time (we use
t/n as opposed to t). From Figure 8.16 it is clear the {X̂n (t/n), t ≥ 0} process
converges to a diffusion process as n is scaled. This would be more pow-
erful if we were to have scaled time as well, that is, plotted X̂n (t) instead of
X̂n (t/n). One could use this scaling when the system has high traffic intensity
but not a large number of servers.
2.5 5
2 4
1.5 3
Diffusion-scaled process
Diffusion-scaled process
1
2
0.5
1
0
0
–0.5
–1 –1
–1.5 –2
–2 –3
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(a) t (b) t
1.5 2
1
1
0.5
Diffusion-scaled process
Diffusion-scaled process
0
0
–0.5
–1 –1
–1.5
–2
–2
–2.5
–3
–3
–3.5 –4
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(c) t (d) t
FIGURE 8.17
Sample paths of scaled process X̂n (t/n) vs. t for μ = 1 and ρ = 0.9. (a) λ = 3.6, s = 4, n = 4.
(b) λ = 14.4, s = 16, n = 16. (c) λ = 36, s = 40, n = 40. (d) λ = 90, s = 100, n = 100.
X(t) − L
X̂n (t/n) = √
n
versus t for various increasing values of λ and s in Figure 8.17. Notice that
the diffusion-scaled process is a little different and not scaled across time
(we use t/n as opposed to t). From Figure 8.17 it is clear the {X̂n (t/n), t ≥ 0}
process converges to a diffusion process as n is scaled. This would be more
powerful if we were to have scaled time as well, that is, plotted X̂n (t) instead
of X̂n (t/n). One could use this scaling when the system has a large number
of servers but not a very high traffic intensity.
we consider the Halfin–Whitt regime (due to Halfin and Whitt [50]) in which
ρ → 1 but β is held a constant where
√
β = (1 − ρ) s.
We use the scale n = s (the choice of n = 1/(1 − ρ)2 would have also worked)
so that n increases as s increases. We plot
X(t) − L
X̂n (t/n) = √
n
versus t for various increasing values of λ and s in Figure 8.18. Notice that the
diffusion-scaled process is a little different and not scaled across time (we use
t/n as opposed to t). From Figure 8.18 it is clear the {X̂n (t/n), t ≥ 0} process
converges to a diffusion process as n is scaled. This would be more power-
ful if we were to have scaled time as well, that is, plotted X̂n (t) instead of
X̂n (t/n). One could use this scaling when the system has both high traffic
1 3
0.5 2.5
0 2
Diffusion-scaled process
Diffusion-scaled process
1.5
–0.5
1
–1
0.5
–1.5
0
–2
–0.5
–2.5 –1
–3 –1.5
–3.5 –2
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(a) t (b) t
2.5 3
2 2
Diffusion-scaled process
Diffusion-scaled process
1.5
1
1
0.5 0
0 –1
–0.5
–2
–1
–3
–1.5
–2 –4
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(c) t (d) t
FIGURE 8.18
Sample paths of scaled process X̂n (t/n) vs. t for μ = 1 and β = 0.2. (a) λ = 0.8, ρ = 0.8, n = s = 1.
(b) λ = 3.6, ρ = 0.9, n = s = 4. (c) λ = 15.2, ρ = 0.95, n = s = 16. (d) λ = 62.4, ρ = 0.975, n = s = 64.
494 Analysis of Queues
X(nt) − s
X̂n (t) = √
n
P{X(∞) ≥ s} → θ
following the results in Whitt [106]. In fact the references in Whitt [106] point
to articles that consider the other two regimes.
To obtain the diffusion limits for the ED regime of the M/M/s + M model,
we begin with the Erlang-A M/M/s/K+M model. The waiting space K−s will
be chosen to be large enough and scaled in a manner that would approach
infinity. Thus we consider a sequence of M/M/s/K + M queues indexed by s,
the number of servers which we would use to scale. In particular, let λs and
Ks be the scaled arrival rate and system capacity. However the service rate
μ and abandonment rate α are not scaled. Also, the traffic intensity ρ is not
scaled and remains fixed for the entire sequence of queues with ρ > 1 (since
the regime is ED). Let
μ(ρ − 1)
q= . (8.10)
α
λs = ρsμ (8.11)
Ks = s(η + 1) (8.12)
Xs (t) − Xs (t)
X̂s (t) = √
s
by making the realization that Xs (t) must be greater than s (as ρ > 1 results
in λs > min{i, s}μ for any i ≥ 0). Thus we have
λs − sμ
Xs (t) = + s = (1 + q)s (8.13)
α
Stability, Fluid Approximations, and Non-stationary Queues 497
where the last equality is by substitution for λs in Equation 8.11 and using
Equation 8.10 for q. Thus we represent the diffusion term as
Xs (t) − s(1 + q)
X̂s (t) = √ (8.14)
s
for all t ≥ 0.
Whitt [106] shows that the stochastic process {X̂s (t), t ≥ 0} as s → ∞ con-
verges to an Ornstein–Uhlenbeck diffusion process. In state x, the infinites-
imal mean or state-dependent drift of the Ornstein–Uhlenbeck process is
−αx and infinitesimal variance 2μρ. Further, the steady-state distribution
of X̂s (∞) converges to a normal distribution with mean 0 and variance
ρμ/α. Next we explain that briefly. We showed earlier in this section via
simulations how processes like {X̂s (t), t ≥ 0} converge to diffusion processes
(hence the term diffusion limit) as s → ∞. Thus that is not a surprising result.
Further, it is possible to show a weak convergence of the birth and death pro-
cess {Xs (t), t ≥ 0} to an Ornstein–Uhlenbeck process by appropriately scaling
(akin to how the constant birth and death parameter converges to a Brown-
ian motion). That is because beyond state s(1+q) since the death rate exceeds
the birth rate, the process gets pulled back to s(1 + q). Likewise, below state
s(1 + q) where the birth rate exceeds the death rate, the process gets pushed
up to s(1 + q). This results in a convergence to the Ornstein–Uhlenbeck
process centered around s(1 + q). Also, the steady-state distribution of an
Ornstein–Uhlenbeck diffusion process centered at zero (with mean drift rate
−m and infinitesimal variance v) is zero-mean normal with variance equal
to v/(2m). Notice that the drift rate of −m implies that the drift in state x is
−mx. That said, the only things remaining to be shown are that the drift rate
for our process is m = α and infinitesimal variance equal to 2μρ. This is the
focus of the next problem.
Problem 82
Show that the Ornstein–Uhlenbeck diffusion process that results from scal-
ing {X̂s (t), t ≥ 0} as s → ∞ has a drift of m(x) = −αx and infinitesimal variance
v(x) = 2μρ for any feasible state x.
Solution
Notice that since Xs (t) takes on any integer value√k ≥ 0, X̂s (t) would corre-
spondingly take on discrete values [k − s(1 + q)]/ s for k = 0, 1, 2, . . . but we
are ultimately interested in any real-valued state x. For that we first consider
an arbitrary real value x and a sequence of xs values for X̂s (t) for each s so
that xs → x as s → ∞. Assuming that s is sufficiently large, we can consider
the following choice for xs so that the preceding condition is met:
498 Analysis of Queues
√
s(1 + q) + x s − s(1 + q)
xs = √ .
s
For any s, the infinitesimal mean (corresponding to the drift) ms (xs ) can
be computed as follows (with the first equation being the definition):
m(x) = −αx.
It is worthwhile
√ pointing out that
√this expression was derived assuming that
xs ≥ − q s. But what if xs < − q s? It turns out that for sufficiently
√ large s it
is not even feasible to reach states xs that are smaller than −q s. Even if one
were to reach such a state, the calculation would result in m(x) = ∞ which
would imply an instantaneous drift to a higher x. However, that is the reason
the problem is worded as m(x) = αx for any feasible state x. Whitt [106] shows
that P{Xs (∞) ≤ s} → 0 as s → ∞ using a fluid model which implies that in
√ there is not chance for xs to be less than zero (leave alone less
steady state
than −q s).
Next we consider the infinitesimal variance. For any s, the infinitesimal
variance vs (xs ) can be computed as follows (with the first equation being the
definition):
vs (xs ) = lim E (X̂s (t + h) − X̂s (t))2 /hX̂s (t) = xs ,
h→0
√
= lim E (Xs (t + h) − Xs (t))2 /(hs)|Xs (t) = sxs + s(1 + q) ,
h→0
√
λs h + μsh + αh sxs + sq + o(h)
= lim ,
h→0 hs
Stability, Fluid Approximations, and Non-stationary Queues 499
√
for any xs ≥ − q s where o(h) is a collection of terms of order less than h such
that o(h)/h → 0 as h → 0 but different from the o(h) defined in ms (xs ).
By taking the limit h → 0 and substituting for λs using Equation 8.11
we get
√ √
vs (xs ) = μρ + μ + αxs / s + αq = 2ρμ + αxs / s
where the second equality uses expressions for q from Equation 8.10. Now
we let s → ∞ resulting in vs (xs ) → v(x) such that
v(x) = 2μρ.
Thus the steady-state distribution of the diffusion process, that is, X̂s (∞)
as s → ∞ converges to a normal distribution with mean 0 and variance ρμ/α.
This involves a rigorous argument taking stochastic process limits appropri-
ately (see Whitt [106] and Whitt [105] for further details). Therefore, as an
approximation we can use for fairly large s values that X̂s (∞) is approxi-
mately normally distributed with mean 0 and variance ρμ/α. Hence using
Equation 8.14 we can state that Xs (∞) is approximately normally distributed
with mean s(1 + q) and variance sρμ/α. Assuming a reasonably signifi-
μ
cant abandonment rate α so that α << s we can see that Xs (∞) would be
greater than s with a very high probability (approximately 1). Hence we
can write down Lq ≈ L − s using our usual definition of L and Lq being the
steady-state number of customers in the system and in the queue waiting for
service to begin, respectively. Thus we have Lq ≈ sq since L ≈ s(1 + q). Now
define Pab as the probability that an arriving customer in steady state would
abandon without service. Using Little’s law for abandoning customers,
we have
1
Lq = λs Pab .
α
Using the fact that Lq ≈ sq = sμ(ρ − 1)/α and λs = sρμ, we can compute
ρ−1
Pab ≈ .
ρ
waiting to begin service, the variance of Xs (∞) must be equal to the vari-
ance of the number of customers waiting to begin service. Using that we
can obtain for the preceding numerical example that the standard devia-
tion
√ of the number of customers waiting to begin service is approximately
sρμ/α = 33.1662, which is remarkably close to the exact result of 33.1 cus-
tomers. The key point to make is that these approximations are conducive
to use in design and control (as opposed to the exact results). This is typical
for most diffusion approximations where the results are surprisingly simple
although the process to obtain them are fairly intensive. That said, in the
next section we leverage upon both fluid and diffusion approximations for
transient analysis.
The other set of events that are responsible for the dynamics of an Mt /M/st
queue are the departures. We let Nd (·) to denote the departure process,
which is also a Poisson process that is not only time-homogeneous but also
Stability, Fluid Approximations, and Non-stationary Queues 501
Thus, for this Mt /M/st queue, we can write down X(t) in terms of the
initial state of the system X(0), as well as the two nonhomogeneous Poisson
processes Na (·) and Nd (·) as
⎛ ⎛ ⎞ ⎞
t t
X(t) = X(0) + Na ⎝ λu du⎠ − Nd ⎝ min{X(u), su }μdu⎠ .
0 0
To this say we add two additional situations: (i) customers renege after
exp(β) time if their service does not start; (ii) there is a new stream of
customers that arrive according to a homogeneous Poisson process with
parameter α but could balk upon arrival resulting in a queue joining prob-
ability qX(t) if the customer arrives at time t and sees X(t) others in the
system. Clearly the reneging occurs according to a nonhomogeneous Poisson
process, let us call it Nr (·). Likewise, let Nb (·) denote the nonhomoge-
neous Poisson process corresponding to the second stream of customers that
potentially balk. For this modified system we can write down X(t) as
⎛ ⎛ ⎞ ⎞
t t
X(t) = X(0) + Na ⎝ λu du⎠ − Nd ⎝ min{X(u), su }μdu⎠
0 0
⎛ ⎞ ⎛ ⎞
t t
− Nr ⎝β max{X(u) − su , 0}du⎠ + Nb ⎝α qX(u) du⎠ .
0 0
k
F(t, x) = li fi (t, x). (8.16)
i=1
k ! "
t
nXn (0) + li Yi nfi (s, Xn (s))ds
i=1 0
Xn (t) = (8.17)
n
with nXn (0) = X(0) so that the initial state is also scaled. The scaled process
{Xn (t), t ≥ 0} is obtained essentially by taking n times faster rates of events.
Such a scaling is also called uniform acceleration in the literature (see Massey
and Whitt [79]). Like in all the fluid models we have seen in this chapter, here
too as n → ∞, the scaled process {Xn (t), t ≥ 0} converges to a deterministic
process almost surely.
The result once again is an artifact of functional strong law of large num-
bers (FSLLN), which leads to what is called the strong approximation. In
Stability, Fluid Approximations, and Non-stationary Queues 503
k ! "
t
nXn (0) + li E Yi nfi (s, Xn (s))ds
i=1 0
E[Xn (t)] =
n
k
t
nXn (0) + li E nfi (s, Xn (s))ds
i=1 0
= (8.18)
n
k
t
= Xn (0) + li E fi (s, Xn (s))ds (8.19)
i=1 0
where Equation 8.18 is due to the Poisson process property recalling that
the expected value of a nonhomogeneous Poisson process N((t)) is (t).
If we know the distribution of Xn (s) then we can write down Equation 8.19,
but we consider a nonparametric approach. For that we use the Lipschitz
property of the function fi (·, ·) due to which
|E[fi (s, Xn (s))] − fi (s, E[Xn (s)])| ≤ ME[|Xn (s) − E[Xn (s)]|].
If we let n → ∞ in this expression, then the RHS goes to zero. Thus we have
k t
lim E[Xn (t)] = Xn (0) + li lim fi (s, E[Xn (s)])ds.
n→∞ n→∞
i=1 0
504 Analysis of Queues
Since we consider X̄(t) = limn → ∞ E[Xn (t)], using the previous equation we
can write down X̄(t) as the solution to the equation
k t
X̄(t) = X̄(0) + li fi (s, X̄(s))ds. (8.20)
i=1 0
Note that using the previous expression it is possible to solve numerically for
X̄(t) for any t. Thus for large n one can approximate Xn (t) as the deterministic
quantity X̄(t). But what is the connection to E[X(t)] that we alluded to earlier
in this section? As it turns out, that would have to be done on a case-by-case
basis. We illustrate that process using an example next.
Problem 83
Consider an Mt /M/st system that models an inbound call center. The con-
stant mean service rate for this call center is μ = 4 customers per hour.
Table 8.2 describes the expected arrival rate λt per hour and number of
servers (st ) by discretizing into eight hourly intervals. Develop a fluid scaling
or uniform acceleration for this system by numerically describing the deter-
ministic process {X̄(t), 0 ≤ t ≤ 8}. Compare against a simulated sample path
of the number in the system process {X(t), 0 ≤ t ≤ 8}. Assume that X(0) = 80,
that is, at time zero there are already 80 customers in the system. Also, obtain
an approximation for E[X(t)] and compare against simulations by creating
100 replications and averaging them.
Solution
The problem description is that of a call center where the arrival rate and
number of servers (that is, representatives or call handlers) are time vary-
ing. However, within an hour we assume that they are held a constant (the
TABLE 8.2
Hourly Arrival Rate and
Staffing at a Call Center
t λt st
(0, 1] 400 110
(1, 2] 440 120
(2, 3] 500 130
(3, 4] 720 170
(4, 5] 800 220
(5, 6] 720 200
(6, 7] 600 140
(7, 8] 400 120
Stability, Fluid Approximations, and Non-stationary Queues 505
analytical model does not need this assumption, it is there just for the
insights). It may be worthwhile going through Table 8.2. In essence during
the first hour 400 customers are expected and the number of servers is 110.
Likewise, during the seventh hour 600 customers are expected and the num-
ber of servers is 140. Notice that in time intervals (3, 4] and (6, 7] there is an
overload situation, that is, the arrival rate is larger than the service capacity,
λt > st μ. However, since this is a transient analysis, that is not much of an
issue but worth watching out for.
That said we now consider the Mt /M/st system with μ = 4 and λt and st
from Table 8.2. Let X(t) be the number of customers in the system at time t
with X(0) = 80. We can rewrite Equation 8.15 as
⎛ ⎞ ⎛ ⎞
t t
X(t) = X(0) + Y1 ⎝ λu du⎠ − Y2 ⎝ μ min{su , X(u)}du⎠ , (8.21)
0 0
where Y1 (·) and Y2 (·) are the nonhomogeneous Poisson arrival and depar-
ture processes, respectively. For some large n and all t ∈ [0, 8], let
λt st
at = and rt = .
n n
We will pretend rt is an integer for the fluid scaling but that will not be neces-
sary for the limiting deterministic process that will be defined subsequently.
Define Xn (t) as
t t
nXn (0) + Y1 0 nau du − Y2 0 μn min{ru , Xn (u)}du
Xn (t) = ,
n
where Xn (0) = X(0)/n. Notice that this equation is identical in form to that of
the scaled process in Equation 8.17. More crucially notice that this equation
is also identical to Equation 8.21 if we let
for all t.
As n → ∞, Xn (t) converges to X̄(t) which is given by the solution to
⎛ ⎞ ⎛ ⎞
t t
X̄(t) = X̄(0) + ⎝ au du⎠ − ⎝ μ min{ru , X̄(u)}du⎠ , (8.22)
0 0
factor n and say that the scaled process {Xn (t), t ≥ 0} converges to its fluid
limit {X̄(t), t ≥ 0}. However, in this problem we begin with λt and st , select an
n and then figure rt and at . Thus the approximation would work well when
λt is significantly larger than μ and st significantly larger than 1. In fact, the
choice of n is actually irrelevant.
We arbitrarily select n = 50 (any other choice would not change the
results). Then we solve for X̄(t) by performing a numerical integration for
Equation 8.22 via first principles in calculus. Using this fluid scaling we plot
an approximation for X(t) using X(t) ≈ nX̄(t) in Figure 8.19. To actually com-
pare against a sample path of X(t), we also plot a simulated sample path of
X(t) in that same figure (see the jagged line). The smooth line in that figure
corresponds to nX̄(t). The crucial thing to realize is that figure would not
change if a different n was selected. Notice how remarkably closely the sim-
ulated graph follows the fluid limit giving us confidence it is performing
well.
However, the next thing to check is whether E[X(t)] is close to X̄(t).
For that consider Figure 8.20. The smooth line is nX̄(t) which is identical
to that in Figure 8.19. By performing 100 simulations E[X(t)] can be esti-
mated for every t. The estimated E[X(t)] is the jagged line in Figure 8.20.
The two graphs are incredibly close suggesting that the approximation is
fairly reasonable. The three ways this fit could improve even further: (1) if
λt and st were much higher; (2) if we used a parametric approach instead
of Lipschitz to resolve Equation 8.19; and (3) if we used well over 100
260
240
220
Number in system at time t
200
180
160
140
120
100
80
60
0 1 2 3 4 5 6 7 8
t
FIGURE 8.19
Number in the system after fluid scaling (smooth line) vs. single simulation sample path (jagged
line).
Stability, Fluid Approximations, and Non-stationary Queues 507
240
220
Expected number in system at time t
200
180
160
140
120
100
80
0 1 2 3 4 5 6 7 8
t
FIGURE 8.20
Mean number in the system using 100 replications of simulation (jagged line) vs. fluid
approximation (smooth line).
X(t) (defined in Equation 8.15) like we did in the previous section. Now, for
the distribution of Xn (t), we consider the “usual” diffusion scaling. Define
the scaled process {X̂n (t), t ≥ 0} where X̂n (t) is given by
√
X̂n (t) = n(Xn (t) − X̄(t)).
Besides the assumptions made in the previous section, we also require that
F satisfies
d
F(t, x) ≤ M,
dx
for some finite M and 0 ≤ t ≤ T. Kurtz [71] shows that under those conditions
k t # t
X̂(t) = li fi (s, X̄(s))dWi (s) + F
(s, X̄(s))X̂(s)ds, (8.23)
i=1 0 0
Wi (·)’s are independent standard Brownian motions, and F
(t, x) = dF(t, x)/dx.
It is crucial to note that this result requires that F(t, x) is differentiable every-
where with respect to x but that is often not satisfied in many queueing
models. There are few ways to get around this which is the key fine-tuning
by Mandelbaum et al. [77] that we alluded to earlier. However, here we
take the approach in Mandelbaum et al. [78] which states that as long as the
deterministic process X̄(t) does not linger around the nondifferentiable point
or points, Equation 8.23 would be good to use.
With that understanding we now describe the diffusion approximation
for Xn (t). For that we use the result in Ethier and Kurtz [33] which states
that if X̂(0) is a constant or a Gaussian random variable, then {X̂(t), t ≥ 0} is a
Gaussian process. Since a Gaussian process is characterized by its mean and
variance, we only truly require the mean and variance of X̂(t) which can be
obtained from Equation 8.23. We will subsequently use that but for now note
that we have a diffusion approximation. In particular for a large n,
X̂(t)
Xn (t) ≈ X̄(t) + √ . (8.24)
n
when n is large. Thus we only require E[Xn (t)] and Var[Xn (t)] to characterize
Xn (t).
Assuming that X̄(0) = Xn (0) = nX(0) where X(0) is a deterministic known
constant quantity, we can see that X̂(0) = 0. Further, by taking the expected
value and variance of approximate Equation 8.24, we get
E[X̂(t)]
E[Xn (t)] ≈ X̄(t) + √ , (8.25)
n
Var[X̂(t)]
Var[Xn (t)] ≈ . (8.26)
n
However, we had shown earlier that E[Xn (t)] = X̄(t) as n → ∞. Using that or
by showing E[X̂(t)] = 0 for all t since X̂(0) = 0 using Equation 8.23, we can say
that E[Xn (t)] ≈ X̄(t). Now, for Var[X̂(t)] we use the result in Arnold [6] for
linear stochastic differential equations. In particular by taking the derivative
with respect to t of Equation 8.23 and using the result in Arnold [6], we get
Var[X̂(t)] as the solution to the differential equation:
dVar[X̂(t)]
k
= fi (t, X̄(t)) + 2F
(t, X̄(t))Var[X̂(t)], (8.27)
dt
i=1
with initial condition Var[X̂(0)] = 0. Once we solve for this ordinary differ-
ential equation, we can obtain Var[X̂(t)] which we can use in Equation 8.26
to get Var[Xn (t)] and subsequently Var[X(t)]. We illustrate this by means of
an example, next.
Problem 84
Consider the Mt /M/st system that models an inbound call center described
in Problem 83. This is a continuation of that problem and it is critical to
go over that before proceeding ahead. Using the results of Problem 83,
develop a diffusion model for that system. Then, obtain an approximation
for Var[X(t)] and compare against simulations by creating 100 replications
and obtaining sample variances.
Solution
Several of the details are based on the solution to Problem 83. Recall X(t), the
number in the system at time t is described in Equation 8.21 as
⎛ ⎞ ⎛ ⎞
t t
X(t) = X(0) + Y1 ⎝ λu du⎠ − Y2 ⎝ μ min{su , X(u)}du⎠ ,
0 0
510 Analysis of Queues
where Y1 (·) and Y2 (·) are the nonhomogeneous Poisson arrival and depar-
ture processes, respectively. For some large n and all t ∈ [0, 8], let
λt st
at = and rt = .
n n
dVar[X̂(t)]
= at + μ min{rt , X̄(t)} − 2I(rt ≥ X̄(t))μVar[X̂(t)],
dt
where the indicator function I(A) is one if A is true and zero if A is false. This
ordinary differential equation can be solved by numerically integrating it to
get Var[X̂(t)] for all t in 0 ≤ t ≤ 8.
From Equation 8.26 we have Var[Xn (t)] ≈ Var[X̂(t)]/n and from an earlier
equation we know Var[X(t)] = n2 Var[Xn (t)]. Hence we get the approximation
Var[X(t)] ≈ nVar[X̂(t)].
Stability, Fluid Approximations, and Non-stationary Queues 511
2500
Variance of number in system at time t
2000
1500
1000
500
0
0 1 2 3 4 5 6 7 8
t
FIGURE 8.21
Variance of number in system using 100 replications of simulation (jagged line) vs. diffusion
approximation (smooth line).
Reference Notes
In the last two decades, one of the most actively researched topics in the
analysis-of-queues area is arguably the concept of fluid scaling. Fluid limits
results are based on some phenomenally technical underpinnings that use
512 Analysis of Queues
stochastic process limits. This chapter does not do any justice in that regard
and interested readers are encouraged to refer to Whitt [105]. Our objective of
this chapter was to present some background material that would familiarize
a reader to this approach to analyze queues using fluid models. The author is
extremely thankful to several colleagues that posted handouts on the world
wide web that were tremendously useful while preparing this material. In
particular, Balaji Prabhakar’s handouts on fluid models, Lyapunov func-
tions, and Foster–Lyapunov criterion; Varun Gupta’s gentle introduction to
fluid and diffusion approximations; Gideon Weiss’ treatment of stability of
fluid networks; and John Hasenbein’s collection of topics on fluid queues
were all immensely useful in developing this manuscript.
As the title suggests, this chapter is divided into three key pieces: stabil-
ity, fluid-diffusion approximations, and time-varying queues. Those topics
have somewhat independently evolved in the literature and usually do not
appear in the same chapter of any book. Thus it is worthwhile describing
them individually from a reference notes standpoint. The topic of stabil-
ity of multiclass queueing networks is a fascinating one, years ago most
researchers assumed that the first-order traffic intensity conditions were suf-
ficient for stability. However, in case of multiclass queueing networks with
reentrant lines using deterministic routing and/or local priorities among
classes, more conditions are necessary for stability due to the virtual station
condition. This is articulated nicely in several books and monographs with
numerous examples. In particular, this chapter benefited greatly from: Chen
and Yao [19] with the clear exposition of fluid limits and all the excellent
multiclass network examples; Dai [25] for describing the fluid networks and
conditions for stability; Meyn [82] for the explanation of how to go about
showing a fluid network is stable; and Bramson [13] for the technical details
and plenty of examples.
Moving on to the next section, it was a somewhat familiar topic for this
book, that is, fluid and diffusion approximations. In fact, we used those
approximations in Chapter 4 without proof. However, only in Chapter 7 we
showed by approximations based on reflected Brownian motion how one
could get good approximations for general queues. However, those meth-
ods were similar to the traditional (Kobayashi [64]) diffusion approximation.
On the other hand, this section provides a diffusion approximation by scal-
ing, in fact first by performing a fluid scaling and subsequently a diffusion
one. There are numerous books and monographs that go into great details
regarding fluid and diffusion scaling. This chapter benefited greatly from
Whitt [105] as well as Chen and Yao [19], especially in terms of construct-
ing scaled processes. However, for an overview of the mathematical details
and example of diffusion processes, Glynn [46] is an excellent resource. This
chapter also benefited from several articles such as Halfin and Whitt [50] as
well as Whitt [106].
The last section of this chapter was on uniform acceleration and strong
approximations. This topic was the focus of the author’s student Young
Stability, Fluid Approximations, and Non-stationary Queues 513
Myoung Ko’s doctoral dissertation. Most of the materials in that section are
from Young’s thesis, in particular from preliminary results of his papers.
The pioneering work on strong approximations was done by Kurtz [71]. The
concept also appears in the book by Ethier and Kurtz [33]. Subsequently,
Mandelbaum et al. [77] described some of the difficulties in using the strong
approximation results available in the literature due to issues regarding
differentiability while obtaining the diffusion limits. The limits, both fluid
and diffusion, are based on the topic of uniform acceleration that can be
found in Massey and Whitt [79]. Numerical studies that circumvent the dif-
ferentiability requirement can be found in Mandelbaum et al. [78]. Young
Myong Ko has found ways to significantly improve both fluid and diffusion
approximations so that they are extremely accurate (article forthcoming).
Exercises
8.1 Consider an extension to the Rybko–Stolyar–Kumar–Seidman net-
work in Figure 8.6 with a node C in between A and B. The first flow
gets served in nodes A, C, and then B whereas the second flow has a
reverse order. Priority is given to the second flow in node C, hence in
terms of priority it is identical to that of A. Obtain the condition for
stability and verify using simulations. This network is taken from
Bramson [13] which has the figure and the stability condition.
8.2 Solve Problem 76 assuming that: (i) interarrival times and ser-
vice times are deterministic constants; (ii) interarrival times are the
same as in Problem 76 but service times are according to a gamma
distribution with coefficient of variation 2.
8.3 Show that the virtual station condition described in Section 8.2.1
along with the necessary conditions are sufficient to ensure stabil-
ity for the network in Figure 8.3. Follow a similar argument as
the one for the Rybko–Stolyar–Kumar–Seidman network outlined
in Section 8.2.3.
8.4 Let Z1 , Z2 , Z3 , . . ., be a sequence of IID random variables with finite
mean m and finite variance σ2 . Define Sn as the partial sum
Sn = Z1 + Z2 + · · · + Zn
8.6 Consider an M/M/s queue with arrival rate λ = 10 per minute and
number of servers s = 5. For μ = 2.5, 2, and 1.8 (all per minute) plot
X̂n (t) versus t for t ∈ [0, 1] minutes using n = 2, 20, 200, and 2000,
where X̂n (t) is defined in Section 8.4.2 in terms of Xn (t) and X̄(t).
Use multiple sample paths to illustrate the diffusion approximation.
Also use X(0) = 0.
8.7 Consider a finite population queueing system with s = 50 servers
each capable of serving at rate μ = 5 customers per hour. Assume
service times are exponentially distributed. Also upon service com-
pletion each customer spends an exponential time with mean 1 h
before returning to the queueing system. Assume that there are
400 customers in total but at time 0, the queue is empty. Obtain
an approximation for the mean and variance of the number of cus-
tomers in the system during the first hour. Perform 100 simulations
and evaluate the approximations.
8.8 Consider the following extension to Problem 83. In addition to all the
details in the problem description, say that customers renege from
the queue (that is, abandon before service starts) after an exp(β) if
their service does not begin. Use 1/β = 5 min. Use fluid and diffu-
sion approximations and numerically obtain E[X(t)] and Var[X(t)].
Compare the results by performing 100 simulations.
8.9 Solve the previous problem under the following additional condi-
tion (note that reneging still occurs): some customers access the call
center from their web browser using Internet telephony. These cus-
tomers arrive according to a homogeneous Poisson process at rate
α = 10 per hour. Only these customers also have real time access to
their position in the wait line (e.g., position-1 implies next in line
for service). However, because of that if there are i total customers
waiting for service to begin, then with probability (0.9)i an arriving
customer joins the system (otherwise the customer would balk).
8.10 Consider an Mt /M/st /st queue, that is, there is no waiting room. If
an arriving customer finds all servers busy, then the customer retries
after exp(θ) time. At this time we say that the customer is in an orbit.
Assume that λt alternates between 100 and 120 each hour for a four
hour period. Also μ = 1 and st during the four hour-long slots are: 90,
125, 125, and 150, respectively. Compute approximately the mean
and variance of the number of customers in the queue as well as in
the orbit. Care is to be taken to derive expressions when X(t) is a 2D
vector.
9
Stochastic Fluid-Flow Queues: Characteristics
and Exact Analysis
In the previous chapter, we saw deterministic fluid queues where the flow
rates were mostly constant and toward the end we saw a case where the
rates varied deterministically over time. In this and the next chapter, we
focus on stochastic fluid queues where flow rates are piecewise constant
and vary stochastically over time. We consider only the flow rates from a
countable set. On a completely different note, in some sense the diffusion
limits we saw in previous chapters can be thought of as a case of flow rates
from an uncountable set that are continuously varying (as opposed to being
piecewise constant). Thus from a big-picture standpoint, metaphorically the
models in this chapter fall somewhere between the deterministically time-
varying fluid queues and diffusion queues. However, here we will not be
presenting any formal scaling of any discrete queueing system to result in
these fluid queues. We focus purely on performance analysis of these queues
to obtain workload distributions.
For the performance analysis we start by describing a queueing system
where the entities are fluids. For example, a sink in a kitchen or bathroom
can be used to explain the nuances. Say there is a fictitious tap or faucet that
has a countable number of settings (as opposed to a continuous set which is
usually found in practice). At each discrete setting, water flows into the sink
at a particular rate. Typically the sojourn time in each setting is random and
the setting changes stochastically over time. This results in a piecewise con-
stant flow rate that changes randomly over time. The sink itself is the queue
or buffer that holds fluid (in this case water), which flows into it. The drain
is analogous to a server that empties the fluid off the sink (however, unlike
a real bathtub or sink, here we assume the drainage rates are not affected by
the weight of the fluid). For our performance analysis, we assume that we
know the stochastic process that governs the input to the sink as well as the
drainage. Using that our aim is to obtain the probability distribution of the
amount of fluid in the sink.
Naturally, these models can be used in hydrology such as analyzing
dams, reservoirs, and water bodies, as well as in process industries such as
chemicals and petrochemicals. However, a majority of the results presented
here have been motivated by applications in computer, communication, and
information systems. Interestingly these are truly discrete systems, but there
are so many discrete entities that flow in an extremely small amount of
515
516 Analysis of Queues
9.1 Introduction
The objective of this section is to provide some introductory remarks regard-
ing stochastic fluid-flow queues. We begin by contrasting fluid-flow queues
against discrete queues to understand their fundamental differences as well
as underlying assumptions. Once we put things in perspective, we describe
some more applications and elaborate on others described previously. Then
we go over some preliminary material such as inputs that go into the per-
formance analysis as well as a description of the condition for stability.
We conclude the section with a characterization of the stochastic flow-rate
processes that govern the flow of fluids.
A1 A2 A3 A4 A5
C
X(t)
(a) Time
Off On
R– C C
X(t)
(b) Time
FIGURE 9.1
Comparing workloads in (a) discrete and (b) fluid queues.
The key difference between discrete queues and fluid queues is that in
the discrete case the entire file arrives instantaneously, thus creating a jump
or discontinuity in the workload process. Whereas in the fluid case, the file
arrives gradually over time at a certain flow rate. Therefore, the workload
process would not have jumps in fluid queues, that is, they are continuous.
For both the discrete and the continuous cases, let X(t) be the amount of fluid
(i.e., workload) in the buffer at time t. Figure 9.1 illustrates this difference
between the X(t) process versus t in the discrete and fluid queues. In par-
ticular it is crucial to point out that although in practice the entire file does
not actually arrive instantaneously but in the discrete model, Figure 9.1a, we
essentially assume the arrival time of the file (A1 , A2 , . . .) as when the entire
file completely arrives at the gateway.
In many systems, this is a necessity as the entire entity is necessary for
processing to begin. Notice that with every arrival, the workload jumps up
by an amount equal to the file size. The workload is depleted at rate c. We
have seen such a workload process in the discrete queues, the only differ-
ences here are that (i) the notation for X(t) is not what we used for the discrete
case; and (ii) the workload depletion rate is c and not 1 like is usually done.
However, what enables fluid model analysis is the fact that the gateway does
not have to wait for the entire file to arrive. As soon as the first packet of the
518 Analysis of Queues
file arrives, it sends it off without waiting for the whole file to arrive. Alter-
natively one could think of the discrete queue as a “bulk” arrival of a batch of
packets whereas in the fluid queue this batch arrives slowly over time. Since
the batch arrives back to back, modeling at the “packet” level is tricky. In a
similar fashion, although not represented in the workload process, we con-
sider a discrete entity’s departure from the system when all of its service is
completed which is not the case for fluids. In fact thus the concept of sojourn
times at the granularity of a whole file is not so easy in fluid queues.
Now we describe the workload process for the fluid queue. In particular,
refer to Figure 9.1b. We assume that information flows into the system as
an on–off fluid. We will explain on–off fluid subsequently, however, for the
purposes of this discussion it would suffice to think of an “on” time as when
there is one or more files back to back that gets processed by the server at
rate c. When there is information, it flows in at rate R. However, when there
is no information flow, we call that period as “off.” From the figure, notice
that the workload gradually increases at rate R − c when the source is on.
It is because fluid enters at rate R and is removed at rate c, resulting in an
effective growth rate of R − c. Also, when the fluid entry is off, the workload
reduces at rate c (provided there is workload to be processed, otherwise it
would be zero). Notice that in Figure 9.1 there is no relationship between the
discrete case’s arrival times and file sizes against the fluid case’s on and off
times.
In summary, fluid-flow models are applicable in systems where the
workload arrives gradually over time (as opposed to instantaneously) and
we are interested in aggregate performance such as workload distribution.
In most cases the entities are themselves either fluids or can be approximated
as fluids. Under these situations the analysis based on fluid models would be
extremely accurate and in comparison the discrete models would be way off.
In particular, when the discrete entities arrive in a back-to-back fashion, it is
extremely conducive to model them as fluids as opposed to discrete point
masses with constant interarrival times. We will next see some applications
where fluid models would be appropriate for their analysis.
9.1.2 Applications
Here we present some scenarios where stochastic fluid-flow models can be
used for performance analysis. The idea is to give a flavor for the kind of
systems that can be modeled using fluid queues. We begin by presenting
some examples from computer and communication systems where informa-
tion flow is modeled as fluids. Then we present an example from supply
chain, followed by one in hydrology and finally a transportation setting. As
described earlier, most of the fluid model results have been motivated by
applications in computer-communication networks. Thus it is worthwhile to
describe a few examples at different granularities of space, size, and time.
However, what is common in all the cases is that entities arrives in a bursty
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 519
fashion, that is, things happen in bursts at different rates as opposed to one
by one in a fairly uniform fashion.
For example, consider a CPU on a computer that processes tasks from a
software agent that is part of a multi-agent system. An agent is a complex
software that can autonomously send, receive, and process information on
behalf of a user. The agent sends tasks to its CPUs to aid in decision-making.
In Aggarwal et al. [3] we consider software agents that are hosted on comput-
ers that perform military logistic operations. We show (see the first figure in
Aggarwal et al. [3]) that the tasks are indeed generated in bursts by the agents
and submitted to the CPU for processing. The burtsy nature is due to the fact
that the agent receives a trigger externally and submits a set of jobs to the
CPU to determine its course of action. The CPU does not wait for all the tasks
of the agent to arrive to begin processing but processes them as they arrive.
It is important to notice that these tasks are fairly atomic in nature that can
be processed independently rather quickly with roughly similar processing
times. This makes it ideal to analyze as fluid queues.
Fluid models have been successfully used in modeling packet-level traf-
fic in both wired (also called wire-line) and wireless networks. Irrespective
of whether it is end systems such as computers and servers, or inside the
network such as routers, switches, and relays, or both such as multi-hop
wireless nodes and sensors, information flow in the form of packets can
be nicely modeled as fluids. In all these systems information stochastically
flows into buffers in the form of tiny packets. These packets are processed
and the stored packets are forwarded to a downstream node in the network.
This process called store-and-forward of packets results in an extremely effi-
cient network. Contrast this to the case where entire files are transferred hop
by hop as a whole (as opposed to packetizing them); a significant amount
of time would be wasted just waiting for entire files to arrive. In the store-
and-forward case, some packets of a file would already be at the destination
while other packets still at the origin, even if the origin and destination are
in two extremes of the world.
At a much coarser granularity, consider users that access a server farm for
web or other application processing. The users enter a session and within a
session they send requests (usually through browsers for web applications)
and receive responses. The users alternate between periods of activity and
quiet times during a session. Also the servers can process requests inde-
pendent of other requests. Thus within a session requests arrive in a bursty
fashion much like a bunch of packets that are part of a file in the previous
example. These requests are stored in a buffer and processed one by one by
the server. One can model each user as a source that toggles between burst-
ing requests and idling in a stochastic fashion. That can nicely be analyzed
using fluid queues. In summary, there are several computer and communi-
cation systems that can be modeled using fluid queues. They key elements
are: bursty traffic, the ability to process smaller elements of the traffic, and
finally (although not emphasized earlier) the smaller elements must have
520 Analysis of Queues
X(t)
c
Z(t)
FIGURE 9.2
Buffer with environment process Z(t) and output capacity c. (From Gautam, N., Quality of
service metrics, in Frontiers in Distributed Sensor Networks, S.S. Iyengar and R.R. Brooks, Eds.,
Chapman & Hall/CRC Press, Boca Raton, FL, 2004, pp. 613–628. With permission.)
522 Analysis of Queues
and Rolski [66]. In other words, the buffer content process {X(t), t ≥ 0}
(when B = ∞) is stable if the mean traffic arrival rate in steady state is less
than c, that is,
∞
τU = tdU(t).
0
Likewise, the off times (or “down” times) are according to a general distri-
bution with CDF D(·). The mean off time τD can be calculated in a similar
manner as
∞
τD = tdD(t).
0
For the rest of this book we assume that the CDFs U(·) and D(·) are such
that we can either compute their LSTs directly or they can be suitably
approximated as phase-type distributions whose LSTs can be computed.
When the buffer size B = ∞, the system would be stable if
rτU
< c.
τU + τD
Zn = Z(Sn +).
524 Analysis of Queues
Let
Note that {Zn , n ≥ 0} is a DTMC, which is embedded in the SMP. Assume that
this DTMC is irreducible and recurrent with transition probability matrix
P = G(∞).
Let
Gi (x) = P{S1 ≤ x|Z0 = i) = Gij (x)
j=1
Let
πi = lim P{Zn = i}
n→∞
[π1 π2 . . . π ] = [π1 π2 . . . π ]P and πi = 1.
i=1
πi τi
pi = lim P{Z(t) = i} = .
t→∞
πm τm
m=1
pi r(i) < c.
i=1
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 525
With that description we are now ready for performance analysis of fluid
queues.
S = {1, 2, . . . , }. The number of states is finite, that is, < ∞. The infinitesi-
mal generator matrix for the CTMC {Z(t), t ≥ 0} is Q = [qij ], which is an ×
matrix. Let pi be the steady-state probability that the environment is in state
i, that is,
pi = lim P{Z(t) = i}
t→0
for all i ∈ S. Since the CTMC is ergodic, we can compute the steady-state
probability row vector p = [p1 p2 . . . p ] by solving for
pQ = [0 0 . . . 0] and pi = 1.
i=1
Having described the buffer and its input, next we consider the output.
The output capacity of the buffer is c. This means that whenever there is fluid
in the buffer it gets removed at rate c. However, if the buffer is empty and
the input rate is smaller than c, then the output rate would be same as the
input rate. For that reason it is called output capacity as the actual output
rate could be smaller than c. The units of c are the same as that of r(Z(t)).
Thus both would be in terms of liters per second or bytes per second, etc.
The term output capacity is also sometimes referred to as channel capacity
or processor capacity. Unless stated otherwise, we assume that c remains a
constant. Before proceeding ahead it maybe worthwhile familiarizing with
the input, buffer, and output using Figure 9.2.
Next we describe the buffer contents and its dynamics. Let X(t) be the
amount of fluid in the buffer at time t. We first assume that B = ∞ (and later
relax that assumption). Whenever X(t) > 0 and Z(t) = i, X(t) either increases
at rate r(i) − c or decreases at rate c − r(i) depending on whether r(i) is greater
or lesser than c, respectively. To capture that we define the drift d(i) when
the CTMC {Z(t), t ≥ 0} is in state i (i.e., Z(t) = i) as
d(i) = r(i) − c
dX(t)
= d(i)
dt
if X(t) > 0 and Z(t) = i for any t ≥ 0. Next, when X(t) = 0 it stays at 0 as long
as the drift is non-positive, that is,
dX(t)
=0
dt
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 527
if X(t) = 0, Z(t) = i, and r(i) ≤ c for any t ≥ 0. However, as soon as the drift
becomes positive, the buffer contents would start increasing from 0.
Given the dynamics of X(t), a natural question to ask is if X(t) would drift
off to infinity (considering B = ∞). As it turns out, based on Equation 9.1, we
can write down the stability condition as
r(i)pi < c. (9.2)
i=1
In other words, the LHS of this expression is the steady-state average input
rate by conditioning on the state of the environment and unconditioning. We
need the average input rate to be smaller than the output capacity. Another
way of stating the stability condition is
d(i)pi < 0
i=1
D = diag[d(i)]
TABLE 9.1
List of Notations
B Buffer size (default is B = ∞)
Z(t) State of environment (CTMC) modulating buffer input at time t
S State space of CTMC {Z(t), t ≥ 0}
Number of states in S, i.e., = |S| and is finite
qij Transition rate from state i to j in CTMC {Z(t), t ≥ 0}
Q Infinitesimal generator matrix, i.e., Q = [qij ]
pi Stationary probability for the ergodic CTMC {Z(t), t ≥ 0}
r(i) Fluid arrival rate when Z(t) = i
R Rate matrix, i.e., R = diag[r(i)]
c Output capacity of the buffer
d(i) Drift in state Z(t) = i, i.e., d(i) = r(i) − c
D Drift matrix, i.e., D = diag[d(i)] = R − cI
X(t) Amount of fluid in the buffer at time t
528 Analysis of Queues
R = diag[r(i)]
It is not hard to see that the CTMC is ergodic with p = [0.0668 0.2647 0.4118
0.2567]. The fluid arrival rates in states 1, 2, 3, and 4 are 20, 15, 10, and 5 kbps,
respectively. In other words, r(1) = 20, r(2) = 15, r(3) = 10, and r(4) = 5 with
⎡ ⎤
20 0 0 0
⎢ 0 15 0 0⎥
R=⎢ ⎥
⎣ 0 0 10 0⎦ .
0 0 0 5
4
Notice that the system is stable since r(i)pi = 10.7086, which is less
i=1
than c = 12. For this numerical example, a sample path of Z(t) and X(t) is
depicted in Figure 9.3(a) with X(0) = 0 and Z(0) = 1. Notice that when the
drift is positive, fluid increases in the buffer and when the drift is nega-
tive, fluid is nonincreasing. It is also important to pay attention to the slope
(although not drawn to scale) as they correspond to the drift.
Now we consider the case when the buffer size B is finite, that is, B < ∞.
Most of what we have described thus far holds. We just state the differences
here. In particular, when the buffer is full, that is, X(t) = B, if the drift is pos-
itive, then fluid enters the buffer at rate c and a fraction of fluid is dropped
at rate r(Z(t)) − c. This would result in the X(t) process staying at B until the
drift becomes negative. However, when X(t) < B, the dynamics are identical
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 529
Z(t) Z(t)
4 4
3 3
2 2
1 1
t t
X(t) X(t)
t t
(a) (b)
FIGURE 9.3
(a) Sample path of environment process and buffer contents when B = ∞ and (b) Sample paths
when B < ∞.
to the infinite size case. The only other thing is that the system is always sta-
ble when B < ∞. Hence the stability condition described earlier is irrelevant.
To illustrate the B < ∞ case, we draw a sample path of Z(t) and X(t) for the
same example considered earlier, that is, c = 12 kbps and {Z(t), t ≥ 0} has
= 4 states, S = {1, 2, 3, 4}, and
⎡ ⎤
−10 2 5 3
⎢ ⎥
⎢ 0 −4 1 3⎥
Q=⎢
⎢
⎥
⎣ 1 1 −3 1⎥⎦
1 2 3 −6
with r(1) = 20, r(2) = 15, r(3) = 10, and r(4) = 5. The sample path is illustrated
in Figure 9.3(b). Notice that for the sake of comparison, the Z(t) sample
paths are identical in Figures 9.3(a) and (b). However, when the X(t) process
reaches B in Figure 9.3(b), it stays flat till the system switches to a negative
drift state.
The next step is to analyze the process {X(t), t ≥ 0}. In other words, we
would like to capture the dynamics of X(t) and characterize it by deriving a
probability distribution for X(t) at least as t → ∞. For that it is important to
notice that unlike most of the random variables we have seen thus far, X(t) is
a mixture of discrete and continuous parts. Notice from Figures 9.3(a) and (b)
that X(t) has a mass (i.e., takes on a discrete value) at 0. Also if B < ∞, then
X(t) has a mass at B as well. Everywhere else X(t) takes on continuous values.
This we will see that X(t) will have a mixture of discrete and continuous parts
530 Analysis of Queues
with a mass at 0 and possibly at B (if B < ∞). With that in mind we proceed
with analyzing the dynamics of X(t) next.
We seek to obtain an analytical expression for Fj (t, x) for all j ∈ S. For that we
consider some j ∈ S and derive the following expressions:
Using the definition of Fj (t, x) in Equation 9.3 we can then write down this
equation as
o(h)
+ qij Fi (t, x − (r(i) − c)h) + .
h
i∈S,i=j
532 Analysis of Queues
Now we take the limit as h → 0 and the above equation results in the
following partial differential equation:
Then the vector F(t, x) satisfies the following partial differential equation
∂F(t, x) ∂F(t, x)
+ D = F(t, x)Q, (9.10)
∂t ∂x
where D is the drift matrix. Verify that the jth vector element of the equation
is identical to that of Equation 9.8. Having described the partial differential
equation, the next step is to write down the initial and boundary conditions.
We assume that X(0) = x0 and Z(0) = z0 for some given finite and allowable
x0 and z0 . Hence the initial conditions for all j ∈ S are
1 if j = z0 and x ≥ x0
Fj (0, x) =
0 otherwise.
is the same as Fj (t, 0) = 0 since X(t) is nonnegative. Notice that if the drift is
negative at time t (i.e., Z(t) = j and d(j) < 0), there is a nonzero probability of
having X(t) = 0. In other words, X(t) has a mass at zero and when X(t) = 0
the drift is negative.
The second boundary condition is a little more involved. However, it
applies only when the buffer size is finite, that is, B < ∞. Just like how X(t)
has a mass at zero, it would also have a mass at B, that is, P{X(t) = B} would
be non-zero if the drift at time t is positive, that is, d(Z(t)) > 0. Thus as x
approaches B from below, Fj (t, x) governed by the partial differential equa-
tion would not include the mass at zero and would truly be representing
P{X(t) < B, Z(t) = j}. However, if the drift is negative, there would be no mass
at B for the X(t) process. Thus if d(j) < 0, P{X(t) < B, Z(t) = j} would be equal
to P{X(t) ≤ B, Z(t) = j} which would just be P{Z(t) = j} since X(t) is bounded
by B. When d(j) > 0 we will have Fj (t, B) + P{X(t) = B, Z(t) = j} = P{Z(t) = j}.
For that reason we do not have the second boundary condition for all j but
only for d(j) < 0. We will revisit this case subsequently through an example.
But it is worthwhile pointing out that this indeed was not an issue when
X(t) = 0 but only when X(t) = B because of the way the CDF is defined as a
right-continuous function.
That said, now we have a partial differential equation for the unknown
vector F(t, x) with initial and boundary conditions. The next step is to solve
it and obtain F(t, x). There are two approaches. One is to use a numer-
ical approach which is effective when numerical values are available for
all the parameters. There are software packages that can solve such partial
differential equations. The second approach is to analytically solve for F(t, x)
that we explain briefly. Let F̃j (w, x) be the LST of Fj (t, x) defined as
∞ ∂F(t, x)
F̃j (w, x) = e−wt dt.
∂t
0
Also the row vector of LSTs being the LST of the individual elements, hence
dF̃(w, x)
wF̃(w, x) − wF(0, x) + D = F̃(w, x)Q,
dx
We would like to obtain a closed-form analytical expression for Fj (x) for all
j ∈ S. For that we require the system to be stable if B = ∞, and the condition
of stability is
pi d(i) < 0.
i∈S
∂Fj (t, x)
lim =0
t→∞ ∂t
dF(x)
D = F(x)Q, (9.12)
dx
where
Since this is steady-state analysis, initial conditions would not matter. The
boundary conditions reduce to
dF(x)
D = F(x)Q,
dx
we try as solution
F(x) = eλx φ,
where φ is a 1 × row vector. The solution (F(x) = eλx φ) works if and only if
φ(λD − Q) = [0 0 . . . 0]
since dF(x)
dx D would be φλDe
λx and F(x)Q would be φQeλx . Essentially φ is
det(λD − Q) = 0, (9.14)
where det(A) is the determinant of square matrix A. Upon solving the equa-
tion, we would get the eigenvalues λ. Then using φ(λD − Q) = [0 0 . . . 0]
for each solution λ, we can obtain the corresponding left eigenvectors φ.
Before forging ahead, we describe some properties and notation. We
partition the state space S into three sets, S+ , S0 , and S− , that denote the
states where the drift is positive, zero, and negative, respectively. Also, + ,
0 , and − are the number of states with positive, zero, and negative drift,
respectively, such that + + 0 + − = . Thus we have
S0 = {i ∈ S : d(i) = 0},
+ = |S+ |,
0 = |S0 |,
− = |S− |.
Using that we can write down some properties. Firstly notice that Equa-
tion 9.14 would have + + − solutions {λi , i = 1, 2, . . . , + + − }. The + + −
solutions
could include multiplicities. The crucial property is that when
pi d(i) < 0, exactly + of the λi values have negative real parts, one is
i∈S
zero, and, − − 1 have positive real parts.
For sake of convenience we number the λi ’s as
where Re(ω) is the real part of a complex number ω. Using this specific
order we are now ready to state the general solution to the differential
equation (9.12) as
+ +−
F(x) = ai eλi x φi , (9.16)
i=1
where ai values are some constants that need to be obtained (recall that we
know how to compute λi and φi for all i).
To compute ai values, we explicitly consider two cases depending on
whether the size of the buffer is infinite or finite. Hence we have the
following:
• If B = ∞ with pi d(i) < 0, then ai values are given by the
i∈S
solution to
+ +1
ai φi (j) = 0 if j ∈ S+ , (9.19)
i=1
− ++
ai φi (j) = 0 if j ∈ S+ (9.20)
i=1
− ++
ai φi (j)eλi B = pj if j ∈ S− , (9.21)
i=1
where φi (j) is the jth element of vector φi . Equation 9.20 is due to the
boundary condition Fj (0) = 0 if d(j) > 0 for all j ∈ S which is equiva-
− ++
lent to ai φi (j) = 0 if j ∈ S+ since the elements in S+ are all
i=1
those with positive drift, that is, d(j) > 0. Likewise, Equation 9.21 is
due to the boundary condition Fj (B) = pj if d(j) < 0 for all j ∈ S which
− ++
is equivalent to ai φi (j)eλi B = pj if j ∈ S− since the elements in
i=1
S− are all those with negative drift, that is, d(j) < 0.
9.2.4 Examples
In this section, we present a few examples to illustrate the approach to obtain
buffer content distribution and describe relevant insights. For that we require
characteristics of the environment process, namely, the generator matrix Q
and rate matrix R as well as buffer characteristics such as size B and output
capacity c. Then using Q, R, B, and c we can obtain the joint distribution
Fj (x) as well as the marginal limiting distribution of X(t). The notation, ter-
minology, and methodology used here are described in Sections 9.2.1, 9.2.2,
and 9.2.3. All but the last example are steady-state analyses, and they are all
presented in a problem-solution format.
538 Analysis of Queues
Problem 85
Consider the example described in Section 9.2.1 where there is an infinite-
sized buffer with output capacity c = 12 kbps and the input is driven by an
environment CTMC {Z(t), t ≥ 0} with = 4 states, S = {1, 2, 3, 4}, and
⎡ ⎤
−10 2 3 5
⎢ 0 −4 1 3⎥
Q=⎢
⎣ 1
⎥
1 −3 1⎦
1 2 3 −6
and fluid arrival rates in states 1, 2, 3, and 4 are 20, 15, 10, and 5 kbps,
respectively. Obtain the joint distribution vector F(x) as well as a graph of
the CDF limt→∞ P{X(t) ≤ x} versus x.
Solution
For this problem we have Q described earlier,
⎡ ⎤
20 0 0 0
⎢ 0 15 0 0⎥
⎢ ⎥
R=⎢ ⎥,
⎣ 0 0 10 0⎦
0 0 0 5
⎡ ⎤
8 0 0 0
⎢0
⎢ 3 0 0⎥⎥
D=⎢ ⎥.
⎣0 0 −2 0⎦
0 0 0 −7
det(λD − Q) = 0,
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 539
to obtain
Notice that the λi values are ordered according to Equation 9.15. Then
using φ(λD − Q) = [0 0 0 0] for each solution λ, we can obtain
the corresponding left eigenvectors φ1 = [−0.2297 0.9600 0.1087 0.1179],
φ2 = [0.1746 0.7328 0.5533 0.3555], φ3 = [0.1201 0.4754 0.7396 0.4610], and
φ4 = [−0.0317 − 0.0660 − 0.9741 0.2138].
Thereby using Equation 9.16 we can write down F(x) as
0.9
0.8
P(X < = x)
0.7
0.6
0.5
0.4
0 1 2 3 4 5 6 7 8 9 10
X
FIGURE 9.4
Graph of P{X ≤ x} vs. x for the infinite buffer case (Problem 85).
To present the similarities and differences between the cases when the
buffer size is infinite versus finite, in the next problem we consider the exact
same numerical values as the previous problem, except for the size of the
buffer. That is described next.
Problem 86
Consider Problem 85 with the only exception that the buffer size is finite
with B = 2. Obtain the joint distribution vector F(x) for 0 ≤ x < B as well as
the distribution for X(t) as t → ∞.
Solution
Recall that the analysis in Section 9.2.3 does not make any assumptions about
B until obtaining the constants ai . Thus from the solution to Problem 85 we
have (for 0 ≤ x < B)
where
and
All we need to compute are a1 , a2 , a3 , and a4 . For that we use Equations 9.20
and 9.21.
From Equation 9.20 we get a1 φ1 (1) + a2 φ2 (1) + a3 φ3 (1) + a4 φ4 (1) =
−0.2297a1 + 0.1746a2 + 0.1201a3 − 0.0317a4 = 0 and a1 φ1 (2) + a2 φ2 (2) + a3 φ3
(2) + a4 φ4 (2) = 0.96a1 + 0.7328a2 + 0.4754a3 − 0.0660a4 = 0. Likewise, from
Equation 9.21 we get a1 φ1 (3)eλ1 B + a2 φ2 (3)eλ2 B +a3 φ3 (3)eλ3 B + a4 φ4 (3)eλ4 B =
0.0070a1 + 0.1669a2 + 0.7396a3 −32.0307a4 = p3 and a1 φ1 (4)eλ1 B + a2 φ2 (4)eλ2 B
+ a3 φ3 (4)eλ3 B + a4 φ4 (4)eλ4 B = 0.0076a1 + 0.1072a2 + 0.4610a3 + 7.0293a4 = p4 .
Using the fact that p3 = 0.4118 and p4 = 0.2567, these four equations can be
solved to obtain a1 = 0.0097, a2 = − 0.4397, a3 = 0.6581, and a4 = 0.000050924.
Next using the notation X(t) → X as t → ∞, we have the distribution of
X given by
Problem 87
(CTMC on–off source) Consider a source that inputs fluid into an infinite-
size buffer. The source toggles between on and off states. The on times are
according to exp(α) and off times according to exp(β). Traffic is generated
at rate r when the source is in the on-state and no traffic is generated when
the source is in the off-state. Assume that r > c, where c is the usual output
542 Analysis of Queues
0.9
0.8
P(X < = x)
0.7
0.6
0.5
0.4
FIGURE 9.5
Graph of P{X ≤ x} vs. x for the finite buffer case of Problem 86.
capacity. Find the condition of stability. Assuming that the stability condi-
tion is satisfied what is the steady-state distribution of the buffer contents in
terms of r, c, α, and β?
Solution
The environment process {Z(t), t ≥ 0} is a CTMC with = 2 states and
S = {1, 2} with 1 representing off and 2 representing the on-state. Therefore,
−β β 0 0
Q= and R = .
α −α 0 r
rβ/(α + β) < c.
State 1 has negative drift and state 2 has positive drift. Hence by solving
for Equation 9.14 we would get one λ value with negative real part and one
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 543
det(λD − Q) = 0,
which yields
(−λc + β)(λr − λc + α) − αβ = 0.
λ2 = 0.
From the stability condition rβ/(α + β) < c we have rβ − c(α + β) < 0 and
dividing that by the positive quantity c(r − c), we get β/c − α/(r − c) < 0.
Hence λ1 < 0. Thus we have verified that one λ value has negative real part
and the other one is zero.
Next, using φ(λD − Q) = [0 0] for each λ, we can obtain the correspond-
ing left eigenvectors as φ1 = [(r − c)/c 1] and φ2 = [α/(α + β) β/(α + β)].
Thereby using Equation 9.16 we can write down F(x) as
All we need to compute are a1 and a2 . For that we use Equations 9.18 and
9.19. From Equation 9.18, a2 = 1/(φ2 1) = 1. Also, from Equation 9.19 we get
a1 φ1 (2) + a2 φ2 (2) = 0 since that corresponds to state with a positive drift and
that results in a1 = − α β + β . Hence we have
αc − (r − c)βeλ1 x β
F(x) = [F1 (x) F2 (x)] = (1 − eλ1 x ) .
c(α + β) α+β
βr
lim P{X(t) ≤ x} = F1 (x) + F2 (x) = 1 − eλ1 x , (9.22)
t→∞ c(α + β)
544 Analysis of Queues
where
βr
lim P{X(t) > x} = eλ1 x . (9.24)
t→∞ c(α + β)
βr
lim P{X(t) > 0} =
t→∞ c(α + β)
which makes sense since in a cycle of one busy period and one idle period
on average a quantity proportional to rβ/(α + β) fluid arrives but that fluid
is depleted during a busy period at rate c. Thus the ratio of the mean busy
period to the mean cycle of one busy period to an idle period must equal
rβ/[c(α + β)] which is also the equation for the fraction of time there is a
non-zero amount of fluid in the buffer.
Problem 88
Consider an on–off source that generates fluid into a buffer of infinite size
with output capacity 8 units per second. The on times are IID random vari-
ables with CDF U(t) = 1 − 0.6e−3t − 0.4e−2t and the off times are IID Erlang
random variables with mean 0.5 and variance 1/12 in appropriate time units
compatible with the on times. When the source is on, fluid is generated at
rate 16 units per second and no fluid is generated when the source is off.
Compute the probability that there would be more than 10 units of fluid in
the buffer in steady state.
Solution
Notice that the on times correspond to a two-phase hyperexponential distri-
bution. So the on time would be exp(3) with probability 0.6 and it would be
exp(2) with probability 0.4, which can be deduced from U(t). The off times
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 545
correspond to the sum of three IID exp(6) random variables. Thus we can
write down the environment process {Z(t), t ≥ 0} as an = 5 state CTMC with
states 1 and 2 corresponding to on and states 3, 4, and 5 corresponding to off.
Thus the Q matrix corresponding to S = {1, 2, 3, 4, 5} is
⎡ ⎤
−3 0 3 0 0
⎢ 0 −2 2 0 0 ⎥
⎢ ⎥
Q=⎢
⎢ 0 0 −6 6 0 ⎥.
⎥
⎣ 0 0 0 −6 6 ⎦
3.6 2.4 0 0 −6
Using that and the fact that c = 8, we have the drift matrix
⎡ ⎤
8 0 0 0 0
⎢ 0 8 0 0 0 ⎥
⎢ ⎥
D=⎢
⎢ 0 0 −8 0 0 ⎥.
⎥
⎣ 0 0 0 −8 0 ⎦
0 0 0 0 −8
Since B = ∞, we need to first check if the buffer is stable. For that we obtain
[p1 p2 p3 p4 p5 ] = [0.2222 0.2222 0.1852 0.1852 0.1852]. We have
5
Dii pi = − 0.8889 < 0, hence the system is stable.
j=1
The rest of the analysis proceeds very similar to Problem 85 with the only
exception being the final expression to compute, which here is the probabil-
ity that there would be more than 10 units of fluid in the buffer in steady
state. Nevertheless for the sake of completion we go through the entire pro-
cess. Notice that S+ = {1, 2} since states 1 and 2 have positive drift. Likewise
S− = {3, 4, 5} since states 3, 4, and 5 have negative drift. Also since there are
no zero-drift states, S0 is a null set. Also + = 2 and − = 3. Thus by solv-
ing Equation 9.14 we would get two λ values with negative real parts, one λ
value would be zero and two with positive real parts. We solve for λ in the
characteristic equation
det(λD − Q) = 0,
546 Analysis of Queues
to obtain
= 1 − 0.0235e−0.3227x − 0.8654e−0.0836x
for all x ≥ 0. Thus the probability that there would be more than 10 units of
fluid in the buffer in steady state is P(X > 10) = 0.0235e−3.227 + 0.8654e−0.836 =
0.3761.
0.9
0.8
0.7
0.6
P{X > x}
0.5
0.4
0.3
0.2
0.1
0
0 5 10 15 20 25 30
x
FIGURE 9.6
Graph of P{X > x} vs. x for the infinite buffer case of Problem 88.
Problem 89
The speed of a particular vehicle on a highway is modulated by a five-state
CTMC {Z(t), t ≥ 0} with S = {1, 2, 3, 4, 5}. When the CTMC is in state i, the
speed of the vehicle is Vi = 75/i miles per hour for i ∈ S. The infinitesimal
generator matrix of the CTMC is
⎡ ⎤
−919.75 206.91 264.85 238.67 209.32
⎢ 223.01 −971.71 301.98 232.73 213.98⎥
⎢ ⎥
⎢ ⎥
Q = ⎢ 343.04 277.78 −1283.57 392.72 270.03⎥
⎢ ⎥
⎣ 353.91 232.27 213.69 −1059.47 259.59⎦
370.92 200.89 216.80 225.60 −1014.21
in units of h−1 . Assume that the CTMC is in state 1 at time 0. Obtain a method
to compute the CDF of the amount of time it would take for the vehicle to
travel 1 mile. Also numerically compute for sample values of t the proba-
bility that the vehicle would travel one mile before time t. Verify the results
using simulations.
548 Analysis of Queues
Solution
Let T(x) be the random time required for the vehicle to travel a distance
x miles. We are interested in P{T(x) ≤ t} for x = 1; however, we provide
an approach for a generic x. Now let X(t) be the distance the vehicle trav-
eled in time t. A crucial observation needs to be made which is that the
events {T(x) ≥ t} and {X(t) ≤ x} are equivalent. In other words, the event
that a vehicle travels in time t a distance less than or equal to x is the
same as saying that the time taken to reach distance x is greater than or
equal to t. Therefore, P{T(x) ≥ t} = P{X(t) ≤ x} and hence the CDF of T(x) is
P{T(x) ≤ t} = 1 − P{X(t) ≤ x}.
Next, we will show a procedure to compute P{X(t) ≤ x}, and thereby
obtain P{T(x) ≥ t}. Notice that, X(t) can be thought of as the amount of fluid
in a buffer at time t with X(0) = 0 modulated by an environment process
{Z(t), t ≥ 0} with Z(0) = 1. The buffer size is infinite (B = ∞) and the output
capacity c = 0. Fluid flows in at rate r(Z(t)) = VZ(t) = 75/Z(t) at time t. Essen-
tially the amount of fluid at time t corresponds to the distance traveled by
the vehicle at time t.
Now define the following joint probability distribution,
which is identical to that of Equation 9.3. If we know Fj (t, x), then we can
immediately obtain the required P{T(x) ≤ t} using
P{T(x) ≤ t} = 1 − P{X(t) ≤ x} = 1 − Fi (t, x).
i∈S
Using the representation in Equation 9.9 we define the row vector F(t, x) as
Based on Equation 9.10 we know that the vector F(t, x) satisfies the partial
differential equation
∂F(t, x) ∂F(t, x)
+ D = F(t, x)Q,
∂t ∂x
Note that D is the drift matrix (essentially diagonal matrix of r(i) − c values
with r(i) = 75/i and c = 0).
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 549
∞
F∗i (s2 , x) = e−s2 t Fi (t, x)dt.
0
∞
F̃∗i (s2 , s1 ) = e−s1 x dF∗i (s2 , x).
0
Writing in matrix form, we have F̃∗ (s2 , s1 ) = F̃∗i (s2 , s1 ) as the 1 × 5 row
∗
∗ i∈S
vector of transforms. Likewise F (s2 , x) = Fi (s2 , x) i ∈ S is the row vector of
Laplace transforms of Fi (t, x) with respect to t.
To solve the partial differential equation, we take the LT and then the
LST of the partial differential equation to get an expression in the transform
space
where Ã(s1 ) is a 1 × 5 row vector of the LSTs of the initial condition. There
are several software packages that can be used to numerically invert this
transform. As additional complication is the 2D nature. Readers are referred
to Kharoufeh and Gautam [62] for a numerical inversion algorithm as well
as a list of references for different inversion techniques. Using x = 1 and the
initial condition X(0) = 0 and Z(0) = 1 giving rise to
Ã(s1 ) = [ 1 0 0 0 0 ]
TABLE 9.2
Travel Time CDF to Traverse x = 1
Mile for Sample t Values
t P{T(x) ≤ t} P{T(x) ≤ t}
(min) Inversion Simulation
1.25 0.0786 0.0777
1.47 0.3335 0.3352
1.70 0.6859 0.6865
1.92 0.9141 0.9136
2.14 0.9873 0.9872
2.37 0.9991 0.9991
2.59 1.0000 0.9999
2.81 1.0000 1.0000
Source: Kharoufeh, J.P. and Gautam, N.,
Transp. Sci., 38(1), 97, 2004. With
permission.
Thus T denotes the time it takes for the buffer content to reach a or b for
the first time. Our objective is to obtain a distribution for T as a CDF (i.e.,
P{T ≤ t}) or its LST (i.e., E[e−wT ]). For that let
Hij (x, t + h) = Hkj (x + h(r(i) − c), t) qik h
k∈S,k=i
where o(h) represents higher order terms of h. Subtracting Hij (x, t) on both
sides and rearranging terms, we get
o(h)
Taking the limit as h → 0, since h → 0, we have
∂H(x, t) ∂H(x, t)
−D = QH(x, t). (9.26)
∂t ∂x
The first boundary condition (i.e., Equation 9.27) is so because if the initial
buffer content is b and source is in state j (assuming r(j) > c), then essentially
the first passage time has occurred. Therefore, the probability that the first
passage time occurs before time t and the source is in state j when it occurred
is 1. The second boundary condition (Equation 9.28) is based on the fact that
if the initial buffer content is a and the source is in state j such that r(j) < c,
then the first passage time is zero. Hence, the probability that the first pas-
sage time happens before time t and the source is in state j when it occurred
is 1. The third boundary condition (i.e., Equation 9.29) is due to the fact that
although the first passage time is zero, the probability that the source is state
j when the first passage time occurs is zero (since at time t = 0 the source is
state i with r(i) > c and i = j). For exactly the same reason, the last boundary
condition (Equation 9.30) is the way it is, that is, the first passage time is zero
but it cannot occur when the source is state j, given that the source was in
state i at time t = 0 with r(i) < c and i = j.
Next we solve the partial differential equation (PDE), that is, Equa-
tion 9.26. First we take the LST across the PDE with respect to t. That reduces
to the following ordinary differential equation (ODE):
dH̃(x, w)
D = (wI − Q)H̃(x, w) (9.31)
dx
where H̃(x, w) is the LST of H(x, t) with respect to t and that in turn equals
the LST of each element of H(x, t). Not only is the ODE easier to solve, but we
can also immediately obtain the LST of the CDF of the first passage time T.
We first solve the ODE. For that let S1 (w), . . . , S (w) be scalar solutions to
the characteristic equation
det(DS(w) − wI + Q) = 0.
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 553
For each Sj (w) we can find column vectors φj (w) that satisfy
Thus given w, the Sj (w) values are eigenvalues and φ1 (w), . . . , φ (w) are
the corresponding right eigenvectors. Using those we can write down the
solution to Equation 9.31.
The solution to this ODE is given by
H̃·,j (x, w) = a1,j (w)eS1 (w)x φ1 (w) + a2,j (w)eS2 (w)x φ2 (w)
where ai,j (w) values are constants to be determined and H̃·,j (x, w) is a column
vector such that
⎡ ⎤
H̃1j (x, w)
⎢ H̃ (x, w) ⎥
⎢ 2j ⎥
H̃·,j (x, w) = ⎢
⎢ ..
⎥.
⎥
⎣ . ⎦
H̃j (x, w)
We can obtain the 2 ai,j (w) values using the 2 equations corresponding to
the LST of the boundary condition equations (9.27) through (9.30) which for
all i ∈ S and j ∈ S are
Thereby, using Equation 9.32 we can write down the LST of the distri-
bution of T. In particular, given X(0) = x0 and Z(0) = z0 , the LST of the first
passage time distribution can be computed as
E e−wT = H̃z0 j (x0 , w). (9.33)
j=1
Although in most instances this equation cannot be inverted to get the CDF
of T, one can quickly get moments of T. Specifically for r = 1, 2, 3, . . .,
554 Analysis of Queues
r
−wT
r rd E e
E[T ] = (−1)
dwr
at w = 0. We can also obtain the probability that the first passage time ends
in state j. In particular
since the CDF in the limit t → ∞ is equivalent to its LST in the limit w → 0.
In the next section we present some examples to illustrate the approach. We
now present two remarks for the cases when a = b ≤ x and a = b ≥ x.
Remark 22
since the first passage time would be zero if we started at a in a state with neg-
ative drift. However, we cannot solve for all ai,j (w) values in Equation 9.32
using the given boundary conditions as there are not enough equations as
unknowns. For that we use some additional conditions including ai,j (w) = 0
if Si (w) > 0 since as x → ∞ we require the condition for Hij (x, t) to be a joint
probability distribution. In addition, it is worthwhile noting that since a first
passage time can only end in a state with negative drift
for any x.
Remark 23
For the case a = b ≥ x as well the analysis is exactly the same as the case
a ≤ x ≤ b, especially the definition of T in Equation 9.25 (again the “or” being
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 555
redundant) and the PDE in Equation 9.26. The only exception is that the
boundary conditions would now be
since the first passage time would be zero if we started at b in a state with
positive drift. However, we cannot solve for all ai,j (w) values in Equation 9.32
using the given boundary conditions as there are not enough equations as
unknowns. For that notice that if the fluid level reached zero in state i (for
that r(i) must be less than c) then it stays at zero till the environment process
changes state to some state k = i. Thus the first passage time is equal to the
stay time in state i plus the remaining time from k till the first passage time
starting in k. By conditioning on k and unconditioning, we can write down
in LST format for all i ∈ S such that r(i) < c
qik qik
H̃ij (0, w) = H̃kj (0, w) .
−qii qik + w
k∈S,k=i
In addition, it is worthwhile noting that since a first passage time can only
end in a state with positive drift
for any x.
9.3.2 Examples
To explain some of the nuances described in the previous section on first
passage times, we consider a few examples here. They are presented in a
problem–solution format.
Problem 90
Consider a reservoir from which water is emptied out at a constant rate
of 10 units per day. Water flows into the reservoir according to a CTMC
556 Analysis of Queues
∞ ∂Hij (x, t)
H̃ij (x, w) = e−wt dt.
∂t
0
To compute the expected number of days from t = 0 for the water level to
become excessive or concerning, all we need is
d
5
E[T|X(0) = 30, Z(0) = 2] = (−1) H̃2j (30, w)
dw
j=1
d
at w = 0. For that we can compute dw H̃ij (x, w) at w = 0 by taking a very small
H̃ij (x,h)−H̃ij (x,0)
h > 0 and obtaining h .
Before explaining how to obtain that, con-
sider the other question, that is, the probability that at the end of a nominal
spell the water level would be excessive. In other words, we need
5
5
P{Z(T) ∈ {3, 4, 5}|X(0) = 30, Z(0) = 2} = H2j (30, ∞) = H̃2j (30, 0).
j=3 j=3
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 557
Thus if we know for all i and j the values of H̃ij (x, h) for some small h
and H̃ij (x, 0), we can immediately compute both E[T|X(0) = 30, Z(0) = 2] and
P{Z(T) ∈ {3, 4, 5}|X(0) = 30, Z(0) = 2}.
To compute H̃ij (x, h) and H̃ij (x, 0), we can write down from Equation 9.32,
for j = 1, 2, 3, 4, 5,
⎡ ⎤
H̃1j (x, w)
⎢ ⎥
⎢ H̃2j (x, w) ⎥
⎢ ⎥
⎢ H̃3j (x, w) ⎥ = a1,j (w)eS1 (w)x φ1 (w)
⎢ ⎥
⎢ ⎥
⎣ H̃4j (x, w) ⎦
H̃5j (x, w)
+ a2,j (w)eS2 (w)x φ2 (w)
+ · · · + a5,j (w)eS5 (w)x φ5 (w), (9.34)
where ai,j (w), Sj (w), and φj (w) values need to be determined especially for
w = 0 and w = h for some small h.
We can obtain S1 (w), . . . , S5 (w) as the five scalar solutions to the charac-
teristic equation
det(DS(w) − wI + Q) = 0.
Thus we get
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0.3067 −0.0675 −0.4472
⎢ −0.9393 ⎥ ⎢ −0.0590 ⎥ ⎢ −0.4472 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
φ1 (0) = ⎢ −0.1234 ⎥, φ2 (0) = ⎢ −0.9832 ⎥, φ3 (0) = ⎢ −0.4472 ⎥,
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ −0.0783 ⎦ ⎣ 0.1425 ⎦ ⎣ −0.4472 ⎦
−0.0479 0.0705 −0.4472
⎡ ⎤ ⎡ ⎤
0.4064 −0.0801
⎢ 0.4159 ⎥ ⎢ −0.0637 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
φ4 (0) = ⎢ 0.4359 ⎥ , and φ5 (0) = ⎢ −0.1184 ⎥.
⎢ ⎥ ⎢ ⎥
⎣ 0.4774 ⎦ ⎣ −0.8042 ⎦
0.4939 0.5733
558 Analysis of Queues
Here, the values of φj (h) for j = 1, 2, 3, 4, 5 and small h are too close to the
respective φj (0) values and hence they are not reported.
To obtain the ai,j (w) values for i = 1, 2, 3, 4, 5, j = 1, 2, 3, 4, 5, and for two
sets namely w = 0 and w = h, we solve the following 25 boundary condition
equations for each w:
⎡ ⎤
a1,1 (0) a2,1 (0) a3,1 (0) a4,1 (0) a5,1 (0)
⎢ a1,2 (0) a2,2 (0) a3,2 (0) a4,2 (0) a5,2 (0) ⎥
⎢ ⎥
⎢ ⎥
⎢ a1,3 (0) a2,3 (0) a3,3 (0) a4,3 (0) a5,3 (0) ⎥
⎢ ⎥
⎣ a1,4 (0) a2,4 (0) a3,4 (0) a4,4 (0) a5,4 (0) ⎦
a1,5 (0) a2,5 (0) a3,5 (0) a4,5 (0) a5,5 (0)
⎡ ⎤
−2416.2 0.1129 × 10−9 −4.5941 −2.3770 0.0541 × 10−3
⎢ 2516.0 0.0369 × 10−9 −1.5004 −0.7763 0.0177 × 10−3 ⎥
⎢ ⎥
⎢ ⎥
=⎢ −9.2 −0.5470 × 10−9 0.3563 0.2910 −0.0490 × 10−3 ⎥.
⎢ ⎥
⎣ −36.2 0.2120 × 10−9 1.4369 1.1714 −0.6466 × 10−3 ⎦
−54.4 0.1851 × 10−9 2.0652 1.6909 0.6238 × 10−3
⎡ ⎤
a1,1 (h) a2,1 (h) a3,1 (h) a4,1 (h) a5,1 (h)
⎢ a1,2 (h) a2,2 (h) a3,2 (h) a4,2 (h) a5,2 (h) ⎥
⎢ ⎥
⎢ a1,3 (h) a2,3 (h) a3,3 (h) a4,3 (h) a5,3 (h) ⎥
⎢ ⎥
⎣ a1,4 (h) a2,4 (h) a3,4 (h) a4,4 (h) a5,4 (h) ⎦
a1,5 (h) a2,5 (h) a3,5 (h) a4,5 (h) a5,5 (h)
⎡ ⎤
−2416.2 0.1129 × 10−9 −4.5929 −2.3757 0.0541 × 10−3
⎢ 2516.0 0.0369 × 10−9 −1.5000 −0.7759 0.0177 × 10−3 ⎥
⎢ ⎥
=⎢ −9.2 −0.5470 × 10−9 0.3562 0.2909 −0.0490 × 10−3 ⎥.
⎢ ⎥
⎣ −36.2 0.2120 × 10−9 1.4364 1.1709 −0.6466 × 10−3 ⎦
−54.4 0.1851 × 10−9 2.0645 1.6901 0.6238 × 10−3
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 559
The values of H̃ij (30, h) for some small h do not differ in the first four sig-
nificant digits from the corresponding H̃ij (30, 0) values for i = 1, 2, 3, 4, 5 and
j = 1, 2, 3, 4, 5.
Thus we have the expected number of days from t = 0 (with initial water
level X(0) = 30) for the water level to become excessive or concerning as
d
5
E[T|X(0) = 30, Z(0) = 2] = (−1) H̃2j (30, w)|w=0
dw
j=1
5
H̃2j (30, h) − H̃2j (30, 0)
= − lim ≈ 5.4165
h→0 h
j=1
5
5
= H2j (30, ∞) = H̃2j (30, 0) = 0.306.
j=3 j=3
Next we consider a problem that leverages off the previous problem but
uses the conditions in Remark 22. The objective is to provide a contrast
against the previous problem, however, under a similar setting.
560 Analysis of Queues
Problem 91
Consider the setting in Problem 90 where the first passage time ends with
the water level becoming excessive and the environment in one of the three
positive drift states 3, 4, or 5 with probabilities 0.0945, 0.3911, or 0.5144,
respectively (these are the probabilities that the first passage time would
end in states 3, 4, or 5 given that it ended with water level becoming exces-
sive). Compute how long the water level will stay excessive before becoming
nominal?
Solution
We let t = 0 as the time when the water level just crossed over from
nominal to excessive. Using the same notation as in Problem 90 for X(t)
and Z(t), we have X(0) = 40, P{Z(0) = 3} = 0.0945, P{Z(0) = 4} = 0.3911, and
P{Z(0) = 5} = 0.5144. Let T be the time when the water level crosses back to
becoming nominal, that is,
d
5 5
E[T] = (−1) H̃ij (40, w)P{Z(0) = i}
dw
i=3 j=1
d
at w = 0. To compute dw H̃ij (x, w) at w = 0 here too we consider a very small
H̃ (x,h)−H̃ (x,0)
h > 0 and obtain it approximately as ij h
ij
. Now, to evaluate H̃ij (x, h)
and H̃ij (x, 0), we can write down from Equation 9.32, for j = 1, 2, 3, 4, 5,
⎡ ⎤
H̃1j (x, w)
⎢ ⎥
⎢ H̃2j (x, w) ⎥
⎢ ⎥
⎢ H̃3j (x, w) ⎥ = a1,j (w)eS1 (w)x φ1 (w)
⎢ ⎥
⎢ ⎥
⎣ H̃4j (x, w) ⎦
H̃5j (x, w)
+ a2,j (w)eS2 (w)x φ2 (w) + · · · + a5,j (w)eS5 (w)x φ5 (w), (9.35)
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 561
where ai,j (w), Sj (w), and φj (w) values need to be determined for w = 0 and
w = h for some small h.
We can obtain Sj (w) for j = 1, 2, 3, 4, 5 as the scalar solutions to the
characteristic equation
det(DS(w) − wI + Q) = 0.
But this is identical to that in Problem 90. Likewise φj (w) can be computed
as the column vectors that satisfy
⎡ ⎤
H̃1j (x, w) ⎡ ⎤
⎢ ⎥ 0
⎢ H̃2j (x, w) ⎥ ⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥
⎢ H̃3j (x, w) ⎥=⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎣ 0 ⎦
⎣ H̃4j (x, w) ⎦
0
H̃5j (x, w)
for j = 3, 4, 5 since the first passage time can never end in states 3, 4, or 5
as the drift is positive in those states (only when the drift is negative, it is
possible to cross over into a particular buffer content level from above). Thus
we need only ai,j (w) values for i = 1, 2, 3, 4, 5 and j = 1, 2. But ai,j (w) = 0 for
all i where Si (w) > 0, otherwise as x → ∞, the expression for H̃ij (x, w) in
Equation 9.35 would blow up. Hence, we have a2,j (w) = 0, a4,j (w) = 0, and
a5,j (w) = 0 for j = 1, 2. Thus all we are left with is to obtain a1,1 (w), a1,2 (w),
a3,1 (w), and a3,2 (w). For that we have four boundary conditions, namely,
H̃11 (40, w) = 1,
H̃22 (40, w) = 1,
H̃21 (40, w) = 0.
562 Analysis of Queues
Thus we have
⎡ ⎤
H̃11 (40, w) ⎡ ⎤
⎢ ⎥ 1
⎢ H̃21 (40, w) ⎥ ⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥
⎢ H̃31 (40, w) ⎥ = a1,1 (w)e40S1 (w) φ1 (w) + a3,1 (w)e40S3 (w) φ3 (w) = ⎢ · ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎣ · ⎦
⎣ H̃41 (40, w) ⎦
·
H̃51 (40, w)
and
⎡ ⎤
H̃12 (40, w) ⎡ ⎤
⎢ ⎥ 0
⎢ H̃22 (40, w) ⎥ ⎢ 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢ H̃32 (40, w) ⎥ = a1,2 (w)e40S1 (w) φ1 (w) + a3,2 (w)e40S3 (w) φ3 (w) = ⎢ · ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎣ · ⎦
⎣ H̃42 (40, w) ⎦
·
H̃52 (40, w)
.
where the .. in the column vector denotes unknown quantities. Once we
know a1,1 (w), a1,2 (w), a3,1 (w), and a3,2 (w), the unknown quantities can be
obtained. Solving for the four equations at w = 0 and w = h = 0.000001 we
get a1,1 (0) = 7.7343 × 106 , a3,1 (0) = − 1.6856, a1,2 (0) = − 7.7343 × 106 , and
a3,2 (0) = − 0.5505; also a1,1 (h) = 7.7344 × 106 , a3,1 (h) = − 1.6858, a1,2 (h) =
−7.7344 × 106 , and a3,2 (h) = − 0.5505.
Now using ai,j (w), Si (w), and φi (w) values for i = 1, 3, j = 1, 2 at w = 0
and w = h in Equation 9.35 we can compute H̃ij (x, w). In particular for x = 40
(which is what we need here) we get
⎡ ⎤
H̃11 (40, 0) H̃12 (40, 0) ⎡ ⎤
⎢ ⎥ 1 0
⎢H̃21 (40, 0) H̃22 (40, 0)⎥ ⎢0 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢H̃31 (40, 0) H̃32 (40, 0)⎥ ⎢ 0.3452⎥
⎢ ⎥ = ⎢0.6548 ⎥.
⎢ ⎥ ⎣0.6910 0.3090⎦
⎣H̃41 (40, 0) H̃42 (40, 0)⎦
0.7153 0.2847
H̃51 (40, 0) H̃52 (40, 0)
The values of H̃ij (40, h) for some small h do not differ in the first four sig-
nificant digits from the corresponding H̃ij (40, 0) values for i = 1, 2, 3, 4, 5 and
j = 1, 2, hence they are not reported.
Thus we have the expected number of days from t = 0 (with ini-
tial water level X(0) = 40 as well as initial environmental conditions
P{Z(0) = 3} = 0.0945, P{Z(0) = 4} = 0.3911, and P{Z(0) = 5} = 0.5144) for the
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 563
d
5 2
E[T] = (−1) H̃ij (40, w)|w=0 P{Z(0) = i}
dw
i=3 j=1
5
2
H̃ij (40, h) − H̃ij (40, 0)
= − lim P{Z(0) = i}
h→0 h
i=3 j=1
Having considered the case in Remark 22, next we solve a problem that
uses the conditions in Remark 23. It is worthwhile contrasting against the
previous two problems considering they are under a similar setting.
Problem 92
Consider the setting in Problem 90 with the exception that at time t = 0 we
just enter the concerning level and Z(0) = 1. What is the expected sojourn
time for the water level to stay at the concerning level before moving
to nominal?
Solution
At t = 0 water level just crosses over from nominal to concerning. Using the
same notation as in Problem 90 for X(t) and Z(t), we have X(0) = 20 and
Z(0) = 1. Let T be the time when the water level crosses back to becoming
nominal from concerning, that is,
d
5
E[T] = (−1) H̃1j (20, w)
dw
j=1
d
at w = 0. To compute dw H̃ij (x, w) at w = 0, here too we consider a very small
H̃ij (x,h)−H̃ij (x,0)
h > 0 and obtain it approximately as h . Now, to evaluate H̃ij (x, h)
564 Analysis of Queues
and H̃ij (x, 0), we can write down from Equation 9.32, for j = 1, 2, 3, 4, 5,
⎡ ⎤
H̃1j (x, w)
⎢ H̃2j (x, w) ⎥
⎢ ⎥
⎢ ⎥
⎢ H̃3j (x, w) ⎥ = a1,j (w)eS1 (w)x φ1 (w)
⎢ ⎥
⎣ H̃4j (x, w) ⎦
H̃5j (x, w)
+ a2,j (w)eS2 (w)x φ2 (w) + · · · + a5,j (w)eS5 (w)x φ5 (w), (9.36)
where ai,j (w), Sj (w), and φj (w) values need to be determined for w = 0 and
w = h for some small h.
We can obtain Sj (w) for j = 1, 2, 3, 4, 5 as the scalar solutions to the
characteristic equation
det(DS(w) − wI + Q) = 0.
But this is identical to that in Problem 90. Likewise φj (w) can be computed
as the column vectors that satisfy
which is also identical to that in Problem 90. Thus refer to Problem 90 for
φj (w) and Sj (w) for j = 1, 2, 3, 4, 5 at w = 0 and w = h. What remains in Equa-
tion 9.35 are the ai,j (w) values for w = 0 and w = h. For that refer back to the
approach in Remark 23. First of all
⎡ ⎤
H̃1j (x, w) ⎡ ⎤
⎢ ⎥ 0
⎢ H̃2j (x, w) ⎥ ⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥
⎢ H̃3j (x, w) ⎥=⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎣ 0 ⎦
⎣ H̃4j (x, w) ⎦
0
H̃5j (x, w)
for j = 1, 2 since the first passage time can never end in states 1 or 2 as the
drift is negative in those states (only when the drift is positive is it possible
to cross over into a particular buffer content level from below). Thus we need
only ai,j (w) values for i = 1, 2, 3, 4, 5 and j = 3, 4, 5.
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 565
5
q1k q1k
H̃1j (0, w) = H̃kj (0, w)
−q11 q1k + w
k=2
5
q21 q21 q2k q2k
H̃2j (0, w) = H̃1j (0, w) + H̃kj (0, w) ,
−q22 q21 + w −q22 q2k + w
k=3
where qij corresponds to the element in the ith row and jth column of Q.
Solving the 15 equations we get for w = 0,
⎡ ⎤
a1,3 (0) a2,3 (0) a3,3 (0) a4,3 (0) a5,3 (0)
⎢ ⎥
⎣ a1,4 (0) a2,4 (0) a3,4 (0) a4,4 (0) a5,4 (0) ⎦
a1,5 (0) a2,5 (0) a3,5 (0) a4,5 (0) a5,5 (0)
⎡ ⎤
0.0093 × 10−3 −0.2211 × 10−4 −0.2109 −0.0033 −0.0014
⎢ ⎥
=⎣ 0.1326 × 10−3 0.1117 × 10−4 −0.8945 −0.0466 −0.0208 ⎦ .
−0.1419 × 10−3 0.1094 × 10−4 −1.1307 0.0500 0.0223
Also, for w = h = 0.000001, the values of ai,j (h) are the same as that when w = 0
to the first few significant digits. Hence we do not present that here.
Now using ai,j (w), Si (w), and φi (w) values for i = 1, 2, 3, 4, 5 and j = 3, 4, 5
at w = 0 and w = h in Equation 9.36 we can compute H̃ij (x, w). In particular
for x = 20 (which is what we need here) we get
⎡ ⎤⎡ ⎤
H̃13 (20, 0) H̃14 (20, 0) H̃15 (20, 0) 0.1583 0.3995 0.4422
⎢ ⎥ ⎢
⎢H̃23 (20, 0)
⎢
H̃24 (20, 0) H̃25 (20, 0)⎥ ⎢0.1497
⎥ ⎢ 0.3913 0.4590⎥
⎥
⎢ ⎥ ⎥
⎢H̃33 (20, 0) H̃34 (20, 0) H̃35 (20, 0)⎥ = ⎢1 0 0 ⎥.
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎣H̃43 (20, 0) H̃44 (20, 0) H̃45 (20, 0)⎦ ⎣0 1 0 ⎦
H̃53 (20, 0) H̃54 (20, 0) H̃55 (20, 0) 0 0 1
566 Analysis of Queues
The values of H̃ij (20, h) for some small h do not differ in the first four sig-
nificant digits from the corresponding H̃ij (20, 0) values for i = 1, 2, 3, 4, 5 and
j = 3, 4, 5, hence they are not reported.
Thus we have the expected number of days from t = 0 (with initial water
level X(0) = 20 as well as initial environmental condition Z(0) = 1) for the
water level to become nominal as
In fact in the previous problem we can also immediately write down the
time spent in critical state as 25.1175 days if we were to start in state 2 (instead
of 1 in the previous problem). Having seen a set of numerical problems, we
next focus on some exponential on–off source cases where we can obtain
closed-form algebraic expressions.
Problem 93
Consider an exponential on–off source that inputs fluid into a buffer. The on
times are according to exp(α) and off times according to exp(β). When the
source is on, fluid enters the buffer at rate r and no fluid enters the buffer
when the source is off. The output capacity is c. Assume that initially there
is x amount of fluid in the buffer. Define the first passage time as the time it
would take for the buffer contents to reach level x∗ or 0, whichever happens
first with x∗ ≥ x ≥ 0. Let states 1 and 2 represent the source being off and on,
respectively. For i = 1, 2, find the LST of the first passage time given that the
environment is initially in state i. Also for i = 1, 2, find the probability that
the first passage time occurs with x∗ or 0 amount of fluid, given that initially
the environment is in state i.
Solution
The setting is identical to that in Problem 87. Recall that the environment
process {Z(t), t ≥ 0} is a CTMC with
−β β −c 0
Q= and D = .
α −α 0 r−c
Next we use the definition of Hij (x, t) and its LST H̃ij (x, w) in Section 9.3 for
i = 1, 2 and j = 1, 2. Then the solution to Equation 9.31 is given by
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 567
H̃11 (x, w)
= a11 (w)eS1 (w)x φ1 (w) + a21 (w)eS2 (w)x φ2 (w),
H̃21 (x, w)
H̃12 (x, w)
= a12 (w)eS1 (w)x φ1 (w) + a22 (w)eS2 (w)x φ2 (w).
H̃22 (x, w)
det(DS(w) − wI + Q) = 0.
For i = 1, 2 we have
w+α−Si (w)(r−c) β
φi (w) = α = w+β+Si (w)c .
1 1
β
ψi (w) =
w + β + Si (w)c
and thus
ψi (w)
φi (w) = .
1
568 Analysis of Queues
Finally, solving for a11 (w), a21 (w), a12 (w), and a22 (w) using the LST of
the boundary conditions H̃22 (x∗ , w) = 1, H̃11 (0, w) = 1, H̃21 (x∗ , w) = 0, and
H̃12 (0, w) = 0, resulting in
∗
a11 (w) = eS2 (w)x /δ(w),
∗ ∗
where δ(w) = eS2 (w)x ψ1 (w) − eS1 (w)x ψ2 (w).
Now that we have expressions for H̃ij (x, w) for i = 1, 2 and j = 1, 2, the LST
of the first passage time given that the environment is initially in state 1 (i.e.,
off) can be computed as
H̃11 (x, w) + H̃12 (x, w) = (a11 (w) + a12 (w))eS1 (w)x ψ1 (w)
+ (a21 (w) + a22 (w))eS2 (w)x ψ2 (w).
Likewise, the LST of the first passage time given that the environment is
initially in state 2 (i.e., on) can be computed as
H̃21 (x, w) + H̃22 (x, w) = (a11 (w) + a12 (w))eS1 (w)x + (a21 (w) + a22 (w))eS2 (w)x .
Also for i = 1, 2, the probability that the first passage time occurs with 0
amount of fluid, given that initially the environment is in state i is H̃i1 (x, 0).
Likewise for i = 1, 2, the probability that the first passage time occurs with x∗
amount of fluid, given that initially the environment is in state i is H̃i2 (x, 0).
To compute H̃ij (x, 0) for i = 1, 2 and j = 1, 2 when we let w = 0, we need to be
careful about the fact that b̂2 = |b̂|. Notice that if rβ < c(α + β), then b̂ < 0,
otherwise b̂ ≥ 0.
Assume that rβ < c(α + β), which would be necessary if we require the
queue to be stable (note that it is straightforward to write down the expres-
sions to follow even for the case rβ ≥ c(α + β) but is not presented here).
Continuing with the assumption that rβ < c(α + β), we can see by letting
w = 0 that
cα − β(r − c)
S1 (0) = 0 and S2 (0) = .
c(r − c)
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 569
−β(r−c)
cα
a12 (0) = −ψ2 (0)/δ(0) = ∗
,
eS2 (0)x − β(r−c)
cα
∗ −1
a21 (0) = −eS1 (0)x /δ(0) = ,
e S2 (0)x∗ − β(r−c)
cα
1
a22 (0) = ψ1 (0)/δ(0) = .
e S2 (0)x∗ − β(r−c)
cα
In the next example, we continue with the setting of the previous exam-
ple, however, with the restriction of the first passage time occurring only
when the amount of fluid in the buffer becomes empty.
Problem 94
Consider an exponential on–off source that inputs fluid into an infinite-sized
buffer. The on times are according to exp(α) and off times according to
exp(β). When the source is on, fluid enters the buffer at rate r and no fluid
enters the buffer when the source is off. The output capacity is c. Assume that
the system is stable. Define the first passage time as the time it would take
for the buffer contents to empty for the first time given that initially there
is x amount of fluid in the buffer and the source is in state i, for i = 1 and 2
570 Analysis of Queues
representing off and on, respectively. Then using that result derive the LST
of the busy period distribution, that is, the consecutive period of time there
is nonzero fluid in the buffer.
Solution
Notice that the setting is identical to that of Remark 22 where a = b ≤ x with
a = b = 0. Also, the stability condition is that rβ < c(α + β). Following the
procedure in Remark 22, define T as the first time the amount of fluid in
the buffer reaches 0 and thereby Hij (x, t) = P{T ≤ t, Z(T) = j|X(0) = x, Z(0) = i}.
Using the results in Problem 93 we can write down H̃ij (x, w), the LST of
Hij (x, t) as follows:
H̃11 (x, w)
= a11 (w)eS1 (w)x φ1 (w) + a21 (w)eS2 (w)x φ2 (w),
H̃21 (x, w)
det(DS(w) − wI + Q) = 0,
−b̂ − b̂2 + 4w(w + α + β)c(r − c)
S1 (w) = ,
2c(r − c)
−b̂ + b̂2 + 4w(w + α + β)c(r − c)
S2 (w) = ,
2c(r − c)
where b̂ = (r − 2c)w + (r − c)β − cα. Also the column vectors φj (w) that satisfy
w+α−Si (w)(r−c) β
φi (w) = α = w+β+Si (w)c .
1 1
To solve for a11 (w) and a21 (w), we use the LST of the boundary conditions
H̃11 (0, w) = 1 and H̃12 (0, w) = 0. Of course, H̃12 (0, w) = 0 any way, hence
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 571
that boundary condition is not useful. We use the additional condition that
aij (w) = 0 if Si (w) > 0 for i = 1, 2 and j = 1, 2. Since S1 (w) ≤ 0 and S2 (w) > 0,
we have a21 (w) = 0. Thus the only term that is nonzero is a11 (w) which is
given by
w + β + S1 (w)c
a11 (w) =
β
H̃11 (x, w) S1 (w)x 1
=e w+β+S1 (w)c .
H̃21 (x, w) β
Now we compute the LST of the busy period distribution. Notice that a
busy period begins with the environment in state 2 with zero fluid in the
buffer (i.e., x = 0) and ends when the environment is in state 1. Hence the
1 (w)c
LST of the busy period distribution is H̃21 (0, w) = w+β+S
β .
Next we consider the case where we start with x amount of fluid and find
the distribution for the time it would take for the buffer contents to reach x∗
with x∗ ≥ x.
Problem 95
Consider an on–off source with on times according to exp(α) and off times
according to exp(β). When the source is on, fluid enters the buffer at rate r
and no fluid enters the buffer when the source is off. The output capacity
is c. Define the first passage time as the time it would take for the buffer
contents to reach a level x∗ for the first time given that initially there is x
amount of fluid in the buffer (such that x ≤ x∗ ) and the source is in state i, for
i = 1 and 2 representing off and on, respectively. Derive the LST of the first
passage time.
Solution
This setting is identical to that of Remark 23 where a = b ≥ x with a = b = x∗ .
We define T as the first time the amount of fluid in the buffer reaches x∗
and thereby Hij (x, t) = P{T ≤ t, Z(T) = j|X(0) = x, Z(0) = i}. Using the results in
Problem 93 we can solve
det(DS(w) − wI + Q) = 0,
572 Analysis of Queues
where b̂ = (r − 2c)w + (r − c)β − cα. Also since the column vectors φj (w) must
satisfy
Thus we can write down H̃ij (x, t), the LST of Hij (x, t) as H̃11 (x, w) =
H̃21 (x, w) = 0, and
H̃12 (x, w)
= a12 (w)eS1 (w)x φ1 (w) + a22 (w)eS2 (w)x φ2 (w).
H̃22 (x, w)
To solve for ai2 (w) for i = 1, 2, we use the LST of the boundary conditions
H̃22 (x∗ , w) = 1 and H̃21 (x∗ , w) = 0. Of course, H̃21 (x, w) = 0 any way, hence
that boundary condition is not useful. We use the additional condition that
β
H̃12 (0, w) = H̃22 (0, w) .
β+w
β β
(a12 (w) + a22 (w)) = a12 (w)
β+w w + β + S1 (w)c
β
+ a22 (w)
w + β + S2 (w)c
β
since H̃12 (0, w) = H̃22 (0, w) β+w .
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 573
−1
S1 (w)x∗ S2 (w)x∗ S1 (w) w + β + S2 (w)c
a12 (w) = e −e ,
S2 (w) w + β + S1 (w)c
−1
S2 (w)x∗ S1 (w)x∗ S2 (w) w + β + S1 (w)c
a22 (w) = e −e .
S1 (w) w + β + S2 (w)c
Thus the LST of the first passage time H̃ij (x, t) is given by H̃11 (x, w) =
H̃21 (x, w) = 0, and
H̃12 (x, w)
= a12 (w)eS1 (w)x φ1 (w) + a22 (w)eS2 (w)x φ2 (w),
H̃22 (x, w)
where a12 (w), a22 (w), φ1 (w), φ2 (w), S1 (w), and S2 (w) are described earlier.
Problem 96
Consider an infinite-sized buffer with fluid input according to an on–off
source that has on and off times exponentially distributed with parameters
α and β, respectively. When the source is on fluid flows in at rate r and no
fluid flows in when off. As soon as the buffer content reaches level a, fluid is
removed from the buffer at rate c. When the buffer becomes empty, the out-
put valve is shut. It remains shut until the buffer content reaches a. In other
words the output also behaves like an alternating on–off sink. Assume that
rβ
α + β < c < r. Obtain LSTs of the consecutive time the buffer is drained at rate
c as well as the time the buffer takes to reach a. Also derive the expected on
and off times for the output valve or sink.
Solution
Let T1 be the time for the output channel to empty the contents in the buffer
starting with a. Also let T2 be the time for the buffer contents to rise from 0
574 Analysis of Queues
and
First we obtain the distribution of T1 and then that of T2 . Let O1 (t) be the
CDF of the random variable T1 such that
Due to the definition of T1 , the source is “on” initially with a amount of fluid
in the buffer, so that T1 is the first passage time to reach zero amount of fluid
in the buffer. For that we can directly substitute for expressions in Problem
94 to obtain the results. The LST Õ1 (w) is
where
w∗ = 2 cαβ(r − c) − rβ − cα − cβ /r, (9.39)
−b − b2 + 4w(w + α + β)c(r − c)
s0 (w) = (9.40)
2c(r − c)
and b = (r − 2c)w + (r − c)β − cα. Note that the LST is defined for all w ≥ w∗
where w∗ is essentially the point where s0 (w) becomes imaginary because the
term inside the square-root is negative. However, the fact that w∗ < 0 ensures
that this would not be a problem for w ≥ 0.
Now let O2 (t) be the CDF of the random variable T2 such that
During time T2 , the output from the buffer is zero. Therefore, the buffer’s
contents X(t) is nondecreasing. Thus we have
For all t ∈ [0, ∞), let Z(t) = 1 denote that the source is off and Z(t) = 2 denote
that the source is on at time t. Define for i = 1, 2
Also define the vector H(x, t) = [H1 (x, t) H2 (x, t)]. Then H(x, t) satisfies the
following partial differential equation
∂H(x, t) ∂H(x, t)
+ R = H(x, t)Q (9.42)
∂t ∂x
Now, taking the Laplace transform of Equation 9.42 with respect to t, we get
∂H∗ (x, s)
sH∗ (x, s) − H(x, 0) + R = H∗ (x, s)Q. (9.43)
∂x
∂H∗ (x, s)
sH∗ (x, s) − [1 0] + R = H∗ (x, s)Q.
∂x
sH̃∗ (w, s) − [1 0] + wH̃∗ (w, s)R − wH∗ (0, s)R = H̃∗ (w, s)Q.
Since P{X(t) ≤ 0, Z(t) = 1} = 0, we have H∗ (0, s) = H1∗ (0, s) 0 and, therefore,
wH∗ (0, s)R = [0 0]. Hence this equation reduces to
Plugging in for R and Q, and taking the inverse of the matrix yields
1
H̃∗ (w, s) = [s + wr + α β]. (9.44)
wr(s + β) + αs + βs + s2
576 Analysis of Queues
However,
2
β −a αs+βs+s
Õ2 (s) = e rs+rβ . (9.45)
β+s
r + a(α + β)
E[T1 ] = , (9.46)
cα + cβ − rβ
r + a(α + β)
E[T2 ] = . (9.47)
rβ
Notice that the ratio E[T1 ]/E[T2 ] is independent of a. This indicates that
no matter what a is, the ratio of time spent by the sink in on and off times
1]
remains the same. Also E[TE[T
1 + T2 ]
= c(αrβ
+ β) . This is not surprising because in
every on–off cycle of the sink an average of E[T1 + T2 ] αrβ
+ β fluid enters the
buffer all of which exit the buffer during time whose mean is E[T1 ], hence
the average amount of fluid exiting a buffer in one cycle is E[T1 ]c. Hence we
1]
have E[TE[T
1 + T2 ]
= c(αrβ
+ β) .
Problem 97
Consider an exponential on–off source with on times according to exp(α)
and off times according to exp(β). When the source is on, fluid enters at rate
r into an infinite-sized buffer and no fluid flows into the buffer when the
source is off. Fluid is drained from the buffer using two rates according to a
threshold policy. When the amount of fluid in the buffer is less than x∗ the
output capacity is c1 , whereas if there is more than x∗ fluid, it is removed at
rate c1 + c2 . For such a buffer, derive an expression for
where 0 < x∗ < B̂ < ∞. Assume that r > c1 + c2 > c1 > rβ/(α + β).
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 577
Solution
The aim is to compute the limiting distribution (as t → ∞) of X(t), the
amount of fluid in the buffer at time t. Note that the output capacity is c1
when X(t) < x∗ and it is c1 + c2 when X(t) ≥ x∗ . The output capacity of the
system can be modeled as an alternating renewal process that stays at c1 for
a random time T1 and switches to c1 + c2 for a random time T2 before mov-
ing back to c1 . Essentially T1 is the first passage time for the buffer content
to reach x∗ (from below) given that initially the source is off and there is
x∗ amount of fluid. During the entire time T1 , the output capacity is c1 and
hence we can directly use the results from Problem 95. Likewise T2 is the
first passage time for the amount of fluid to reach x∗ (from above) given that
initially there is x∗ amount of fluid and the source is on. Also, during the
time T2 , the output capacity is c1 + c2 , which enables us to use results from
Problem 94.
To compute the limiting probability (as t → ∞) that X(t) is greater than
B̂ we condition on the region above or below x∗ to obtain
E[T2 ]
lim P{X(t) > x∗ } = . (9.49)
t→∞ E[T1 ] + E[T2 ]
since the X(t) process being above x∗ is stochastically equivalent to the X̂(t)
process being above 0. We can immediately write down
578 Analysis of Queues
P{X̂(t) > B̂ − x∗ |X̂(t) > 0} = P{X̂(t) > B̂ − x∗ , X̂(t) > 0}/P{X̂(t) > 0}
βr
lim P{X̂(t) > x} = eλ1 x
t→∞ (c1 + c2 )(α + β)
P{X̂(t) > B̂ − x∗ } ∗
lim = eλ1 (B̂−x ) .
t→∞ P{X̂(t) > 0}
where
−b̂ − b̂2 + 4w(w + α + β)c1 (r − c1 )
S1 (w) = ,
2c1 (r − c1 )
−b̂ + b̂2 + 4w(w + α + β)c1 (r − c1 )
S2 (w) = ,
2c1 (r − c1 )
d −wT1
E[T1 ] = − Ee |w=0
dw
1 c1 α+β ∗
= + (eS2 (0)x − 1), (9.51)
β β c1 (α + β) − rβ
1)
where S2 (0) = c1 α−β(r−c
c1 (r−c1 ) > 0.
Now we derive E[T2 ]. Recall that T2 is the first passage time to reach
fluid level x∗ from above given that initially there is x∗ amount of fluid and
the source is on. During this time the output capacity is c1 + c2 . Notice that
this first passage time is the same as the busy period of a buffer with CTMC
on–off source input and output capacity c1 + c2 . Thus using the LST of the
busy period distribution described toward the end of the solution to Problem
94, we can see that
w + β + S0 (w)(c1 + c2 )
E e−wT2 =
β
where
2
−b − b + 4w(w + α + β)(c1 + c2 )(r − c1 − c2 )
S0 (w) = ,
2(c1 + c2 )(r − c1 − c2 )
d −wT2
E[T2 ] = − Ee |w=0
dw
1 c1 + c2 α+β
=− + . (9.52)
β β (c1 + c2 )(α + β) − rβ
c1 α−β(r−c1 )
where S2 (0) = c1 (r−c1 ) and λ1 = β/(c1 + c2 ) − α/(r − c1 − c2 ).
580 Analysis of Queues
In the next example, we will consider a few aspects that will give a flavor
for the analysis in the next chapter. In particular, we will consider: (i) multi-
ple sources that superpose traffic into a buffer, (ii) a network situation where
the departure from one node acts as input to another node, and (iii) a case of
non-CTMC-based environment process.
Problem 98
Consider two infinite-sized buffers in tandem as shown in Figure 9.7. Input
to the first buffer is from N independent and identical exponential on–off
sources with on time parameter α, off time parameter β and rate r. The
output from the first buffer is directly fed into the second buffer. The out-
put capacities of the first and second buffers are c1 and c2 , respectively.
What is the stability condition? Assuming that is satisfied, characterize the
environment process governing input to the second buffer.
Solution
Let Z1 (t) be the number of sources that are in the “on” state at time t. Clearly
{Z1 (t), t ≥ 0} is a CTMC with N + 1 states. When Z1 (t) = i the input rate is ir.
For notational convenience we assume that c1 is not an integral multiple of
r. Thus every state in the CTMC {Z1 (t), t ≥ 0} has strictly positive or strictly
negative drifts. Let
!c "
1
= .
r
Thus whenever Z1 (t) ∈ {0, . . . , − 1}, the drift is negative, that is, the
first buffer’s contents would be nonincreasing. Likewise, whenever
Z1 (t) ∈ {, . . . , N}, the drift is positive, that is, the first buffer’s contents would
be increasing. The first buffer is stable if the average fluid arrival rate in
Nrβ
steady state is lesser than the service capacity, that is, α+β < c1 . If the
first buffer is stable then the steady-state average departure rate from that
buffer is also Nrβ/(α + β). Thus the second buffer is also stable if αNrβ + β < c2 .
Although not needed for this problem’s analysis, unless c2 < c1 , the second
1
2 c1 c2
X1(t) X2(t)
N
FIGURE 9.7
Tandem buffers with multiple identical sources. (From Gautam, N. et al., Prob. Eng. Inform. Sci.,
13, 429, 1999. With permission.)
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 581
Nrβ
< c2 < c1 .
α+β
which is to be derived. By definition, Gij (t) is the joint probability that given
the current state is i the next state is j and the sojourn time in i is less than t.
For i = 0, 1, . . . , − 1 and j = 0, 1, . . . , , it is relatively straightforward to
obtain Gij (t) as follows (since the sojourn times in state i are exponentially
distributed):
⎧ iα
⎪
⎪ 1 − exp{−(iα + (N − i)β)t} if j = i − 1
⎨ iα+(N−i)β
Gij (t) = (N−i)β
1 − exp{−(iα + (N − i)β)t} if j = i + 1
⎪
⎪ iα+(N−i)β
⎩
0 otherwise .
The only tricky part in the kernel is to describe Gj (t). For that, we define
a first passage time in the {X1 (t), t ≥ 0} process as
We now derive an expression for G̃j (s), the LST of Gj (t). Note that
G (t) = 0.
Let
⎡ ⎤
−Nβ Nβ 0 ... 0 0
⎢ α −α − (N − 1)β (N − 1)β 0 ... 0 ⎥
⎢ ⎥
⎢ · · · · · · ⎥
⎢ ⎥
Q=⎢
⎢ · · · · · · ⎥⎥
⎢ · · · · · · ⎥
⎢ ⎥
⎣ 0 0 ... (N − 1)α −(N − 1)α − β β ⎦
0 0 ... 0 Nα −Nα
and
Let sk (w) and χk (w) be the kth eigenvalue and corresponding eigenvec-
tor respectively of R−1 (wI − Q). As we described earlier, there are states
with negative drift. Without loss of generality we let s0 , s1 , . . . , s−1 be the
negative eigenvalues and χ0 , χ1 , . . . , χ−1 be the corresponding eigenvec-
tors written in that form suppressing they are functions of w for compact
notation. Define
−1
H̃j (x, w) = akj esk x χk .
k=0
In other words,
and so on
G̃j (w) = [A χ∗ ]j ,
where
In matrix notations
A χ = I,
where
χ = χ0 χ1 . . . χ−1
whose LST we describe above. When the environment is in state Z2 (t) at time
t, the fluid enters the second buffer at rate min(Z2 (t)r, c1 ).
We can also compute the sojourn time τi in state i, for i = 0, 1, . . . , as
⎧
⎨ 1
if i = 0, 1, . . . , − 1
iα+(N−i)β
τi = −1
⎩
j=1 G̃j (0) if i = ,
ai τi
pi = lim P{Z2 (t) = i} = , (9.53)
t→∞
k=0 ak τk
where
a = a G(∞) = a G̃(0).
Reference Notes
Stochastic fluid flow models or fluid queues have been around for almost
three decades but have not received the attention that the deterministic fluid
queues have received from researchers. In fact this may be the first textbook
that includes two chapters on fluid queues. Pioneering work on fluid queues
was done by Debasis Mitra and colleagues. In particular, the seminal article
by Anick, Mitra, and Sondhi [5] is a must read for anyone interested in the
area of fluid queues. At the end of the next chapter, there is a more exten-
sive reference to the fluid queue literature. This chapter has been mainly
an introductory one and we briefly describe the references relevant to its
development.
The single buffer fluid model setting, notation, and characterization
described in Sections 9.1.3, 9.1.4, and 9.2.1 are adapted from Kulkarni [69].
The main buffer content analysis for CTMCs in Sections 9.2.2 and 9.2.3 first
appeared in Anick, Mitra, and Sondhi [5] and the version in this chapter has
been mostly derived from Vidhyadhar Kulkarni’s course notes. Section 9.3
on first passage time analysis with examples is from a collection of papers
including Narayanan and Kulkarni [84], Aggarwal et al. [3], Gautam et al.
[39], Kulkarni and Gautam [70], and Mahabhashyam et al. [76].
There are several extensions to the setting considered in this section.
Some of these we will see in the next chapter. It is worthwhile mention-
ing about others that we will not see. In particular, Kulkarni and Rolski
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 585
[66] extend the fluid model analysis to continuous state space processes
such as the Ornstein Uhlenbeck process. Krishnan et al. [65] consider frac-
tional Brownian motion driving input to a buffer. Kella [58] uses Levy
process inputs to analyze non-product form stochastic fluid networks. From
a methodological standpoint other techniques are possible. For example,
Ahn and Ramaswami [4] consider matrix-analytic methods for transient
analysis, steady-state analysis, and first passage times of both finite- and
infinite-sized buffers.
Exercises
9.1 Consider an infinite-sized buffer into which fluid entry is modulated
by a six-state CTMC {Z(t), t ≥ 0} with S = {1, 2, 3, 4, 5, 6} and
⎡ ⎤
−1 1 0 0 0 0
⎢ 1 −2 0⎥
⎢ 1 0 0 ⎥
⎢ 0 1 −2 0⎥
⎢ 1 0 ⎥
Q=⎢ ⎥.
⎢ 0 0 1 −2 1 0⎥
⎢ ⎥
⎣ 0 0 0 1 −2 1⎦
0 0 0 0 1 −1
The constant output capacity is c = 18 kbps and the fluid arrival rates
in states 1, 2, 3, 4, 5, and 6 are 30, 25, 20, 15, 10, and 5 kbps, respec-
tively. Obtain the joint probability that in steady state there is more
than 20 kb of fluid in the buffer and the CTMC is in state 1. Also
write down an expression for limt→∞ P{X(t) ≤ x}.
9.2 Consider a finite-sized buffer whose input is modulated by a five-
state CTMC {Z(t), t ≥ 0} with S = {1, 2, 3, 4, 5} and
⎡ ⎤
−3 2 1 0 0
⎢ 2 −5 2 1 0 ⎥
⎢ ⎥
⎢ ⎥
Q=⎢ 1 2 −6 2 1 ⎥.
⎢ ⎥
⎣ 0 0 2 −4 2 ⎦
0 0 1 2 −3
The output capacity for the buffer is c = 15 kbps and the fluid arrival
rates in states 1, 2, 3, 4, and 5 are 20, 16, 12, 8, and 4 kbps, respec-
tively. Consider three values of buffer size B in kb, namely, B = 2,
B = 4, and B = 6. For the three cases obtain the probability there is
more than 1 kb of fluid in the buffer in steady state. Also obtain the
fraction of fluid that is lost because of a full buffer in all three cases.
586 Analysis of Queues
The output capacity for the buffer is c = 4 kbps and the fluid arrival
rate is (7 − i) kbps in state i for all i ∈ S. Derive an expression for
limt→∞ P{X(t) ≤ x} for x ≥ 0.
9.4 “Leaky Bucket” is a control mechanism for admitting data into a
network. It consists of a data buffer and a token pool, as shown in
Figure 9.8. Tokens in the form of fluid are generated continuously
at a fixed rate γ into the token pool of size BT . The new tokens
are discarded if the token pool is full. External data traffic enters
the infinite-sized data buffer in fluid form from a source modu-
lated by an environmental process {Z(t), t ≥ 0} which is an -state
CTMC. Data traffic is generated at rate r(Z(t)) at time t. If there are
tokens in the token pool, the incoming fluid takes an equal amount
of tokens and enters the network. If the token pool is empty then
the fluid waits in the infinite-sized data buffer for tokens to arrive.
Let X(t) be the amount of fluid in the data buffer at time t and
Y(t) the amount of tokens in the token buffer at time t. Assume
that at time t = 0 the token buffer and data buffer are both empty,
that is, X(0) = Y(0) = 0. Draw a sample path of Z(t), X(t), Y(t),
and the output rate R(t). What is the stability condition? Assum-
ing stability, using the results in this chapter derive an expression
X(t)
(Z(t), r(Z(t))) R (t)
Output
Data buffer
Token BT Y (t)
pool
Token
rate γ
FIGURE 9.8
Single leaky bucket. (From Gautam, N., Telecommun. Syst., 21(1), 35, 2002. With permission.)
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 587
The output capacity for the buffer is c = 4 kbps and the fluid arrival
rate is (7 − i) kbps in state i for all i ∈ S. Initially there is 0.1 kb of
fluid and the environment is in state 3. Define the first passage time
as the time to reach a buffer level of 0.2 or 0 kb, whichever happens
first. Obtain the LST of the first passage time and derive the mean
first passage time.
9.7 Consider a wireless sensor node that acts as a source that generates
fluid at a constant rate r. This fluid flows into a buffer of infinite size.
The buffer is emptied by a channel that toggles between capacity c
and 0 for exp(γ) and exp(δ) time, respectively. Assume that c > r
and that the system is stable. Derive an expression for the steady-
state distribution of buffer contents. Also, characterize the output
rate process from the queue (notice that the output rate is 0, r, or c).
588 Analysis of Queues
589
590 Analysis of Queues
X(t)
c
Z(t)
FIGURE 10.1
Single buffer with a single environment process and output capacity c (From Gautam, N., Qual-
ity of service metrics. In: Frontiers in Distributed Sensor Networks, S.S. Iyengar and R.R. Brooks
(eds.), Chapman & Hall/CRC Press, Boca Raton, FL, pp. 613–628, 2004.)
t
A(t) = r(Z(u))du.
0
This uses the fact that A(t) is the cumulative workload generated from time
0 to t, and r(Z(u)) is the instantaneous rate at which workload is gener-
ated. Thus, from the first principles of integration we have the expression
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 591
1
h(v) = lim log E{exp(vA(t))}
t→∞ t
where
rmean = E(r(Z(∞))) is the mean traffic flow rate
rpeak = supz {r(z)} is the peak traffic flow rate
h (v) denotes the derivative of h(v) with respect to v
1 h(v)
eb(v) = lim log E{exp(vA(t))} = .
t→∞ vt v
Thus, the ALMGF and effective bandwidths are related much similar to rela-
tionship between LST and the Laplace transform. There are benefits of both
h(v)
FIGURE 10.2
Graph of h(v) versus v.
592 Analysis of Queues
and thus we will continue using both. Using the definition of eb(v), it can be
shown that eb(v) is an increasing function of v and
Also,
These properties are depicted in Figure 10.3. Although we will see their
implications subsequently, it is worthwhile explaining the properties related
to rmean and rpeak , which are the mean and peak input rates, respectively.
Essentially, eb(v) summarizes the workload generation rate process. Two
obvious summaries are rmean and rpeak that correspond to the average case
and worst-case scenarios. The eb(v) parameter captures those as well as
everything in between. This would become more apparent when we con-
sider the workload flowing into an infinite-sized buffer (with X denoting the
steady-state buffer content level assuming it exists). If the output capacity
of this buffer is eb(v), then as x → ∞, P{X > x} → e−vx . Naturally, the output
capacity of the buffer must be greater than rmean to ensure stability and it
must be less than rpeak to have any fluid buildup. We expect the probabil-
ity of the buffer content being greater than x to be higher when the output
capacity is closer to rmean than when it is closer to rpeak . That intuition can be
verified.
eb(v)
r peak
r mean
FIGURE 10.3
Graph of eb(v) versus v.
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 593
Having defined ALMGF and the effective bandwidth (as well as briefly
alluding to how we will use it for describing steady-state buffer content
distribution), the next question is if we are given the environment process
{Z(t), t ≥ 0} and the workload rates r(Z(t)) for all t, can we obtain expressions
for h(v) and eb(v)? That will be the focus of the following section.
Next, we show through Problem 99 we next show how to derive the previous
expression.
Problem 99
Derive the expression for h(v) in Equation 10.1 using the definition of h(v)
for a CTMC environment process {Z(t), t ≥ 0} with state space S, generator
matrix Q = [qij ], and fluid rate matrix R.
Solution
From the definition of h(v) given by
1
h(v) = lim log E evA(t) ,
t→∞ t
594 Analysis of Queues
we consider
gi (t) = E evA(t) Z(0) = i
for some i ∈ S. We can immediately write down the following for some
infinitesimally small positive h:
gi (t + h) = E evA(t+h) Z(0) = i
= E evA(t+h) Z(h) = j, Z(0) = i P{Z(h) = jZ(0) = i}
j∈S
= evr(i)h E evA(t) Z(0) = j qij h + evr(i)h E evA(t) Z(0) = i + o(h)
j∈S
where o(h) are terms of the order higher than h such that o(h)/h → 0
as h → 0. Before proceeding, we explain the last equation. First of all,
P{Z(h) = j|Z(0) = i} = qij h + o(h) if i = j and P{Z(h) = i|Z(0) = i} = 1 + qii h + o(h)
using standard CTMC transient analysis results. Also, from time 0 to h when
the CTMC is in state i, r(i)h amount of fluid is generated. Thus, A(t + h) is
stochastically identical to A(t) + r(i)h assuming that at time h the environ-
ment process toggles from i to j. Using that we can rewrite gi (t + h) in the
previous equation as
gi (t + h) = evr(i)h gj (t)qij h + evr(i)h gi (t) + o(h).
j∈S
Subtracting gi (t) from both sides of the equation, dividing by h, and letting
h → 0, we get
dgi (t)
= gi (t)vr(i) + gj (t)qij
dt
j∈S
using the fact that evr(i)h = 1 + vr(i)h + o(h). We can write the preceding
differential equation in vector form as
dg(t)
= [Rv + Q]g(t)
dt
where g(t) is a |S| × 1 column vector of gi (t) values. The equation is similar
to several differential equations derived in Chapter 9. From there we can see
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 595
g(t) = aj eλj t ψj
j∈S
where
aj values are scalar constants
λj and ψj are the jth eigenvalue and corresponding right eigenvector,
respectively, of (Rv + Q)
g(t) = eθt aj e(λj −θ)t ψj .
j∈S
1
h(v) = lim log[π0 g(t)].
t→∞ t
Using the expression for g(t), we can immediately write down the following:
⎡ ⎤
1
h(v) = lim log ⎣eθt π0 aj e(λj −θ)t ψj ⎦
t→∞ t
j∈S
⎧ ⎡ ⎤⎫
1⎨ ⎬
= lim log eθt + log ⎣π0 aj e(λj −θ)t ψj ⎦
t→∞ t ⎩ ⎭
j∈S
⎡ ⎤
1
= θ + lim log ⎣π0 aj e(λj −θ)t ψj ⎦
t→∞ t
j∈S
= θ + 0.
For the last expression, we do need some j for which λj = θ so that the sum-
mation itself does not go to zero or infinite as t → ∞. Since θ = e(Rv + Q), we
have h(v) = e(Rv + Q).
596 Analysis of Queues
Problem 100
Consider an on-off source with on-times according to exp(α) and off-times
according to exp(β). Traffic is generated at rate r when the source is in the
on-state and no traffic is generated when the source is in the off-state. Obtain
a closed-form algebraic expression for eb(v) for such a source.
Solution
The environment process {Z(t), t ≥ 0} is a two-state CTMC, where the first
state corresponds to the source being off and the second state corresponds to
the source being on. Therefore,
0 0 −β β
R= and Q= .
0 r α −α
−β β
M= .
α rv − α
|M − λI| = 0,
which yields
(rv − α − λ)(−β − λ) − βα = 0.
λ2 + (β + α − rv)λ − βrv = 0,
rv − α − β + (rv − α − β)2 + 4βrv
eb(v) = . (10.2)
2v
rβ
rmean =
α+β
rpeak = r.
Problem 101
Water flows into the reservoir according to a CTMC {Z(t), t ≥ 0} with
S = {1, 2, 3, 4, 5} and
⎡ ⎤
−1 0.4 0.3 0.2 0.1
⎢ 0.4 −0.7 0.1 0.1 0.1 ⎥
⎢ ⎥
⎢
Q = ⎢ 0.5 0.4 −1.1 0.1 0.1 ⎥
⎥.
⎣ 0.2 0.3 0.3 −1 0.2 ⎦
0.3 0.3 0.3 0.3 −1.2
When Z(t) = i, the inflow rate is 4i. Graph eb(v) versus v for the water flow.
Solution
For the CTMC source we have
⎡ ⎤
4 0 0 0 0
⎢ 0 8 0 0 0 ⎥
⎢ ⎥
R= ⎢
⎢ 0 0 12 0 0 ⎥.
⎥
⎣ 0 0 0 16 0 ⎦
0 0 0 0 20
5
Also rpeak = 20 and rmean = 4ipi = 9.6672 since the solution to [p1 p2 p3
i=1
p4 p5 ] Q = [0 0 0 0 0] and p1 + p2 + p3 + p4 + p5 = 1 is
[p1 p2 p3 p4 p5 ] = [0.2725 0.3438 0.1652 0.1315 0.0870]. Using eb(v) = e(Q/v +
R), we plot eb(v) versus v in Figure 10.4.
598 Analysis of Queues
20
18
16
eb(v)
14
12
10
8
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
v
FIGURE 10.4
Graph of eb(v) versus v for Problem 101.
∞
G̃ij (w) = e−wx dGij (x),
0
then we have
and
e(Λ(u,v)) e(Λ(u,v))
e* (v)
1 1
e* (v)
u* (v) u u* (v) u
(a) (b)
FIGURE 10.5
(a) e((u, v)) versus u and (b) e((u, v)) versus u.
600 Analysis of Queues
Problem 102
Fluid flows into a buffer according to a three-state SMP {Z(t), t ≥ 0} with
state space {1, 2, 3}. The elements of the kernel of this SMP are given as
follows: G12 (t) = 1 − e−t − te−t , G21 (t) = 0.4(1 − e−0.5t ) + 0.3(1 − e−0.2t ),
G23 (t) = 0.2(1 − e−0.5t ) + 0.1(1 − e−0.2t ), G32 (t) = 1 − 2e−t + e−2t , and
G11 (t) = G13 (t) = G22 (t) = G31 (t) = G33 (t) = 0. Also, the flow rates in the three
states are r(i) = i for i = 1, 2, 3. Graph eb(v) versus v for v ∈ [0, 3].
Solution
For the SMP source we have the LST of the kernel as
⎡ 1
⎤
0 (w+1)2
0
⎢ ⎥
G̃(w) = ⎣ 0.2
0.5+w + 0.06
0.2+w 0 0.1
0.5+w + 0.02
0.2+w ⎦.
2
0 (2+w)(1+w) 0
Now, using the elements of the LST of the kernel, we can easily write down
for i = 1, 2, 3 and j = 1, 2, 3,
Notice that the LSTs are such that G̃ij (w) would shoot off to infinity only
if their denominators (if any) become zero. However, the shooting off to
infinity would not be sudden but gradual. Hence, for all v, e∗ (v) ≥ 1 (in fact
e∗ (v) = ∞ and u∗ (v) = max{r(1)v−1, r(2)v−0.2, r(3)v−1, 0} because, for exam-
ple, at u = r(1)v − 1, the denominator of G̃12 (w) goes to zero, and so on for
the other LSTs). Thus, to compute eb(v), all we need is the unique solution to
e((veb(v), v)) = 1.
Next, we explain how to numerically obtain the unique solution. Essen-
tially, for a given v, e((u, v)) decreases with respect to u from u∗ (v) to
infinity. Using that and the bounds on eb(v), that is, rmean ≤ eb(v) ≤ rpeak for all
v, we can perform a binary search for eb(v) between max{rmean , u∗ (v)/v} and
rpeak to find the unique solution to e((h(v), v)) = 1. Notice that rpeak = r(3) = 3
3
and rmean = ipi = 1.8119, since the stationary distribution of the SMP
i=1
can be computed as [p1 p2 p3 ] = 1/(π1 τ1 + π2 τ2 + π3 τ3 )[π1 τ1 π2 τ2 π3 τ3 ]
= 1/(0.35 × 2 + 0.5 × 3.2 + 0.15 × 1.5)[0.35 × 2 0.5 × 3.2 0.15 × 1.5] = [0.2772
0.6337 0.0891].
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 601
2.8
2.7
2.6
2.5
2.4
eb(v)
2.3
2.2
2.1
1.9
1.8
0 0.5 1 1.5 2 2.5 3
v
FIGURE 10.6
Graph of eb(v) versus v for Problem 102.
Using eb(v) as the solution to e((h(v), v)) = 1, we plot eb(v) versus v for
v ∈ [0, 3] in Figure 10.6. The ALMGF h(v) can be obtained as veb(v).
and
For a given v,
Problem 103
Consider an on-off source that generates fluid so that the on-times are IID
random variables with CDF U(t) = 1 − 0.6e−3t − 0.4e−2t and the off-times are
IID Erlang random variables with mean 0.5 and variance 1/12 in appropriate
time units compatible with the on-times. When the source is on, fluid is gen-
erated at the rate of 16 units per second; and no fluid is generated when the
source is off. This is identical to the source described in Problem 88. Graph
h(v) versus v for v ∈ [0, 1].
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 603
Solution
For the on/off source, we have the LST of the on-times as
∞ 1.8 0.8
E e−wU = e−wt dU(t) = + .
3+w 2+w
0
3
∞ 6
E e−wD = e−wt dD(t) =
6+w
0
since the CDF of the off-times D(t) = 1 − e−6t − 6te−6t − 18t2 e−6t for all t ≥ 0.
Using these we can write down
3
1.8 0.8 6
(u, v) = E e(−u+rv)U E e−uD = +
3 − rv + u 2 − rv + u 6+u
with r = 16.
Notice that the LSTs are such that (u, v) would shoot off to infinity
only if the denominator becomes zero. Also, the shooting off to infinity
would not be abrupt but gradual. Hence, for all v, e∗ (v) ≥ 1 (in fact e∗ (v) = ∞
and u∗ (v) = max{rv − 2, 0} because for all u > rv − 2, the denominator of
(u, v) is nonzero). Thus, to compute h(v) all we need is the unique solu-
tion to (h(v), v) = 1. To numerically obtain the unique solution, note that
for a given v, (u, v) decreases with respect to u from u∗ (v) to infinity.
Using that and the bounds on eb(v), that is, rmean ≤ eb(v) ≤ rpeak for all v, we
can perform a binary search for h(v) between max{vrmean , u∗ (v)} and vrpeak
to find the unique solution to (h(v), v) = 1. Here we have rpeak = r = 16
and rmean = rE[U]/(E[U] + E[D]) = 7.1111. Using h(v) as the solution to
(h(v), v) = 1, we plot h(v) versus v for v ∈ [0, 1] in Figure 10.7.
15
10
h (v)
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
v
FIGURE 10.7
Graph of eb(v) versus v for Problem 103.
Problem 104
Consider Problem 103 and obtain h(v) for v = 0.5 and v = 1 using the CTMC
source results.
Solution
We follow the analysis outlined in the solution to Problem 88. Notice that the
on-times correspond to a two-phase hyperexponential distribution. Hence
the on-time would be exp(3) with probability 0.6 and it would be exp(2) with
probability 0.4, which can be deduced from U(t). The off-times correspond
to the sum of three IID exp(6) random variables. Thus, we can write down
the environment process {Z(t), t ≥ 0} as an = 5 state CTMC with states 1 and
2 corresponding to on and states 3, 4, and 5 corresponding to off. Thus, the
Q matrix corresponding to S = {1, 2, 3, 4, 5} is
⎡ ⎤
−3 0 3 0 0
⎢ 0 −2 2 0 0 ⎥
⎢ ⎥
⎢ ⎥
Q=⎢ 0 0 −6 6 0 ⎥.
⎢ ⎥
⎣ 0 0 0 −6 6 ⎦
3.6 2.4 0 0 −6
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 605
Since h(v) = e(Q + vR), we obtain the eigenvalues of (Q + vR). For v = 0.5,
we get the eigenvalues as −9.3707, −4.4931 + 3.4689i, −4.4931 − 3.4689i,
5.2364, and 6.1205. When there are some eigenvalues that are complex, soft-
ware packages when asked to compute the maximum might compute the
maximum of the absolute value. However, in this case, that would return
the wrong value. The correct approach is to determine the maximum for the
real part. If one does that here, we get h(0.5) = 6.1205, which matches exactly
with the approach used in Problem 103. There is a perfect match for h(1)
as well yielding a value 14.0226, again with two eigenvalues with imaginary
parts. It is worthwhile to spend a few moments contrasting the two methods,
that is, the one used in this problem and that of Problem 103.
There are several other stochastic process sources for which one can
obtain the effective bandwidths. As described earlier for an MRGP and
regenerative process, we can use the results in Kulkarni [68]. See Krishnan
et al. [65] for the calculation of effective bandwidths for traffic modeled by
fractional Brownian motion. In fact, we can even obtain effective bandwidths
for discrete sources. We do not present any of those here but the reader is
encouraged to do a literature search to find out more about those cases. Next
we consider some quick extensions.
Z1(t)
Z2(t) X(t)
C
Z k(t)
FIGURE 10.8
Single infinite-sized buffer with multiple input sources. (From Gautam, N. et al., Prob. Eng.
Inform. Sci., 13, 429, 1999. With permission.)
fluid at rate rk (Zk (t)) into the buffer. Let ebk (v) be the effective bandwidth of
source k such that
1
ebk (v) = lim log E{exp(vAk (t))}
t→∞ vt
where
t
Ak (t) = rk Zk (u) du.
0
If the stochastic process {Zk (t), t ≥ 0} is an SMP (or one of the processes that
is tractable to get the effective bandwidths), then we can obtain ebk (v). Then
the net effective bandwidth of the fluid arrival into the buffer due to all the K
sources is, say, eb(v). Since the net fluid input rate is just the sum of the input
rates of the K superpositioned sources, we have A(t) = A1 (t) + · · · + AK (t). By
definition
1 1
eb(v) = lim log E{exp(vA(t))} = lim log E[exp(v{A1 (t) + · · · + AK (t)})].
t→∞ vt t→∞ vt
1
eb(v) = lim log E[exp(vA1 (t))]E[exp(vA2 (t))] . . . E[exp(vAK (t))]
t→∞ vt
1
K
= lim log E[exp(vAk (t))].
t→∞ vt
k=1
Thus, we have
K
eb(v) = ebk (v).
k=1
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 607
t
A(t) = r(Z(u))du.
0
1
hA (v) = lim log E{exp(vA(t))}.
t→∞ t
1
ebA (v) = lim log E{exp(vA(t))}.
t→∞ vt
E[r(Z(∞))] < c.
We ask the question: What is the ALMGF as well as the effective band-
width of the output traffic from the buffer? For that let D(t) be the total
output from the buffer over (0, t]. By definition, the ALMGF of the output is
1
hD (v) = lim log E{exp(vD(t))}.
t→∞ t
608 Analysis of Queues
Recall that
for any ALMGF h(v). For an infinite-sized stable buffer, since rmean would be
the same for the input and output due to no loss, we have
and
Figure 10.9 pictorially depicts this where there is a v∗ , which is the value of
v for which hA (v) = c. In other words, for v > v∗ , hD (v) essentially follows the
tangent at point v∗ . We can write down the relationship between hA (v) and
hD (v) as
hA (v) if 0 ≤ v ≤ v∗
hD (v) =
hA (v ) − cv + cv if v > v∗ ,
∗ ∗
d
[hA (v)] = c.
dv
hA(v)
hD(v)
v* v
FIGURE 10.9
Relationship between ALMGF of the input and the output of a buffer. (From Kulkarni, V.G. and
Gautam, N., Queueing Syst. Theory Appl., 27, 79, 1997. With permission.)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 609
1
ebD (v) = lim log E{exp(vD(t))},
t→∞ vt
then to obtain the relationship between ebD (v) and ebA (v), we can write down
the effective bandwidth ebD (v) of the output as
ebA (v) if 0 ≤ v ≤ v∗
ebD (v) = v∗
c − v {c − ebA (v )} if v > v∗ .
∗
Notice from the preceding expression that ebD (v) ≤ ebA (v). That is mainly
because D(t) ≤ A(t) for all t if the queue was empty initially. Further, the peak
rate for the input is rpeak whereas it is c for the output. Since rpeak > c, there
would be values of v when ebD (v) will be lesser than ebA (v) (as for very large
v, the effective bandwidths approach their respective peak values, i.e., rpeak
for input and c for output). The main implication of ebD (v) ≤ ebA (v) is that as
fluid is passed from queue to queue, it will eventually get more and more
smooth approaching closer to the mean rate.
For more details regarding effective bandwidths of output processes,
refer to Chang and Thomas [16], Chang and Zajic [17], and de Veciana
et al. [23]. In the following section, we will find out how to use effective
bandwidths to obtain bounds and approximations for the steady-state buffer
contents.
E[r(Z(∞))] < c.
A common feature for the bounds and approximations of P{X > x} is that
all of them use the effective bandwidth of the fluid input. In fact, the struc-
ture of all the approximations and bounds would also be somewhat similar.
The key differences among the different bounds and approximations are the
values of x for which the methods are valid and whether the result is con-
servative. By conservative, we mean that our expression for P{X > x} is higher
than the true P{X > x}. The reason we call that conservative is if one were to
design a system based on our expression for P{X > x}, then what is actually
observed in terms of performance would only be better. With that notion, we
first summarize the various methods for bounds and approximations. Later
we describe them in detail.
• Exact computation:
Expressions for P{X > x} of the type
P{X > x} = bi e−ηi x .
i
Based on: Anick et al. [5], Elwalid and Mitra [28, 29], and
Kulkarni [69].
Described in: Section 9.2.3.
Valid for: Any x and any CTMC environment process {Z(t), t ≥ 0} or
environment processes that can easily be modeled as CTMCs.
Drawback: Not easily extendable to other environment processes.
• Effective bandwidth approximation:
Estimates of the tail probabilities of P{X > x} using just the effective
bandwidth calculations as
Based on: Elwalid and Mitra [30], Kesidis et al. [61], Krishnan et al.
[65], and Kulkarni [68].
Described in: Section 10.2.1.
Valid for: Large x and a wide variety of stochastic processes.
Drawback: Could be off by an order of magnitude for not-so-large x.
• Chernoff dominant eigenvalue approximation:
Improvement to the effective bandwidth approximation for
P{X > x} as
Based on: Palmowski and Rolski [87, 88] and Gautam et al. [39].
Described in: Section 10.2.3.
Valid for: Any x and any SMP environment process {Z(t), t ≥ 0}.
Drawback: Computationally harder than other methods.
Notice that the η described in the exponent of the last three methods
are in fact equal. So essentially the methods eventually only differ in the
constant that multiplies e−ηx . However, the approaches are somewhat differ-
ent and their scopes are different too. We will see in the following sections.
We already discussed the computation of P{X > x} for CTMC environment
processes in Chapter 9. The others are described in the following.
eb(η) = c. (10.5)
Problem 105
Consider an infinite-sized buffer with output capacity c and input regulated
by an on-off source with on-times according to exp(α) and off-times accord-
ing to exp(β). Traffic is generated at rate r, when the source is in the on-state,
and no traffic is generated when the source is in the off-state. Assume that
rβ/(α + β) < c < r. Using the effective bandwidth approximation, develop an
expression for the tail distribution. Compare that with the exact expression
for the buffer contents.
Solution
Recall from Equation 10.2 that the effective bandwidth of such a CTMC
on-off source is
rv − α − β + (rv − α − β)2 + 4βrv
eb(v) = .
2v
rη − α − β + (rη − α − β)2 + 4rβη
=c
2η
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 613
which results in
β α
η=− + .
c (r − c)
where
β α
η=− + .
c (r − c)
From the exact analysis in Equation 9.22 in Problem 87, we can see that
βr
P{X > x} = eλx ,
c(α + β)
where
β α
λ= − .
c (r − c)
Problem 106
Consider the system described in Problem 85, where there is an infinite-sized
buffer with output capacity c = 12 kbps and the input is driven by a four-state
CTMC with
⎡ ⎤
−10 2 3 5
⎢ 0 −4 1 3 ⎥
Q=⎢
⎣ 1
⎥.
1 −3 1 ⎦
1 2 3 −6
614 Analysis of Queues
⎡ ⎤
20 0 0 0
⎢ 0 15 0 0 ⎥
R=⎢
⎣ 0
⎥.
0 10 0 ⎦
0 0 0 5
Q
eb(v) = e +R ,
v
for all x ≥ 0. The second term in this example is practically negligible for
almost any x value. Thus, the approximation is off about 0.6757, which
would be reasonable when x is large and we would get the right order of
magnitude.
(i.e., − m states with strictly positive drifts). Exact analysis (using the
notation in Section 9.2.2) yields
−m+1
−m
P{X > x} = 1 − F(x, j) = ki eλi x ,
j=1 i=1
where each ki is in terms of all the φj and aj values. However, the effective
bandwidth approximation for large values of x yields
η = − max λi .
i:Re(λi )<0
Problem 107
Consider a stable M/M/1 queue with arrival rate λ and service rate μ. Derive
an expression for the steady-state workload distribution. Then use the effec-
tive bandwidth approximation to obtain the probability that the steady-state
workload is greater than x for some very large x.
Solution
Let W be a random variable denoting the steady-state workload in the sys-
tem for an M/M/1 queue. By conditioning on the steady-state number in the
616 Analysis of Queues
∞
j j
−sW λ λ μ
E[e ]= 1−
μ μ μ+s
j=0
λ μ+s
= 1−
μ μ+s−λ
λ λ
= 1− 1+ .
μ μ+s−λ
λ −(μ−λ)x
P{W > x} = e .
μ
N(t)
A(t) = Si ,
i=1
where
Si ∼ exp(μ)
N(t) is the number of events (i.e., arrivals) in time (0, t] of a Poisson process
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 617
N(t)
μ
E evA(t) = E E evA(t) N(t) = E
μ−v
since E[evSi ] = μ/(μ − v). Also, by computing the generating function for a
Poisson random variable N(t), we can write
E evA(t) = e−(1−z)λt ,
1 (z − 1)λ λ
eb(v) = lim log E evA(t) = = .
t→∞ vt v μ−v
η = μ − λ.
Thus, we have the tail distribution of the workload for very large x as
where η = μ − λ. Notice from the exact analysis that P{W > x} = (λ/μ) e−ηx .
Thus, similar to the fluid queue, here too the exponent term agrees perfectly,
which would make the approximation excellent as x grows.
K
E{rk (Zk (∞))} < c.
k=1
1
ebk (v) = lim log E{exp(vAk (t))},
t→∞ vt
where
t
Ak (t) = rk (Zk (u))du.
0
K
ebk (η) = c.
k=1
Problem 108
A wireless sensor system with seven nodes form a feed-forward in-tree
network as depicted in Figure 10.10. Every node of the network has an
infinite-sized buffer (denoted by B1 , . . . , B7 ) and information flows in and out
of those buffers. Except for nodes 5 and 7 that only transmit sensed informa-
tion, all other nodes sense as well as transmit information. We model sensed
information to arrive into buffers B1 , B2 , B3 , B4 , and B6 as independent and
identically distributed exponential on-off fluids with parameters α per sec-
ond, β per second, and r kBps. For i = 1, . . . , 7, the output capacity of buffer Bi
is ci kBps. The sensor network operators would prefer not to have more than
b kBps of information stored in any buffer at any time in steady state. Using
the effective bandwidth approximation, derive approximations for the prob-
ability of exceeding b kB of information in each of the seven buffers. Obtain
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 619
B1
B4
B2 B6 B7
B5
B3
FIGURE 10.10
In-tree network.
rβ
rmean
s = .
(α + β)
⎧
⎨ ebin
j (v) if 0 ≤ v ≤ v∗j
ebout ! "
j (v) =
⎩ cj −
v∗j
v cj − ebin
j v∗j if v > v∗j ,
d
v ebin
j (v) = cj .
dv
ebin
1 (v) = ebs (v),
ebin
2 (v) = ebs (v),
ebin
3 (v) = ebs (v),
ebin out
4 (v) = ebs (v) + eb1 (v),
ebin out
6 (v) = ebs (v) + eb5 (v),
ebin
j (ηj ) = cj .
∗
j (v) for j = 1, 2, 3, we first need vj . For j = 1, 2, 3, we
Next, to compute ebout
can solve for v in
d
v ebin
j (v) = cj
dv
to get
#$ % # $ %
β cj α α β(r − cj )
v∗j = −1 + 1− = 0.4031.
r β(r − cj ) r cj α
At buffer B4 we have
&
ebs (v) + ebin
1 (v) if 0 ≤ v ≤ v∗1
ebin
4 (v) = ebs (v) + ebout
1 (v) = v∗1 ' ∗ (
ebs (v) + c1 − v c1 − ebin
1 v1 if v > v∗1 ,
with v∗2 = v∗3 = 0.4031. Solving for ebin 5 (η5 ) = c5 , we get η5 = 0.5738.
At buffer B6 we have
⎧
⎨ebs (v) + ebin
5 (v) if 0 ≤ v ≤ v∗5
ebin (v) = eb (v) + eb out
(v) =
⎩eb (v) + c − v∗5 'c − ebin v∗ ( if v > v∗ ,
6 s 5
s 5 v 5 5 5 5
where L can be thought of as the fraction of the fluid that would be lost in
steady state if there was no buffer and η is the solution to
K
ebk (η) = c.
k=1
Let
& *
K
∗
s = sup c w − mk (w) ,
w≥0 k=1
K
mk (w∗ ) = c,
k=1
where mk (w) denotes the derivative of mk (w) with respect to w. Then the
Chernoff estimate of L is
exp(−s∗ )
L≈ √ ,
w∗ σ(w∗ ) 2π
where
K
σ2 (w∗ ) = mk (w∗ ),
k=1
with mk (w) denoting the second derivative of mk (w) with respect to w. The
main problem is in computing mk (w). If {Zk (t), t ≥ 0} can be modeled as a
stationary and ergodic process with state space Sk and stationary probability
vector, pk , then
⎧ ⎫
⎨ ⎬
j k
mk (w) = log pk ew r (j) .
⎩ ⎭
j∈Sk
Problem 109
Consider a source modulated by an -state irreducible CTMC {Z(t), t ≥ 0}
with infinitesimal generator
Q = [qij ]
624 Analysis of Queues
pQ = 0 and pl = 1.
l=1
When the CTMC is in state i, the source generates fluid at rate ri . This source
inputs traffic into an infinite capacity buffer with output channel capacity
c. Assume that l = 1 pi ri < c < maxl rl . Using CDE approximation develop
an expression for P(X > x). Then, illustrate the approach for the numerical
example of Problem 106.
Solution
Using the CDE approximation, we have
Also,
s∗ = cw∗ − m(w∗ )
and
! " ! " ! "2
w∗ rj 2 w∗ rj w∗ rj
j=1 pj e j=1 pj rj e − j=1 pj rj e
σ2 (w∗ ) = ! "2 .
w ∗r
j=1 pj e
j
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 625
exp(−s∗ )
L≈ √ .
w∗ σ(w∗ ) 2π
Recall from Problem 106 that the steady-state probability vector is p = [0.0668
0.2647 0.4118 0.2567] and η that solves e(R + Q/η) = c is η = 0.5994 per kB. To
express L, we numerically obtain w∗ = 0.0649, σ2 (w∗ ) = 20.3605, s∗ = 0.0423,
thereby L = 0.5208. Thus, from the CDE approximation, we have
Problem 110
Consider two sources that input traffic into an infinite-sized buffer with
output capacity c = 10. The first source is identical to that in Problem
102 and the second source is identical to that in Problem 103. In other
words, from source-1, fluid flows into the according to a three-state SMP
{Z1 (t), t ≥ 0} with state space {1, 2, 3}. The buffer elements of the kernel
of this SMP are G12 (t) = 1 − e−t − te−t , G21 (t) = 0.4(1 − e−0.5t ) + 0.3(1 −
e−0.2t ), G23 (t) = 0.2(1 − e−0.5t ) + 0.1(1 − e−0.2t ), G32 (t) = 1 − 2e−t + e−2t , and
G11 (t) = G13 (t) = G22 (t) = G31 (t) = G33 (t) = 0. Also, the flow rates in the three
states are r(i) = i for i = 1, 2, 3. Source-2 is an on-off source {Z2 (t), t ≥ 0} with
state space {u, d} that generates fluid so that the on-times are IID random
variables with CDF U(t) = 1−0.6e−3t −0.4e−2t and the off-times are IID Erlang
random variables with mean 0.5 and variance 1/12 in appropriate time units
compatible with the on-times. When the source is on, fluid is generated at
the rate of 16 units per second and no fluid is generated when the source is
626 Analysis of Queues
η = 0.0965,
with eb1 (η) = 1.8535 and eb2 (η) = 8.1465. To compute L, consider functions
mk (w) (for k = 1, 2). We have
m1 (w) = log p11 ew r(1) + p21 ew r(2) + p31 ew r(3)
and
m2 (w) = log pu2 ew ru + pd2 ew rd ,
where r(i) = i for i = 1, 2, 3, p11 p21 p31 = [0.2772 0.6337 0.0891] as described in
Problem 102, ru = 16, rd = 0, and [pu pd ] = [4/9 5/9] based on Problem 103.
We can compute w∗ as the solution to
using a binary search. Thus, we have w∗ = 0.0168 for the numerical values
stated in the problem. Also,
and
2
3 ∗ r(i) 3 ∗ r(i) 3 ∗ r(i)
pi1 ew pi1 r(i)2 ew − pi1 r(i) ew
i=1 i=1 i=1
σ2 (w∗ ) = 2
3 ∗
pi ew r(i)
i=1 1
! ∗ ∗
"! ∗ ∗
"
pu2 ew ru + pd2 ew rd pu2 r2u ew ru + pd2 r2d ew rd
! ∗ ∗
"2
− pu2 ru ew ru + pd2 rd ew rd
+ ! "2
∗ ∗
pu2 ew ru + pd2 ew rd
= 64.2977.
exp(−s∗ )
L≈ √ = 1.1708.
w∗ σ(w∗ ) 2π
Notice that the procedure for obtaining L does not depend on the stochas-
tic process governing the sources once we know the steady-state distribution.
This makes it convenient since a single approach can be used for any discrete
stochastic process. However, as described earlier, the method is an approxi-
mation that is usually suitable for the tail distribution. Next we will consider
an approach for bounds on the entire distribution, not just the tail.
infinite capacity buffer into which fluid is generated from K sources accord-
ing to environment processes {Zk (t), t ≥ 0} for k = 1, 2, . . . , K that are indepen-
dent SMPs. When the SMP {Zk (t), t ≥ 0} for some k ∈ {1, 2, . . . , K} is in state
i, fluid is generated into the buffer at rate rk (i). The SMP {Zk (t), t ≥ 0} for all
k ∈ {1, 2, . . . , K} has a state space Sk = {1, 2, . . . , k } and kernel Gk (x) = Gkij (x) .
Using that we can calculate the expected time spent in state i, which we
call τki . Also assume that we can compute pk , the stationary vector of the kth
SMP {Zk (t), t ≥ 0}, where
Further, using the SMP source characteristics, say we can compute the effec-
tive bandwidth of the kth source, ebk (v). Then, as always, we let η be the
solution to
K
ebk (η) = c.
k=1
Now we describe how to compute bounds for the steady-state buffer content
distribution as follows:
where
X(t) is the amount of fluid in the buffer at time t
C∗ and C∗ are constants that we describe how to compute next
Denote k (η) = χk (η, ebk (η)). Let hk be the left eigenvector of k (η) corre-
sponding to the eigenvalue 1, that is,
hk = hk k (η). (10.6)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 629
& ∞ *
hki e−η(rk (i)−ebk (η))x x eη(rk (i)−ebk (η))y dGkij (y)
k
min (i, j) = inf ∞ , (10.8)
x (pki /τki ) x dGkij (y)
& ∞ *
hki e−η(rk (i)−ebk (η))x x eη(rk (i)−ebk (η))y dGkij (y)
k
max (i, j) = sup ∞ . (10.9)
x (pki /τki ) x dGkij (y)
where
/K
k
/K k
k=1 H k=1 H
C∗ = /K , C∗ = /K ,
minA k=1 min
k (i , j )
k k maxA k=1 max (ik , jk )
k
&
A = (i1 , j1 ), (i2 , j2 ), . . . , (iK , jK ) :
*
K
k
ik , jk ∈ Sk , rk (ik ) > c and ∀k, P (ik , jk ) > 0 .
k=1
In these expressions, the only unknown terms are max k and min
k . Next, we
describe how to compute them for some special cases. For that, we drop k with
the understanding that all the expressions pertain to k.
First consider a nonnegative random variable Y with CDF Gij (x)/Gij (∞)
and density
dGij (x) 1
gij (x) = .
dx Gij (∞)
630 Analysis of Queues
gij (x)
λij (x) = .
1 − Gij (x)/[Gij (∞)]
λij (x) ↑ x
λij (x) ↓ x.
It is possible to obtain closed form algebraic expressions for max (i, j) and
min (i, j), if random variable Y with distribution Gij (x)/Gij (∞) is an IFR
or DFR random variable. The following result describes how to compute
max (i, j) and min (i, j) in those cases. Let x∗ and x∗ be such that
& ∞ *
∗ eη(ri −c)y dGij (y)
hi x
x = arg sup ∞
x (pi /τi )eη(ri −c)x x dGij (y)
and
& ∞ *
eη(ri −c)y dGij (y)
hi x
x∗ = arg inf ∞ .
x (pi /τi )eη(ri −c)x x dGij (y)
Then max (i, j) and min (i, j) occur at x values given by Table 10.1 with the
understanding
TABLE 10.1
Computing max (i, j) and min (i, j)
IFR DFR
ri > c ri ≤ c ri > c ri ≤ c
x∗ 0 ∞ ∞ 0
φ̃ij (−η(ri −c))τi hi τi hi λij (∞) τi hi λij (∞) φ̃ij (−η(ri −c))τi hi
max (i, j) pij pi pi (λij (∞)−η(ri −c)) pi (λij (∞)−η(ri −c)) pij pi
x∗ ∞ 0 0 ∞
τi hi λij (∞) φ̃ij (−η(ri −c))τi hi φ̃ij (−η(ri −c))τi hi τi hi λij (∞)
min (i, j) pi (λij (∞)−η(ri −c)) pij pi pij pi pi (λij (∞)−η(ri −c))
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 631
Problem 111
Consider a source modulated by an -state irreducible CTMC {Z(t), t ≥ 0}
with infinitesimal generator
Q = [qij ]
pQ = 0 and pl = 1.
l=1
When the CTMC is in state i, the source generates fluid at rate ri . Let
R = diag[ri ].
This source inputs traffic into an infinite capacity buffer with output chan-
nel capacity c. Develop upper and lower bounds for P(X > x). Then, for
the numerical example of Problem 106, compare the bounds against exact
results.
Solution
We obtain η by solving
eb(η) = c,
where eb(·) is the effective bandwidth of the CTMC source and it is given by
Q
eb(v) = e R + .
v
where qi = − qii = j=i qij . The expected amount of time the CTMC spends
in state i is
1
τi = .
qi
For a given v,
since eb(η) = c and the remaining terms can be obtained from G̃ij (cη − ri η).
Equation 10.6 without the superscript k reduces to
hj = hi φij (η)
i=1
qij
= hi . (10.10)
qi − η(ri − c)
i=j
hi qi
= −1
η(ri − c) qi − η(ri − c)
i=1
hi
= . (10.11)
qi − η(ri − c)
i=1
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 633
From Equation 10.9 without the superscript k, we have for i = j and qij > 0
& ∞ *
hi e−η(ri −c)x x eη(ri −c)y dGij (y)
max (i, j) = sup ∞
x (pi /τi ) x dGij (y)
& ∞ *
hi e−η(ri −c)x eη(ri −c)y e−qi y dy
= sup x∞ −q y
x pi qi x e i dy
0
1 hi
= sup
x pi qi − η(ri − c)
0
1 hi
= inf
x pi qi − η(ri − c)
hi
g = [gi ] = .
qi − η(ri − c)
Q
g R+ = cg,
η
gi qij
Q qj
g R+ = rj − gj +
η j η η
i=j
qij
qj hj hi
= rj − +
η qj − η(rj − c) η qi − η(ri − c)
i=j
qj hj hj
= rj − +
η qj − η(rj − c) η
634 Analysis of Queues
hj
= c
qj − η(rj − c)
= cgj .
At any rate, we can write down bounds for the steady-state buffer content
distribution as
where
H
C∗ =
mini:ri >c,j:pij >0 {min (i, j)}
hi
i=1 qi −η(ri −c)
=
1 hi
mini:ri >c pi qi −η(ri −c)
i=1 gi
= gi
mini:ri >c pi
and
H
C∗ =
maxi:ri >c,j:pij >0 {max (i, j)}
hi
i=1 qi −η(ri −c)
=
hi
maxi:ri >c p1i qi −η(r i −c)
i=1 gi
= g .
maxi:ri >c pii
Recall from Problem 106 that the steady-state probability vector is p = [0.0668
0.2647 0.4118 0.2567] and η that solves e(R + Q/η) = c is η = 0.5994 per kB.
To obtain C∗ and C∗ , we solve for g as the left eigenvector of (R + Q/η)
that corresponds to eigenvalue of c, that is, g satisfies g(R + Q η) = cg. Using
that we get g = [0.1746 0.7328 0.5533 0.3555] (although this is not unique, but
notice that it appears in the numerator and denominator of both C∗ and C∗
and hence would be a nonissue). Next, notice that only in states i = 1 and
i = 2, we have ri > c. Thus, we have
g1 + g2 + g3 + g4
C∗ = = 0.6953
min(g1 /p1 , g2 /p2 )
and
g1 + g2 + g3 + g4
C∗ = = 0.6561.
max(g1 /p1 , g2 /p2 )
TABLE 10.2
Comparing the Exact Results against Approximations and Bounds
Method Result
Exact computation lim P{X(t) > x} = 0.6757e−0.5994x − 0.0079e−1.3733x
t→∞
Effective bandwidth approx. lim P{X(t) > x} ≈ e−0.5994x
t→∞
CDE approx. lim P{X(t) > x} ≈ 0.5208e−0.5994x
t→∞
Bounds 0.6561e−0.5994x ≤ lim P{X(t) > x} ≤ 0.6953e−0.5994x
t→∞
636 Analysis of Queues
Problem 112
Consider a source modulated by a two-state (on and off) process that alter-
nates between the on and off states. The random amount of time the process
spends in the on state (called on-times) has CDF U(·) with mean τU and the
corresponding off-time CDF is D(·) with mean τD . The successive on and off-
times are independent and on-times are independent of off-times. Fluid is
generated continuously at rate r during the on state and at rate 0 during
the off state. The source inputs traffic into an infinite-capacity buffer. The
output capacity of the buffer is a constant c. State the stability condition,
and assuming it is true obtain bounds for the steady-state buffer content
distribution.
Solution
The stability condition is
rτU
< c.
τU + τD
We assume that the preceding condition is true and c < r (otherwise the
buffer would be empty in steady state). Following the notation described
for the SMP bounds, we obtain the following. Define
0 D̃(vc)
(v) = ,
Ũ(−v(r − c)) 0
where Ũ(·) and D̃(·) are the LSTs of U(t) and D(t), respectively. We assume
that e((η)) = 1 has a solution and it implies that
1
e((η)) = Ũ(−η(r − c)) D̃(ηc) = 1
h = [1 D̃(ηc)].
(1 − D̃(ηc))r
H= ,
c(r − c)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 637
⎡ ∞ −ηc(y−x) 0⎤
(τU +τD ) x e dD(y)
0 infx
⎢ 0
1−D(x) ⎥
min = ⎣ ⎦
(τ +τ ) ∞ eη(r−c)(y−x) dU(y)
infx D̃(ηc) U D x1−U(x) 0
and
⎡ ∞ 0⎤
(τU +τD ) x e−ηc(y−x) dD(y)
0 supx
⎢ 0
1−D(x) ⎥
max = ⎣ ∞ ⎦.
(τU +τD ) x eη(r−c)(y−x) dU(y)
supx D̃(ηc) 1−U(x) 0
Thus, we can derive bounds for the steady-state buffer content distribu-
tion X as
where
Ũ(−η(r − c)) − 1 r
C∗ = ∞ 0 (10.13)
τU + τD eη(r−c)(y−x) dU(y)
c(r − c)η infx x
1−U(x)
and
Ũ(−η(r − c)) − 1 r
C∗ = ∞ 0. (10.14)
τU + τD eη(r−c)(y−x) dU(y)
c(r − c)η supx x
1−U(x)
Next we consider a special case of the earlier problem (namely, the Erlang
on-off source) to explore and explain the general on-off source.
638 Analysis of Queues
Problem 113
The Erlang on-off source is one with Erlang(NU , α) on-time distribution,
Erlang(ND , β) off-time distribution, and fluid is generated at rate r when the
source is on. Assuming stability, obtain bounds for the steady-state buffer
content distribution. For a numerical example with r = 15, c = 10, τU = 1/70,
and τD = 1/30, illustrate the bounds.
Solution
Note that τU = NU /α and τD = ND /β. Assume that the condition of stability
is satisfied and r > c. Thus, we have
⎡ ! "ND ⎤
β
0 D̃(vc) ⎢ 0 β+vc ⎥
(v) = =⎣ ! "NU ⎦.
Ũ(−v(r − c)) 0 α
0
α−v(r−c)
and
ND
β
h= 1 .
β + ηc
Using the fact that the Erlang random variable has an increasing hazard rate
function, we see that
⎡ ! "ND ⎤
β
⎢ 0 (NU /α + ND /β) β+ηc ⎥
min = ⎣! "ND ⎦
β
β+ηc (NU /α + ND /β) α−η(r−c)
α
0
and
⎡ ⎤
β
0 (NU /α + ND /β) β+ηc
max = ⎣! β
"ND ! "NU ⎦.
β+ηc (NU /α + ND /β) α
α−η(r−c) 0
where
! "NU
α
α−η(r−c) −1 r
C∗ = (10.16)
τU + τD c(r − c)η α
α−η(r−c)
and
! "NU
α
α−η(r−c) −1 r
C∗ = ! "NU 0 . (10.17)
τU + τD
c(r − c)η α
α−η(r−c)
Next consider the numerical example of an Erlang on-off source with on-
time distribution Erlang(NU , α) and off-time distribution Erlang(ND , β) with
r = 15, c = 10, τU = 1/70, and τD = 1/30. We keep the means constant (i.e.,
τU and τD are held constant) but decrease the variances by increasing NU
and ND . In Figure 10.11, we illustrate for four pairs of (NU , ND ) (namely,
(1, 1), (4, 3), (9, 8), and (16, 14)) the logarithm of the upper and lower bounds
on the limiting distribution of the buffer-content process. From the figure
we notice that as the variance decreases, the bounds move further apart.
Also note that C∗ increases with decrease in variance and C∗ decreases with
decrease in variance. Since η increases with decrease in variance, the tail of
the limiting distribution rapidly approaches zero.
x
0
–5 (1 , 1)
–10
Log10(P(X > x))
–15
(4 , 3)
–20
–25
(9 , 8)
(16 , 14)
FIGURE 10.11
Logarithm of the upper and lower bounds as a function of x. (From Gautam, N. et al., Prob. Eng.
Inform. Sci., 13, 429, 1999. With permission.)
640 Analysis of Queues
Remark 24
For the exponential on-off source, which is a special case of the Erlang on-off
source, we get C∗ = C∗ = rβ/[c(α + β)]. Hence, the upper and lower bounds
are equal resulting in
rβ
P{X > x} = e−ηx ,
c(α + β)
where
cα + cβ − βr
η=
c(r − c)
Problem 114
There are K independent fluid sources that input traffic into an infinite
capacity buffer. Each source k is modulated by a CTMC {Zk (t), t ≥ 0} with
infinitesimal generator Qk on state space {1, 2, ..., k }. Also,
k
pk Qk = 0 and pki = 1.
i=1
Fluid is generated at rate rk (Zk (t)) by source k at time t. Let Rk be the cor-
responding rate matrix. Fluid is removed from the buffer by a channel with
constant capacity c. Let X(t) be the amount of fluid in the buffer at time t.
Obtain bounds for P{X(t) > x} as t → ∞ assuming that the buffer is stable.
Solution
For k = 1, 2, . . . , K, let the effective bandwidth of source k be
Qk
ebk (v) = e Rk + ,
v
K
ebk (η) = c
k=1
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 641
k
K
rk (l)pkl < c
k=1 l=1
where
/K k k
∗ k=1 l=1 hl
C = /
mini1 ,...,ik : rk (ik )>c K k=1 hik /pik
k k
and
/K k k
k=1 l=1 hl
C∗ = / .
maxi1 ,...,ik : rk (ik )>c K k=1 hik /pik
k k
Problem 115
An exponential on-off source with on-time parameter α, off-time parameter
β, and rate r (fluid generation rate when on) generates traffic into an infinite-
capacity buffer with output capacity c1 . The output from the buffer acts as an
input to another infinite-capacity buffer whose output capacity is c2 . Assume
for stability and nontriviality that
rβ
< c2 < c1 < r.
(α + β)
642 Analysis of Queues
c1 c2
X1(t) X2(t)
FIGURE 10.12
Exponential on-off input to buffers in tandem. (From Gautam, N. et al., Prob. Eng. Inform. Sci.,
13, 429, 1999. With permission.)
rv − α − β + (rv − α − β)2 + 4βrv
eb(v) = . (10.18)
2v
rβ
P{X1 > x} = e−η1 x , (10.19)
c1 (α + β)
where
c1 α + c1 β − βr
η1 = .
c1 (r − c1 )
D(t) = 1 − e−βt .
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 643
Recall Problem 94 where we derived the LST of the busy period distribution
U(·) as
&
w+β+c1 s0 (w)
if w ≥ w∗
Ũ(w) = β
∞ otherwise,
√
where
w∗ = (2 c1 αβ(r − c1 ) − rβ − c1 α − c1 β)/r, s0 (w) = (−b −
b2 + 4w(w + α + β)c1 (r − c1 ))/(2c1 (r − c1 )), and b = (r − 2c1 )w + (r − c1 )β −
c1 α. The LST of the distribution D(·) is
&
β
if w > −β
D̃(w) = β+w
∞ otherwise.
For this general on-off “pseudo” source that inputs traffic into the second
buffer, we can compute its effective bandwidth, eb2 (v), as
eb1 (v) if 0 ≤ v ≤ v∗
eb2 (v) = ∗
(eb1 (v∗ ) − c1 ) vv + c1 if v > v∗ ,
where
2 # $ %
∗ β c1 α α β(r − c1 )
v = −1 + 1− (10.20)
r β(r − c1 ) r c1 α
and eb1 (v) is from Equation 10.18. Note that η2 is obtained by solving
eb2 (η2 ) = c2 .
If η2 ≤ v∗ , we have
0 D̃(η2 c2 )
(η2 ) = ,
Ũ(−η2 (c1 − c2 )) 0
[1 h2 ] (η2 ) = [1 h2 ]
as h2 = D̃(η2 c2 ). If η2 > v∗ , we use the same h2 since D̃(·) only gradually goes
to infinite and the condition is mainly because of Ũ(·). The situation is similar
to the one in Figure 10.5(b). Hence, it would not cause any concerns. With
this we proceed to obtain the bounds for the distribution of X2 .
644 Analysis of Queues
where
Problem 116
Consider the tandem buffers model in Figure 10.13. Input to the first buffer is
from N independent and identical exponential on-off sources with on-time
parameter α, off-time parameter β, and rate r. The output from buffer-1 is
directly fed into buffer-2. The output capacities of buffer-1 and buffer-2 are
c1 and c2 , respectively. Assuming stability, obtain bounds on the limiting
distributions of the contents of the two buffers.
Solution
We first obtain bounds on the contents of buffer-1 and then of buffer-2.
1
2 c1 c2
X1(t) X2(t)
N
FIGURE 10.13
Tandem buffers model with multiple sources. (From Gautam, N. et al., Prob. Eng. Inform. Sci.,
13, 429, 1999. With permission.)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 645
Buffer-1: Let Z1 (t) be the number of sources that are in the on state at time t.
Clearly, {Z1 (t), t ≥ 0} is an SMP (more specifically, a CTMC). Assume
Nrβ
< c1 < Nr
α+β
for stability (ensured by the first inequality) and nontriviality (the second
inequality ensures that buffer-1 is not always empty). We can show that (δ)
is given by
⎧ iα
⎪
⎨ iα+(N−i)β−(ir−c1 )δ if j = i − 1
φij (δ) = (N−i)β
if j = i + 1
⎪
⎩ iα+(N−i)β−(ir−c1 )δ
0 otherwise ,
and e((δ)) = 1 always has solutions. Using the expression for eb(v) in
Equation 10.18 and solving for η1 in N eb(η1 ) = c1 , we get
N(c1 α + c1 β − Nβr)
η1 = .
c1 (Nr − c1 )
h = h(η1 ).
where
N ! "
hi N
i=0 η1 (ir−c1 ) j=0 (φ (η
ij 1 )) − 1
C∗1 = ,
hi 1
mini:ir>c1
pi iα + (N − i)β − η1 (ir − c1 )
N ! "
hi N
i=0 η1 (ir−c1 ) j=0 (φij (η1 )) − 1
C1∗ = ,
hi 1
maxi:ir>c1
pi iα + (N − i)β − η1 (ir − c1 )
and
ai τi N! αN−i βi
pi = N = .
am τm i!(N − i)! (α + β)N
m=0
646 Analysis of Queues
where Z1 (t) is the number of sources on at time t. Let R1 (t) be the output rate
from the first buffer at time t. We assume that
Nrβ
< c2 < c1 .
α+β
We can see that the {Z2 (t), t ≥ 0} process is an SMP on state space {0, 1, . . . , }
with kernel
G(t) = Gij (t)
2 # $ %
β c1 α α β(Nr − c1 )
v∗ = −1 + 1− .
r β(Nr − c1 ) r c1 α
Hence solving
eb2 (η2 ) = c2 ,
we get
0
N(c2 α + c2 β − Nβr) h(v∗ ) − c1 v∗
η2 = min , ,
c2 (Nr − c2 ) c 2 − c1
where
! "
rv∗ − α − β + (rv∗ − α − β)2 + 4βrv∗ N
h(v∗ ) = . (10.24)
2
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 647
If η2 ≤ v∗ , then
&
G̃ij (−η2 (ir − c2 )) if 0 ≤ i ≤ − 1,
φij (η2 ) =
G̃ij (−η2 (c1 − c2 )) if i =
h(η2 ) = h.
It can be shown that the random variables associated with the distri-
bution Gj (x)/Gj (∞) have a decreasing failure rate. Hence, min (, j) and
max (, j) occur at x = ∞ and x = 0, respectively. Thus, we can find bounds
for the steady-state distribution of the buffer-content process {X2 (t), t ≥ 0} as
follows. On the basis of Equations 10.7 through 10.9, removing k since K = 1,
we can write
⎛ ⎞
hi ⎝ (φij(η2 )) − 1⎠ ,
H=
η2 (min(ir, c1 ) − c2 )
i=0 j=0
and obtain min (i, j) as well as max (i, j) using Table 10.1.
Then the limiting distribution of the buffer content process is
where
H H
C∗2 = , C2∗ = .
mini,j:ir>c2 ,j=i±1 min (i, j) maxi,j:ir>c2 ,j=i±1 max (i, j)
In Figure 10.14, we illustrate the upper and lower bounds on the limiting
distribution of the buffer-content process
–3 x
–4
–5 Upper bound
Log10(P(X2 > x))
–6
–7
Lower bound
–8
–9
FIGURE 10.14
The upper and lower bounds as a function of x. (From Gautam, N. et al., Prob. Eng. Inform. Sci.,
13, 429, 1999. With permission.)
B1
1
2 X1(t)
K1
1
2 X2(t)
c
K2
B2
BN
1
2
XN(t)
KN
FIGURE 10.15
Multiclass fluid system. (From Kulkarni, V.G. and Gautam, N., Queueing Syst. Theory Appl., 27,
79, 1997. With permission.)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 649
at rate rij (Zij (t)) into buffer j. All the classes of fluids are emptied by a sin-
gle channel of constant capacity c. At this time we do not specify the service
scheduling policy for emptying the N buffers.
For example, we will consider policies such as: timed round-robin
(polling) policy, where the scheduler serves the N buffers in a round-robin
fashion; static priority service policy, where there is a priority order for each
class and only when all higher priority buffers are empty this class would be
served; generalized processor sharing (GPS) policy, where a fraction of chan-
nel capacity c is offered to all buffers simultaneously; and threshold policies,
where both the buffer to serve and the fractions of capacity to be assigned
depend on the amount of fluid in the buffers (using threshold values or
switching curves). Notice that the buffers do not necessarily have a constant
output capacity. However, all the results we have seen thus far have had
constant output capacity. We will subsequently use a fictitious compensating
source to address this.
However, we first describe the main objective, which is to analyze the
buffer content levels in steady state. For that, let Xj (t) be the amount of fluid
in buffer j at time t. Assume that all N buffers are of infinite capacity. Assume
that we can use the source characteristics to obtain the effective bandwidth
of source i of class j as ebij (v) for i = 1, 2, . . . , Kj and j = 1, . . . , N. We use that
for the performance analysis. The quality-of-service (QoS) criterion is mainly
based on tail distribution of the buffer contents. In particular, for a given set
of buffer levels B1 , . . . , BN , the probability of exceeding those levels must be
less than 1 , . . . , N , respectively. This for j = 1, . . . , N,
Note that the preceding QoS can indirectly be used for bounds on delay
as well.
The analysis in this section can be used not only in obtaining the tail prob-
abilities but also for admission control. For that we assume that all sources
of a particular class are stochastically identical. We assume that sources
arrive at buffers, spend a random sojourn time generating traffic according
to the respective environment process, and then depart from the buffers. We
assume that the number of sources is slowly varying compared to sources
changing states as well as buffer contents. In particular, we assume that
steady state is attained well before the number of sources of each class
changes. For such a system, our objective is to determine the feasible region
K given by
new Kj + 1st source arrives, we can admit the new source if admitting it
would result in the QoS criterion being satisfied for all sources. Otherwise,
the source is rejected. This can easily be accomplished by maintaining a
precomputed look-up table of the feasible region K.
For the rest of this section, we will consider performance analysis for var-
ious service scheduling policies (such as timed round robin, static priority,
generalized processor sharing, and threshold based). We will obtain admis-
sible regions wherever appropriate and solve the admission control problem.
However, prior to this we first describe how to analyze a buffer whose out-
put capacity is varying as this is a recurring theme for all policies. In fact,
the goal of the next section is to describe a unified approach to address
time-varying output capacities so that they could be used subsequently in
performance analysis.
&
dX̂(t) r(Z(t)) − c(Y(t)) if X̂(t) > 0
=
dt {r(Z(t)) − c(Y(t))}+ if X̂(t) = 0
Z(t)
r(Z(t))
r(Z(t)) c(Y(t)) c
Z(t) Y(t) c –c(Y(t))
Y(t)
FIGURE 10.16
Original system (left) and equivalent fictitious system (right).
Assume that tso does not change with time. The cycle time T is defined
as the amount of time the scheduler takes to complete a cycle, and is
given by
N
T = tso + τj .
j=1
K
j
τj
E[rij (Zij (∞))] < c .
T
i=1
1
2
C
Xj(t)
Kj
Compensating source
FIGURE 10.17
Transforming buffer j to a constant output capacity one using a compensating source. (From
Gautam, N. and Kulkarni, V.G., Queueing Syst. Theory Appl., 36, 351, 2000. With permission.)
c(T − τj )
ebsj (v) = .
T
K
j
(T − τj )
ebij (ηj ) + c = c.
T
i=1
Thus, based on the effective bandwidth approximation, the QoS criteria for
all the classes of traffic are satisfied if for all j = 1, 2, . . . , N,
e−Bj ηj < j .
Similarly, it is also possible to check if the QoS criteria is satisfied using the
upper SMP bound as C∗j e−Bj ηj < j since
654 Analysis of Queues
Although the preceding results assume we know ebij (v) and also how to com-
pute C∗j , next we present a specific example to clarify these aspects. Also, for
the sake of obtaining closed-form algebraic expressions, we consider a rather
simplistic set of sources.
Problem 117
Consider a multiclass fluid queueing system with N buffers, one for each
class. For all j = 1, . . . , N, the input to buffer j are Kj independent and identical
alternating on-off sources that stay on for an exponential amount of time
with parameter αj and off for an exponential amount of time with parameter
βj . When a source is on, it generates traffic continuously at rate rj into buffer j,
and when it is off, it does not generate any traffic. The scheduler serves buffer
j for a deterministic time τj at a maximum rate c and stops serving the buffer
for a deterministic time T −τj . Using effective bandwidth approximation and
bounds, obtain expressions for P(Xj > Bj ). Then for the following numerical
values αj = 3, βj = 0.2, rj = 3.4, Bj = 30, τj /T = 3/13, Kj = 10, c = 15.3, and T
varies from 0.01 to 0.40 while τj /T is fixed, graph using the two methods of
approximate expressions for the fraction of fluid lost assuming that the size
of the buffer is Bj .
Solution
The effective bandwidth of all the Kj sources combined is
1
rj v − αj − βj + (rj v − αj − βj )2 + 4βj rj v
Kj ebj (v) = Kj .
2v
cτj
Kj ebj (ηj ) = ,
T
we get
cτj (αj + βj ) − rj Kj βj T
ηj = cτj .
Kj T (rj TKj − cτj )
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 655
Recall that the effective bandwidth approximation yields P(Xj > Bj ) ≈ e−Bj ηj
and the SMP bounds result in P(Xj > Bj ) ≤ C∗j e−ηj Bj . Thus, for the numer-
ical values listed in the problem, if we let Bj be the size of the buffer,
then the fraction of fluids lost can be approximated as P(Xj > Bj ). With that
understanding, the loss probability estimates using the effective-bandwidth
technique is
loss(ebw) = e−ηj Bj
Figure 10.18 shows the results for loss(ebw) and loss(smp), given the numer-
ical values in the problem by varying T from 0.01 to 0.40 while keeping
τj /T fixed.
Intuitively, we expect the loss probability to increase with T since an
increase in T would increase the time the server does not serve the buffer.
The SMP bounds estimate, loss(smp), increases with T and hence confirms
our intuition. The effective-bandwidth estimate, loss(ebw), does not change
with T. For small T, since loss(smp) < loss(ebw), we can conclude that the
effective-bandwidth technique produces a conservative result. For large T,
the estimate of the loss probability is smaller using the effective-bandwidth
technique than the SMP bounds technique. This indicates that there may be a
risk in using the effective-bandwidth technique as it could result in the QoS
criteria not being satisfied.
Log10 [loss(smp)]
–5.3
Log10 [loss(ebw)]
–5.4
–5.5
FIGURE 10.18
Estimates of the logarithms of loss probability. (From Gautam, N. and Kulkarni, V.G., Queueing
Syst. Theory Appl., 36, 351, 2000. With permission.)
On the other hand, using the upper bound for an SMP we choose the largest
smp
integer Kj,max that satisfies
smp −5 and T
j,max when j = 10
ebw and K
Figure 10.19 shows the results for Kj,max
varies from 0.01 to 10.00 while τj /T is fixed. As T increases, we expect fewer
sources to be allowable into the buffer so that long bursts of traffic can be
smp
avoided when the server is not serving. From the figure, Kj,max clearly con-
forms to our intuition. For large T, we may end up admitting more sources if
we used the effective-bandwidth technique and hence the QoS criterion may
not be satisfied.
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 657
10
Maximum number of sources ebw
Kj,max
8
smp
6 Kj,max
1 2 3 4 5 6 7 8 9 10
T
FIGURE 10.19
Estimate of the maximum number of sources. (From Gautam, N. and Kulkarni, V.G., Queueing
Syst. Theory Appl., 36, 351, 2000. With permission.)
The loss probability estimate using the SMP bounds decreases with increase
in c. Therefore, we perform a search using the bisection method to pick a c
between the mean and peak input rates that satisfies
C∗j e−ηj Bj = j ,
smp
and we denote the c value obtained as cmin since it is the smallest output
capacity that would result in satisfying the QoS criterion
14.8 smp
Cmin
14.7
C
14.6 ebw
Cmin
14.5
FIGURE 10.20
Estimates of the required bandwidth. (From Gautam, N. and Kulkarni, V.G., Queueing Syst.
Theory Appl., 36, 351, 2000. With permission.)
Problem 118
Consider a multiclass fluid queueing system with N = 2. For j = 1, 2, class j
fluid enters into buffer j from Kj exponential on-off sources with mean on-
time 1/αj and mean off-time 1/βj . Fluid is generated by each source at rate
rj when the source is in the on-state. Fluid is emptied by a channel with
capacity c that serves buffer j for τj time and has a total switch-over time of
tso per cycle. State an algorithm to determine the feasible region for this timed
round-robin policy Ktrr so that if (K1 , K2 ) ∈ Ktrr , then P(Xj > Bj ) < j for j = 1
and j = 2. Graph the feasible region for the following numerical values:
To begin with, assume that the cycle time T and the switch-over time tso are
fixed known constants. However, the values τ1 and τ2 vary and are appro-
priately chosen such that τ1 + τ2 + tso = T. Subsequently, consider the case
where T is varied so that it is under different orders of magnitude compared
to tso .
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 659
Solution
An algorithm to compute the feasible region:
1. Set K = ∅.
2. Let τ1 = T and τ2 = 0. (The scheduler always serves only buffer-
1, and hence there are no switch-over times and no compensating
source.)
3. Obtain the maximum number of admissible class-1 sources K1max as
the maximum value of K1 such that
where
and
c(α1 + β1 ) − r1 K1 β1
η1 = .
c/K1 (r1 K1 − c)
where
and
c(α2 + β2 ) − r2 K2 β2
η2 = .
c/K2 (r2 K2 − c)
8. Set K1 = 1.
9. While K1 < K1max :
(i) Compute the minimum required τ1 (≤ T − tso ) such that the
loss probability is less than 1 .
(ii) Compute the available τ2 ( = T − tso − τ1 ).
(iii) Given τ2 , compute the maximum possible K2 value by
minimizing over the set A2 for K2 + 1 sources.
(iv) K = K ∪ {(K1 , 1), (K1 , 2), . . . , (K1 , K2 )}.
(v) K1 = K1 + 1.
10. Return Ktrr = K.
20
18
16
14
12
K2
10
5 10 15 20 25 30
K1
FIGURE 10.21
Admissible region Ktrr . (From Gautam, N. and Kulkarni, V.G., Queueing Syst. Theory Appl., 36,
351, 2000. With permission.)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 661
20
18
16
14 T = 1.22
12
K2
10
8
T = 0.14
6
4 T = 12.02
2
5 10 15 20 25 30
K1
FIGURE 10.22
Ktrr as a function of T. (From Gautam, N. and Kulkarni, V.G., Queueing Syst. Theory Appl., 36,
351, 2000. With permission.)
That said, we move on to the next service scheduling policy, the static
priority policy.
662 Analysis of Queues
K
N j
K1
ebi1 (η1 ) = c. (10.28)
i=1
The constants L1 and C∗1 can be obtained using the CDE approximation
(Section 10.2.2) and SMP bounds (Section 10.2.3), respectively. Thereby,
we could use P(X1 > x) to determine if the QoS criteria P{X1 > B1 } ≤ 1 is
satisfied.
Buffer-j (1 < j ≤ N): The capacity available to buffer j is 0 when at least one of
j−1 Kk +
the buffers 1, . . . , j − 1 is nonempty and it is c − rik (Zik (t))
k=1 i=1
if all the buffers 1, . . . , j − 1 are empty. Let Rj−1 (t) be the sum of the output
rates of the buffers 1, . . . , j − 1 at time t with R0 (t) = 0. Therefore
⎧ j−1
⎨ c if k=1 Xk (t) >0
Rj−1 (t) = (10.29)
⎩ min[c, Kk j−1 r (Z (t))] if j−1
i=1 k=1 ik ik k=1 Xk (t) = 0.
Thus, the (time varying) channel capacity available for buffer j is c − Rj−1 (t)
at time t. Any sample path of the buffer content process {Xj (t), t ≥ 0} remains
unchanged if we transform the model for buffer j into one that gets served at
a constant capacity c and an additional compensating source producing fluid
at rate Rj−1 (t) at time t. Note that the compensating source j is independent
of the Kj sources of priority j.
A critical observation to make is that the compensating source is indeed
the output from a buffer whose input is the aggregated input of all 1, . . . , j−1
priority traffic and constant output capacity c. This observation is made in
Elwalid and Mitra [32] and is immensely useful in the analysis. Consider the
transformed model for the case N = 2 (a 2-priority model for ease of expla-
nation) depicted in Figure 10.23. The sample paths of the buffer content
processes {X1 (t), t ≥ 0} and {X2 (t), t ≥ 0} in this model are identical to those
in the original system. Similarly, in the case of N priorities, such a tandem
model is used.
Thus, buffer j can be equivalently modeled as one that is served at a con-
stant rate c, but has an additional compensating source as described earlier.
Let the effective bandwidth of this compensating source (which is the out-
put traffic from a fictitious buffer with input corresponding to all j − 1 higher
664 Analysis of Queues
1
2 X1(t) c B2
K1
B1 1 c
2 X2(t)
K2
FIGURE 10.23
Equivalent N = 2 priority system. (From Gautam, N. and Kulkarni, V.G., Queueing Syst. Theory
Appl., 36, 351, 2000. With permission.)
j
priority sources) be ebo (v), which is given by
K
j
j
ebij (ηj ) + ebo (ηj ) = c, ∀ j = 1, 2, . . . , N,
i=1
j
where eb1o (·) = 0 and ebo (·) is as in Equation 10.30. Note that the compensating
source j is independent of the Kj sources of priority j. If it is possible to charac-
terize the output process as a tractable stochastic process, we could use CDE
approximation (Section 10.2.2) or SMP bounds (Section 10.2.3) to derive an
expression for P(Xj > x). Otherwise, we could always use the effective band-
width approximation P(Xj > x) ≈ e−ηj x , for j = 2, . . . , N. Thereby, we could
use P(Xj > x) to determine if the QoS criteria P{Xj > Bj } ≤ j is satisfied. We
will use the SMP bounds to obtain an approximation for P(Xj > x) in the
example we present next.
Problem 119
Consider two classes of traffic. The Kj class-j sources, for j = 1, 2, are indepen-
dent and identical on-off sources with exponential on and off times, on-time
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 665
K1 r1 β1 K2 r2 β2
+ < c.
α1 + β1 α2 + β2
For the analysis, we first consider buffer-1. If K1 ≤ c/r1 , then P{X1 > x} = 0,
since buffer-1 will always be empty. Now for the case K1 > c/r1 , let η1 be
the solution to K1 eb1 (η1 ) = c. Then the steady-state distribution of the buffer-
content process is bounded as
where
K1 (cα1 + cβ1 − K1 β1 r1 )
η1 = , (10.32)
c(K1 r1 − c)
! "K1
K1 r1 α1
K1 r1 −c α1 +β1
C∗1 = ! "c/r1 ,
cα1
β1 (K1 r1 −c)
and
K1
K1 r1 β1
C∗1 = .
c(α1 + β1 )
or
v∗ cv∗
K1 eb1 (v∗ ) + K2 eb2 (η2 ) = and η2 > v∗ ,
η2 η2
where
2 # $ %
β1 cα1 α1 β1 (K1 r1 − c)
v∗ = −1 + 1− .
r1 β1 (K1 r1 − c) r1 cα1
c1 = K2 eb2 (η2 )
and
c2 = c − K2 eb2 (η2 ).
1
τ1i = ,
iα2 + (K2 − i)β2
and
K −i
a1 τ1 K2 ! α2 2 βi2
p1i = K i i = .
2 1 1 i!(K2 − i)! (α2 + β2 )K2
m=0 am τm
h1 = h1 1 (η2 ).
Therefore,
⎛ ⎞
K2 K2 !
"
h1i
H1 = ⎝ φ1ij (η2 ) − 1⎠
η2 (ir2 − c1 )
i=0 j=0
and
1 1
max (i, j) = min (i, j)
∞
h1i e−η2 (ir2 −c
1 )x
eη2 (ir2 −c
1 )y
x dG1ij (y)
=
p1i ∞
dG1ij (y)
τ1i x
h1i 1
= .
p1i iα2 + (K2 − i)β2 − η2 (ir2 − c1 )
⎧ iα1
⎪
⎪ iα1 +(K1 −i)β1 1 − exp{−(iα1 + (K1 − i)β1 )t} if j = i − 1
⎪
⎨
(K1 −i)β1
G2i,j (t) = 1 − exp{−(iα1 + (K1 − i)β1 )t} if j = i + 1
⎪
⎪ iα1 +(K1 −i)β1
⎪
⎩
0 otherwise.
Let
⎧ iα1
⎪
⎪ iα1 +(K1 −i)β1 if j = i − 1
⎪
⎨
G2i,j (∞) = (K1 −i)β1
if j = i + 1
⎪
⎪ iα1 +(K1 −i)β1
⎪
⎩
0 otherwise,
G2M,j (∞) = G̃2M,j (0),
where G̃2M,j (s) is the LST of G2M,j (t) that we have shown in Problem 116.
We also need the expression for the sojourn time τ2i in state i, for
i = 0, 1, . . . , M. We have
⎧ 1
⎨ iα1 +(K1 −i)β1 if i = 0, 1, . . . , M − 1
τ2i =
⎩ M−1 G̃2 (0) if i = M
j=0 M,j
a2 τ2
p2i = Mi i ,
2 2
k=0 ak τk
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 669
where
a2 = a2 G2 (∞).
Define
⎧
⎨ G̃2ij (−η2 (ir1 − c2 )) if 0 ≤ i ≤ M − 1,
2
φij (η2 , m) =
⎩ mG̃2 (−η (c − c2 )) if i = M.
ij 2
2
Solve for m such that the Perron–Frobenius eigenvalue of (η2 , m) is 1.
Hence, we obtain h2 from
2
h2 (η2 , m) = h2 .
It can be shown that random variables with distribution G2Mj (x)/G2Mj (∞)
have a decreasing failure rate. Hence, min
2 (M, j) and 2 (M, j) occur at
max
x = ∞ and x = 0, respectively. Thus, we have for (i, j) ∈ {0, 1, . . . , M},
⎛ ⎞
M M !
"
h2i
H2 = ⎝ φ̄2ij (η2 , m) − 1⎠ ,
η2 (ir1 − c2 )
i=0 j=0
⎧ ⎫
⎪
⎨ h2 e−η2 (ir1 −c2 )x ∞ eη2 (ir1 −c2 )y dG2 (y) ⎪
⎬
2 i x ij
min (i, j) = inf ∞ 2 ,
x ⎪
⎩ p2i ⎪
⎭
iα1 +(K2 −i)β1 −η2 (ir1 −c ) x
2 dGij (y)
and
⎧ ⎫
⎪
⎨ h2 e−η2 (ir1 −c2 )x ∞ eη2 (ir1 −c2 )y dG2 (y) ⎪
⎬
2 i x ij
max (i, j) = sup ∞ 2 .
⎪
⎩ p2i ⎪
⎭
x
iα1 +(K2 −i)β1 −η2 (ir1 −c2 ) x
dGij (y)
H1 H2
C∗2 =
min(i1 , j1 ), (i2 , j2 ): min{i1 r1 , c} + i2 r2 > c,
pi1 j1 > 0, pi2 j2 > 0 min 1 (i , j ) 2 (i , j )
1 1 min 2 2
670 Analysis of Queues
and
H1 H2
C∗2 = .
max(i1 , j1 ), (i2 , j2 ): min{i1 r1 , c} + i2 r2 > c,
pi1 j1 > 0, pi2 j2 > 0 max 1 (i , j ) 2 (i , j )
1 1 max 2 2
Problem 120
Consider the N = 2 class system described in Problem 119, where all sources
of a class are IID exponential on-off sources. Obtain admissible region K
using effective bandwidth approximation, CDE approximation, as well as
SMP bounds so that if (K1 , K2 ) ∈ K, then P(X1 > B1 ) > 1 and P(X2 > B2 ) > 2 .
Compare the approaches for the following numerical values:
Solution
Using the three different methodologies (effective bandwidth, CDE, and
SMP bounds), we can obtain different admissible regions based on the
expressions used for approximating P(Xj > x) for j = 1, 2. We first describe
them and provide notation:
25
20
(2)
Kcde
15
K2
Ksmp
(1)
10 Kcde
5 Kebw
5 10 15 20 25 30 35
K1
FIGURE 10.24
(1) (2)
Regions N , Kebw , Ksmp , Kcde , Kcde . (From Gautam, N. and Kulkarni, V.G., Queueing Syst. Theory
Appl., 36, 351, 2000. With permission.)
Next, we compare the region Ksmp with the regions obtained using the
(1) (2)
CDE approximation, Kcde and Kcde , as well as the regions obtained by
effective-bandwidth approximation Kebw and N . We represent the regions
under consideration in Figure 10.24 using the numerical values stated in the
problem.
The region obtained by the SMP bounds, Ksmp , is conservative. There-
fore, if an admissible region has points in Ksmp , then those points are
guaranteed to satisfy the QoS criteria. Thus, the effective-bandwidth approx-
imation produces overly conservative results for these parameter values.
It is crucial to point out that although the effective bandwidth produces
conservative results usually, it is not guaranteed to be conservative, unlike
the results from SMP bounds. But in general, on one hand the effective-
bandwidth approximation is computationally easy, on the other hand, it
could either be too conservative (and hence leading to underutilization of
resources) or be nonconservative (and hence unclear about meeting the QoS
criteria). The CDE approximation, although computationally slower than
the effective-bandwidth approximation, is typically faster than the SMP
bounds technique. However, there are examples where we can show that
(1) (2)
the CDE approximation produces regions Kcde and Kcde with points (K1 , K2 )
that would actually result in the QoS criteria not being satisfied. Using SMP
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 673
Problem 121
Consider a multiclass fluid queueing system with N = 2 and IID exponen-
tial on-off sources for each class. For j = 1, 2, class j fluid enters into buffer
j from Kj exponential on-off sources with mean on-time 1/αj and mean off-
time 1/βj . Fluid is generated by each source at rate rj when the source is in
the on-state. Fluid is emptied by a channel with capacity c. Use the following
numerical values:
Consider two policies: (1) timed round robin with tso = 0.02 and T = c(B1 +
B2 ) + tso ; and (2) static priority. Compare the two policies by viewing the
admissible regions.
Solution
In Figure 10.25, we compare the two policies, timed round-robin and static
priority, by viewing their respective admissible regions (using SMP bounds,
hence the region corresponds to Ksmp in the previous problem and Ktrr
in Section 10.3.2) for two-class exponential on-off sources with parameters
given in the problem.
From the figure, we see the timed round-robin policy results in a smaller
admissible region. This is because unlike the static priority service policy,
the timed round-robin policy is not a work-conserving service discipline. In
particular, there is time switching between buffers as well as time spent in
buffers (recall that τ1 and τ2 are always spent) even if there is no traffic.
However, static priority service policy does not achieve fairness among the
classes of traffic. Therefore, it may not be an appropriate policy to use at
all times.
There are many other policies one could consider besides timed round-
robin and static priority. We briefly describe three of them in the following
section.
674 Analysis of Queues
20
18
16
14
Static priority
12
K2
10
6
Timed round-
4 robin
5 10 15 20 25 30
K1
FIGURE 10.25
Timed round-robin versus static priority. (From Gautam, N. and Kulkarni, V.G., Queueing Syst.
Theory Appl., 36, 351, 2000. With permission.)
• φ1 + φ2 + · · · + φN = 1
• If all the input buffers have nonzero fluid, the scheduler allocates
output capacity c in the ratio φ1 : φ2 : · · · : φN to each of the N
buffers
• If some buffers are empty, the scheduler allocates just enough capac-
ity to those buffers (equal to the rates of traffic entering) so that
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 675
The GPS policy is in some sense the limiting timed round-robin policy,
where tso = 0 and for all j, τj → 0 such that τj /T → φj . The only exception to
timed round-robin is when a buffer is empty and here we assume that the
system is work conserving. So empty buffers are served only for a fraction
of their slot. The discrete version of the GPS is called the packetized general
processor sharing (PGPS) or weighted fair queueing, which is well-studied
in the literature. The quality-of-service aspects, effective bandwidths, and
admission control for the GPS and PGPS have been addressed in detail in
de Veciana et al. [24] and [22]. We recapitulate those results for GPS in the
next problem for the case N > 2, although the N = 2 case should be referred
to those articles.
Problem 122
What is the condition of stability? Assuming a stable system, let Xj be
amount of fluid in buffer j in steady state for all j ∈ [1, . . . , N]. Using effec-
tive bandwidth analysis, obtain an approximation for P(Xj > Bj ) for all
j ∈ [1, . . . , N].
Solution
Notice that the scheduler is work conserving, in other words it is impossible
that there is fluid in at least one buffer and the scheduler is draining at a rate
lower than c. Thus, the condition for stability is
K
N j
Using the effective bandwidth analysis is a little tricky since the compen-
sating source is not easy to characterize except for some very special cases.
For some j ∈ [1, . . . , N], take buffer j. It is guaranteed a minimum bandwidth
of φj c at all times. However, the remaining (1 − φj )c (or greater, if buffer
j is empty) is shared among all the sources in ratios according to the GPS
scheme.
This is a little tricky to capture using a compensating source. Hence, we
consider a fictitious compensating source that is essentially the output from
a fictitious buffer with capacity (1 − φj )c and input being all the sources from
all the classes except j. Thus, when the fictitious buffer is nonempty, buffer
j gets exactly φj c capacity; however, when the fictitious buffer is empty,
all unutilized capacity is used by buffer j. It is not difficult to check that
this compensating source is rather conservative, that is, in the real setting,
676 Analysis of Queues
a lesser amount of fluid flows from the compensating source. In the spe-
cial case when N = 2 and K1 = K2 = 1 on-off source with on rates of class j
source being larger than φj c, this fictitious compensating source is identical
to the real compensating source. One could certainly develop other types
of compensating sources. The key idea here is a conservative one where
unless all the other buffers are empty, the remaining capacity is not allocated
to buffer j.
For such a compensating source, we solve for ηj as the unique solution to
⎛ ⎞
K Kk
j
ebij (ηj ) + min ⎝(1 − φj )c, ebik (ηj )⎠ = c.
i=1 k=j i=1
In fact, instead of the preceding expression one could have been more strict
and written down the output effective bandwidth from the fictitious buffer.
Thereby, we can obtain an approximation for the probability that there is
more than Bj amount of fluid in buffer j as P(Xj > Bj ) ≈ e−Bj ηj . Also for all
j = 1, 2, . . . , N,
e−Bj ηj < j
That said, we move on to the next set of policies. Both are based on thresh-
olds of buffer contents. They leverage upon results from this chapter as well
as Chapter 9.
Problem 123
Consider a fluid queueing system with two infinite-sized buffers as shown
in Figure 10.26. For j = 1, 2, fluid enters buffer j according to an alternating
on-off process such that for an exponentially distributed time (with mean
1/αj ) fluid enters continuously at rate rj and then no fluid enters for another
Buffer 1
a C
Buffer 2
FIGURE 10.26
Two-buffer system. (From Aggarwal, V. et al., Perform. Eval., 59(1), 19, 2004. With permission.)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 677
exponentially distributed time (with mean 1/βj ). When the off-time ends,
another on-time starts, and so on. Let Xj (t) be the amount of fluid in buffer j
(for j = 1, 2) at time t. A scheduler alternates between buffers-1 and buffer-
2 while draining out fluid continuously at rate c. Assume that r1 > c and
r2 > c. The policy adopted by the scheduler is as follows: as soon as buffer-
1 becomes empty (i.e., X1 (t) = 0), the scheduler switches from buffer-1 to
buffer-2. When the buffer contents in buffer-1 reaches a (i.e., X1 (t) = a), the
scheduler switches back from buffer-2 to buffer-1. We denote 0 and a as the
thresholds for buffer-1. What is the stability condition? Assuming stability,
derive an expression using SMP bounds for the steady-state distribution of
the contents of buffer-2.
Solution
Note that the scheduler’s policy is dependent only on buffer-1. That means
even if buffer-2 is empty (i.e., X2 (t) = 0), as long as buffer-1 has less than a
(i.e., X1 (t) < a), the scheduler does not switch back to buffer-1. Also it is rela-
tively straightforward to model the dynamics of buffer-1 and obtain the state
probability P(X1 > x) for x > a assuming the buffer is stable (and as t → ∞,
X1 (t) → X1 ). This analysis is described in Chapter 9 (Problem 96). Here we
only consider bounds for P(X2 > x) assuming the system is stable. In fact, the
stability condition (for limiting distributions of the buffer contents X1 (t) and
X2 (t) to exist) is
r1 β1 r2 β2
+ < c.
α1 + β1 α2 + β2
&
w+β1 +cs0 (w) a s0 (w)
e if w ≥ w∗
Õ1 (w) = β1
∞ otherwise,
where
√
∗ 2 cα1 β1 (r1 − c) − r1 β1 − cα1 − cβ1
w = ,
r1
−b − b2 + 4w(w + α1 + β1 )c(r1 − c)
s0 (w) = ,
2c(r1 − c)
and b = (r1 −2c)w+(r1 −c)β1 −cα1 . The mean on-time E[T1 ] can be computed
as E[T1 ] = − dÕ1 (w)/dw at w = 0. Hence we have
r1 + a(α1 + β1 )
E[T1 ] = .
cα1 + cβ1 − r1 β1
2
β1 −a αr1 s+β 1 s+s
Õ2 (s) = e 1 s+r1 β1 .
β1 + s
Hence, the mean off-time E[T2 ] can be derived as E[T2 ] = − dÕ2 (s)/ds at s = 0
and is given by
r1 + a(α1 + β1 )
E[T2 ] = .
r1 β1
A detailed derivation of Õ1 (·) and Õ2 (·) is described in Aggarwal et al. [3].
Using this compensating source model and the SMP bounds analysis, we
can derive the limiting distribution of the buffer contents of buffer-2. We first
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 679
obtain the effective bandwidth of source 2 (the original source into buffer-2)
using Equation 10.18 as
r2 v − α2 − β2 + (r2 v − α2 − β2 )2 + 4β2 r2 v
eb2 (v) = .
2v
where
1 C1
X*
C
C2
2
FIGURE 10.27
Two-buffer system. (From Mahabhashyam, S. et al., Oper. Res., 56(3), 728, 2008. With
permission.)
Problem 124
Consider a two-buffer fluid flow system illustrated in Figure 10.27. For
j = 1, 2, class j fluid enters buffer j according to an alternating on-off process
so that fluid enters continuously at rate rj for an exponentially distributed
time (on-times) with mean 1/αj and then no fluid enters (off-times) for
another exponential time with mean 1/βj . The on and off times continue
alternating one after the other. The buffers can hold an infinite amount of
fluid; however, the contents of only one buffer is observed, buffer-1. There
are two schedulers that drain fluid from the two buffers. Scheduler-1 has
a capacity of c1 and scheduler-2 has a capacity c2 , which are the maximum
rates the respective schedulers can drain fluid. Let Xj (t) be the amount of
fluid in buffer j (for j = 1, 2) at time t. Fluid is drained from the two buffers
in the following fashion. When X1 (t) is nonzero, scheduler-1 serves buffer-1
and when X1 (t) = 0, scheduler-1 serves buffer-2. Also, if X1 (t) is less than a
threshold x∗ , scheduler-2 removes fluid from buffer-2, otherwise it drains out
buffer-1. Assuming stability, derive bounds for the steady-state fluid level in
buffer-2.
Solution
Notice that when X1 (t) = 0, both schedulers serve buffer-2 and when
0 < X1 (t) < x∗ scheduler-1 serves buffer-1 and scheduler-2 serves buffer-2,
whereas, when X1 (t) ≥ x∗ , both schedulers serve buffer-1. Since only X1 (t)
is observed, the buffer-emptying scheme depends only on it. If Cj (t) is
the capacity available for buffer j at time t, then C1 (t) = 0 when X1 (t) = 0,
C1 (t) = c1 when 0 < X1 (t) < x∗ , and C1 (t) = c1 +c2 whenever X1 (t) ≥ x∗ . Capac-
ity available for buffer-2 at any time t is C2 (t) = c1 + c2 − C1 (t). The stability
condition for the two-buffer system in Figure 10.27 is given by:
r1 β1 r2 β2
+ < c1 + c2 .
α1 + β1 α2 + β2
It is possible to obtain the state probability P(X1 > x) for x > x∗ assum-
ing the buffer is stable (and as t → ∞, X1 (t) → X1 ). This analysis is described
in Problem 97 (see Chapter 9). Here we only consider bounds for P(X2 > x),
where X2 is the amount of fluid in buffer-2 in steady state. For buffer-2, the
output capacity is not only variable but also inherently dependent on con-
tents of buffer-1. The input for buffer-2 is from an exponential on-off source
but the output capacity varies from zero to (c1 + c2 ) depending on the buffer
content in buffer-1. The variation of output capacity over time (say Ô(t)) with
respect to content of buffer-1 is as follows:
⎧
⎨ 0 when X1 (t) ≥ x∗ ,
Ô(t) = c2 when 0 < X1 (t) < x∗ ,
⎩
c 1 + c2 when X1 (t) = 0.
Consider the queueing system (as depicted in Figure 10.28), where there
are two input streams and a server with a constant output capacity c1 + c2 .
The first input stream is a compensating source, where fluid enters the queue
at rate c1 +c2 − Ô(t) at time t. The second input stream is identical to source-2,
where fluid enters according to an exponential on-off process with rates r2
when on and 0 when off. The environment process that drives traffic gener-
ation for the compensating source can be modeled as a four-state SMP. Let
Z1 (t) be the environment process denoting the on-off source for buffer-1. If
source-1 is on at time t, Z1 (t) = 1 and if source-1 is off at time t, Z1 (t) = 0. Con-
sider the Markov regenerative sequence {(Yn , Sn ), n ≥ 0} where Sn is the nth
regenerative epoch, corresponding to X1 (t) equaling either x∗ or 0, and Yn
is the state immediately following the nth Markov regenerative epoch such
that
⎧
⎪
⎪ 1 if X1 (Sn ) = 0 and Z1 (Sn ) = 0,
⎪
⎨ 2 if X1 (Sn ) = 0 and Z1 (Sn ) = 1,
Yn =
⎪ 3 if
⎪ X1 (Sn ) = x∗ and Z1 (Sn ) = 0,
⎪
⎩
4 if X1 (Sn ) = x∗ and Z1 (Sn ) = 1.
C1 + C2
FIGURE 10.28
Buffer-2 with compensating source. (From Mahabhashyam, S. et al., Oper. Res., 56(3), 728, 2008.
With permission.)
682 Analysis of Queues
The expressions G12 (t), G21 (t), G24 (t), G31 (t), G34 (t), and G43 (t) need to
be obtained. Two of them are relatively straightforward to obtain, namely,
G12 (t) and G43 (t). First consider G12 (t). This is the probability that Yn changes
from 1 to 2 before time t, which is the same as the probability of the source-1
going from off to on. Hence G12 (t) is given by
Next consider G43 (t). This is the probability that the buffer-2 content goes up
from x∗ and reaches x∗ in time t. This is identical to the probability that the
buffer content starts at zero, goes up, and comes back to zero within time t,
that is, equivalent to the busy period distribution. The LST of G43 (t) can be
obtained by substituting appropriate terms in the busy period distribution
of Problem 94. Hence,
&
w+β1 +cs0 (w)
if w > w∗
G̃43 (w) = β1
∞ otherwise
where
−b − b2 + 4w(w + α1 + β1 )c(r1 − c)
s0 (w) = ,
2c(r1 − c)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 683
√
and b = (r1 − 2c)w + (r1 − c)β1 − cα1 , w∗ = (2 cα1 β1 (r1 − c) − r1 β1 − cα1 +
cβ1 )/r1 , and c = c1 + c2 .
To obtain expressions for the remaining terms in the kernel of the SMP,
namely, G21 (t), G24 (t), G31 (t), and G34 (t), turn to Problem 93. Using
a subscript of 1 for α, β, r, and c, it is relatively straightforward to
see that
∗ ∗
G̃31 (w) = a11 (w)eS1 (w)x ψ1 (w) + a21 (w)eS2 (w)x ψ2 (w),
where
1
−b̂ − b̂2 + 4w(w + α1 + β1 )c1 (r1 − c1 )
S1 (w) = ,
2c1 (r1 − c1 )
1
−b̂ + b̂2 + 4w(w + α1 + β1 )c1 (r1 − c1 )
S2 (w) = ,
2c1 (r1 − c1 )
β1
ψi (w) =
w + β1 + Si (w)c1
684 Analysis of Queues
and finally,
∗
eS2 (w)x
a11 (w) = ,
δ(w)
−ψ2 (w)
a12 (w) = ,
δ(w)
∗
−eS1 (w)x
a21 (w) = ,
δ(w)
ψ1 (w)
a22 (w) = ,
δ(w)
∗ ∗
with δ(w) = eS2 (w)x ψ1 (w) − eS1 (w)x ψ2 (w).
Now that we have characterized the compensating source as an SMP,
next we consider that and the original source-2 and obtain the effective
bandwidths. The effective bandwidth of the compensating source can be
computed as follows. For a given v such that v > 0, define the matrix χ(v, u)
such that
⎡ ⎤
0 G̃12 (vu) 0 0
⎢ G̃21 (vu − c1 v) 0 0 G̃24 (vu − c1 v) ⎥
⎥.
χ(v, u) = ⎢
⎣ G̃31 (vu − c1 v) 0 0 G̃34 (vu − c1 v) ⎦
0 0 G̃43 (vu − c1 v − c2 v) 0
Using the effective bandwidths of the original source 2 (eb2 (v)) and the
compensating source (ebc (v)), η can be obtained as the unique solution to
γ2 = eb2 (η),
γc = ebc (η).
Define (η) = χ(η, ebc (η)) such that φij (η) is the ijth element of (η). Let
h be the left eigenvector of (η) corresponding to the eigenvalue of 1, that is,
h = h(η).
where
4 0
r2 β2 hi 4
γ2 (α2 +β2 ) φij (η) − 1
i=1 η(r(i) − γc ) j=1
K∗ =
min (i, j):pij > 0 min (i, j),
and
4 0
r2 β2 hi 4
γ2 (α2 +β2 ) φij (η) − 1
i=1 η(r(i) − γc ) j=1
K∗ =
max (i, j):pij > 0 max (i, j)
with min (i, j) and max (i, j) derived using the values in Table 10.3.
TABLE 10.3
Table of max and min Values
IFR & r(i) > γc IFR & r(i) ≤ γc DFR & r(i) > γc DFR & r(i) ≤ γc
φij (−η(r(i)−γc ))τi hi τi hi λij (∞) τi hi λij (∞) φ̃ij (−η(r(i)−γc ))τi hi
max (i, j) pij pi pi (λij (∞)−η(r(i)−γc )) pi (λij (∞)−η(r(i)−γc )) pij pi
τi hi λij (∞) φ̃ij (−η(r(i)−γc ))τi hi φ̃ij (−η(r(i)−γc ))τi hi τi hi λij (∞)
min (i, j) pi (λij (∞)−η(r(i)−γc )) pij pi pij pi pi (λij (∞)−η(r(i)−γc ))
Source: Mahabhashyam, S. et al., Oper. Res., 56(3), 728, 2008. With permission.
686 Analysis of Queues
Reference Notes
The main focus of this chapter was to determine approximations and bounds
for steady-state fluid levels in infinite-sized buffers. However, we did not
present the underlying theory of large deviations that enabled this. Inter-
ested readers can refer to Shwartz and Weiss [98] as well as Ganesh et al.
[38] for an excellent treatment of large deviations. The crucial point is that
the tail events are extremely rare and, in fact, only analytical models can be
used to estimate their probabilities suitably. There are simulation techniques
too but they are typically based on a change of measure argument follow-
ing the Radon–Nikodym theorem. Details regarding change of measures can
be found in textbooks such as by Ethier and Kurtz [33]. In fact, the bounds
described in this chapter are based on some exponential change of measure
arguments in Ethier and Kurtz [33].
The common theme in this chapter is the concept of effective band-
widths (also called effective capacity). The theoretical underpinnings for
effective bandwidth is based on large deviations and we briefly touched
upon the Gärtner–Ellis condition. Further details on the Gärtner–Ellis con-
ditions can be found in Kesidis et al. [61] and the references therein. An
excellent tutorial on effective bandwidths is Kelly [60] and it takes a some-
what different approach defining it at time t whereas what we present is
based on letting t → ∞. Our results are mainly based on Elwalid and Mitra
[30], Kesidis et al. [61], and Kulkarni [68] that show how to compute effec-
tive bandwidths of several types of traffic flows. Further, Chang and Thomas
[16], Chang and Zajic [17], and de Veciana et al. [23] explain effective
bandwidth computations for outputs from queues and extend the results
to networks.
Once we know how to compute effective bandwidths, they can be used
for approximating buffer content distributions. Recall from Section 9.2.3
that buffer content distributions can be obtained only when the sources are
CTMCs (based on Anick et al. [5], Elwalid and Mitra [28, 29], and Kulkarni
[69]). The effective bandwidth approximation lends itself well for computing
the tail distributions. The results presented in this chapter (Section 10.2.1)
are summarized from Elwalid and Mitra [30], Kesidis et al. [61], Krishnan
et al. [65], and Kulkarni [68]. These results were fine-tuned by Elwalid et
al. [31, 32] by considering Chernoff bounds (hence the CDE approximation
in Section 10.2.2). Although effective bandwidth and CDE approximations
are mainly for the tail probabilities, exponential bounds on the buffer con-
tent analysis (called SMP bounds because these require the sources to be
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 687
SMPs) described in Section 10.2.3 are based on Palmowski and Rolski [87, 88]
and Gautam et al. [39].
These results can be suitably extended to multiclass fluid queues, where
each class of fluid has a dedicated buffer. Perhaps the most well-studied pol-
icy is the priority rule that gained popularity to aid differentiated services
and is based on its discrete counterpart. For a comprehensive study on effec-
tive bandwidths with priorities, see Berger and Whitt [9, 10], and Gautam
and Kulkarni [70]. Policies such as generalized processor sharing are consid-
ered in de Veciana et al. [22, 24]. The results on timed round-robin policy and
its comparison with static priority are based on Gautam and Kulkarni [40].
The analysis on threshold-based policies is based on Mahabhashyam et al.
[76] and Aggarwal et al. [3].
Exercises
10.1 Consider a fluid source driven by a three-state CTMC environ-
ment process {Z(t), t ≥ 0} with generator matrix and rate matrix
given by
⎡ ⎤ ⎡ ⎤
−β β 0 0 0 0
⎢ ⎥ ⎢ 0 r 0 ⎥
Q = ⎣ γ −γ − δ δ ⎦ and R=⎣ 1 ⎦.
0 α −α 0 0 r2
⎡ ⎤
−β1 − β2 β2 β1 0
⎢ −α2 − β1 ⎥
⎢ α2 0 β1 ⎥
Q= ⎢ ⎥
⎣ α1 0 −α1 − β2 β2 ⎦
0 α1 α2 −α1 − α2
688 Analysis of Queues
and
⎡ ⎤
0 0 0 0
⎢ 0 r2 0 0 ⎥
⎢ ⎥
R= ⎢ ⎥.
⎣ 0 0 r1 0 ⎦
0 0 0 r1 + r2
Note that the four states correspond to: both sources on, source-
1 off and 2 on, source-2 off and source-1 on, and both sources
off. Compute the effective bandwidth of this source, call it eb(v).
(c) Show that the algebraic expression for eb(v) is identical to the
effective bandwidth of the net input to the buffer eb1 (v) + eb2 (v).
10.3 Consider a single buffer fluid model with input from an on-off
source with hyperexponential on-time CDF (for x ≥ 0)
When the source is on, fluid enters the buffer at rate r = 3 Mbps and
when the source is off, no fluid enters. The output channel capacity
c = 2 Mbps.
(a) Compute Ũ(s) and D̃(s), the LSTs of U(x) and D(x), respec-
tively.
(b) The tail probability of the limiting buffer contents P{X > x} for
very large x can be obtained using effective bandwidths as
1
eb(v) = lim log E{exp(vAn )}.
n→∞ vn
√
1 and standard deviation 0.5/ i. Graph the effective bandwidth of
this source eb(v) versus v for v ∈ [0, 2].
10.8 Recall the in-tree network in Figure 10.10 considered in Problem
108. It is desired that the probability of exceeding buffer level of
b = 14 kB must be less that 0.000001 in all seven buffers. Using effec-
tive bandwidth approximation, design the smallest output capacity
cj for j = 1, . . . , 7 to achieve such a quality of service. Use the same
numerical values of α = 5, β = 1, and r = 6.
10.9 Consider Problem 110 and obtain bounds for P(X > x) using SMP
bounds. Compare the results against those based on CDE approxi-
mation described in Problem 110.
10.10 Solve Problem 117 using CDE approximation. In particular, using
CDE approximation, obtain expressions for P(Xj > Bj ) for all
j = 1, . . . , N. Then graph the fraction of fluid lost assuming that the
size of the buffer is Bj . Compare against SMP bounds and effective
bandwidth approximation results presented in Problem 117.
10.11 Consider a static priority policy to empty fluids from three buffers
with buffer-1 given the highest priority. Into buffer i, fluid enters
from a general on-off source with on-time CDF pi (1 − e−3t ) + (1 −
pi )(1 − e−4t ) and off-time CDF 1 − e−t − te−t for i = 1, 2, 3. Also,
p1 = 0.5, p2 = 0.4, and p3 = 0.2. Traffic is generated at the rate of 2
per unit time when any source is on. The output capacity c = 1.8.
Using effective bandwidth approximations determine expressions
for the probability that each of the buffers would exceed a level x
in steady state.
10.12 For the setting in Problem 123, assume that there is a cost of
Cs to switch from one buffer to another. What is the optimal
value of a that would minimize the long-run average cost per
unit time subject to satisfying the constraints that P(Xj > Bj ) < j
for j = 1, 2. Use UB to ensure the constraint is satisfied. Illus-
trate the optimal solution for the following numerical val-
ues: β1 = 2, α1 = 8, r1 = 2.645, β2 = 3, α2 = 9, r2 = 1.87, c = 1.06, and
Cs = 100. Also, B1 = 2.5, B2 = 8, 1 = 0.001, and 2 = 0.01.
Appendix A: Random Variables
FX (x) = P{X ≤ x}
for all x ∈ (−∞, ∞). In this book, we drop X from FX (x) and just call the
CDF as F(x) especially where there is only one random variable being consid-
ered. There are two basic types of random variables: discrete and continuous.
Discrete random variables are defined on some discrete points on the real
line. However, continuous random variables are defined on a set of open
intervals on a real line but not on any discrete points. Of course, there is
a class of random variables called mixture or hybrid random variables that
are a combination of discrete and continuous random variables. The CDF for
these random variables have jumps (or discontinuities) at the discrete points.
Let D be the set of discrete points for a mixture random variable X, and for
all x ∈ D, let px = p{X = x}, be the magnitude of jumps.
We proceed with a generic mixture random variable with the under-
standing that both continuous and discrete random variables are special
cases corresponding to D = ∅ and px = 1, respectively. Thus, let X
x∈D
be a mixture random variable with CDF F(x) and discrete-point set D. The
expected value of X is defined as
∞
E[X] = xdF(x) + xpx
−∞ x∈D
691
692 Analysis of Queues
where the integral is Riemann type (however, it is indeed derived using the
Lebesgue integral). We present an example to illustrate.
Problem 125
Let X be a random variable that denotes Internet packet size for TCP trans-
missions (in bytes). On the basis of empirical evidence, say the CDF of X is
modeled as
⎧
⎪
⎪ 0 if x < 40
⎨ √
a x+b if 40 < x < 576
F(x) =
⎪
⎪ 0.0001x + 0.6424 if 576 < x < 1500
⎩
1 if x > 1500
where √
a = 0.3/24 −√ 40
b = 0.25 − a 40
Notice that there are jumps or discontinuities in the CDF at 40, 576, and
1500 bytes. Hence, D = {40, 576, 1500} with px given by 0.25, 0.15, and 0.2076
for x = 40, x = 576, and x = 1500, respectively. Compute E[X].
Solution
Using the definition
∞
E[X] = xdF(x) + xpx
−∞ x∈D
with D = {40, 576, 1500} and px given in the problem statement, we have
40
576
1500
E[X] = xdF(x) + xdF(x) + xdF(x)
−∞ 40 576
∞
+ xdF(x) + 40p40 + 576p576 + 1500p1500
1500
576
√
1500
=0+ 0.5a xdx + 0.0001xdx + 0 + 40
40 576
= 580.4901 bytes.
Random Variables 693
∞
E[Xr ] = xr dF(x) + xr px
−∞ x∈D
2. Bernoulli distribution
• Description: A Bernoulli trial can result in a success with
probability p and a failure with probability q with q = 1 − p.
Then the random variable X, which takes on 0 if the trial is
a failure and 1 if the trial is a success, is called the Bernoulli
random variable with parameter p.
• PMF:
• Mean:
E[X] = p.
• Variance:
3. Binomial distribution
• Description: A Bernoulli trial can result in a success with
probability p and a failure with probability q with q = 1 − p.
Then the random variable X, the number of successes in n
independent Bernoulli trials is called the binomial random
variable with parameters n and p.
• PMF:
n x n−x
p(x) = p q , x = 0, 1, 2, . . . , n.
x
• Mean:
E[X] = np.
• Variance:
V[X] = npq.
4. Geometric distribution
• Description: A Bernoulli trial can result in a success with
probability p and a failure with probability q with q = 1 − p.
Then the random variable X, denoting the number of
Bernoulli trials until a success is obtained is the geometric
random variable with parameter p.
Random Variables 695
• PMF:
p(x) = pqx−1 , x = 1, 2, . . . .
• Mean:
1
E[X] = .
p
• Variance:
1−p
V[X] = .
p2
• Mean:
k
E[X] = .
p
• Variance:
k(1 − p)
V[X] = .
p2
6. Hypergeometric distribution
• Description: A random sample of size n is selected without
replacement from N items. Of the N items, k may be classi-
fied as successes and N − k are classified as failures. The
number of successes, X, in this random sample of size n is
a hypergeometric random variable with parameters N, n,
and k.
• PMF:
k
N−k
x n−x
p(x) = N
, x = 0, 1, 2, . . . , n.
n
696 Analysis of Queues
• Mean:
nk
E[X] = .
N
• Variance:
N−n k k
V[X] = n 1− .
N−1 N N
7. Poisson distribution
• Description: A Poisson random variable X with parameter
λ, if its PMF is given by
e−λ (λ)x
p(x) = , x = 0, 1, 2, . . .
x!
• Mean:
E[X] = λ.
• Variance:
V[X] = λ.
8. Zipf distribution
• Description: A random variable X with Zipf distribution
taking on values 1, 2, . . ., n has a PMF of the form
1/xs
p(x) = n s
, x = 1, 2, . . . , n
i=1 1/i
• Variance:
n
1/is−2
V[X] = i=1
n s
− {E[X]}2 .
i=1 1/i
It is worthwhile for the reader to verify that the PMF properties are sat-
isfied as well as verify E[X] and V[X] expressions for the various discrete
Random Variables 697
random variables. Note that unlike the other distributions, Poisson and Zipf
distributions descriptions are not based out of a random experiment. With
these few examples, we move to continuous distributions.
x2
P{x1 < X < x2 } = f (x)dx = F(x2 ) − F(x1 ).
x1
∞
E[Xr ] = xr f (x)dx.
−∞
• CDF:
1 − e−λx , x > 0
F(x) =
0 elsewhere
• Mean:
1
E[X] =
λ
• Variance:
1
V[X] =
λ2
698 Analysis of Queues
λ (λx)
k−1
−λx x>0
f (x) = (k−1)! e
0 elsewhere
• CDF:
⎧
⎪
⎨
k−1
(λx)r
1− e−λx x>0
F(x) = r!
⎪
⎩ r=0
0 elsewhere
• Mean:
k
E[X] =
λ
• Variance:
k
V[X] =
λ2
1 α−1 e−x/β
βα (α) x x>0
f (x) =
0 elsewhere
∞
where (α) = xα−1 e−x dx. If α is an integer, then
0
(α) = (α − 1)!.
• CDF: There is no closed-form expression for the CDF in
the generic case (exception is when α is an integer). For
numerical values, use tables or software packages.
Random Variables 699
• Mean:
E[X] = αβ
• Variance:
V[X] = αβ2
• CDF:
⎧
⎨ 0 x<a
x−a
F(x) = b−a , a≤x≤b
⎩
1 x>b
• Mean:
a+b
E[X] =
2
• Variance:
(b − a)2
V[X] =
12
• CDF:
1 − e−αx
β
x>0
F(x) =
0 elsewhere
• Mean:
1
E[X] = α−1/β 1 +
β
700 Analysis of Queues
1
e−(1/2)[(x−μ)/σ]
2
f (x) = √ −∞ < x < ∞
2πσ
E[X] = μ
• Variance:
V[X] = σ2
E[X] = v
• Variance:
V[X] = 2v
Random Variables 701
• Variance:
2 2
V[X] = e2μ+σ (eσ − 1)
• Variance:
αβ
V[X] =
(α + β)2 (α + β + 1)
• CDF:
β
K
F(x) = 1− x x≥K>0
0 elsewhere.
• Mean:
Kβ
E[X] = ifβ > 1 while E[X] = ∞ if β ≤ 1.
β−1
• Variance:
K2 β
V[X] = if β > 2 while V[X] = ∞ if β ≤ 2.
(β − 1)2 (β − 2)
It is worthwhile to verify for each of the distributions that the PDF prop-
erties are satisfied. Also, compute E[X] and V[X] using the definitions and
verify that expressions given for the various distributions. Before wrapping
up, we briefly mention another metric that is frequently used in queueing
called coefficient of variation (COV).
For example, the notion of COV does not exist for a normal random
variable, say with mean 0 and variance 1.
• Exponential is not the only distribution √ with COV of 1. For example,
a Pareto random variable with β = 1 + 2 and any K > 0 has a COV
of 1. It is incorrect to use M/G/1 results for a G/G/1 queue with
COV of arrivals equal to 1. The results will match only when the
interarrival times are exponential.
• The relationship between COV and the hazard rate function of a
positive-valued continuous random variable needs to be stated very
carefully. First, let us define the hazard (or failure) rate function h(x)
Random Variables 703
f (x)
h(x) =
1 − F(x)
for all x where f (x) > 0. Several references (e.g., Tijms [102] on page
438 and Wierman et al. [107] in Lemma 1) state that for a positive-
valued continuous random variable X with hazard rate function
h(x), if h(x) is increasing with x, then COV[X] ≤ 1 and h(x) is decreas-
ing with x then COV[X] ≥ 1. The preceding result is extremely useful
and intuitive, but one has to be careful to interpret it and use it. For
example, if one considers a Pareto
√ random variable, h(x) is decreas-
ing for all x ≥ K but if β > 2 + 1, then the COV is less than 1.
Although it appears to be contradicting the preceding result, if one
defines h(x) for all x ≥ 0, not just x ≥ K where f (x) > 0 for Pareto, then
h(x) is not decreasing for all x > 0 and the preceding result is valid.
Also, it is crucial to realize that the result goes only in one direction
(i.e., if h(x) is increasing or decreasing for all x ≥ 0, then COV would
be <1 or >1). Knowing the COV does not reveal the monotonicity of
the hazard rate function.
(z) = p0 + p1 z + p2 z2 + · · ·
∞
= pj zj
j=0
= E zX .
(1) = 1
(1) = E[X]
where (z) and (z) are the first and second derivatives of (z) with
respect to z. Also, in many instances, when we obtain (z), there
would be an unknown parameter that can be resolved by using the result
(1) = 1.
A few other properties of a GF are described as follows:
1 dk
P{X = k} = (z)|z=0 .
k! dzk
dr
E[Xr ] = (z)|z=1
dzr
1
k
lim P{X = i} = lim (1 − z)(z).
k→∞ k z→1
i=0
See Chapter 2 for several examples of GFs used in contexts of queues. Next,
we move to continuous random variables, which can also be described using
GFs; however, for the purposes of this book, we mainly use transforms.
∞ ∞
F̃X (s) = e−sx dFX (x) = e−sx fX (x)dx
0 0
where fX (x) is the PDF of the random variable X. However, if X has a mixture
of discrete and continuous parts, then E[e−sX ] computation must be suitably
adjusted as described in Section A.1. For the remainder of this section, we
will assume X is continuous without any discrete parts.
Some examples of continuous random variables where LST of their
CDF can be computed are as follows: If X ∼ exp(λ), then F̃X (s) = λ/(λ + s);
if X ∼ Erlang(λ, k), then F̃X (s) = (λ/(λ + s))k ; if X ∼ Unif (0, 1), then F̃X (s) =
706 Analysis of Queues
1 − e−s /s. Similar to the PDF and CDF, for the LSTs too we drop the X and
say F̃(s) as the LST.
A few properties of LSTs are described as follows:
dr
E[Xr ] = (−1)r F̃(s)|s=0 .
dsr
3. The following properties of LSTs can be extended to any function
F(x) defined for x ≥ 0 (not just CDFs):
a. Let F, G, and H be functions with nonnegative domain and range.
Further, for scalars a and b, let H(x) = aF(x) + bG(x). Then
H̃(s) = aF̃(s) + bG̃(s).
b. Let F(x), G(x), and H(x) be functions of x with nonnegative
domain and range such that F(0) = G(0) = H(0) = 0. In addition,
assume that F(x), G(x), and H(x) either grow slower than esx or
are bounded. Letting
x x
H(x) = F(x − u)dG(u) = G(x − u)dF(u),
0 0
F(t)
lim = lim sF̃(s).
t→∞ t s→0
LSTs are used throughout this book starting with Chapter 2. Examples of
their use can be found there. The main purpose is that in many instances the
Random Variables 707
∞
F∗ (s) = e−sx F(x)dx.
0
provided F(0) = 0 and F(x) either grows slower than esx or is bounded.
A few properties of LTs are described:
F∗ (s)
R∗ (s) = .
s
708 Analysis of Queues
dn ∗
T∗ (s) = (−1)n F (s).
dsn
This result is essentially the law of total probability that is typically dealt in
an elementary probability book or course. We illustrate this expression using
Random Variables 709
some examples. In many texts one would find examples where X and Y are
either both discrete or both continuous and we encourage the readers to refer
to them. However, we present examples where one of X or Y is discrete and
the other is continuous.
Problem 126
A call center receives calls from three classes of customers. The service times
are class-dependent: for class-1 calls they are exponentially distributed with
mean 3 min; for class-2 calls they are according to an Erlang distribution
with mean 4 min and standard deviation 2 min; and for class-3 calls they
are according to a uniform distribution between 2 and 5 min. Compute the
distribution of the service time of an arbitrary caller if we know that the
probability the caller is of class i is i/6 for i = 1, 2, 3.
Solution
Let Y be a continuous time random variable denoting the service time in
minutes of the arbitrary caller and X be a discrete random variable denot-
ing the class of that caller. From the problem statement, we know that
P(X = 1) = 1/6, P(X = 2) = 1/3, and P(X = 3) = 1/2. We also can write down
(based on the problem description) that for any y ≥ 0,
y2 y3
(1 − e−y/3 ) 1− 1+y+ 2 + 6 e−y
FY (y) = +
6 3
y−2 I(y > 2)
+ min ,1
3 2
for all y ≥ 0.
⎧
⎪
⎪ P{Y = y|X = x}pX (x) if X is discrete,
⎪
⎨ x
pY (y) = P{Y = y} = ∞
⎪
⎪ P{Y = y|X = x}fX (x)dx
⎪
⎩ if X is continuous.
−∞
Problem 127
The price of an airline ticket on a given day is modeled as a continuous ran-
dom variable X, which is according to a Pareto distribution with parameters
K and β (where β is an integer greater than 1 in this problem). The demand
for leisure tickets during a single day follows a Poisson distribution with
parameter C/X. What is the probability that the demand for leisure tickets
on a given day is r?
Solution
Let Y be a random variable that denotes the demand for leisure tickets on a
given day. We want P{Y = r}. To compute that, we use the fact that we know
(C/x)r
P{Y = r|X = x} = e−C/x .
r!
βKβ
fX (x) =
xβ+1
∞
P{Y = r} = P{Y = y|X = x}fX (x)dx
−∞
∞ (C/x)r βKβ
= e−C/x dx
r! xβ+1
K
1/K
(Ct)r
= e−Ct βKβ tβ−1 dt
r!
0
Random Variables 711
β 1/K
(r + β − 1)! K (Ct)r+β−1
= β e−Ct C dt
r! C (r + β − 1)!
0
⎛ ⎞
β
r+β−1
(r + β − 1)! K ⎝1 − −C/K (C/K)j
⎠
= β e
r! C j!
j=0
E[g(Y)] = E[E[g(Y)|X]]
Thus, we can easily obtain moments of Y, LST of Y, etc. We illustrate that via
a few examples.
Problem 128
The probability that a part produced by a machine is non-defective is p. By
conditioning on the outcome of the first part type (defective or not), compute
the expected number of parts produced till a non-defective one is obtained.
Solution
Let X be the outcome of the first part produced with X = 0 denoting a defec-
tive part and X = 1 denoting a non-defective part. Also, let Y be the number
of parts produced till a non-defective one is obtained. The question asks to
compute E[Y]. Although this problem can be solved by realizing that Y is a
geometrically distributed random variable with probability of success p, the
question specifically asks to condition on the outcome of the first part type.
712 Analysis of Queues
and by solving for E[Y] we get E[Y] = 1/p. This is consistent with
the expected value of a geometric random variable with probability of
success p.
Problem 129
The average bus ride for Michelle from school to home takes b minutes; and it
takes her on average w minutes to walk from school to home. One day when
Michelle reached her bus stop to go home she found out that it would take a
random time for the next bus to arrive and that random time is according to
an exponential distribution with mean 1/λ minutes. Michelle decides to wait
for a maximum of t minutes at the bus stop and then walk home if the bus
does not arrive within t minutes. What is the expected time for Michelle to
reach home from the time she arrived at the bus stop? If Michelle would like
to minimize this, what should her optimal time t be?
Solution
Let X be the time the bus would arrive after Michelle gets to the bus stop
and Y be the time she would reach home from the time she arrived at the bus
stop. It is known that X is exponentially distributed with parameter λ. Also,
x + b if x ≤ t
E[Y|X = x] =
t + w if x > t
since x ≤ t implies Michelle would ride the bus and vice versa. Thus, by
unconditioning, we have
∞
E[Y] = E[Y|X = x]λe−λx dx
0
t ∞
= (x + b)λe−λx dx + (t + w)λe−λx dx
0 t
1
= 1 − e−λt − λte−λt + b 1 − e−λt + (t + w)e−λt
λ
Random Variables 713
Problem 130
The orders received for grain by a farmer add up to X tons, where X is an
exponential random variable with mean 1/β tons. Every ton of grain sold
brings a profit of p, and every ton that is not sold is destroyed at a loss of
l. Let T be the tons of grains produced by the farmer, which is according to
an Erlang distribution with parameters α and k. Any portion of orders that
are not satisfied are lost without any penalty cost. What is the expected net
profit for the farmer?
Solution
Let R be the net profit for the farmer, which is a function of X. The expected
net profit conditioned on T is
T ∞
E[R|T] = [px − l(T − x)]βe−βx dx + pTβe−βx dx
0 T
p+l
= 1 − e−βT − βTe−βT − lT 1 − e−βT + pTe−βT
β
p+l
= 1 − e−βT − lT.
β
p+l
E[R] = E[E[R|T]] = 1 − E e−βT − lE[T].
β
Notice that from the last two problems, we obtain some intriguing
results mainly due to the exponential distribution properties. In that light
714 Analysis of Queues
A.4.1 Characteristics
Before describing the properties of exponential random variables, we first
recapitulate their characteristics so that they are in one location for easy
reference. A nonnegative continuous random variable X is distributed expo-
nentially with parameter λ if any of the following can be shown to hold:
In other words, the three are equivalent. In fact in many instances, show-
ing the LST form is simpler than showing the CDF or PDF. Now, if X is
an exponentially distributed random variable with parameter λ, then we
symbolically state that as X ∼ exp(λ).
Another useful result to remember is that P{X > x} = e−λx for x ≥ 0. In
fact the hazard rate function (defined as fX (x)/(1 − FX (x)) for any nonnega-
tive random variable X) of the exponential random variable with parameter
λ is indeed λ. Further, in terms of moments, the expected value of X is
E(X) = 1/λ. Also, the variance of X is V[X] = 1/λ2 . Thus, the COV is 1 for
the exponential random variable. Next, we describe some useful properties.
A.4.2 Properties
The following is a list of useful properties of the exponential random vari-
able. They are presented without derivation. Interested readers are encour-
aged to refer to standard texts on probability and stochastic processes such
as Kulkarni [67].
Random Variables 715
and
The LST can be inverted and CDF obtained in two cases: (1) when
all αi values are equal, which result in the Erlang distribution,
716 Analysis of Queues
(2) when all αi values are different, which result in the hypoexpo-
nential distribution.
X1 + X2 + · · · + Xn
Xn =
n
for any n. Then based on the results in the previous paragraph, we have
σ2
E Xn = τ and V Xn = .
n
Xn → τ
Sn = X1 + · · · + Xn
N(t) = max{n ≥ 0 : Sn ≤ t}
(λt)k
P{N(t) = k} = e−λt
k!
(λt)k
P{N(t + s) − N(s) = k} = e−λt
k!
for any t > 0 and s > 0. Notice that it is identical to P{N(t) = k}.
• Independent increments: If {N(t), t ≥ 0} is a PP(λ), 0 ≤ t1 ≤ t2 ≤ · · · ≤
tn are fixed real numbers, and 0 ≤ k1 ≤ k2 ≤ · · · ≤ kn are fixed
integers, then
t
(t) = λ(u)du.
0
[(t + s) − (s)]k
P{N(t + s) − N(s) = k} = exp{−[(t + s) − (s)]}
k!
where similar to the regular Poisson process, N(u) is the number of events
that occur in time (0, u). Also, E[N(t)] = (t).
The concept of batch or bulk events can be modeled using CPP, which
essentially is the same as a regular Poisson process with the exception
that with every event, the counting process need not increase by one. Let
{N(t), t ≥ 0} be a PP(λ). Let {Zn , n ≥ 1} be a sequence of IID random variables
that is also independent of {N(t), t ≥ 0}. Define
N(t)
Z(t) = Zn
n=1
E[Z(t)] = λtE[Z1 ],
Var[Z(t)] = λtE Z21 .
pk (t) = P{N(t) = k}
∞
p̃k (s) = e−st dpk (t) = (G̃(s))k (1 − G̃(s))
0
where G̃(s) = E[e−sYn ], the LST of the CDF of Yn . For example, if Yn ∼ exp(λ);
G(y) = P{Yn ≤ y} = 1 − e−λy for y ≥ 0. Also, G̃(s) = λ/(λ + s). In this case,
{N(t), t ≥ 0} process is indeed a Poisson process with parameter λ. One can
get by inverting the LST the following:
(λt)k
pk (t) = P{N(t) = k} = e−λt .
k!
m−1
As another example, let Yn ∼ Erlang(m, λ); G(y) = P{Yn ≤ y} = 1 − r=0 e−λy
(λy)r /r! for y ≥ 0. Also, G̃(s) = λm /(λ + s)m . Clearly,
mk m(k+1)
λ λ
p̃k (s) = − .
λ+s λ+s
In general, using the LST and inverting it to get the distribution of N(t) is
tricky. However, there are several results that can be derived without need-
ing to invert. We present them here. For the remainder of this section, we
Random Variables 721
assume that E[Yn ] = τ and V[Yn ] = σ, such that both τ and σ are finite. The
main results (many of them being asymptotic) are as follows:
G̃(s)
M̃(s) = ,
1 − G̃(s)
M(t) 1
lim = ,
t→∞ t τ
The next set of results have their roots in reliability theory, which would
explain the terminology. We define the following variables: A(t) = t − SN(t) ,
which is the time since the previous event, in reliability this would be the age;
B(t) = SN(t)+1 − t, which is the time the next event would occur, in reliability
this is the remaining life; and C(t) = A(t) + B(t), which is the time between
the previous and the next events, that is, in reliability this would be total life.
It is possible to derive:
1
x
lim P{B(t) ≤ x} = [1 − G(u)]du,
t→∞ τ
0
τ 2 + σ2
lim E[B(t)] = ,
t→∞ 2τ
τ2 + σ 2
lim E[A(t)] = .
t→∞ 2τ
Reference Notes
The contents of this chapter is a result of teaching various courses on prob-
ability and stochastic processes both at the undergraduate level and at the
graduate level. There are several excellent books on the topics covered in
this chapter. For example, the elementary material on probability, random
variables, and expectations can be found in Ross [93]. However, the nota-
tions used in this chapter and a majority of results are directly from Kulkarni
[67]. That would be a wonderful resource to look into the proofs and deriva-
tions for some of the results in this chapter. The notable exceptions not found
in either of those texts are as follows: the discussion on mixture distribu-
tions, coefficient of variation (COV), some special distributions, as well as
numerical inversion of LSTs and LTs. A description of relevant references
are provided for those topics. Finally, for a rigorous treatment of probabil-
ity, yet not abstract, an excellent source is Resnick [90], which also nicely
explains topics such as law of large numbers and central limit theorem.
Exercises
A.1 Let X be a continuous random variable with PDF
1
π x sin x if 0 < x < π
fX (x) =
0 otherwise.
Prove that
∞
(x) = tx−1 e−t dt.
0
√
Using the PDF of the normal distribution, show that (1/2) = π.
A.3 The conditional variance of X, given Y, is defined by
A.4 Suppose given a, b > 0, and let X, Y be two random variables with
values in Z+ (set of nonnegative integers) and R+ (set of nonnega-
tive real numbers), respectively. The joint distribution of X and Y is
characterized by
y (at)n
P[X = n, Y ≤ y] = b exp(−(a + b)t)dt.
n!
0
∞
where (a) = xa−1 e−x dx. Also, E[X] = α/(α + β). Bus B will
0
arrive at the same station at a random time uniformly distributed
between the arrival time of bus A and 11:00 a.m. Find the expected
value of the arrival time of bus B.
A.7 Two parts (call them A and B) are manufactured in parallel on two
machines (call them machine 1 and machine 2). The processing time
for part A on machine 1 is distributed exponentially with parame-
ter α. Similarly, part B takes an exp(β) amount of time to process
on machine 2. If the processing starts at the same time on both
machines, what is the expected time to complete processing of both
parts (i.e., the expected time for both machines to become idle)?
Hint: Let XA and XB be random variables denoting the time to pro-
cess jobs A and B on machines 1 and 2, respectively. Then define
Z = max(XA , XB ). Compute E[Z].
724 Analysis of Queues
λ3 t2 e−λt
g(t) = t ≥ 0.
2
725
726 Analysis of Queues
for any i ∈ S, j ∈ S, and n ≥ 0. Clearly, pij is the probability of going from state
i to state j from one observation to the next. The matrix of pij values is called
the transition probability matrix
P = [pij ]
(note that for the transition probability matrix, each row sums to
one)
6. Draw a transition diagram, that is, draw a directed network by
drawing the node set (the state space S) and the arcs (i, j) if pij > 0,
with arc cost pij for all i ∈ S and j ∈ S
Problem 131
Packets arriving at a router are classified into two types: real-time (RT) and
non-real-time (NR) packets. An RT packet follows another RT packet with
probability 0.7 (therefore, the probability of an NR packet following an RT
packet is 0.3). Similarly, an NR packet follows another NR packet with prob-
ability 0.6 (therefore, the probability of an RT packet following an NR packet
is 0.4). Model the type of packets arriving at a router as a DTMC.
Solution
Let Xn denote the type of the nth packet (RT or NR) arriving at the router.
Clearly, Xn can take only one of two values, RT or NR. Thus the state space
is S = {RT, NR}. From the problem description to predict the type of the
next packet, we only need to know the type of the current packet but noth-
ing about the history. Also, the transition probabilities are time-invariant.
Therefore, Markov and time-homogeneity properties are satisfied.
Now we can write down the transition probability matrix as follows:
RT NR
RT 0.7 0.3
P= .
NR 0.4 0.6
Thus the probability of the next packet being NR given the current is NR is
0.7, which is the northwest corner of the P matrix. Notice the rows adding
to one. We can also draw the transition diagram as described in Figure B.1.
Thus the system is modeled as a DTMC.
This is perhaps one of the simplest examples of a DTMC with two states.
Next we state a slightly bigger example.
Problem 132
Consider three cell-phone companies A, B, and C. Every time a sale
is announced, a thrifty graduate student switches from one company to
another. If the student is with company A before a sale, he switches to B
or C with probability 0.4 or 0.3, respectively. Likewise if he is with B, he
0.7 0.6
0.4
RT NR
0.3
FIGURE B.1
Transition diagram for the RT/NR problem.
728 Analysis of Queues
0.4
0.3 0.5
A 0.3 C
0.5 0.1
0.4 0.2
B
0.3
FIGURE B.2
Transition diagram for the cell-phone switching problem.
A B C
⎡ ⎤
A 0.3 0.4 0.3
P = B ⎣ 0.5 0.3 0.2 ⎦ .
C 0.4 0.1 0.5
Also, the transition diagram can easily be drawn as shown in Figure B.2.
Thus the system is modeled as a DTMC.
Next we present a case where the state space has infinite elements.
Problem 133
Consider a time division multiplexer from which packets are transmitted at
times 0, 1, 2, etc. Packets arriving between time n and n + 1 have to wait
until time n + 1 to be transmitted. However, at most one packet can be trans-
mitted at a time. Let Yn be the number of packets that arrive during time n
to n + 1. Assume that ai = P{Yn = i}. Model the number of packets awaiting
transmission as a DTMC.
Solution
Let Xn be the number of packets awaiting transmission just before time n
(i.e., just before an opportunity to transmit). Clearly, S = {0, 1, 2, . . .}. Based
Stochastic Processes 729
Similarly,
Notice in this example that we did not provide the transition diagram.
This is fairly typical since there is a one-to-one correspondence between the
transition probability matrix and the transition diagram. See the exercises for
more example problems as well as Chapter 4.
S = {A, B, C}
and
⎡ ⎤
0.3 0.4 0.3
P = ⎣ 0.5 0.3 0.2 ⎦ .
0.4 0.1 0.5
If at time 0, the student is with cell-phone company B, then after the third
sale (i.e., n = 3) to obtain the probability that this student is with company C,
it can be computed as follows. Firstly,
⎡ ⎤
0.3860 0.2770 0.3370
P3 = ⎣ 0.3930 0.2760 0.3310 ⎦ .
0.3870 0.2590 0.3540
Thus we have
which is the element corresponding to row B and column C (second row and
third column).
Continuing with this example, notice that as n → ∞,
⎡ ⎤
0.3882 0.2706 0.3412
Pn → ⎣ 0.3882 0.2706 0.3412 ⎦ .
0.3882 0.2706 0.3412
It appears as though in the long run, the state of the DTMC is independent
of its initial state. In other words, irrespective of which phone company the
graduate student started with, he/she would eventually be with A, B, or C
with probability 0.3882, 0.2706, and 0.3412, respectively. Notice how the P∞
matrix has all identical rows. This is the focus of the following section.
π = πP
πj = 1
j∈S
Thus we have (πRT πNR ) = (4/7 3/7). Therefore, in the long run, four-
sevenths of the packets will be real-time and three-sevenths non-real-time.
Such an analysis is also very useful to describe the performance of sys-
tems in steady state. We use a terminology of cost; however, it is not
732 Analysis of Queues
necessary that the cost has a financial connotation. With that understand-
ing, say when the system (DTMC) enters state i, it incurs a cost c(i) on an
average. Then the long-run average cost per unit time (or per observation or
per slot) is
c(i)πi
i∈S
which can be computed once the steady-state probabilities πi for all i ∈ S are
known.
As an example, consider Problem 133 describing a time-division mul-
tiplexer. Let πj be the limiting probability that there are j packets in the
multiplexer just before the nth attempted transmission (we assume that πj
for all j ∈ S can be calculated). Let us answer the question: what is the average
multiplexing delay for a packet that arrives immediately after an attempted
transmission in the long run? If there are i packets in the multiplexer when
this packet arrives, then this packet faces a latency (or delay) of (i + 1)τ units
of time. Note that τ is the time between successive multiplexing attempts.
Therefore, the average multiplexing delay is
π0 τ + τ iπi .
i=1
for any i ∈ S, j ∈ S, and s ≥ 0, where S is the state space. Thereby the stochastic
process {X(t), t ≥ 0} would be a CTMC.
A CTMC is typically characterized by its so-called infinitesimal generator
matrix Q with rows and columns corresponding the current state and the
next state in S. In other words, we keep track of epochs when the system state
changes, that is, X(t) changes with the understanding that between epochs
the state remains a constant. Thus in some sense if we considered the epochs
as observation times we indeed would have a DTMC. Next, to describe an
element qij of the Q matrix, it is the rate at which an event that would take
the system from state i to state j would occur. In other words, the triggering
event that drives the system from state i to state j happens after exp(qij ) time.
However, it is crucial to realize that j does not have to happen, another event
may have occurred prior to that. With that description, next we describe how
to model a system as a CTMC and state a few examples to clarify the earlier
description.
1. Define X(t), the state of the system at time t (this must be selected
appropriately so that Markov and time-homogeneity properties are
satisfied)
2. Write down the state space S, which is a set of all possible values
X(t) can take
3. Verify if Markov and time-homogeneity properties are satisfied
(this is straightforward if the interevent times are exponentially
distributed with time-invariant parameters)
4. Construct the generator matrix Q = [qij ] as follows:
a. For i = j, qij is the rate of transitioning from state i to state j (this
means that if no other event occurs then it would take an expo-
nential amount of time with mean 1/qij to go from state i to state j;
734 Analysis of Queues
also, if there are multiple events that can take the CTMC from i to
j, then the rate qij is the sum of the rates of all the events)
Problem 134
Consider a machine that toggles between two states, up and down. The
machine stays up for an exponential amount of time with mean 1/α hours
and then goes down. Then the machine stays down for an exponential
amount of time with mean 1/β hours before it gets back up. Model the
machine states using a CTMC.
Solution
Let X(t) be the state of the machine at time t. Therefore, if X(t) = 0, then
the machine is down at time t. Also, if X(t) = 1, the machine is up at time t.
Clearly, we have the state space as S = {0, 1}. The generator matrix in terms
of {0, 1} by {0, 1} is
0 1
0 −β β
Q= .
1 α −α
The rate diagram is provided in Figure B.3. Hence the system is modeled
a CTMC.
Problem 135
Consider a telephone switch that can handle at most N calls simultaneously.
Assume that calls arrive according to PP(λ) to the switch. Any call arriving
0 1
FIGURE B.3
Rate diagram for up/down machine.
Stochastic Processes 735
λ λ λ λ λ
0 1 2 N–1 N
μ 2μ 3μ (N – 1)μ Nμ
FIGURE B.4
Rate diagram for telephone switch.
when there are N other calls in progress receives a busy signal (and hence
rejected). Each accepted call lasts for an exponential amount of time with
mean 1/μ amount of time (this is the duration of a phone call, also called
hold times). Model the number of ongoing calls at any time as a CTMC.
Solution
Let X(t) be the number of ongoing calls in the switch at time t. Clearly, there
could be anywhere between 0 and N calls. Hence we have the state space
as S = {0, 1, . . . , N}. In many problem instances including this one, it is eas-
ier to draw the rate diagram and use it for analysis. In that light, the rate
diagram is illustrated in Figure B.4. To explain that, consider some i such
that 0 < i < N. When X(t) = i, one of two events can occur: either a new call
could arrive (this happens after exp(λ) time) or an existing call could com-
plete (this happens after exp(iμ) time). Of course if X(t) = 0, the only event
that could occur is a new call arrival. Likewise if X(t) = N, the only event of
significance is a call completing. Notice how memoryless property and min-
imum of exponentials property of exponential random variables are used in
the description.
Then the generator matrix is
⎡ ⎤
−λ λ 0 0 ... 0 0
⎢ μ −(λ + μ) λ 0 ... 0 0 ⎥
⎢ ⎥
⎢ ⎥
Q= ⎢ 0 2μ −(λ + 2μ) λ ... 0 0 ⎥.
⎢ ⎥
⎢ .. .. .. .. . . .. .. ⎥
⎣ . . . . . . . ⎦
0 0 0 0 . . . Nμ −Nμ
Notice how easy is the transition between the rate diagram and the generator
matrix.
Problem 136
Consider a system where messages arrive according to PP(λ). As soon as a
message arrives, it attempts transmission. The message transmission times
736 Analysis of Queues
are exponentially distributed with mean 1/μ units of time. If no other mes-
sage tries to transmit during the transmission time of this message, the
transmission is successful. If any other message tries to transmit during
this transmission, a collision results and all transmissions are terminated
instantly. All messages involved in a collision are called backlogged and
are forced to retransmit. All backlogged messages wait for an exponential
amount of time (with mean 1/θ) before starting retransmission. Model the
system called “unslotted Aloha” as a CTMC.
Solution
Let X(t) denote the number of backlogged messages at time t and Y(t) be
a binary variable that denotes whether or not a message is under transmis-
sion at time t. Then we model the stochastic process {(X(t), Y(t)), t ≥ 0} as a
CTMC. Notice that the state of the system is a two-tuple vector and the state
space is
S = {(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1), (3, 0), (3, 1), . . .}.
Say the state of the system at time t is (i, j) for some (i, j) ∈ S. If j = 1, then
one of three events can change the state of the system: a new arrival at rate λ
would take the system to (i+2, 0) due to a collision; a retransmission attempt
at rate iθ would take the system to (i + 1, 0) due to a collision; and a trans-
mission completion at rate μ would take the system to (i, 0). However, if
j = 0, then one of two events can change the state of the system, a new arrival
at rate λ would take the system to (i + 1, 1), and a retransmission at rate iθ
would take the system to (i − 1, 1). Based on that we can show that the rate
diagram would be as described in Figure B.5.
Notice in this example that we did not provide the Q matrix since it can
easily be inferred from the rate diagram. See the exercises for more example
problems as well as Chapters 2 and 3. Next, we move on to some analysis of
CTMCs. Much like DTMCs, here too, we first present transient analysis and
then move on to steady-state analysis.
θ 2θ
μ 3θ μ
μ λ
λ θ
μ λ
2θ λ
λ λ
0,1 1,1 λ 2,1 3,1
FIGURE B.5
Rate diagram for unslotted Aloha.
Stochastic Processes 737
for any i ∈ S and j ∈ S. This is the same pij (t) described for the Markov and
time-homogeneity properties. The matrix P(t) = [pij (t)] satisfies the following
matrix differential equation:
dP(t)
= P(t)Q = QP(t)
dt
with initial condition P(0) = I and boundary condition j ∈ S pij (t) = 1 for
every i ∈ S and any t ≥ 0. The solution to the differential equation can be
written as
P(t) = exp(Qt)
A2 A3
exp(A) = I + A + + + ···
2! 3!
which in the scalar special case would reduce to the usual exponential. It is
crucial to notice that the solution works only if the CTMC has finite number
of states. There are efficient ways of computing it especially when the entries
of Q are numerical (and not symbolic). However, there are other techniques
to use when Q is symbolic or if there are infinite elements in S.
As an example, consider a four-state CTMC {X(t), t ≥ 0} with S =
{1, 2, 3, 4} and
⎡ ⎤
−5 1 2 2
⎢ 0 −2 1 1 ⎥
Q=⎢
⎣ 1
⎥.
3 −5 1 ⎦
2 0 0 −2
⎡ ⎤
0.1842 0.2895 0.1316 0.3947
⎢ 0.1842 0.2895 0.1316 0.3947 ⎥
P(10) = exp(10Q) = ⎢
⎣ 0.1842
⎥.
0.2895 0.1316 0.3947 ⎦
0.1842 0.2895 0.1316 0.3947
Notice how the rows are identical, that is, the columns have the same
elements each. In other words, irrespective of which state 1 is currently,
eventually with probability 0.1842 the system will be in state 1. This is sim-
ilar to the steady-state behavior we saw with DTMCs. Next, we describe
steady-state analysis of CTMCs.
pQ = 0
pj = 1
j∈S
where
p is the vector (p0 , p1 , . . . , pj , . . .)
0 is a row vector of zeros
Also, when the system (CTMC) enters state i, it incurs a cost c(i) per unit
time on average. Again, the cost does not necessarily mean “dollar” cost but
any other performance measures as well. Then the long-run average cost
incurred per unit time is
c(i)pi .
i∈S
−β β
(p0 p1 ) = (0 0)
α −α
p0 + p1 = 1.
Thus (p0 p1 ) = (α/(α + β) β/(α + β)). Further, when the machine is up,
it produces products at a rate of ρ per second and no product is produced
when the machine is down. Then the long-run average production rate is
0 × p0 + ρ × p1 = ρβ/(α + β).
Next, consider the telephone switch system in Problem 135. Let
(p0 , p1 , . . . , pN ) be the solution to pQ = 0 and pi = 1. For this system, the
probability that an arriving call in steady state receives a busy signal (or is
rejected) is pN . Also, the long-run average rate of call rejection is λpN and the
N
long-run average switch utilization is ipi .
i=0
Finally, for Problem 136, let (p00 , p01 , p10 , p11 , p20 , p21 , . . .) be the solution
to pQ = 0 and (i,j) ∈ S pij = 1. Then the average number of backlogged mes-
∞
sages in steady state is i(pi0 +pi1 ) and the long-run system throughput
∞ i=0
is pi1 μ2 /(μ + iθ + λ).
i=0
740 Analysis of Queues
= P{Y1 = j, S1 ≤ x|Y0 = i}
for any i ∈ J , j ∈ J , and x ≥ 0. The two equations here are similar to the
Markov and time-homogeneity properties respectively. Now, for any i ∈ J ,
j ∈ J , and x ≥ 0 define
Problem 137
Consider a G/M/1 queue with independent identically distribute (IID) inter-
arrival times continuously distributed with common CDF A(·) and exp(μ)
service times. Let Z(t) be the number of customers in the system at time t, Sn
be the time of the nth arrival into the system with S0 = 0, and Yn = Z(Sn −)
be the number of customers in the system just before the nth arrival. For any
i ≥ 0 and j ≥ 0 obtain Gij (x) for the MRS {(Yn , Sn ), n ≥ 0}.
Solution
Clearly, we have S0 = 0 and S0 ≤ S1 ≤ S2 ≤ S3 ≤ . . ., since arrivals
occur one by one and the n + 1st arrival occurs after the nth. Then define
J = {0, 1, 2, . . .}. Then for any i ∈ J and any j ∈ J , we have
This result is due to the fact that when i + 1 ≥ j > 0, Gij (x) is the probability
of having exactly i + 1 − j service completions in time S1 and S1 ≤ x (where
S1 is an interarrival time). Likewise if j = 0, then Gi0 (x) is the probability that
the i + 1st service completion occurs before S1 and S1 ≤ x. Finally, if there are
i customers just before time Sn , then just before time Sn+1 it is not possible to
have more than i + 1 customers in the system, so j must be less than or equal
to i + 1 and thus Gij (x) = 0 if j > i + 1.
There are some properties MRSs satisfy that are important to address. Say
we are given an MRS {(Yn , Sn ), n ≥ 0} with kernel G(x). Then the stochastic
process {Yn , n ≥ 0} is a DTMC with transition probability matrix P = G(∞).
In fact in many analysis situations we may only have the LST of the kernel
G̃(s), then one can easily obtain G(∞) as G̃(0) using one of the LST proper-
ties. Then, it is also crucial to notice that if we know the initial distribution
a = [P{Y0 = i}], then the MRS is completely characterized by a and G(x). With
this, we move onto two stochastic processes that are driven by MRSs.
Yn = Z(Sn +).
Let
The kernel of the SMP (which is the same as that of the MRS) is
For the DTMC {Yn , n ≥ 0}, let the transition probability matrix be
P = G(∞).
Stochastic Processes 743
and the expected time the SMP spends in state i continuously before
transitioning out be
τi = E(S1 |Y0 = i)
πi = lim P{Yn = i}
n→∞
πi τi
pi = lim P{Z(t) = i} =
.
t→∞
πm τm
m=1
Problem 138
Consider a system with two components, A and B. The lifetime of the system
has a CDF F0 (x) and mean μ0 . When the system fails, with probability q it
is a component A failure and with probability (1 − q) it is a component B
failure. As soon as a component fails, it gets repaired and the repair time
for components A and B have CDF FA (x) and FB (x), respectively, as well
as means μA and μB , respectively. Model the system as an SMP and obtain
the steady-state probability that the system is up with components A and
component B are under repair.
Solution
Let Z(t) be the state of the system with state space {0, A, B} such that Z(t) = 0
implies the system is up and running at time t, whereas if Z(t) is A or B,
then at time t the system is down with component A or B, respectively,
under repair. Let Sn denote the epoch when Z(t) changes for the nth time
744 Analysis of Queues
Hence we get
πi μi
pi = lim P{Z(t) = i} = .
t→∞ π0 μ0 + πA μA + πB μB
Thus we have
1
[p0 pA pB ] = [μ0 qμA (1 − q)μB ].
μ0 + qμA + (1 − q)μB
The kernel of the MRGP (which is the same as that of the MRS) is
For the DTMC {Yn , n ≥ 0}, let the transition probability matrix be
P = G(∞).
and
τi = E(S1 |Y0 = i)
πi = lim P{Yn = i}
n→∞
π = πP and πi = 1.
i∈S
746 Analysis of Queues
Define αkj as the expected time spent in state j from time 0 to S1 , given
that Y0 = k, for all k ∈ S and j ∈ S. This could be tricky to compute in some
instances. The stationary distribution of the MRGP for any j ∈ S is given by
πk αkj
pj = lim P{Z(t) = j} =
k ∈ S .
t→∞ πk τk
k∈S
Of course all this assumes that the stationary distribution exists, which
only requires that the DTMC {Yn , n ≥ 0} is irreducible and positive recur-
rent (assuming that the epochs Sn occur continuously over time). With that
understanding, we move on to the final type of stochastic processes in
this chapter and the only one where the states are not countable (note that
other stochastic processes such as Ornstein–Uhlenbeck process and Gaussian
process are not considered although they have been used in chapters of
this book).
1
P{Xi = 1} = P{Xi = −1} = .
2
We let X(0) = 0, otherwise we will just look at X(t)−X(0). Note that E[Xi ] = 0
and Var[Xi ] = 1. From Equation B.1, we have
E[X(t)] = 0,
t
Var[X(t)] = (x)2 . (B.2)
t
Stochastic Processes 747
Now let x and t become zero. However, we must be careful to ensure that
the limit exists for Equation B.2. Therefore, we must have
(x)2 = σ2 t,
E[X(t)] = 0,
and
Var[X(t)] → σ2 t.
From central limit theorem we have: X(t) is normally distributed with mean
0 and variance σ2 t. With that we now formally define a Brownian motion.
• X(t) has independent increments; that is, for every pair of disjoint
time intervals (s, t) and (u, v), s < t ≤ u < v, the increments {X(t) −
X(s)} and {X(v) − X(u)} are independent random variables. There-
fore, the Brownian motion is a Markov process.
• Every increment {X(t) − X(s)} is normally distributed with mean 0
and variance σ2 (t − s).
where √
α = (x − x0 )/(σ t − s)
y √ −u2 /2
(y) = −∞ 1/ 2πe du
such that
Y(t) = eX(t) .
The process {Y(t), t ≥ 0} is called a geometric Brownian motion. Using the fact
that the moment generating function of a normal random variable X(t) with
mean 0 and variance t is
2 /2
E[esX(t) ] = ets ,
we have
Z(t) = |X(t)|.
The process {Z(t), t ≥ 0} is called Brownian motion reflected at the origin. The
CDF of Z(t) can be obtained for y > 0 as
= 2P{X(t) ≤ z} − 1
2 z
e−x
2 /2t
=√ dx − 1.
2πt −∞
Further, we have
2t
E[Z(t)] = ,
π
2
Var[Z(t)] = 1 − t.
π
Stochastic Processes 749
• X(0) = 0
• {X(t), t ≥ 0} has stationary and independent increments
• X(t) is normally distributed with mean μt and variance t
Note that the variance would be σ2 t if B(t) was not a “standard” Brownian
motion.
It can be shown that the CDF satisfies the following diffusion equation:
∂ ∂ σ2 ∂ 2
F(t, x; x0 ) = −μ F(t, x; x0 ) + F(t, x; x0 ). (B.3)
∂t ∂x 2 ∂x2
Initial condition: X(0) = x0 implies
0 if x < x0
F(0, x; x0 ) =
1 if x ≥ x0 .
Problem 139
Consider the Brownian motion {X(t), t ≥ 0} with drift coefficient μ and vari-
ance of X(t) is σ2 t. Assume there is a reflecting barrier placed on the x-axis.
Solve the PDE and obtain steady-state probabilities.
750 Analysis of Queues
Solution
The solution to the PDE (Equation B.3) is
x − x0 − μt −x − x0 − μt
− e−2xμ/σ φ
2
F(t, x; x0 ) = φ √ √ .
σ t σ t
dF(x) σ2 d2 F(x)
0 = −μ + .
dx 2 dx2
The solution is
2
F(x) = 1 − e2xμ/σ ,
1
f (x + dx) = f (x) + f (x)dx + f (x)(dx)2 + o((dx)2 ). (B.4)
2
1
df (x) = f (x)dx + f (x)(dx)2 + o((dx)2 ).
2
f (x + dx) − f (x)
lim = f (x).
dx→0 dx
df (x) = f (x)dx.
Stochastic Processes 751
When Brownian motion is involved, things turn out a little differently. Let
Bt be a standard Brownian motion (the same as B(t) but to avoid too many
parentheses in our formulae we use Bt ). Consider a function f (Xt ), where
Xt = μt + σBt .
1
df (Xt ) = f (Xt )dXt + f (Xt )(dXt )2 + o((dXt )2 ).
2
(dBt )2 = dt
= σ2 dt,
where on the last line we have ignored terms of higher order than dt. Substi-
tuting in the Taylor’s series expansion, and omitting higher-order terms, we
have
1
df (Xt ) = f (Xt )dXt + f (Xt )σ2 dt. (B.5)
2
dSt
= μdt + σdBt . (B.6)
St
Now consider ln(St ). We would like to obtain d(ln(St )). Note that for f (x) =
ln(x), we have f (x) = 1/x and f (x) = − 1/x2 . Therefore, we have (using
752 Analysis of Queues
Equation B.6)
dSt 1 (dSt )2
d(ln(St )) = −
St 2 S2t
1
= μdt + σdBt − (μdt + σdBt )2
2
1
= μdt + σdBt − σ2 dt.
2
ST = S0 eνT+σBT .
Reference Notes
Like the previous chapter, this chapter is also mainly a result of teach-
ing various courses on stochastic processes especially at the graduate level.
The definitions, presentations, and notations for the first part of this chap-
ter (DTMC, CTMC, MRS, SMP, and MRGP) are heavily influenced by
Kulkarni [67]. Another excellent resource for those topics is Ross [92]. For
the Brownian part, the materials presented are based out of Ross [91] and
Medhi [80]. Several topics such as diffusion processes, Ornstein–Uhlenbeck
process, Gaussian process, and martingales have been left out. Some of these
such as martingales can be found in both Ross [91] and Resnick [90]. Also,
Stochastic Processes 753
the topic of stochastic process limits (i.e., also not considered here) can be
found in Whitt [105].
Exercises
B.1 A discrete-time polling system consists of a single communication
channel serving N buffers in a cyclic order starting with buffer-1. At
time t = 0, the channel polls buffer-1. If it has any packets to trans-
mit, the channel transmits exactly one and then moves to buffer-2
at time t = 1. The same process repeats at each buffer until at time
t = N − 1 the channel polls buffer N. Then at time t = N, the chan-
nel polls buffer-1 and the cycle repeats. Now consider buffer-1. Let
Yt be the number of packets it receives during the interval (t, t + 1].
Assume that Yt = 1 with probability p and Yt = 0 with probability
1 − p. Let Xn be the number of packets available for transmission
at buffer-1 when it is polled for the nth time. Model {Xn , n ≥ 1} as a
DTMC.
B.2 Consider a DTMC with transition probability matrix
⎛ ⎞
p0 p1 p2 p3 ···
⎜ 1 0 0 0 ··· ⎟
⎜ ⎟
⎜ 0 1 0 0 ··· ⎟
P=⎜ ⎟
⎜ 0 0 1 0 ··· ⎟
⎝ ⎠
.. .. .. .. ..
. . . . .
∞
∞
where pj > 0 for all j and pj = 1. Let M = (j pj ) and
j=0 j=0
assume M < ∞. Show that π0 = 1/(1 + M) and hence find the sta-
tionary probability distribution π = (π0 π1 π2 . . .).
B.3 Conrad is a student who goes out to eat lunch everyday. He eats
either at a Chinese food place (C), at an Italian food place (I), or
at a Burger place (B). The place Conrad chooses to go for lunch on
day n can be modeled as a DTMC with state space S = {C, I, B} and
transition matrix
⎡ ⎤
0.3 0.3 0.4
P=⎣ 1 0 0 ⎦.
0.5 0.5 0
That means if Conrad went to the Chinese food place (C) yesterday,
he will choose to go to the Burger place (B) today with probabil-
ity 0.4, and also there is a 30% chance he will go to the Italian
754 Analysis of Queues
⎡ ⎤
2 3
⎢ ⎥
Q=⎣ 1 −1 ⎦.
0 −2
two conditions: one repair person and two repair persons. Would
it better to employ one or two repair persons for this system? If a
machine is up, it generates a revenue of $r per unit time. However,
each repair person charges $c per unit time.
B.7 Consider a three-state SMP {Z(t), t ≥ 0} with state space {1, 2, 3}.
The elements of the kernel of this SMP is given as follows:
G12 (t) = 1 − e−t − te−t , G21 (t) = 0.4(1 − e−0.5t ) + 0.3(1 − e−0.2t ),
G23 (t) = 0.2(1 − e−0.5t ) + 0.1(1 − e−0.2t ), G32 (t) = 1 − 2e−t + e−2t , and
G11 (t) = G13 (t) = G22 (t) = G31 (t) = G33 (t) = 0. Obtain the probability
that the SMP is in state i in steady state for i = 1, 2, 3.
B.8 Let {X(t), t ≥ 0} be a standard Brownian motion. Define {Z(t), t ≥ 0}
such that Z(t) = |X(t)|, the Brownian motion reflected at the origin.
Using the CDF of Z(t), derive expressions for E[Z(t)] and Var[Z(t)].
B.9 For any constant k show that {Y(t), t ≥ 0} is a martingale if
Y(t) = exp{kB(t) − k2 t/2}, where B(t) is a standard Brownian motion.
For that, all you need to show is the following is satisfied:
E[Y(t)|Y(u), 0 ≤ u ≤ s] = Y(s).
1. S. Aalto, U. Ayesta, and R. Righter. On the Gittins index in the M/G/1 queue.
Queueing Systems, 63(1–4), 437–458, 2009.
2. J. Abate and W. Whitt. Numerical inversion of Laplace transforms of probability
distributions. ORSA Journal on Computing, 7, 36–43, 1995.
3. V. Aggarwal, N. Gautam, S.R.T. Kumara, and M. Greaves. Stochastic fluid-flow
models for determining optimal switching thresholds with an application to
agent task scheduling. Performance Evaluation, 59(1), 19–46, 2004.
4. S. Ahn and V. Ramaswami. Efficient algorithms for transient analysis of
stochastic fluid flow models. Journal of Applied Probability, 42(2), 531–549, 2005.
5. D. Anick, D. Mitra, and M.M. Sondhi. Stochastic theory of a data handling
system with multiple sources. Bell System Technical Journal, 61, 1871–1894, 1982.
6. L. Arnold. Stochastic Differential Equations: Theory and Applications, Krieger
Publishing Company, Melbourne, FL, 1992.
7. F. Baccelli and P. Bremaud. Elements of Queuing Theory: Palm Martingale Calculus
and Stochastic Recurrences, 2nd edn., Springer, Berlin, Germany, 2003.
8. F. Baskett, K.M. Chandy, R.R. Muntz, and F. Palacios. Open, closed and mixed
networks of queues with different classes of customers. Journal of the ACM, 22,
248–260, 1975.
9. A.W. Berger and W. Whitt. Effective bandwidths with priorities. IEEE/ACM
Transactions on Networking, 6(4), 447–460, August 1998.
10. A.W. Berger and W. Whitt. Extending the effective bandwidth concept to net-
work with priority classes. IEEE Communications Magazine, 36, 78–84, August
1998.
11. G.R. Bitran and S. Dasu. Analysis of the PHi /PH/1 queue. Operations Research,
42(1), 159–174, 1994.
12. G. Bolch, S. Greiner, H. de Meer, and K.S. Trivedi. Queueing Networks and Markov
Chains, 1st edn., John Wiley & Sons Inc., New York, 1998.
13. M. Bramson. Stability of queueing networks. Probability Surveys, 5, 169–345,
2008.
14. P.J. Burke. The output of a queuing system. Operations Research, 4(6), 699–704,
1956.
15. J.A. Buzacott and J.G. Shanthikumar. Stochastic Models of Manufacturing Systems,
Prentice-Hall, New York, 1992.
16. C.S. Chang and J.A. Thomas. Effective bandwidth in high-speed digital net-
works. IEEE Journal on Selected Areas in Communications, 13(6), 1091–1100,
1995.
17. C.S. Chang and T. Zajic. Effective bandwidths of departure processes from
queues with time varying capacities. In: Fourteenth Annual Joint Conference of the
IEEE Computer and Communication Societies, Boston, MA, pp. 1001–1009, 1995.
18. X. Chao, M. Miyazawa, and M. Pinedo. Queueing Networks: Customers, Signals,
and Product Form Solutions, John Wiley & Sons, New York, 1999.
757
758 References
78. A. Mandelbaum, W.A. Massey, M.I. Reiman, A. Stolyar, and B. Rider. Queue
lengths and waiting times for multiserver queues with abandonment and
retrials. Telecommunication Systems, 21(2–4), 149–171, 2002.
79. W.A. Massey and W. Whitt. Uniform acceleration expansions for Markov chains
with time-varying rates. The Annals of Applied Probability, 8(4), 1130–1155, 1998.
80. J. Medhi. Stochastic Models in Queueing Theory, Elsevier Science, Boston, MA,
2003.
81. D.A. Menasce and V.A.F. Almeida. Scaling for E-Business: Technologies, Mod-
els, Performance, and Capacity Planning, Prentice Hall, Upper Saddle River, NJ,
2000.
82. S.P. Meyn. Control Techniques for Complex Networks, Cambridge University Press,
New York, 2009.
83. M. Moses, S. Seshadri, and M. Yakirevich. HOM Software. http://www.stern.
nyu.edu/HOM
84. A. Narayanan and V.G. Kulkarni. First passage times in fluid models with
an application to two priority fluid systems. Proceedings of the IEEE Interna-
tional Computer Performance and Dependability Symposium, Urbana-Champaign,
IL, 1996.
85. M.F. Neuts. Matrix-Geometric Solutions in Stochastic Models—An Algorithmic
Approach, The Johns Hopkins University Press, Baltimore, MD, 1981.
86. T. Osogami and M. Harchol-Balter. Closed form solutions for mapping general
distributions to quasi-minimal PH distributions. Performance Evaluation, 63, 524–
552, 2006.
87. Z. Palmowski and T. Rolski. A note on martingale inequalities for fluid models.
Statistic and Probability Letter, 31(1), 13–21, 1996.
88. Z. Palmowski and T. Rolski. The superposition of alternating on-off flows and a
fluid model. Report no. 82, Mathematical Institute, Wroclaw University, 1996.
89. N.U. Prabhu. Foundations of Queueing Theory, Kluwer Academic Publishers,
Boston, MA, 1997.
90. S.I. Resnick. A Probability Path, Birkhauser, Boston, MA, 1998.
91. S.M. Ross. Stochastic Processes, John Wiley & Sons Inc., New York, 1996.
92. S.M. Ross. Introduction to Probability Models, 8th edn., Academic Press, New York
2003.
93. S.M. Ross. A First Course in Probability, 8th edn., Pearson Prentice Hall, Upper
Saddle River, NJ, 2010.
94. D. Sarkar and W.I. Zangwill. Expected waiting time for nonsymmetric cyclic
queueing systems–Exact results and applications. Management Science, 35(12),
1463–1474, 1989.
95. L.E. Schrage and L.W. Miller. The queue M/G/1 with the shortest remaining
processing time discipline. Operations Research, 14, 670–684, 1966.
96. R. Serfozo. Introduction to Stochastic Networks, Springer-Verlag, New York, 1999.
97. L.D. Servi. Fast algorithmic solutions to multi-dimensional birth-death pro-
cesses with applications to telecommunication systems. In: Performance Evalu-
ation and Planning Methods for the Next Generation Internet, A. Girard, B. Sanso,
and F. Vazquez-Abad (eds.), Springer, New York, pp. 269–295, 2005.
98. A. Shwartz and A. Weiss. Large Deviations for Performance Analysis, Chapman &
Hall, New York, 1995.
99. W.J. Stewart. Introduction to the Numerical Solution of Markov Chains, Princeton
University Press, Princeton, NJ, 1994.
762 References
100. S. Stidham Jr. Optimal Design of Queueing Systems, CRC Press, Boca Raton, FL,
2009.
101. H. Takagi. Queueing analysis of polling models. ACM Computing Surveys, 20(1),
5–28, 1988.
102. H.C. Tijms. A First Course in Stochastic Models, John Wiley & Sons Inc., Bognor
Regis, West Sussex, England, 2003.
103. W. Whitt. The queueing network analyzer. The Bell System Technical Journal,
62(9), 2779–2815, 1983.
104. W. Whitt. Departures from a queue with many busy servers. Operations Research,
9(4), 534–544, 1984.
105. W. Whitt. Stochastic-Process Limits, Springer, New York, 2002.
106. W. Whitt. Efficiency-driven heavy-traffic approximations for many-server
queues with abandonments. Management Science, 50(10), 1449–1461, 2004.
107. A. Wierman, N. Bansal, and M. Harchol-Balter. A note comparing response
times in the M/G/1/FB and M/G/1/PS queues. Operations Research Letters, 32,
73–76, 2003.
108. R.W. Wolff. Stochastic Modeling and the Theory of Queues, Prentice Hall, Engle-
wood Cliffs, NJ, 1989.
Manufacturing and Industrial Engineering
Analysis of Queues
Methods and Applications
“The breadth and scope of topics in this book surpass the books currently on the market. For most
graduate engineering or business courses on this topic the selection is perfect. … presented in
sufficient depth for any graduate class. I like in particular the “problems” presented at regular intervals,
along with detailed solutions. … excellent coverage of both classical and modern techniques in
queueing theory. Compelling applications and case studies are sprinkled throughout the text. For many
of us who teach graduate courses in queueing theory, this is the text we have been waiting for!”
—John J. Hasenbein, The University of Texas at Austin
“Dr. Gautam has an obvious passion for queueing theory. His delight in presenting queueing paradoxes
beams through the pages of the book. His relaxed conversational style makes reading the book
a pleasure. His introductory comments about having to account for a large variety of educational
backgrounds among students taking graduate courses indicate that he takes education very seriously. It
shows throughout the book. He has made an excellent choice of topics and presented them in his own
special style. I highly recommend this queueing text by an expert who clearly loves his field.”
—Dr. Myron Hlynka, University of Windsor, Ontario, Canada
Features
• Explains concepts through applications in a variety of domains such as production,
computer communication, and service systems
• Presents numerous solved examples and exercise problems that deepen students’
understanding of topics
• Includes discussion of fluid-flow queues, which is not part of any other textbook
• Contains prerequisite material to enhance methodological understanding
• Emphasizes methodology, rather than presenting a collection of formulae
• Provides 139 solved problems and 154 unsolved problems
• Promotes classroom discussions using case studies and paradoxes
K10327
6000 Broken Sound Parkway, NW ISBN: 978-1-4398-0658-6
Suite 300, Boca Raton, FL 33487 90000
711 Third Avenue
an informa business New York, NY 10017
2 Park Square, Milton Park
www.taylorandfrancisgroup.com Abingdon, Oxon OX14 4RN, UK
9 781439 806586
w w w. c rc p r e s s . c o m