0% found this document useful (0 votes)

1K views

Analysis of Queues - Methods and Applications (2012, CRC Press)

Teori antrian - Queueing Theory

Uploaded by

فردوس سليمان

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views

Analysis of Queues - Methods and Applications (2012, CRC Press)

Teori antrian - Queueing Theory

Uploaded by

فردوس سليمان

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 788

Analysis of Queues

Methods and Applications

Natarajan Gautam
Analysis of Queues
Methods and Applications
The Operations Research Series
Series Editor: A. Ravi Ravindran
Professor, Department of Industrial and Manufacturing Engineering
The Pennsylvania State University – University Park, PA

Published Titles:
Analysis of Queues: Methods and Applications
Natarajan Gautam

Integer Programming: Theory and Practice

John K. Karlof

Operations Research and Management Science Handbook

A. Ravi Ravindran

Operations Research Applications

A. Ravi Ravindran

Operations Research: A Practical Introduction

Michael W. Carter & Camille C. Price

Operations Research Calculations Handbook, Second Edition

Dennis Blumenfeld

Operations Research Methodologies

A. Ravi Ravindran

Probability Models in Operations Research

C. Richard Cassady & Joel A. Nachlas

Forthcoming Titles:

Introduction to Linear Optimization and Extensions

with MATLAB®
Roy H. Kwon

Supply Chain Engineering: Models and Applications

A. Ravi Ravindran & Donald Paul Warsing
Analysis of Queues
Methods and Applications

Natarajan Gautam

Boca Raton London New York

CRC Press is an imprint of the

Taylor & Francis Group, an informa business
MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the accu-
racy of the text or exercises in this book. This book’s use or discussion of MATLAB® software or related products does not
constitute endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular use of the
MATLAB® software.

CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742

© 2012 by Taylor & Francis Group, LLC

CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Version Date: 20120405

International Standard Book Number-13: 978-1-4398-0659-3 (eBook - PDF)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made
to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all
materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all
material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in
any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, micro-
filming, and recording, or in any information storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.
copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-
8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that
have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identi-
fication and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com

and the CRC Press Web site at

http://www.crcpress.com
To my parents

Ramaa Natarajan and P.R. Natarajan

This page intentionally left blank
Contents

Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Author. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
List of Case Studies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi
List of Paradoxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Analysis of Queues: Where, What, and How?. . . . . . . . . . . . . . . . . . . . 2
1.1.1 Where Is This Used? The Applications . . . . . . . . . . . . . . . . . . . . 2
1.1.2 What Is Needed? The Art of Modeling . . . . . . . . . . . . . . . . . . . . 5
1.1.3 How Do We Plan to Proceed? Scope and Methods . . . . . . 7
1.2 Systems Analysis: Key Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.1 Stability and Flow Conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.2 Definitions Based on Limiting Averages . . . . . . . . . . . . . . . . . . 10
1.2.3 Asymptotically Stationary and Ergodic Flow
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.4 Little’s Law for Discrete Flow Systems . . . . . . . . . . . . . . . . . . . . 12
1.2.5 Observing a Flow System According to a Poisson
Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Queueing Fundamentals and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.1 Fundamental Queueing Terminology . . . . . . . . . . . . . . . . . . . . . 21
1.3.2 Modeling a Queueing System as a Flow System . . . . . . . . . 26
1.3.3 Relationship between System Metrics for G/G/s
Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.3.3.1 G/G/s/K Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.3.4 Special Case of M/G/s Queue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.4 Psychology in Queueing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2. Exponential Interarrival and Service Times: Closed-Form

Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.1 Solving Balance Equations via Arc Cuts. . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.1.1 Multiserver and Finite Capacity Queue Model
(M/M/s/K) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.1.2 Steady-State Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.1.3 Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.2 Solving Balance Equations Using Generating Functions . . . . . . . 61
2.2.1 Single Server and Infinite Capacity Queue
(M/M/1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
vii
viii Contents

2.2.2Retrial Queue with Application to Local Area

Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.2.3 Bulk Arrival Queues (M[X] /M/1) with a Service
System Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.2.4 Catastrophic Breakdown of Servers in M/M/1
Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.3 Solving Balance Equations Using Reversibility . . . . . . . . . . . . . . . . . . 78
2.3.1 Reversible Processes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.3.2 Properties of Reversible Processes . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.3.3 Example: Analysis of Bandwidth-Sensitive Traffic . . . . . . 81
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3. Exponential Interarrival and Service Times: Numerical

Techniques and Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.1 Multidimensional Birth and Death Chains. . . . . . . . . . . . . . . . . . . . . . . . 93
3.1.1 Motivation: Threshold Policies in Optimal Control . . . . . 94
3.1.2 Algorithm Using Recursively Tridiagonal Linear
Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.1.3 Example: Optimal Customer Routing . . . . . . . . . . . . . . . . . . . . . 114
3.2 Multidimensional Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.2.1 Quasi-Birth-Death Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.2.2 Matrix Geometric Method for QBD Analysis . . . . . . . . . . . . 122
3.2.3 Example: Variable Service Rate Queues in Computer
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
3.3 Finite-State Markov Chains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
3.3.1 Efficiently Computing Steady-State Probabilities . . . . . . . 132
3.3.1.1 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . 133
3.3.1.2 Direct Computation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
3.3.1.3 Using Transient Analysis . . . . . . . . . . . . . . . . . . . . . . . . 135
3.3.1.4 Finite-State Approximation. . . . . . . . . . . . . . . . . . . . . . 136
3.3.2 Example: Energy Conservation in Data Centers . . . . . . . . . 136
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

4. General Interarrival and/or Service Times: Closed-Form

Expressions and Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
4.1 Analyzing Queues Using Discrete Time Markov Chains . . . . . . . 151
4.1.1 M/G/1 Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
4.1.2 G/M/1 Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
4.2 Mean Value Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
4.2.1 Explaining MVA Using an M/G/1 Queue . . . . . . . . . . . . . . . . 181
4.2.2 Approximations for Renewal Arrivals and General
Service Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
4.2.3 Departures from G/G/1 Queues. . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Contents ix

4.3 Bounds and Approximations for General Queues . . . . . . . . . . . . . . . 188

4.3.1 General Single Server Queueing System (G/G/1) . . . . . . . 189
4.3.2 Multiserver Queues (M/G/s, G/M/s, and G/G/s) . . . . . . . 194
4.3.3 Case Study: Staffing and Work-Assignment in Call
Centers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
4.3.3.1 TravHelp Calls for Help . . . . . . . . . . . . . . . . . . . . . . . . . 197
4.3.3.2 Recommendation: Major Revamp . . . . . . . . . . . . . . 198
4.3.3.3 Findings and Adjustments. . . . . . . . . . . . . . . . . . . . . . . 200
4.4 Matrix Geometric Methods for G/G/s Queues . . . . . . . . . . . . . . . . . . . 201
4.4.1 Phase-Type Processes: Description and Fitting . . . . . . . . . . 202
4.4.2 Analysis of Aggregated Phase-Type Queue
( PHi /PH/s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
4.4.3 Example: Application in Semiconductor
Wafer Fabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
4.5 Other General Queues but with Exact Results . . . . . . . . . . . . . . . . . . . 213
4.5.1 M/G/∞ Queue: Modeling Systems with Ample
Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
4.5.2 M/G/1 Queue with Processor Sharing:
Approximating CPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
4.5.3 M/G/s/s Queue: Telephone Switch Application . . . . . . . . . 227
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

5. Multiclass Queues under Various Service Disciplines . . . . . . . . . . . . . . 241

5.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
5.1.1 Examples of Multiclass Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
5.1.2 Preliminaries: Little’s Law for the Multiclass
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
5.1.3 Work-Conserving Disciplines for Multiclass G/G/1
Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
5.1.4 Special Case: At Most One Partially Completed
Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
5.2 Evaluating Policies for Classification Based on Types:
Priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
5.2.1 Multiclass M/G/1 with FCFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
5.2.2 M/G/1 with Nonpreemptive Priority . . . . . . . . . . . . . . . . . . . . . 256
5.2.3 M/G/1 with Preemptive Resume Priority . . . . . . . . . . . . . . . . 262
5.2.4 Case Study: Emergency Ward Planning . . . . . . . . . . . . . . . . . . 266
5.2.4.1 Service Received by Class-1 Patients . . . . . . . . . . . 268
5.2.4.2 Experience for Class-2 Patients . . . . . . . . . . . . . . . . . 269
5.2.4.3 Three-Class Emergency Ward Operation . . . . . 271
5.3 Evaluating Policies for Classification Based on Location:
Polling Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
5.3.1 Exhaustive Polling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
5.3.2 Gated Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
x Contents

5.3.3 Limited Service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

5.4 Evaluating Policies for Classification Based on Knowledge
of Service Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
5.4.1 Shortest Processing Time First . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
5.4.2 Preemptive Shortest Job First . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
5.4.3 Shortest Remaining Processing Time . . . . . . . . . . . . . . . . . . . . . . 287
5.5 Optimal Service-Scheduling Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
5.5.1 Setting and Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
5.5.2 Optimal Scheduling Policies in Single Class
Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
5.5.3 Optimal Scheduling Policies in Multiclass Queues . . . . . . 300
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

6. Exact Results in Network of Queues:Product Form. . . . . . . . . . . . . . . . . . . 311

6.1 Acyclic Queueing Networks with Poisson Flows . . . . . . . . . . . . . . . . 312
6.1.1 Departure Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
6.1.2 Superpositioning and Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
6.1.3 Case Study: Automobile Service Station . . . . . . . . . . . . . . . . . . 319
6.1.3.1 System Description and Model . . . . . . . . . . . . . . . . . 320
6.1.3.2 Analysis and Recommendation . . . . . . . . . . . . . . . . . 321
6.2 Open Jackson Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
6.2.1 Flow Conservation and Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
6.2.2 Product-Form Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
6.2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
6.3 Closed Jackson Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
6.3.1 Product-Form Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
6.3.2 Arrivals See Time Averages (ASTA) . . . . . . . . . . . . . . . . . . . . . . . 346
6.3.3 Single-Server Closed Jackson Networks . . . . . . . . . . . . . . . . . . 353
6.4 Other Product-Form Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
6.4.1 Open Jackson Networks with State-Dependent
Arrivals and Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
6.4.2 Open Jackson–Like Networks with Deterministic
Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
6.4.3 Multiclass Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
6.4.4 Loss Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370

7. Approximations for General Queueing

Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
7.1 Single-Server and Single-Class General Queueing
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
7.1.1 G/G/1 Queue: Reflected Brownian Motion–Based
Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Contents xi

7.1.2Superpositioning, Splitting, and Flow through a

Queue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
7.1.2.1 Superposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
7.1.2.2 Flow through a Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
7.1.2.3 Bernoulli Splitting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
7.1.3 Decomposition Algorithm for Open Queueing
Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
7.1.4 Approximate Algorithms for Closed Queueing
Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
7.1.4.1 Bottleneck Approximation for Large C . . . . . . . . 400
7.1.4.2 MVA Approximation for Small C . . . . . . . . . . . . . . 402
7.2 Multiclass and Multiserver Open Queueing Networks
with FCFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
7.2.1 Preliminaries: Network Description . . . . . . . . . . . . . . . . . . . . . . . 407
7.2.2 Extending G/G/1 Results to Multiserver, Multiclass
Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
7.2.2.1 Flow through Multiple Servers . . . . . . . . . . . . . . . . . 411
7.2.2.2 Flow across Multiple Classes . . . . . . . . . . . . . . . . . . . . 411
7.2.2.3 Superposition and Splitting of Flows . . . . . . . . . . 413
7.2.3 QNA Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
7.2.4 Case Study: Network Interface Card in Cluster
Computing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
7.3 Multiclass and Single-Server Open Queueing Networks
with Priorities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
7.3.1 Global Priorities: Exponential Case . . . . . . . . . . . . . . . . . . . . . . . . 425
7.3.2 Global Priorities: General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
7.3.3 Local Priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
7.3.3.1 MVA-Based Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 436
7.3.3.2 State-Space-Collapse-Based Algorithm . . . . . . . . 437
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442

8. Fluid Models for Stability, Approximations, and Analysis of

Time-Varying Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
8.1 Deterministic Fluid Queues: An Introduction . . . . . . . . . . . . . . . . . . . . 447
8.1.1 Single Queue with a Single Server . . . . . . . . . . . . . . . . . . . . . . . . . 448
8.1.2 Functional Strong Law of Large Numbers and the
Fluid Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
8.2 Fluid Models for Stability Analysis of Queueing
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
8.2.1 Special Multiclass Queueing Networks with Virtual
Stations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
8.2.2 Stable Fluid Network Implies Stable Discrete
Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
xii Contents

8.2.3Is the Fluid Model of a Given Queueing Network

Stable? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
8.3 Diffusion Approximations for Performance Analysis . . . . . . . . . . . 483
8.3.1 Diffusion Limit and Functional Central Limit
Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
8.3.2 Diffusion Approximation for Multiserver Queues . . . . . . 488
8.3.2.1 Fix s, Increase λ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490
8.3.2.2 Fix ρ, Increase λ and s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
8.3.2.3 Fix β, increase λ and s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
8.3.3 Efficiency-Driven Regime for Multiserver Queues
with Abandonments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
8.4 Fluid Models for Queues with Time-Varying Parameters . . . . . . 500
8.4.1 Uniform Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
8.4.2 Diffusion Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513

9. Stochastic Fluid-Flow Queues: Characteristics and Exact

Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
9.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
9.1.1 Discrete versus Fluid Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
9.1.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
9.1.3 Preliminaries for Performance Analysis. . . . . . . . . . . . . . . . . . . 521
9.1.4 Environment Process Characterization. . . . . . . . . . . . . . . . . . . . 522
9.1.4.1 CTMC Environmental Processes . . . . . . . . . . . . . . . 522
9.1.4.2 Alternating Renewal Environmental
Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
9.1.4.3 SMP Environmental Processes . . . . . . . . . . . . . . . . . . 523
9.2 Single Buffer with Markov Modulated Fluid Source . . . . . . . . . . . . 525
9.2.1 Terminology and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
9.2.2 Buffer Content Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
9.2.3 Steady-State Results and Performance Evaluation . . . . . . 534
9.2.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
9.3 First Passage Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
9.3.1 Partial Differential Equations and Boundary
Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
9.3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585

10. Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics . . . . . 589

10.1 Introduction and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
10.1.1 Inflow Characteristics: Effective Bandwidths and
ALMGF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
10.1.2 Computing Effective Bandwidth and ALMGF . . . . . . . . . . . 593
Contents xiii

10.1.2.1 Effective Bandwidth of a CTMC Source . . . . . . . 593

10.1.2.2 Effective Bandwidth of a Semi-Markov
Process (SMP) Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
10.1.2.3 Effective Bandwidth of a General On/Off
Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
10.1.3 Two Extensions: Traffic Superposition and Flow
through a Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
10.1.3.1 Superposition of Multiple Sources . . . . . . . . . . . . . 605
10.1.3.2 Effective Bandwidth of the Output from a
Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
10.2 Performance Analysis of a Single Queue . . . . . . . . . . . . . . . . . . . . . . . . . . 609
10.2.1 Effective Bandwidths for Tail Asymptotics . . . . . . . . . . . . . . . 611
10.2.2 Chernoff Dominant Eigenvalue Approximation. . . . . . . . . 622
10.2.3 Bounds for Buffer Content Distribution . . . . . . . . . . . . . . . . . . . 627
10.3 Multiclass Fluid Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
10.3.1 Tackling Varying Output Capacity: Compensating
Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 650
10.3.2 Timed Round Robin (Polling) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651
10.3.3 Static Priority Service Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662
10.3.4 Other Policies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687

Appendix A: Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691

A.1 Distribution and Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691
A.1.1 Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693
A.1.2 Continuous Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697
A.1.3 Coefficient of Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702
A.2 Generating Functions and Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 703
A.2.1 Generating Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704
A.2.2 Laplace–Stieltjes Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705
A.2.3 Laplace Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707
A.3 Conditional Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 708
A.3.1 Obtaining Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 708
A.3.2 Obtaining Expected Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 711
A.4 Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714
A.4.1 Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714
A.4.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714
A.5 Collection of IID Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716
A.5.1 Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717
A.5.2 Renewal Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
xiv Contents

Appendix B: Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725

B.1 Discrete-Time Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
B.1.1 Modeling a System as a DTMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
B.1.2 Transient Analysis of DTMCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729
B.1.3 Steady-State Analysis of DTMCs . . . . . . . . . . . . . . . . . . . . . . . . . . . 730
B.2 Continuous-Time Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732
B.2.1 Modeling a System as a CTMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
B.2.2 Transient Analysis of CTMCs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
B.2.3 Steady-State Analysis of CTMCs . . . . . . . . . . . . . . . . . . . . . . . . . . . 738
B.3 Semi-Markov Process and Markov Regenerative
Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 740
B.3.1 Markov Renewal Sequence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 740
B.3.2 Semi-Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742
B.3.3 Markov Regenerative Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 744
B.4 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746
B.4.1 Definition of Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 747
B.4.2 Analysis of Brownian Motion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 749
B.4.3 Itô’s Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750
Reference Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763
Preface

The field of design, control, and performance analysis of queueing systems

is extremely mature. The gap between the content of the texts used in under-
graduate courses and what is published in leading journals like Queueing
Systems: Theory and Applications is astounding. This gap can be bridged by
offering a course or two covering a broad range of classical and contempo-
rary topics in queueing. For this, one requires a text that not only deals with a
wide range of concepts, but also has an exposition that would lend students
to read up materials that are not covered in class on their own. This is pre-
cisely what motivated the content and presentation of this book. It has been
written primarily keeping students in mind and has several solved exam-
ples, applications, case studies, and exercises for students to be able to read
and understand the analytical nuances of various topics. Subsequently, they
can dig deeper into specific concepts by taking specialized courses; working
with their advisers; and/or reading research monographs, advanced texts,
and journal articles to come up to speed with the state of the art.
Besides students, this book has also been written keeping instructors in
mind. Having taught courses in queueing in two large industrial engineering
departments, one of the major difficulties I have always faced is that the stu-
dents are extremely heterogeneous, especially in terms of background. Some
students are in the first year of their master’s program, some are advanced
master’s students, many are early doctoral students, and a few are advanced
doctoral students. Furthermore, students are from different departments,
and they earn undergraduate degrees in several disciplines from various
countries and institutions and at various points of time. One of the chal-
lenges to teach a course under such heterogeneous circumstances is to make
sure that all students have the prerequisite material without having to teach
that in class. This motivated the two appendix chapters, where concepts are
explained in great detail with numerous examples and exercises. Students
without the right prerequisite material can prepare themselves for the course
by reading the appendices on their own, preferably before the course starts.
In summary, this book has been written to serve as a text in gradu-
ate courses on the topic of queueing models. Following are two alternative
course structures based on the material presented in this book:

1. Teach one graduate course on queueing models that follows an

introductory course on probability and random processes. For such
a course on queueing models, since the contents of each chapter in
this book are fairly modular, the instructor can easily select a subset
of topics of his or her choice from each of the 10 main chapters. As
a suggestion, instructors could cover the first half to two-thirds of
xv
xvi Preface

each chapter and leave out the remaining topics or assign them for
independent reading.
2. Furthermore, this book would be perfect when used in two courses.
In the first course, one could cover the appendices, followed by
Chapters 1 through 4. And in the second (advanced) course, one
could cover Chapters 5 through 10. In that case, it would be
sufficient to require an undergraduate course on probability as a
prerequisite to the first course in the sequence.
The analytical methods presented in this book are substanti-
ated using applications from a wide set of domains, including
production, computer, communication, information, transportation,
and service systems. This book could thus be used in courses in
programs such as industrial engineering, systems engineering, oper-
ations research, statistics, management science, operations man-
agement, applied mathematics, electrical engineering, computer
science, and transportation engineering. In addition, I sincerely
hope that this book appeals to an audience beyond students and
instructors. It would be appropriate for researchers, consultants, and
analysts that work on performance modeling or use queueing mod-
els as analysis tools. This book has evolved based on my numerous
offerings of entry-level to mid-level graduate courses on the theory
and application of queueing systems. Those courses have been my
favorite among graduate courses, and I am absolutely passionate
about the subject area. I have truly enjoyed writing this book, and I
sincerely hope you will enjoy reading it and getting value out of it.

Natarajan Gautam
College Station, Texas

For MATLAB and Simulink product information, please contact:

The MathWorks, Inc.
3 Apple Hill Drive
Natick, MA, 01760-2098 USA
Tel: 508-647-7000
Fax: 508-647-7001
E-mail: [email protected]
Web: www.mathworks.com
Acknowledgments

I have come to realize after writing this book that just like it takes a village
to raise a child, it does so to write a book as well. I would like to take this
opportunity to express my gratitude to a small subset of that village.
I would like to begin by thanking my dissertation adviser Professor
Vidyadhar G. Kulkarni for all the knowledge, guidance, and professional
skills he has shared with me. His textbooks have been a source of inspira-
tion and a wealth of information that have been instrumental in shaping this
book. I would also like to acknowledge Professor Kulkarni’s fabulous teach-
ing style that I could only wish to emulate. Talking about excellent teachers,
I would like to thank all the fantastic teachers I have had growing up. I was
lucky to have fabulous mathematics teachers in high school—Mrs. Sarva-
mangala and Mr. Nainamalai. I am also grateful to my excellent instructors
during my undergraduate program, including Professor G. Srinivasan for
his course on operations research and Professor P.R. Parthasarathy for his
course on probability and random processes. I would also like to thank Pro-
fessor Shaler Stidham Jr., who taught me the only course on queueing that I
have ever taken as a student.
Next, I would like to express my sincerest gratitude to some of my col-
leagues. In particular, I would like to thank Professor A. Ravindran for
encouraging me to write this book and for all his tips for successfully com-
pleting it. I was also greatly motivated by the serendipitous conversation I
had with Professor Sheldon Ross when he happened to sit by me during a
bus ride at an INFORMS conference. In addition, I would also like to thank
several colleagues that have helped me with this manuscript through numer-
ous conversations, brainstorming sessions, and e-mail exchanges. They
include Professors Karl Sigman, Ward Whitt, and David Yao from Columbia
University; Dr. Mark Squillante from IBM; Professors Raj Acharya, Russell
Barton, Jeya Chandra, Geroge Kesidis, Soundar Kumara, Anand Sivasubra-
maniam, Qian Wang, and Susan Xu from Penn State University; Professors
J.-F. Chamberland, Guy Curry, Rich Feldman, Georgia-Ann Klutke, P.R.
Kumar, Lewis Ntaimo, Henry Pfister, Don Phillips, Srinivas Shakkottai,
Alex Sprintson, and Marty Wortman from Texas A&M University; Profes-
sor Rhonda Righter from the University of California at Berkeley; Professor
Sunil Kumar from the University of Chicago; and Professors Anant Bal-
akrishnan, John Hasenbein, David Morton, and Sridhar Seshadri from the
University of Texas at Austin.
Some of the major contributions to the contents of this book are due to
my former and current students that took my courses and collaborated on
research with me. In particular, I would like to thank Vineet Aggarwal, Yiyu

xvii
xviii Acknowledgments

Chen, Naveen Cherukuri, Prathi Chinnusami, Jenna Estep, Maria Emelia-

nenko, Donna Ghosh, Nathan Gnanasambandham, Piyush Goel, Sai Rajesh
Mahabhashyam, Cesar Rincon Mateus, Cenk Ozmutlu, Venky Sarangan,
Mohamed Yacoubi, and Yanyong Zhang, as some of their research has
appeared in example problems in this book. In addition, thanks also to the
following students who not only did research that resulted in several exam-
ples in this book, but also took the time to read various chapters: Ezgi Eren,
Jeff Kharoufeh, Young Myoung Ko, Arupa Mohapatra, Ronny Polansky,
and Samyukta Sethuraman. Also, thanks to Youngchul Kim for some of the
simulation results in this book.
Last but not the least, I would like to thank my friends and family, in
starting with my friends from graduate school such as Suresh Acharya, Con-
rad Lautenbacher, Anu Narayanan, Srini Rajgopal, and Chris Rump who
taught me things that I was reminded of while writing portions of this
book. I am greatly indebted to my family members, especially Madhurika
Arvind, Rohini Nath, and Arvind Ramakrishnan, who have continuously
provided moral and emotional support during the entire book-writing pro-
cess. I would like to dedicate this book to my parents for their endless
contributions to my growth and development. They have kept such good
tabs on my progress—every time I completed a chapter, I made sure I
e-mailed it to them. A big thanks also goes to my brother Gokul Natarajan,
who has been a pillar of support, and my other family members, includ-
ing Srinithya Karthik, Karthik Muralidharan, Priya Ramakrishnan, as well
as my parents-in-law for being there for me. I would like to conclude this
acknowledgment by thanking my wife, Srividya Ramasubramanian, and my
son, Sankalp Gautam, for their love, encouragement, and compassion, all of
which were despite my absence, tantrums, and constant ramblings over the
last few years. Also, thanks to Srividya for picking the artwork for the cover
of this book.
Author

Natarajan Gautam is an associate professor in the Department of Industrial

and Systems Engineering with a courtesy appointment in the Department of
Electrical and Computer Engineering at Texas A&M University, College Sta-
tion. Prior to joining Texas A&M University in 2005, he was on the industrial
engineering faculty at Pennsylvania State University, University Park, for
eight years. He received his PhD in operations research from the University
of North Carolina at Chapel Hill. Gautam serves as an associate editor for
INFORMS Journal on Computing and Omega. He has held officers positions in
the INFORMS Applied Probability Society, INFORMS Telecommunication
Section, and the Computer and Information Systems division of the IIE. He
received the IIE Outstanding Young Industrial Engineer Award (education
category) in 2006.

xix
This page intentionally left blank
List of Case Studies

1. Staffing and work-assignment in call centers (Section 4.3.3)

2. Hospital emergency ward planning (Section 5.2.4)
3. Automobile service station (Section 6.1.3)
4. Network interface card in cluster computing (Section 7.2.4)

xxi
This page intentionally left blank
List of Paradoxes

1. M/G/1 queue busy period (Remark 8)

2. Braess’ paradox (Problem 55)
3. Waiting time method (Problem 56)
4. Inspection paradox (Section A.5.2)

The author maintains a website of Paradoxes in queueing, refer to that for a

more up to date list (http://ise.tamu.edu/people/faculty/Gautam/paradox.
pdf). Also, please consider contributing to it. Further, there is a list of para-
doxes in the following Wikipedia site: http://en.wikipedia.org/wiki/List_
of_paradoxes. It has some good ones on probability (including the Monty
Hall problem and the three cards problem). Interestingly, there is a slightly
different list of paradoxes in another Wikipedia site (http://en.wikipedia.
org/wiki/Category:Probability_theory_paradoxes).

xxiii
This page intentionally left blank
1
Introduction

For a moment imagine being on an island where you do not have to wait
for anything; you get everything the instant you want or need them! Sounds
like a dream doesn’t it? Well, let us not have any illusions about it and state
upfront that this book is not about creating such an island, leave alone cre-
ating such a world. Wait happens! This book is about how to deal with it.
In other words, how do you analyze systems to manage the waiting expe-
rienced by users of the system. Having said that waiting is inevitable, it is
only fair to point out that in several systems waiting has been significantly
reduced using modern technology. For example, at fast-food restaurants it
is now possible to order online and your food is ready when you show up.
At some amusement parks, especially for popular rides, you can pick up a
ticket that gives you a time to show up at the ride to avoid long lines. With
online services, waits at banks and post offices have reduced considerably.
Most toll booths these days have automated readers that let you zip through
without stopping. There are numerous such examples and it appears like
there are only a few places like airport security where the wait has gotten
longer over the years!
Before delving into managing waiting, here are some further comments
to consider:

• There is no doubt that information technology is an enabler for most

of the reduction in wait times discussed in the preceding text. How-
ever, this also means that the information infrastructure needs to be
managed diligently. For this reason, a good number of examples in
this book will be based on applications in computer-communication
systems, an understanding of which is crucial for everyone in this
hi-tech era.
• Many systems in this day and age exhibit tremendous amount of
variability. The variability comes in many forms. It could be in terms
of the types of different users such as those ordering online versus
those ordering inside a fast-food restaurant, for example. The vari-
ability could also be in terms of the duration of processing times,
which could have an extremely high coefficient-of-variation (i.e.,
ratio of standard deviation to mean). It is a challenge to manage
waiting times under high variability.
• Reducing waiting time does come at a price. It certainly takes a
lot of resources to drastically reduce waiting time. However, even

1
2 Analysis of Queues

if the cost of the resources are outweighed by the revenue due

to lower waiting times, there could be other undesired long-term
implications. For example, sometimes, to maintain and use the infor-
mation infrastructure results in expending tremendous amount of
greenhouse gases for the electricity to cool the servers, which is an
undesirable by-product of reducing wait times.

Having described the challenges and implications associated with wait-

ing, we would next like to describe what role analysis of queues (part of the
title of this book) plays in addressing them.

1.1 Analysis of Queues: Where, What, and How?

We would like to set the stage for this book by describing some introductory
remarks with respect to analysis of queues. In particular, we aim to address
questions such as the following: Where have analysis of queues been success-
fully used? What do we need as inputs to do the analysis and what can we
expect as outputs? How do we plan to go about analyzing queues?

1.1.1 Where Is This Used? The Applications

Most of the applications we consider in this book would fall under one of
the three domains described in the following text, although it is worthwhile
pointing out that the results on queueing can be used in a much wider set of
domains.

• Computer, communication, and information systems: There was a con-

ference in Copenhagen (Denmark) recently to celebrate 100 years of
queueing, a field that began with A.K. Erlang’s paper on queue-
ing models for managing telephone calls. As the field started to
grow rapidly, Leonard Kleinrock, who has written one of the most
widely used queueing theory books, laid the foundation for Internet
communication. The irony is that both telephony and the Internet
have come a long way in terms of high performance and fidelity
that many perhaps do not even realize the role of queueing in them
anymore! For that reason, it may be worthwhile to briefly mention
about queues in computer, communication, and/or information sys-
tems. Say you are sitting with your laptop in a hotel by the beach
typing up an email to your friend in another country. Before you
even hit the send button, the information in the email has made
it through several queues in your laptop. One of them would be
Introduction 3

that of the central processing unit (CPU) in your laptop, which is

essentially a server that processes jobs from various applications
including your email client. As soon as you click on Send, your
email gets broken up into smaller pieces of information called pack-
ets that get forwarded through the Internet all the way from your
laptop to your friend’s email client. During this process, the packets
typically are stored in several queues such as those in the Internet
routers. Notice that every packet needs to be reliably transmitted
from the source to the destination in a timely manner through multi-
ple systems owned and operated by different organizations. This is a
significant challenge and queueing theory has played a major role in
its design and operation. Typically, while information flows from a
source to a destination, a significant time is spent waiting in queues.
But that wait time can be controlled, which is one of the main
concerns for networking professionals. Having said that, it is impor-
tant to say that the role of queueing goes way beyond networking.
It is also used to analyze end systems such as servers, comput-
ers, and other components some of which we will see throughout
this book.
• Production systems: From a historical standpoint, as queueing mod-
els were being used to analyze telephony, mass production started
to emerge in a big way and presented a need for efficiency. That
led to a flurry of research in production systems paving way for a
plethora of results in queueing. In fact, a significant chunk of queue-
ing theory was developed with production systems as the driving
force. By production systems we mean manufacturing, logistics, and
transportation involved during the life cycle of a product. Although
it is just a matter of terminology, sometimes facilities such as repair
shops are considered as service systems in the literature but in this
book they would still fall under production systems. Further, from
an analysis standpoint, it may not be relevant to model all the enti-
ties involved in a product’s life cycle as the system. It is quite typical
to consider a factory that manufactures products as a separate sys-
tem, then the transportation network that distributes the product
as a separate system, and warehouses involved in the process as
separate systems, etc. One of the reasons for doing this is that usu-
ally a single company owns and operates such systems. Even if the
same company does the logistics, transportation, warehousing, and
retailing, it may be worthwhile to still consider those as separate sub-
systems and analyze them individually. With that thought consider
a factory that transforms raw materials into finished products. Such
a system typically consists of workstations (or work centers) and
buffers. The workstations are machines that we would call “servers”
generically and the buffers are queues where products wait to
4 Analysis of Queues

be processed. Inherent variability in processing times, unreliable

machines as well as multiple product types forces the need to use
buffers. Managing the flow of products and analyzing the buffers is
key to successfully running a production facility. For that, queueing-
based analysis come in handy as we will see in examples throughout
this book.
• Service systems: Although it is not at all important with respect
to the rest of this book to differentiate between a production sys-
tem and a service system, here we make that distinction based on
whether or not physical products are moved through the system.
Some examples of service systems where queueing plays a major
role are call centers, hospitals, and entertainment-related hospitality
services such as theme parks. We explain them further in terms of
the role of queueing by considering them one by one. An inbound
call center consists of a collection of representatives and commu-
nication devices that receive calls from customers of one or more
companies. When you call a toll-free help desk telephone number,
chances are the phone call goes to an inbound call center. If there is
a representative available to attend the call you get to talk to them
immediately, otherwise you wait listening to music till a represen-
tative becomes available. Oftentimes, customers become impatient
and abandon calls before they could talk to a representative and
then try calling at a later time. Also, sometimes the solution pro-
vided by the help desk would not work and the customer makes a
second call. The objective of the call center is to provide at the low-
est possible cost excellent service levels that include short waiting
times, low abandonment rates, and low probability of second calls.
Queueing theory plays a monumental role in the design and oper-
ation of such call centers by developing algorithms implemented in
decision-support systems for staffing, cross-training, and call han-
dling. Having motivated the role of queueing in call centers, we
move on to another service system where customers experience
queueing and waiting, namely, hospitals and health clinics. Irre-
spective of whether it is a clinic or an emergency ward, they can
be modeled as a multistation queueing system. For example, when
patients check in, they wait in a queue for their turn, then they wait
to be called inside, and once inside they wait one or more times
to be treated. Here too, like call centers, queueing models can be
used for staffing, cross-training, and developing information sys-
tems to handle patients. In fact, in the third service system example
too, which is theme parks (or amusement parks), queueing models
are used for staffing, training, and information systems. Here cus-
tomers go from one station to another, wait in line, and get served
(i.e., enjoy the ride).
Introduction 5

In summary, queues or waiting lines are found everywhere and they

need to be managed appropriately to balance costs, performance, and
customer satisfaction. This book mainly deals with queues in applications
or domains described earlier such as computer-communication networks,
production systems, and service systems. However, it is crucial to point
out that there are many other applications and domains that we have not
talked about. Having said that, the presentation of this book would be made
in a rather abstract manner so that the reader does not get the impres-
sion that the queueing model has been developed only with a particular
application in mind. However, it is critical to realize that there is a mod-
eling step that converts an application scenario into an analysis framework
and vice versa. We address that next by presenting what is needed to start
the analysis.

1.1.2 What Is Needed? The Art of Modeling

Modeling is the process of abstracting a real-life scenario in a manner that is
conducive for analysis. This process is in itself an art because there is no sin-
gle right way of going about it and the process is quite subjective. Although
one could develop scientific principles as well as best practices for the mod-
eling process, in the end it is still a piece of art that some like and some do
not. One way that a lot of people rationalize is by saying “whatever works.”
In other words, a good model is one that produces results that are satisfac-
tory for the intended purposes. Having said that, let us go over a model
selection exercise using an example not related to queueing. Say you are
interested in knowing whether or not a missile would strike a particular sta-
tionary target, given the setting from where it is launched. By “setting” we
mean the speed, direction, weight of the missile, and location at the time
of launch. For simplicity, assume that the missile has no control system to
adjust on the fly (so it is more like a cannon ball from a traditional cannon).
How would you go about determining whether or not the missile will strike
the target?
There are many ways to answer this question. A rational approach is
to abstract the scenario so that we can analyze it easily and inexpensively.
We could abstract the missile as a point mass, use the formulas for projec-
tile motion from basic physics, and determine whether or not the projectile
would strike the target. But wait, how about the following: the projectile
formula neglects air resistance; the acceleration due to gravity could be dif-
ferent from what is used and it could vary during flight; the initial conditions
may not be totally accurate (e.g., if speed is set to 100 m/s, it may actually
be somewhere between 95 and 105 m/s in practice); there could be an object
in the path such as a bird flying; and many other such issues. So what do
we do now? Should we worry about none, some, or all these issues? Of
course we can never be sure that we have considered all the potential issues.
6 Analysis of Queues

So whichever set of issues we use, we would end up making statements

like: “under such and such assumptions we can state that the probability of
striking the target is 0.953.” Unfortunately, many a time we tend to ignore
the “assumptions” and just make blank statements; one should be cognizant
of that.
Further, is the preceding statement in quotes good enough to make deci-
sions? That depends on what decisions are to be made and whether the
assumptions appear reasonable. Knowing that is also an art and is quite sub-
jective too. Hence we will not delve into it any further. However, there is
another dimension to modeling. Notice that before modeling we do need
to know the question being asked and whether our analysis can answer
that question. For the preceding example, if the question is to determine
whether the cannon can be on a boat to fire the missile, considering the boat’s
capability to sustain the recoil, perhaps a different model may be needed.
Alternatively, if the missile is capable of adjusting its trajectory on the fly
using its control systems, then the model would probably have to be accord-
ingly modified. In summary, it is incredibly crucial to understand what
assumptions go into a modeling-analysis framework and what the frame-
work is capable of determining. Having said that, it is important to iterate
that the power of queueing models is the ability to provide useful results
under mild assumptions. Now, we next describe the framework and scope
of this book.
As the title of this book suggests, we are mainly interested in systems
that can be modeled as a set of queues or waiting lines. There are many
phases or steps involved in solving problems experienced by these real-life
systems. Three of those phases are modeling the system (which is to con-
vert the real-life system into a simplified representation that is conducive
for analysis); choosing an appropriate tool and analyzing the simplified
system using existing methods or developing new ones; and using the out-
put of the analysis for design, control, what-if analysis, decision making,
etc., in the real-life system. This is represented in Figure 1.1. Essentially,
the modeling process converts the real-life system into a simplified repre-
sentation containing model description and analysis. Next the negotiation
process kicks-in which matches the model with the analysis framework, if
one exists. If there is no analysis framework, then the model description and
assumptions are suitably modified. This process continues until the real-
life system can be suitably represented as a queueing system. Then either
using existing methods, tools, and algorithms, or developing new ones, the
queueing system is analyzed and performance measures are obtained. Then
a decision-making process uses the performance measures in optimization,
control, or what-if analysis that can be implemented in the real-life system.
Finally, the design and operation process is altered suitably to implement the
changes suggested. This process would be repeated until a desired outcome
is obtained.
Introduction 7

De
Real-life system sig
g na
n
eli nd
od op
er
M ati
on

Model Optimization,
description and control, what-if
assumptions Focus of this book analysis, etc.

g
in
ak
N
eg

m
ot

ion
ia

Queueing Methods, cis

tio

Performance De
n

system tools, and

measures
representation algorithms

Analysis framework

FIGURE 1.1
Framework and scope of this book.

1.1.3 How Do We Plan to Proceed? Scope and Methods

As described in Figure 1.1, the scope of this book is in the analysis frame-
work. We will start with a queueing system representation and develop
performance measures for that system. For that, we will consider several
methods, tools, and algorithms, which would be the thrust of this book. Due
to the subjectivity involved in modeling a real-life system as a queueing sys-
tem we do not lay much emphasis on that. In addition, to develop a sense for
how to model a system, it is critical to understand what goes into the anal-
ysis (so that the negotiation can be minimal and constructive). So in some
sense, from a pedagogical standpoint, it is critical to present the analysis first
and then move on to modeling. That way one knows what works and what
does not, more importantly what systems can be analyzed and what cannot!
Only if analysis of queues is well understood, do we have any hope of know-
ing how best to model a real-life system. Having described the scope of this
book, next we describe the kind of methods used in this book.
The main focus of this book is to develop analytical methods for studying
queues. The theoretical underpinnings of these analytical techniques can be
categorized as queueing theory. An objective of queueing theory is to develop
formulae, expressions, or algorithms for performance metrics such as aver-
age number of entities in a queue, mean time spent in the system, resource
availability, and probability of rejection. The results from queueing theory
can directly be used to solve design and capacity planning problems such as
determining the number of servers, an optimum queueing discipline, sched-
ule for service, number of queues, and system architecture. Besides making
such strategic design decisions, queueing theory can also be used for tactical
as well as operational decisions and controls. In summary, the scope of this
book is essentially to develop performance metrics for queueing systems,
which can subsequently be used in their design and operations.
8 Analysis of Queues

In a nutshell, the methods in this book can be described as analytical

approaches for system performance analysis. There are other approaches for
system performance analysis such as simulations. It is critical to understand
and appreciate situations when it is more appropriate to use queueing the-
ory as well as situations where one is better off using simulations. Queueing
theory is more appropriate when (a) several what-if situations need to be
analyzed expeditiously, namely, what happens if the arrival rate doubles,
triples, etc.; (b) insights into relationship between variables are required,
namely, how is the waiting time related to service time; (c) to determine
best course of action for any set of parameters, namely to answer the ques-
tion of whether it is always better to have one queue with multiple servers
than one queue for each server; (d) formulae are needed to plug into opti-
mization routines, namely, to insert into a nonlinear program, the mean
queue length must be written as a closed-form expression to optimize service
speed. Simulations, on the other hand, are more appropriate when (a) sys-
tem performance measures are required for a single set of numerical values,
(b) performance of a set of given policies need to be evaluated numerically,
and (c) assumptions needed for queueing models are unrealistic (which is
arguably the most popular reason for using simulations). Having said that,
in practice, it is not uncommon to use a simulation model to verify analytical
results from queueing models or use analytical models for special cases to
verify simulations. With that understanding, we would like to reiterate that
for the purpose of this book we will mainly consider analytical models, and
simulations will be used only to validate approximations. Next we proceed
to describe some preliminary analytical results that are useful to a variety of
systems, not just queueing systems.

1.2 Systems Analysis: Key Results

Consider Figure 1.2 that describes a flow system. Note that the flow sys-
tem may or may not be a queueing system. Essentially, entities flow into
the system, spend a finite time in the system and then leave the system.
These entities could either be discrete (i.e., countable) or be continuous (i.e.,
uncountable). Examples of such flow systems and entities are as follows:

Input Output
System

FIGURE 1.2
A flow system with inputs and outputs.
Introduction 9

busses into which people enter and exit whenever the bus stops; cash reg-
ister at a store into which money enters and exits; fuel reservoir in a gas
station where gasoline enters when a fuel tanker fills it up and it exits when
customers fill up their vehicle gas tanks; theme parks where riders arrive into
the park, spend a good portion of their day going on rides and leave. There
are many such examples of flow systems in everyday life. Not all such sys-
tems are necessarily best modeled as queueing systems. Nonetheless, there
are a large number of fundamental results that we would like to present here
with the understanding that although they are frequently used in the context
of queueing, they can also be applied in wider domains such as inventory
systems, for example.
We describe some notations that would be used in this chapter alone. The
description is given as though the entities are discrete; however, by changing
the word “number” to “amount,” one can pretty much arrive at the same
results if the entities were continuous. Let α(t) be the number of entities that
flow into the system during the time interval [0, t]. Also, define γ(t) as the
number of entities in the system at time t with γ(0) = 0, that is, the system is
empty initially. Finally, δ(t) denotes the number of entities that flow out of
the system in time [0, t]. Due to flow conservation, we have

α(t) = γ(t) + δ(t) (1.1)

which essentially states that all entities that arrived into the system during a
time period of length t either left the system or are still in the system. In other
words, entities are neither created nor destroyed. If one were careful, most
flow systems can be modeled this way by appropriately choosing the entities.
For example, in systems like maternity wards in hospitals where it
appears like the number of people checking in would be fewer than num-
ber of people checking out, by appropriately accounting for unborn children
at the input itself, this balance can be attained. Although the previous exam-
ple was said in jest, one must be careful especially in systems with losses. In
our definition, entities that are lost are also included in the output (i.e., in the
δ(t) definition) but one has to be very careful during analysis. To illustrate
this point further, consider a system like a hotline where customers call for
help. In such systems, some customers may wait and leave without being
served, and some customers may leave without waiting (say due to a busy
signal). One has to be very careful in classifying the customers and deriving
performance measures accordingly for each class individually. The results
presented in this section is by aggregating over all classes (unless accounted
for explicitly). To clarify further, consider a production system where the
raw material that flows in results in both defective and nondefective prod-
ucts. Clearly, when it comes to analysis, the emphasis we place on defective
items may be significantly different than that for nondefective items, so it
might be beneficial to derive individual performance measures. To model
the production system as a whole, it may be beneficial to consider them as
10 Analysis of Queues

a single class. With this in mind, we next present a set of results that are
asymptotic in time, that is, as t → ∞.

1.2.1 Stability and Flow Conservation

The first thing that should come to mind when talking about asymptotic
results of flow systems, is the notion of stability for that system. Although
there is a mathematically rigorous way to describe the conditions for a sys-
tem to be stable as well as the definition of stability, we would like to take a
somewhat crude approach. Under rather mild conditions for the {α(t), t ≥ 0}
process (which by definition means the collection of α(t) values for every t
from zero onward) we can state the condition of stability. We define the flow
system to be stable if γ(t) is finite for all t and in particular

lim γ(t) < ∞ (1.2)

t→∞

almost surely. In words, the flow system is considered to be stable if the

number of entities in the system at any instant (including after an infinite
amount of time) would never blow off to infinity. The condition stated in
Equation 1.2 would amount to γ(t)/t → 0 as t → ∞. Thus dividing Equation
1.1 by t and letting t → ∞, we get

α(t) δ(t)
lim = lim .
t→∞ t t→∞ t

The preceding result is an important asymptotic result that is often misun-

derstood especially while applying to queueing networks. It states in words
that the long-run average input rate (left side of the equation) equals the
long-run average output rate (right side of the equation) if the flow system
is stable. In that spirit, define for a stable flow system, the long-run average
input rate (assuming the limit exists) as

α(t)
= lim . (1.3)
t→∞ t

Next we give a few more definitions.

1.2.2 Deﬁnitions Based on Limiting Averages

Consider a stable flow system and the notation described earlier. Let the
long-run time-averaged number of entities in the system (assuming the limit
exists) be defined as
T
0 γ(t)dt
H = lim . (1.4)
T→∞ T
Introduction 11

Although the preceding definition holds for discrete as well as continuous

entities, all the definitions subsequently in this subsection are for discrete
entities only. At first, we need to describe the indicator function I(A), which
is 1 if event A is true and 0 if event A is false. For example, for some j the
indicator function I(γ(t) = j) equals 1 or 0 if the number in the system at time
t equals j or not j, respectively. In that light, define qi (for i = 0, 1, 2, . . .) as
the long-run fraction of time if there were exactly i number of entities in the
system, that is,
T
0 I(γ(t) = i)dt
qi = lim . (1.5)
T→∞ T

Notice that the numerator of the term inside the limit essentially is the
amount of time in the interval [0, T] during which there were exactly i in
the system. Verify that

∞

H= iqi .
i=0

Let τ1 , τ2 , τ3 , . . . be the time spent in the system by entity 1, 2, 3, . . ., assuming

some arbitrary way of assigning numbers to entities (with the understanding
that this assignment does not play any role in the result). Then, the long-run
average time spent by an entity in the system, , is

τ1 + τ2 + · · · + τn
= lim . (1.6)
n→∞ n

We will subsequently establish a relationship between the various terms

defined.

1.2.3 Asymptotically Stationary and Ergodic Flow Systems

We first define a stationary stochastic process {Z(t), t ≥ 0} for some arbitrary
Z(t) and then consider an asymptotically stationary processes subsequently.
Although there is a rigorous definition for stationarity, for our purposes, at
least for this introductory chapter, all we need is that

P{zl ≤ Z(t) ≤ zu } = P{zl ≤ Z(t + s) ≤ zu }

for any t, s, zl , and zu to call {Z(t), t ≥ 0} a stationary stochastic process.

If the preceding result holds only as t → ∞, then we call the stochas-
tic process {Z(t), t ≥ 0} asymptotically stationary. For most results in this
book, we only require asymptotic stationarity of stochastic processes unless
explicitly stated otherwise. For example, the classical time-homogeneous,
12 Analysis of Queues

irreducible, and positive-recurrent continuous-time Markov chains (CTMCs)

are asymptotically stationary stochastic processes. It is worthwhile to point
out that sometimes in the literature, one defines stochastic processes as
{Z(t), −∞ < t ≤ ∞} assuming that the process started at t = −∞ and thereby
at t = 0 the stochastic process is stationary. However, we will continue with
the description provided earlier.
Having described the notion of asymptotically stationary stochastic pro-
cesses, we now return to our flow system in Figure 1.2. Consider the number
or amount of entities in the system at time t, that is, γ(t). Notice that
{γ(t), t ≥ 0} is a stochastic process that keeps track of the number or amount of
entities in the system across time. Although we do not require {γ(t), t ≥ 0} to
be a stationary stochastic process everywhere, we assume that it is asymptot-
ically stationary. In general, {γ(t), t ≥ 0} will not be stationary for all t because
γ(0) = 0 with probability 1.
Now, an asymptotically stationary and ergodic system is one where
the long-run average quantities are equal to the steady-state expected val-
ues. The way that translates for the {γ(t), t ≥ 0} process is that if it is
asymptotically stationary and ergodic, then

lim E[γ(t)] = H, lim P{γ(t) = i} = qi , and lim E[τn ] = ,

t→∞ t→∞ n→∞

where H, qi , and are defined in Equations 1.4 through 1.6. In simulations,

one typically computes H, qi , and after considering a reasonable warm-up
period to reduce the impact of the initial condition that γ(0) = 0.
Next, we present two results called Little’s law and Poisson observations
that can be derived for certain flow systems, especially when they are asymp-
totically stationary and ergodic. We will present some examples to illustrate
the results. It may be worthwhile to pay attention to the assumptions (or
conditions) stated to describe the results. Notice that Little’s law is for dis-
crete entity queues, not continuous (although one can derive the equivalent
to Little’s law for continuous).

1.2.4 Little’s Law for Discrete Flow Systems

Consider a flow system that satisfies flow conservation. Further, assume that
the flow system is stable and has discrete entities flowing through. Although
we only require that the limits in Equations 1.3, 1.4, and 1.6 exist, we
also assume that the flow system is asymptotically stationary and ergodic.
According to Little’s law

H = (1.7)

where , H, and are defined in Equations 1.3, 1.4, and 1.6, respectively.
We do not provide a proof of the preceding result (for which it is more con-
venient if we have a stationary and ergodic system, although that is not
Introduction 13

a requirement). To illustrate Little’s law, we first provide a brief example

followed by a detailed numerical problem.
Consider a small company with 200 employees and an average attrition
rate of 18.73 employees per year leaving the company. The policy of the
company is not to grow beyond 200 employees and replace every vacant
position. Therefore, the steady-state average recruiting rate is equal to the
average attrition rate . However, vacancies cannot be immediately filled
and the time-averaged number of employees in the long run is calculated to
be 187.3 (i.e., H = 187.3). Using Little’s law, we have = 10 years, which is
the average time an employee stays with the company. Next we consider a
detailed numerical problem.

Problem 1
Couch-Potato is a high-end furniture store that carries a sofa set called Plush.
Customers arrive into Couch-Potato requesting for a Plush sofa set according
to a Poisson process at an average rate of 1 per week. Couch-Potato’s policy
is to not accept any back orders. So if there are no Plush sofa sets available
in inventory, customers’ requests are not fulfilled. It is also Couch-Potato’s
policy to place an order from the manufacturer for “five” Plush sofa sets as
soon as the number of them in inventory goes down to “two”. The manu-
facturer of Plush has an exponentially distributed delivery time with a mean
of 1 week to deliver the set of “five” Plush sofa sets. Model the Plush sofa
set system in Couch-Potato as a flow system. Is the system stable? Compute
the average input rate , the time-averaged number of Plush sofa sets in
inventory (H), and the average number of weeks each Plush sofa set stays in
Couch-Potato ().
Solution
The Plush system in Couch-Potato is indeed a (discrete) flow system where
with every delivery from the manufacturer, five sofa sets flow into the sys-
tem. Also, with every fulfilled customer order, sofa sets exit the system. We
let γ(t) be the number of Plush sofa sets in the system at time t. Although
we do not need γ(0) to be zero for the analysis, assuming that would not
be unreasonable. Also, notice that for all t, γ(t) stays between “zero” and
“seven”. For example, if by the time the shipment arrived, two customers
have already ordered Plushes, then the number in inventory would become
zero. Likewise, a maximum of “seven” is because an order of “five” Plushes
are placed when the inventory reaches “two”, so if the shipment arrives
before the next customer demand, there would be “seven” Plush sofa sets
in the system. Notice that since γ(t) never exceeds “seven”, the system is
stable.
To obtain the other performance measures, we model the stochas-
tic process {γ(t), t ≥ 0} as a CTMC with state space {0, 1, 2, 3, 4, 5, 6, 7} and
corresponding infinitesimal generator matrix
14 Analysis of Queues

⎡ ⎤
−1 0 0 0 0 1 0 0
⎢ 1 −2 0 0 0 0 1 0⎥
⎢ ⎥
⎢ 0 1 −2 0 0 0 0 1⎥
⎢ ⎥
⎢ 0 0 1 −1 0 0 0 0⎥
Q=⎢
⎢ 0
⎥
⎢ 0 0 1 −1 0 0 0⎥⎥
⎢ 0 0 0 0 1 −1 0 0⎥
⎢ ⎥
⎣ 0 0 0 0 0 1 −1 0⎦
0 0 0 0 0 0 1 −1

in units of number of sofa sets per week. Details on modeling systems as

CTMCs can be found in Section B.2. Let pi be the probability that there are i
Plush sofa sets at Couch-Potato in steady state. We can solve for the steady-
state probabilities using

[p0 p1 p2 p3 p4 p5 p6 p7 ]Q = [0 0 0 0 0 0 0 0]

and p0 + p1 + · · · + p7 = 1. We get
1
[p0 p1 p2 p3 p4 p5 p6 p7 ] = [1 1 2 4 4 4 3 2].
21

Note that an order for “five” Plush sofa sets is placed every time the
inventory level reaches 2. So we pay attention to state 2 with correspond-
ing steady-state probability p2 = 2/21. In the long run, a fraction 2/21 of time
the system is in state 2 and on average state 2 lasts for half a week. Thus
the average rate at which orders are placed is 2 × 2/21 per week. Hence
the average input rate = 2 × (2/21) × 5 = 20/21 Plush sofa sets per week.
Also, the time-averaged number of Plush sofa sets in inventory (H) can be
computed as

7
85
H= ipi = .
21
i=0

Therefore, using Little’s law, the average number of weeks each Plush sofa
set stays in Couch-Potato () can be computed as
H
= = 4.25 weeks.

1.2.5 Observing a Flow System According to a Poisson Process

Consider the flow system in Figure 1.2 and notation described earlier, espe-
cially that γ(t) is the number of entities in the system at time t with γ(0) = 0.
Instead of observing the system continuously over time, say we observe it
at times t1 , t2 , t3 , . . ., according to a Poisson process (in other words, for any
Introduction 15

i > 0, the inter-observation times ti − ti−1 are independent and identically

distributed [IID] exponential random variables). Such discrete observations
are common in many systems, especially when measurements need signifi-
cant processing. For example, at traffic lights instead of making a video and
processing it, one could just take stills at times t1 , t2 , . . ., but according to a
Poisson process. The question to ask is how the statistics taken continuously
relate to those sampled according to a Poisson process.
The main result is that the observations made according to a Poisson process
yield the same results as those observed continuously. To explain that further,
recall the definition of H and qi from Equations 1.4 and 1.5, respectively.
Assuming the limits exist, we have
n
γ(tj )
j=1
lim → H.
n→∞ n
Likewise for discrete entities, when i = 0, 1, 2, . . .
n
I(γ(tj ) = i)
j=1
lim → qi .
n→∞ n
Thus one can obtain time-averaged statistics by sampling the system at IID
exponentially spaced intervals. This leads us to the following remark.

Remark 1

It is fairly common that systems are sampled at regularly spaced intervals

(such as every 15 min, say) and then the data are averaged. That is not
good practice because it could lead to unintended errors. For example, con-
sider a traffic light that goes from red to green in 5-min cycles. If this light
is observed at 15-min intervals and every time toward the end of a green
period, then the observations would usually result in zero or a very small
number of vehicles at the light. Although that is an extreme example, it is one
where the averages based on discrete observations would not be equal to the
time-averaged quantity. Now, instead, if observations are made according
to a Poisson process with mean a inter-observation time of 15 min, then the
average would indeed be similar to when the system is observed continu-
ously. In fact, similar situations arise while monitoring computer networks.
The probes sent to obtain statistics use what is called Poisson ping instead of
equally spaced ping, again to avoid any biases induced in the averages.

If the Poisson observations are made on an asymptotically stationary and

ergodic system (namely, the {γ(t), t ≥ 0} process in steady state), then the
Poisson observations will yield time averages. In particular,
16 Analysis of Queues

lim E[γ(tn )] → H,
n→∞

and

lim P{γ(tn ) = i} → qi ,
n→∞

where H and qi are defined in Equations 1.4 and 1.5, respectively.

Problem 2
Consider a single-product inventory system with continuous review adopt-
ing what is known as the (K, R) policy, which we explain next. Demand
of one unit arrives according to a Poisson process with parameter λ per
week. Demand is satisfied using products stored in inventory, and no back-
orders allowed, that is, if a demand occurs when the inventory is empty, the
demand is not satisfied. The policy adopted is called (K, R) policy wherein
an order for K items is placed as soon as the inventory level reaches R. It
takes a random time exponentially distributed with mean 1/θ weeks for the
order to be fulfilled (this is called lead time). Assume that K > R, but both R
and K are fixed constants. Problem 1 is a special case of this single-product
inventory system adopting the (K, R) policy with K = 5, R = 2, and θ = λ = 1
per week. What would the distribution and expected value of the number
of items in inventory be the instant a demand arrives? Also, determine the
average product departure rates.
Solution
Let X(t) be the number of products in inventory at time t. Clearly, {X(t), t ≥ 0}
is a CTMC with state space S = {0, 1, . . . , R + K} and rate diagram shown in
Figure 1.3. Let pi be the steady-state probability of i items in inventory. To
obtain pi for all i ∈ [0, R + K], we use the balance equations

i−1
R
θ pj = λpi , i = 1, . . . , R, θ pj = λpi , i = R + 1, . . . , K,
j=0 j=0

R
θ pj = λpK+i , i = 1 . . . , R.
j=i

θ θ
θ θ θ
0 1 2 R–1 R R+1 K K+ 1 K+ 2 R+K–1 R+K
λ λ λ λ λ λ λ λ λ λ λ λ λ

FIGURE 1.3
Rate diagram for (K, R) inventory system.
Introduction 17

K+R
Then, p0 can be obtained using pi = 1 =⇒
i=0

θ i−1 θ R θ R
R R
i−1
p0 1+ φ + (K − R) φ + (φ − φ ) = 1,
λ λ λ
i=1 i=1

where φ = 1 + (θ/λ). Also, the steady-state distribution for i > 0 is

⎧
⎪
⎪
θφi−1
1 ≤ i ≤ R,
⎪ λ+KθφR
⎪
⎨ R
θφ
pi = R < i ≤ K,
⎪
⎪ λ+KθφR
⎪
⎪
⎩ θ(φ −φ
R i−k−1 )
K < i ≤ K + R.
λ+KθφR

We need to compute the distribution and expected value of the number

of items in inventory the instant a demand arrives. However, the demands
arrive according to a Poisson process and Poisson observations would match
the long-run averages. Therefore, the distribution and expected value of
the number of items in inventory the instant a demand arrives are pi and
R+K
ipi , respectively. Further, the average departure rate is λ(1 − p0 ). Ver-
i=0
ify that this is identical to the arrival rate derived for the special case in
Problem 1.

Having described some generic results for flow systems, we now delve
into a special type of flow system called queueing systems.

1.3 Queueing Fundamentals and Notations

Although one could argue that any flow system such as the one depicted in
Figure 1.2 is a queueing system, we typically consider a few minor distin-
guishing features for queueing-type flow systems. The potential arrivals (or
inputs) into a flow system must take place on their own accord. For example,
arrival of entities in an inventory system (which is a type of flow system)
is due to the order placement and not on their own accord. In production
systems literature inventories are known as pull systems while queues are
known as push systems. Another aspect (although not necessary and per-
haps not always true either) is that in a queueing system there is a notion
of one or more servers (or processors or machines) that perform a service
(or task or operation) for the entity. In an inventory system, there may
be no true service rendered (or no task is performed) and the inventory is
purely for storage purposes. With that in mind, we describe some features
18 Analysis of Queues

Waiting room Server(s)

Arrivals Departures

FIGURE 1.4
A single-station queueing system. (From Gautam, N., Queueing Theory. Operations Research and
Management Science Handbook, A. Ravindran (ed.), CRC Press, Taylor & Francis Group, Boca
Raton, FL, pp. 9.1–9.37, 2007. With permission.)

of canonical queueing systems. For the rest of this chapter, we only consider
single-station queueing systems with one waiting line and one set of servers.
Of course one could model a multistation queueing network as a single flow
system, but in practice one typically models in a decomposed manner where
each node in a network is a single-flow system. This justifies considering a
single station. Also, at this stage, we do not make any distinctions between
classes of entities. With that in mind, we present some details of queueing
systems.
Consider a single-station queueing system as shown in Figure 1.4. This
is also called a single-stage queue. There is a single waiting line and one or
more servers (such as at a bank or post office). We will use the term “servers”
generally but sometimes for specific systems we would call them processors
or machines. We call the entities that arrive and flow through the queueing
system as customers, jobs, products, parts, or just entities. Arriving cus-
tomers enter the queueing system and wait in the waiting area if a server
is not free (otherwise they go straight to a server). When a server becomes
free, one customer is selected and service begins. Upon service completion,
the customer departs the system. Usually, time between arrivals and time
to serve customers are both random quantities. Therefore, to analyze queue-
ing systems one needs to know something about the arrival process and the
service times for customers. Other aspects that are relevant in terms of anal-
ysis include the number of servers, capacity of the system, and the policy
used by the servers to determine the service order. Next we describe few key
remarks that are needed to describe some generic, albeit basic, results for
single-station queueing systems.

Remark 2

The entities that flow in the queueing system will be assumed to be dis-
crete or countable. In fact, a bulk of this book is based on discrete queues
Introduction 19

with fluid queues considered in only two chapters toward the end. As
described earlier, these entities would be called customers, jobs, products,
parts, etc.

Remark 3

Unless explicitly stated otherwise, the customer inter-arrival times, that is,
the time between arrivals, are assumed to be IID. Thereby the arrival process
is generally assumed to be what is called a renewal process. Some exceptions
to that are when the arrival process is time varying or when it is correlated.
But those exceptions will only be made in subsequent chapters. Further, all
arriving customers enter the system if there is room to wait (that means
unless stated otherwise, there is no balking). Also, all customers wait till their
service is completed in order to depart (likewise, unless stated otherwise,
there is no reneging).

Remark 4

For the basic results some assumptions are made regarding the service
process. In particular, we assume that the service times are IID random
variables. Also, the servers are stochastically identical, that is, the service
times are sampled from the same distribution for all servers. In addition, the
servers adopt a work-conservation policy, that is, the server is never idle
when there are customers in the system. The last, assumption means that
as soon as a service is completed for a customer, the server starts serving
the next customer instantaneously (if one is waiting for service). Thus while
modeling one would have to appropriately define what all activities would
be included in a service time.

The assumptions made in the preceding remarks can and will certainly
be relaxed as we go through the book. There are many instances in the book
that do not require these assumptions. However, for the rest of this chap-
ter, unless explicitly stated otherwise, we will assume that assumptions in
Remarks 2, 3, and 4 hold. Next, using the assumptions, we will provide some
generic results that will be useful to analyze queues. However, before we
proceed to those results, recall that to analyze queueing systems one needs to
know something about the arrival process, the service times for customers,
the number of servers, capacity of the system, and the policy used by the
servers to determine the service order. We will next describe queues using a
compact nomenclature that takes all those into account.
In order to standardize description for queues we use a notation that is
accepted worldwide called Kendall notation honoring the pioneering work by
20 Analysis of Queues

D.G. Kendall. The notation has five fields:

AP/ST/NS/Cap/SD.

In the Kendall notation, AP denotes arrival process characterized by the

inter-arrival distribution, ST denotes the service time distribution, NS is the
number of servers in the system, Cap is the maximum number of customers
in the whole system (with a default value of infinite), and SD denotes, ser-
vice discipline that describes the service order such as first come first served
(FCFS)—the default one, last come first served (LCFS), random order of ser-
vice (ROS), and shortest processing time first (SPTF), etc. The fields AP and
ST can be specific distributions such as exponential (denoted by M, which
stands for memoryless or Markovian), Erlang with k phases (denoted by
Ek ), phase-type (PH), hyperexponential with k phases (Hk ), and determin-
istic (D), etc. Sometimes instead of a specific distribution, AP and ST fields
could be G or GI, which denote general distribution (although GI explicitly
says “general independent,” G also assumes independence considering the
assumptions made in the remarks). Table 1.1 summarizes values that can be
found in the five fields of Kendall notation.
For example, a queue that is GI/H2 /4/6/LCFS implies that the arrivals
are according to a renewal process with general distribution, service times
are according to a two-phase hyperexponential distribution, there are four
servers, a maximum of six customers are permitted in the system at a time
(including four at the server), and the service discipline is LCFS. Also,
M/G/4/9 implies that the inter-arrival times are exponentially distributed
(thereby the arrivals are according to a Poisson process), service times are
according to some general distribution, there are four servers, the system
capacity is nine customers in total, and the customers are served according
to FCFS. Since FCFS is the default scheme, it does not appear in the notation.

TABLE 1.1
Fields in the Kendall Notation
AP M, G, Ek , Hk , PH, D, GI, etc.
ST M, G, Ek , Hk , PH, D, GI, etc.
NS denoted by s, typically 1, 2, . . . , ∞
Cap denoted by K, typically 1, 2, . . . , ∞
default: ∞
SD FCFS, LCFS, ROS, SPTF, etc.
default: FCFS
Source: Gautam, N., Queueing Theory. Operations
Research and Management Science Handbook,
A. Ravindran (ed.), CRC Press, Taylor &
Francis Group, Boca Raton, FL, pp. 9.1–
9.37, 2007. With permission.
Introduction 21

In an M/M/1 queue, the arrivals are according to a Poisson process, ser-

vice times exponentially distributed, there is one server, the waiting space
is infinite, and the customers are served according to FCFS. Note that since
both the system capacity and the service discipline take their default val-
ues, they do not appear in the notation. As a final example, consider the
E4 /M/3/∞/LCFS queue. Of course the inter-arrival times are according to a
four-phase Erlang distribution, service times are exponential, there are three
servers, infinite capacity, and the discipline is LCFS. Note that in this case we
do write down the default value for service capacity because the last field is
not the default value.
These are only a few examples and it is not practical to list all the var-
ious cases possible for the various fields. However, a few are critical to be
mentioned. Sometimes one sees discrete distributions like Geometric (rep-
resented as Geo) and that typically implies that the queue is observed at
discrete times. At other times, one sees MMPP to mean Markov modu-
lated Poisson process and BMAP to mean batch Markovian arrival process.
It is also worthwhile noting that the Kendall notation is typically modified
slightly for cases not described earlier. For example, if the system is time
varying, then sometimes a subscript t is used in the notation (such as an
Mt /M/1 queue, which is similar to an M/M/1 queue except the arrival rate
is time varying, usually deterministically). Likewise, if customers arrive in
batches or are served in batches, they are represented differently (M[X] /G/2
implies that the arrival process is a compound Poisson process, that is, the
inter-arrival time is exponential but with each arrival, a random number X
entities enter the system). If we allow abandonment or reneging, then we
can describe that too in the notation. We will describe such notations as
and when they are needed in the book. Now, we will move to some generic
analysis results that are true for any single-station queue.

1.3.1 Fundamental Queueing Terminology

Consider a single-station queueing system such as the one shown in
Figure 1.4. Assume that this system can be described using Kendall nota-
tion. This means the inter-arrival time distribution, service time distribution,
number of servers, system capacity, and service discipline are given. Recall
that the assumptions in Remarks 2, 3, and 4 also hold. For such a system, we
now describe some parameters and measures that we collectively call queue-
ing terminology. As a convention, we assign numbers for customers with the
nth arriving customer called customer-n. The only place where there may be
some ambiguity is in the batch arrival case. For arrivals that occur simulta-
neously in a batch, we arbitrarily assign numbers for those customers as that
would not affect our results. Most of the terminology and results presented
in this section are also available in Kulkarni [67] with possibly different
notations.
22 Analysis of Queues

Define An as the time when the nth customer arrives, and thereby
An − An−1 is the nth inter-arrival time if the arrivals are not in batches. Let
Sn be the service time for the nth customer. Usually from the Kendall nota-
tions, especially when assumptions in Remarks 2, 3, and 4 hold, we typically
know both An − An−1 and Sn stochastically for all n. In other words, we know
the distributions of inter-arrival times and service times. In some sense they
and the other Kendall notation terms form the “input.” Next we describe
some terms and performance measures that can be derived once we know
the inputs.
Let Dn be the time when the nth customer departs. We denote X(t) as the
number of customers in the system at time t, Xn as the number of customers
in the system just after the nth customer departs, and Xn∗ as the number of
customers in the system just before the nth customer arrives. Although in
this chapter we would not go into details, it is worthwhile mentioning that
{X(t), t ≥ 0}, {Xn , n ≥ 0}, and {Xn∗ , n ≥ 0} are usually modeled as stochastic pro-
cesses. We also define two other variables, which are usually not explicitly
modeled. These are Wn , the waiting time of the nth customer, and W(t), the
total remaining workload at time t (this is the sum of the remaining service
time for all the customers in the system at time t). The preceding variables are
described in Table 1.2 for easy reference, where customer n denotes the nth
arriving customer. Note that if we are given A1 , A2 , . . ., as well as S1 , S2 , . . .,
we can obtain Dn , X(t), Xn , Xn∗ Wn , and W(t). We describe that next for a
special case (note that typically we do not know the explicit realizations of
An and Sn for all n; we only know their distributions).
To illustrate the terms described in Table 1.2, consider a G/G/1 queue
where the inter-arrival times are general and service times are general with
a single server adopting FCFS and infinite space for customers to wait (refer

TABLE 1.2
Variables—Their Mathematical Notation as well as Meanings
Relation to
Variable Other Variables Meaning
An Arrival time of customer n
Sn Service time of customer n
Dn Departure time of customer n
X(t) Number of customers in the system at time t
Xn X(Dn +) Number in system just after customer n’s departure
Xn∗ X(An −) Number in system just before customer n’s arrival
Wn Dn − An Waiting time of customer n
W(t) Total remaining workload at time t
Source: Gautam, N., Queueing Theory. Operations Research and Management Science
Handbook, A. Ravindran (ed.), CRC Press, Taylor & Francis Group, Boca
Raton, FL, pp. 9.1–9.37, 2007. With permission.
Introduction 23

S7
W(t)
S2 S3
S5 S6
S1 S4
A1 A2 A3 A4 A5 A6 A7 t
D1 D2 D3 D4 D5 D6 D7

X(t)

A 1 A2 A3 A4 A5 A6 A7 t
D1 D2 D3 D4 D5 D6 D7
W1 W3 W5
W2 W4 W6
W7

FIGURE 1.5
Sample path of workload and number in the system for a G/G/1 queue.

to Figure 1.5). Let A1 , A2 , . . . , A7 be the times that the first seven customers
arrive to the queue. The customers require a service time of S1 , S2 , . . . , S7 ,
respectively. Assume that the realizations of An and Sn are known (although
in practice we only know them stochastically). The queue is initially empty.
As soon as the first customer arrives (that happens at time A1 ) the number
in the queue jumps from 0 to 1 (note the jump in the X[t] graph). Also, the
workload in the system jumps up by S1 because when the arrival occurs
there is S1 amount of work left to be done (note the jump in the W[t] graph).
Until the next arrival or service completion, the number in the system is
going to remain a constant equal to 1. Hence the X[t] graph stays flat at
1 till the next event. However, the workload keeps reducing because the
server is working on the customer. Notice from the figure that before the
first customer’s service is completed, the second customer arrives. Hence
the number in the system (the X[t] graph) jumps up by 1 and the work-
load jumps up by S2 (the W[t] graph) at time A2 . Since there is only a
single server and we use FCFS, the second customer waits while the first
customer continues being served. Hence the number in the system (the X[t]
graph) stays flat at 2 and the workload reduces (the W[t] graph) reduces
continuously.
As soon as the server completes service of customer-1, that customer
departs (this happens at time D1 ). Note that the time spent in the system by
customer-1 is W1 = D1 − A1 = S1 . Immediately after customer-1 departs, the
number in the system (the X[t] graph) jumps down from 2 to 1. However,
since the server adopts a work-conservation policy, it immediately starts
working on the second customer. Hence the W(t) graph has no jumps at
time D1 . From the figure notice that the next event is arrival of customer-3,
which happens while customer-2 is being served. Hence at time A3 the
24 Analysis of Queues

number in the system jumps from 1 to 2 and the workload jumps up by

S3 . The server completes serving customer-2 (at time D2 ) and immediately
starts working on customer-3. Hence there is no jump in the workload pro-
cess. However, the number in the system reduces by 1. Note that customer-2
spent W2 = D2 − A2 amount of time in the system but it includes some wait-
ing for service to begin plus S2 . Soon after time D2 there is only one customer
in the system. Then at time D3 customer-3 completes service and the system
becomes empty. This also means that the workload becomes zero and the
number in the system is also zero. Subsequently, customer-4 arrives into an
empty system at time A4 and gets served before the fifth customer arrives.
So the system becomes empty again.
Then the fifth customer arrives at time A5 and requires a large amount
of service S5 . By the time the server could complete service of the fifth cus-
tomer, the sixth and seventh customers arrive. So the number in the system
goes from 0 to 1 at time A5 , then 1 to 2 at time A6 , and then 2 to 3 at time A7 .
The server completes serving the fifth, sixth, and seventh customer and the
system becomes empty once again. From the figure, notice that the server
is idle from time 0 to A1 , then becomes busy from A1 to D3 , then becomes
idle from D3 to A4 , and so on. The server essentially toggles between busy
and idle periods. The durations D3 − A1 , D4 − A4 , and D7 − A5 in the
figure are known as busy periods. The idle periods are continuous periods
of time when W(t) is 0. Also, note that X1∗ = 0 since the number in the system
immediately before the first arrival is 0, whereas X1 = 1 since the number
in the system immediately after customer-1 departs is 1. As another exam-
ple note that X6∗ = X(A6 −) = 1 and also X5 = X(D5 +) = 2. The terms Xn∗ and
Xn are not depicted in the figure but can be easily obtained from the X(t)
graph since Xn∗ = X(An −) and Xn = X(Dn +). To explain further, we consider
a problem next.

Problem 3
Consider the exact same arrival times of the first seven customers
A1 , A2 , . . . , A7 as well as the exactly same corresponding service time require-
ments of S1 , S2 , . . . , S7 , respectively, as described earlier. However, the
system has two identical servers. Draw graphs of W(t) and X(t) across time
for the first seven customers assuming that the eighth customer arrives well
after all the seven previous customers are served. Compare and contrast the
graphs against those we saw earlier for the case of one server.
Solution
We assume that the system is empty at time t = 0. The graphs of W(t) and
X(t) versus t for this G/G/2 queue is depicted in Figure 1.6. The first cus-
tomer arrives at time A1 and one of the two servers processes this customer
and the workload process jumps up by S1 . While one server processes this
customer, another customer arrives and the workload jumps by S2 at time
Introduction 25

W(t)
S2 S6
S5
S1 S3 S7
S4

A1 A2 A3 A4 A5 A6 A7 t
D1 D2 D3 D4 D6 D7 D5

X(t)

A1 A2 A3 A4 A5 A6 A7 t
D1 D2 D3 D4 D6 D7 D5

FIGURE 1.6
Sample path of workload and number in the system for a G/G/2 queue.

A2 . However, since there are two servers processing customers now, the
workload can be reduced twice as fast (hence a different downward slope
of W[t] immediately after A2 ). Then at time D1 , the first server completes
serving the first customer and becomes idle. The second server subsequently
completes serving customer-2 and the entire system becomes empty for a
short period of time between D2 and A3 when the third customer arrives.
Since the servers are identical, we do not specify which server processes the
third customer but we do know that one of them is processing the customer.
Then at time D3 , the system becomes empty. This process continues. When-
ever there are two or more customers in the system the workload reduces at
a faster rate than when there is one in the system. However, when there are
no customers in the system W(t) is 0.
Next we contrast the differences in Figures 1.5 and 1.6. The periods with
the system empty have indeed grown, which is expected with having more
servers. It is crucial to point out that the notion of busy period is unclear
since a server could be idle but the system could have a customer served by
the other server. However, the notion of the time when the system is empty
is still consistent between the two figures (and that is when W[t] and X[t]
are zero). Another difference between the figures that we described earlier is
that in the G/G/2 case the downward slopes for the W(t) process take on two
different values depending on the number in the system. However, a crucial
difference is that the customers do not necessarily depart in the order they
arrived. For example, the seventh customer departs before the fifth customer
(i.e., D7 < D5 ) in the G/G/2 figure. For this reason, we do not call this service
discipline FIFO in this book (that means first-in-first-out) and instead stick to
FCFS. However, the term “FIFO” does apply to the waiting area alone (not
including the servers), and hence it is often found in the literature. However,
to avoid any confusion we say FCFS.
26 Analysis of Queues

In a similar fashion, one could extend this to other queues and disciplines
by drawing the W(t) and X(t) processes (see the exercises at the end of the
chapter). However, typically we do not know realizations of An and Sn for
all n but we only know the distributions of the inter-arrival times and service
times. In that case can we say anything about X(t), Xn , Xn∗ , W(t), Dn , and Wn ?
We will see that next.

1.3.2 Modeling a Queueing System as a Flow System

A natural question to ask is if we are given distributions for the inter-arrival
times and service times, can we compute distributions for the quantities X(t),
W(t), Wn , etc. It turns out that it is usually very difficult to obtain distribu-
tions of the random variables X(t), Xn , Xn∗ , W(t), and Wn . However, some
asymptotic results can be obtained by letting n and t go to infinite. But we
just saw some asymptotic properties of flow systems in Section 1.2. Could
we use them here? Of course, since the queueing system is indeed a flow
system. But how do the notations relate to each other? The arrivals to the
queueing system correspond to the inputs to the flow system. Note that α(t)
described in Section 1.2 can be written as α(t) = min{n ≥ 1 : An > t} − 1. In
other words, α(t) for a queueing system counts the number of arrivals in
time 0 to t. Also, γ(t) = X(t), the number of customers in the queue at time t.
Finally, δ(t) can be written as δ(t) = min{n ≥ 1 : Dn > t} − 1. In other words,
δ(t) for a queueing system counts the number of departures in time 0 to t.
Thus, all the asymptotic properties that we described in Section 1.2 can be
used here for queueing systems. We would henceforth not draw analogies
between the notations used in Section 1.2 and here. Instead, we would just
go ahead and use those results for flow systems here.
To describe the preceding asymptotic results, we consider the following
performance measures. Let pj be the long-run fraction of time that there are
j customers in the system. Similarly, let πj and π∗j be the respective long-run
fractions of departing and arriving customers that would see j other cus-
tomers in the system. In addition, let G(x) be the long-run fraction of time
the workload is less than x. Likewise, let F(x) be the long-run fraction of
customers that spend less than x amount of time in the system. Finally, define
L as the time-averaged number of customers in the system, and define W as
the average waiting time (averaged across all customers). These performance
metrics can be mathematically represented as follows (recall the indicator
function I[A] definition in Section 1.2.2):
T
0 I(X(t) = j)dt
pj = lim ,
T→∞ T
N
I(Xn = j)
n=1
πj = lim ,
N→∞ N
Introduction 27

N
I(Xn∗ = j)
π∗j = lim n=1
,
N→∞ N
T
0 I(W(t) ≤ x)dt
G(x) = lim ,
T→∞ T
N
I(Wn ≤ x)
n=1
F(x) = lim ,
N→∞ N
T
0 X(t)dt
L = lim
T→∞ T

and
W1 + W2 + · · · + WN
W = lim .
N→∞ N

To illustrate the concept of time averages and indicator functions, as well

as to get a better notion of the preceding terminologies we consider the
following problem.

Problem 4
Consider the time between t = 0 and t = D7 for the G/G/1 queue in Figure 1.5.
Assume that we have numerical values for all Ai and Di for i = 1, . . . , 7. What
fraction of time between t = 0 and t = D7 were there for two customers in
the system? What fraction of the seven arriving customers saw one customer
in the system? What is time-averaged number of customers in the system
between t = 0 and t = D7 ?
Solution
From Figure 1.5, note that there are two customers in the system between
times A2 and D1 , A3 and D2 , A6 and A7 , as well as D5 and D6 . Thus the
fraction of time between t = 0 and t = D7 that there were two customers in
the system is ((D1 − A2 ) + (D2 − A3 ) + (A7 − A6 ) + (D6 − D5 ))/D7 . Notice that
D
the expression is identical to 0 7 I(X(t) = 2)dt/D7 .
From Figure 1.5, also note that customers 1, 4, and 5 saw zero customers
in the system when they arrived; customers 2, 3, and 6 saw one in the system
when they arrived; and customer 7 saw two customers in the system upon
arrival. Thus a fraction 3/7 of the arriving customers saw one customer in
7
the system. The fraction is indeed equal to I Xn∗ = 1 /7.
1
To obtain the time-averaged number of customers in the system between
D
time 0 and D7 , we use the expression 0 7 X(t)dt/D7 . Hence we have that
28 Analysis of Queues

value as

(A2 − A1 ) + 2(D1 − A2 ) + (A3 − D1 ) + 2(D2 − A3 ) + (D3 − D2 ) + (D4 − A4 )

+ (A6 − A5 ) + 2(A7 − A6 ) + 3(D5 − A7 ) + 2(D6 − D5 ) + (D7 − D6 )
D7

customers in the system by averaging over time from t = 0 to t = D7 .

Although the preceding description looks cumbersome even for a small

problem instance, it turns out that for many queueing systems it is possi-
ble to obtain expressions for pj , πj , π∗j , G(x), F(x), L, and W. In particular,
if we consider queues that are asymptotically stationary and ergodic flow
systems, then it is possible to obtain some (if not all) of those expressions.
It turns out that single-station queueing systems such as the one shown
in Figure 1.4 that can be described using Kendall notation and that satisfy
assumptions in Remarks 2, 3, and 4 are typically asymptotically stationary
and ergodic. In that light, we redefine pj , πj , π∗j , G(x), F(x), L, and W as the
corresponding asymptotic measures as follows: pj is also the probability that
there would be j customers in the system in steady state; πj and π∗j would
be the respective probabilities that in steady state a departing and, respec-
tively, an arriving customer would see j other customers in the system; G(x)
and F(x) would be the cumulative distribution functions of the workload in
steady state and waiting time for an arrival into the system in steady state,
respectively; L would be the expected number of customers in the system
in steady state; and W would be the expected waiting time for a customer
arriving at the system in steady state. Hence these performance metrics can
also be represented as follows:

pj = lim P{X(t) = j},

t→∞

πj = lim P{Xn = j},

n→∞

π∗j = lim P Xn∗ = j ,
n→∞

G(x) = lim P{W(t) ≤ x},

t→∞

F(x) = lim P{Wn ≤ x},

n→∞

L = lim E[X(t)]
t→∞
Introduction 29

and

W = lim E[Wn ].
n→∞

Since the system is asymptotically stationary and ergodic, the two definitions
of pj , πj , π∗j , G(x), F(x), L, and W would be equivalent. In fact, we would end
up using the latter definition predominantly as we would be modeling the
queueing system as stochastic processes and perform steady-state analysis.
One of the primary objectives of analysis of queues is to obtain closed-
form expressions for the performance metrics pj , πj , π∗j , G(x), F(x), L, and
W given properties of the queueing system such as inter-arrival time dis-
tribution, service time distribution, number of servers, system capacity,
and service discipline. Although we would derive the expressions for var-
ious settings in future chapters only, for the remainder of this section, we
concentrate on describing the relationship between those measures.

1.3.3 Relationship between System Metrics for G/G/s Queues

In this section, we describe some relationships between the metrics πj , π∗j , L,
and W, as well as other new metrics that we will define subsequently. We
consider the setting of a G/G/s queue where the arrivals are according to
a renewal process (such that the arrival times are at A1 , A2 , . . .), the service
times are according to some given distribution (with S1 , S2 , . . . being IID),
there are s identical servers, there is an infinitely large waiting room, and the
service is according to FCFS. The results can be generalized to other cases
as well; in particular, we will consider a finite-sized waiting room at the
end of this section. Also, we will describe whether the results can be used
in a much broader setting whenever appropriate. In particular, the arrivals
being according to a renewal process is not necessary for most results. Also,
FCFS is not necessary for almost all the results (however, we do require
for some results that the discipline be nonpreemptive and work conserving,
which will be defined later). We begin by stating the relationship between
πj and π∗j .
If the limits defining πj and π∗j exist, then

πj = π∗j for all j ≥ 0.

We explain this relation using an example illustration, but the rigorous proof
uses what is known as a level-crossing argument. In a G/G/s queue, note
that the times customers arrive to an empty system are regenerative epochs.
In other words, starting at a regenerative epoch, the future events are inde-
pendent of the past. In Figure 1.5, times A1 , A4 , and A5 are regenerative
epochs. The process that counts the number of regenerative epoch is indeed a
30 Analysis of Queues

renewal process and the time between successive regenerative epochs are IID
(although for this system the distribution is not easy to compute in general).
We call this time between successive regenerative epochs as a regenerative
cycle. For the regenerative process described previously in a G/G/s system,
we assume that the regenerative cycle times on average are finite and the
system is stable (stability conditions will be explained later). It is crucial to
note that within any regenerative cycle of such a G/G/s queue, the number
of arriving customers seeing j others in the system would be exactly equal to
the number of departing customers seeing j others in the system.
For example, consider the regenerative cycle [A1 , A4 ) in Figure 1.5. There
are three arrivals, two of which see one in the system (customers 2 and 3)
and one sees zero in the system (customer 1). Observe that there must be
exactly three departures (if there are three arrivals). Of the three departures,
two see one in the system (customers 1 and 2) and one sees zero (customer 3).
Similarly, in regenerative cycles [A4 , A5 ) and [A5 , A8 ) one can observe (pre-
tending A8 is somewhere beyond D7 ) that the number of arriving customers
that see j in the system (for any j) would be exactly equal to the number of
departing customers that see j in the system. Since the entire time is com-
posed of these regenerative cycles, by summing over infinitely large number
of regenerative cycles we can see that the fraction of arriving customers that
see j others in the system would be exactly equal to the fraction of depart-
ing customers that see j others in the system. Hence we get πj = π∗j . Before
proceeding it is worthwhile to verify for the G/G/2 case in Figure 1.6 where
the regenerative cycles are [A1 , A3 ), [A3 , A4 ), [A4 , A5 ), and [A5 , A8 ) assuming
A8 is somewhere beyond D5 . Also, this result can be generalized easily for
a finite capacity queue and any service discipline as long as arrivals occur
individually and service completions occur one by one.
Next we describe the relationship between L and W. For that we require
some additional terminology. Define the following for a single stage G/G/s
queue (with characteristics described in the previous paragraph):

• λ: Average arrival rate into the system. By definition λ is the long-

run average number of customers that arrive into G/G/s queue per
unit time. This definition holds even if the arrival process is not
renewal. However, if the arrivals are indeed according to a renewal
process then for any n > 1 (since the inter-arrival times are IID) we
can define

1
= E[An − An−1 ].
λ

In other words, λ is the inverse of the expected inter-arrival time.

• μ: Average service rate of a server. By definition μ is the long-run
average number of customers any one server can serve per unit
time, if the server continuously served customers. Notice that we do
Introduction 31

require the service times to be IID. Therefore, μ can easily be written

for any arbitrary n as

1
= E[Sn ].
μ

In other words, μ is the inverse of the average time needed to

serve a customer. It is important to note that in our results we
assume that the units for λ and μ are the same. In other words,
both λ and μ should be expressed as per second or both should be
per minute, etc.
• ρ: Traffic intensity of the system. By definition this is the load
experienced by the system and is expressed as

λ
ρ= .
sμ

Note that ρ is a dimensionless quantity.

• Lq : Average number of customers waiting in the queue, not includ-
ing ones in service. Note that the L we defined earlier includes
customers at the servers.
• Wq : Average time spent by customers in the queue waiting, not
including time spent in service. Note that the W we defined earlier
includes customer service times.

Now, we are ready to present some relationships in terms of λ, μ, ρ,

Lq , Wq , L, and W and then between pj , πj , and π∗j . However, as a first step
we describe the stability condition. The G/G/s queue as described earlier is
stable if

ρ < 1.

In other words, we require that λ < sμ for the system to be stable. This is
intuitive because it says that there is enough capacity (service rate on average
offered by all servers together is sμ) to handle the arrivals. In the literature,
ρ > 1 is called an overloaded system and ρ = 1 is called a critically loaded
system. Next we present a remark for stable G/G/s queues.

Remark 5

The average departure rate from a G/G/s queue is defined as the long-run
average number of customers that depart from the queue per unit time. If
the G/G/s queue is stable, then the average departure rate is λ.
32 Analysis of Queues

This remark is an artifact of the argument made in Section 1.2.1 that since
this is a stable flow conserving system, the average input rate must be the
average output rate. With that said, we describe the relationship between
the preceding terms and then explain them subsequently.
A G/G/s queue with notation described earlier in this section satisfies the
following:

1
W = Wq + , (1.8)
μ

L = λW (1.9)

and

Lq = λWq . (1.10)

We will now explain these equations. Equation 1.8 is directly from the defini-
tion; the total time in the system for any customer must be equal to the time
spent waiting in the queue plus the service time; thus taking expectations we
get Equation 1.8. Equations 1.9 and 1.10 are both due to Little’s law described
in Section 1.2.4. Essentially, if one considers the entire queueing system as a
flow system and then suitably substitutes the terms in Equation 1.7, then
Equation 1.9 can be obtained. However, if one considered just the wait-
ing area as the flow system, then Equation 1.7 can be used once again to
derive Equation 1.10. The preceding equations can be applied in more gen-
eral settings than what we described. In particular, Equation 1.8 is applicable
beyond the G/G/s setting such as: it does not require renewal arrivals; if
appropriately defined, it can be used for finite capacity queues as well as
some non-FCFS disciplines. Likewise, Equations 1.9 and 1.10 are applicable
in a much wider contexts since Little’s law can be applied to any flow sys-
tem, not just the G/G/s queue setting. In particular, it can be extended to
G/G/s/K queues (as we will see at the end of this section) by appropriately
picking λ values. Also, it is not required that the discipline be FCFS (even
work-conservation is not necessary).
The key benefit of the three equations is that if we can compute one of L,
Lq , W, or Wq , the other three can be obtained by solving the three equations
for the three remaining unknowns. It is worthwhile pointing out that λ and μ
are not unknowns. One can (usually) easily compute λ and μ from the G/G/s
description. We illustrate this using an example next.

Problem 5
A simulation of a G/G/9 queue yielded Wq = 1.92 min. The inputs to the sim-
ulation included a Pareto distribution (with mean 0.1235 min and coefficient
of variation of 2) for the inter-arrival times and a gamma distribution (with
Introduction 33

mean 1 min and coefficient of variation of 1) for the service times. Compute
L, Lq , and Wq .
Solution
Based on the problem description we have a G/G/s queue with s = 9, λ = 8.1
per minute, and μ = 1 per minute. Also, Wq = 1.92 min, which is the aver-
age time a customer waits to begin service. Using Equation 1.10 we can
get Lq = λWq = 8.1 × 1.92 = 15.552 customers that can be seen waiting for
service (on average) in the system. Also, using Equation 1.8 we have
W = Wq + 1/μ = 2.92 min (which is the mean time spent by each customer in
the system). Finally, using Equation 1.9 we get L = λW = 8.1 × 2.92 = 23.652
customers in the system on average in steady state.

Next, we describe few more interesting results based on Equations 1.8

through 1.10. Multiplying Equation 1.8 by λ we get λW = λWq + λ/μ.
However, since L = λW (Equation 1.9) and Lq = λWq (Equation 1.10), we have

λ
L = Lq + .
μ

Also, since L is the expected number of customers in the system and Lq is the
expected number of customers in the waiting, we have λ/μ as the expected
number of customers at the servers. Therefore, the expected number of busy
servers in steady state is λ/μ = sρ. Define random variable Bi as 1 if server i
is busy in steady state and 0 otherwise (i.e., server i is idle). Since the servers
are identical we define pb as the probability that a particular server is busy,
that is, P(Bi = 1) = pb . Also, E[Bi ] = pb for all i ∈ [1, 2, . . . , s]. We saw earlier
that E[B1 + B2 + · · · + Bs ] = λ/μ since the expected number of busy servers is
λ/μ. But E[B1 + B2 + · · · + Bs ] is also spb . Hence we have the probability that
a particular server is busy pb given by

pb = ρ.

Also, for the special single server case of s = 1, that is, G/G/1 queues, the
probability that the system is empty, p0 is

p0 = 1 − ρ

since the probability that the server is busy in steady state is ρ. As we

described earlier, most of the preceding results can be extended to more
generalized settings (e.g., FCFS is not necessary). We next explain one
such extension, namely, the G/G/s/K queue for which some minor tweaks
are needed.
34 Analysis of Queues

1.3.3.1 G/G/s/K Queue

The description of a G/G/s/K queue is identical to that of the G/G/s queue
given earlier with the only exception that the total number in the system
cannot exceed K. We do assume that K is finite. The good news is that such
a system is always stable. However, a complication arises because not all
customers that “arrive” to the system actually “enter” the system. In other
words, if an arriving customer finds K others in the system (i.e., a full system
with no waiting room), that arriving customer leaves without service. It is
crucial to notice that the Kendall notation is still for the arrival process and
not the entering process. However, we must be careful while defining the per-
formance measures by explicitly stating that they are for entering customers.
In that light, the average entering rate λ can be defined as

λ = λ 1 − π∗K

since a fraction π∗K of arrivals would be turned away due to a full system.
Also, note that the average rate of departure from the system after being
served is also λ. Using λ, the average number of customers that enter the
queueing system per unit time, we can write down Little’s law as

L = λW. (1.11)

It is important to note that W must be interpreted as the mean time in

the system for customers that actually “enter” the system (and it does not
include customers that were turned away). In other words W is a condi-
tional expected value, conditioned on the arriving customer able to enter
the system. However, if one were to consider all arriving customers, then
the average time spent in the system would indeed be L/λ; however, this
includes a fraction π∗K that experienced zero time in the system.
Further, if we used the conditional definition of W and Wq (i.e., they
are averaged only across customers that enter the system), then we can also
show the following:

1
W = Wq + ,
μ
Lq = λWq .

In addition, as we described earlier

πj = π∗j

for all j with them being equal to zero if j > K. Having said that, it is cru-
cial to point out that for most of the book we will mainly concentrate on
Introduction 35

infinite capacity queues (with some exceptions especially in the very next
chapter) due to issues of practicality and ease of analysis. From a practi-
cal standpoint, if a queue actually has finite capacity but the capacity is
seldom reached, approximating the queue as an infinite capacity queue is
reasonable.

1.3.4 Special Case of M/G/s Queue

One special case of the G/G/s queue is worth mentioning because of the addi-
tional results one can derive. That special case is when the arrival process is
a Poisson process leading to an M/G/s queue. Assume that the arrival pro-
cess is a Poisson process with inter-arrival times according to an exponential
distribution. The average inter-arrival time is still 1/λ as before and mean
arrival rate is λ customers per unit time. All the other definitions are the
same as that in the G/G/s queue case.
With that said, the first result is based on the description in Section 1.2.5.
There we stated that if a system is observed according to a Poisson process
then the observations made according to a Poisson process yield the same
results as those observed continuously. For the M/G/s queue, we consider
observing the number in the system. Clearly, pj is the long-run fraction of
time there are j in the system and it can be obtained by observing the system
continuously. On the other hand, if the system is observed according to a
Poisson process then a fraction pj of those observations in the long run must
result in the system being in state j. Now, if these Poisson observations are
made by arriving customers (which is the case in M/G/s queues), then a
fraction pj of arriving customers would observe the system being in state j in
the long run. But the fraction of arriving customers that observe the system
in state j by definition is π∗j . Therefore,

pj = π∗j .

This result is called PASTA (Poisson arrivals see time averages) since the
arrivals are seeing time-averaged quantities as described in Section 1.2.5.
The PASTA property can be used further to obtain relations between time
averages and ensemble averages. Recall the definition of L, which can also
be written as

∞

L= jpj .
j=0

Hence L is the time-averaged number of customers in the system (as well as

the expected number of customers in the system in steady state). In a similar
manner, let L(k) be the kth factorial moment of the number of customers in
36 Analysis of Queues

the system in steady state, that is,

∞

j
L(k) = k! pj .
k
j=k

Also, let W (k) be the kth moment of the waiting time in steady state, that is,

W (k) = lim E[{Wn }k ].

n→∞

Note that W (1) = W and L(1) = L. Of course using Little’s law we have L = λW.
But Little’s law can be extended for the M/G/s queue in the following
manner. For an M/G/s queue

L(k) = λk W (k) . (1.12)

Therefore, all moments of the queue lengths are related to corresponding

moments of waiting times for any M/G/s queue. Note that from factorial
moments it is easy to obtain actual moments.
In summary, we have shown the relationships between various perfor-
mance measures (such as number in the system and their moments, time
in the system and its moments) as well as steady-state distributions such
as pj , πj , or π∗j . However, we need to obtain some of the measures using
the dynamics of the queueing system and then use the relationships derived
earlier to obtain other metrics. To do that, that is, obtain some of the met-
rics by analyzing the dynamics of the system, would be the focus of most of
the remaining chapters of the book. Such an approach of analysis is known
as physics-based because the system dynamics and interactions are mod-
eled. However, it is also possible to alleviate perceived waiting in systems,
especially when humans are involved as entities. We address that next by
studying psychological aspects.

1.4 Psychology in Queueing

The objective of many commercial systems is to design and control resources
to maximize expected revenue (or minimize expected costs) in the long run.
One of the key aspects for that objective is to ensure that the users or cus-
tomers are satisfied. Customer satisfaction is a psychological concept and is
often not linear, not rational, and not uniform across space (i.e., customers)
as well as time. Also, customer satisfaction can sometimes be enhanced
Introduction 37

without improving the system performance. We present several examples

to illustrate that.
The classic example presented in psychology of queues is what is known
as the elevator problem. A hotel in a downtown of a big city was a high-rise
building with multiple floors. Unfortunately, the hotel management got sev-
eral complaints about how long it took from when a customer enters the
hotel to go to his or her room. The basic problem was that the elevators did
not have the capacity to handle the volume of customers especially during
peak times. The hotel management considered several alternatives that were
suggested. One of them was to replace the current elevators with faster ones.
But that was soon dismissed because it was the stopping and letting cus-
tomers out that was taking most of the time. Another suggestion was to try a
different algorithm for the elevators to operate but simulations showed that
did not dramatically improve the performance (time between showing up at
the elevator and going to the room). A final suggestion that the management
seriously considered is to build two new elevator shafts outside the building
(which is the only space available).
Simulation studies showed that the final suggestion would certainly
improve the system performance. But when the management did their cost
analysis they found that it was exorbitant. The cost to build two new elevator
shafts did not match up the benefits due to improved customer satisfac-
tion. Frustrated, the hotel management approached a consultant. When the
consultant went to the hotel to understand the problem, she immediately
thought of a clever solution. Her suggestion was to invest a little to ensure
that the perceived wait of customers are reduced. For that she suggested
placing mirrors both in the elevator and in the waiting areas. That way cus-
tomers would spend time grooming themselves instead of thinking they are
waiting. She also suggested to play soothing music both in the waiting area
and in the elevator and use fragrances. Sure enough, upon implementing,
the complaints reduced dramatically. Although the actual time in the system
did not improve, the perceived time surely did! The customers did not feel
the agony of waiting.
There are several other systems where the operators have aimed at reduc-
ing the perceived waiting times. Doctors, offices and barber shops usually
have magazines so that the patients and customers, respectively, can read
them and not perceive the long wait. The waiting area in most entertain-
ment facilities (such as theme parks, theaters, zoos, and aquariums) usually
have videos or information about the exhibits on the walls for customers to
enjoy while they are waiting. In fact, some dentist’s offices even have a televi-
sion mounted on the ceiling to distract customers that are truly experiencing
agony! The grocery store checkout line has tabloids so that people are busy
glancing through them not realizing they are waiting. That would certainly
be better than other expensive solutions such as more checkout counters. In
fact, these days most grocery stores have self-checkout lines too where cus-
tomers are kept busy checking out (although usually the check out clerk is
38 Analysis of Queues

probably much faster, but since the customer is kept occupied it does not
appear that way).
There are many such situations where humans perceive something as tak-
ing longer when it actually might be shorter. One such example is at fast-food
restaurants. While designing the waiting area, one is faced with choosing
whether to have one long serpentine waiting line or have one line in front of
each server. We will see later in this book that one serpentine line is better
than having one line in front of each server when it comes to minimizing the
expected time spent in the system. But then why do we see fast-food restau-
rants having a line in front of each server? One reason is that most people
feel happier to be on a shorter line than a longer line! Also, for example, if
there are three servers in a fast-food restaurant taking orders, then most peo-
ple feel better being second in line behind a server than being the fourth in
a long serpentine line, although in terms of getting out of the restaurant it
is better to be fourth in a serpentine line with three servers than second in
a line with one server. Even though this appears irrational, the perception
certainly matters while making design decisions.
Another aspect that is crucial for customer satisfaction is to reduce the
anxiety involved in waiting. Providing information, estimates, and guaran-
tees, as well as reducing uncertainties, goes a long way in terms of customer
satisfaction. Getting an estimate of how long the wait is going to be can
reduce the anxiety level. In many phone-based help desks one typically gets
a message saying, “your call will be answered in approximately 5 minutes.”
By saying that one typically does not get impatient for that 5 min. In most
restaurants when one waits for a table, the greeter usually gives an estimate
of how long the wait is. In theme parks one is usually provided information
such as “wait time is 45 minutes from this point.” These days, to avoid road
rage, on many city roads one sees estimated travel times displayed for var-
ious points of interest. It is also crucial to point out that in many instances
providing this waiting information is not only useful in terms of reducing
anxiety but it also enables the customer or user to make choices (such as
considering alternative routes when a road is congested).
In many instances, providing information to customers so that they could
make informed decisions actually improves the system performance. For
example, a note on the highway (or on the radio) informing an accident
would make some drivers take alternate routes. This certainly alleviates the
congestion on the highway where the accident has taken place. Another
example is at theme parks (such as at DisneyWorld) where as a guest you
have the option of standing in a long line or picking up a token that gives
you another window of time to show up for the ride (which also ensures
short wait times). This is a win–win situation because it not only reduces
the agony of waiting and improves the experience of the guests as now they
can enjoy more things, but it also reduces bursts seen during certain times of
the day by spreading them out and thereby running the systems more effi-
ciently. This also allows for classifying customers and catering to their needs,
Introduction 39

as opposed to using FCFS for all customers. By not forcing all the customers
to wait or to show up at times specified by the system, the system is able to
satisfy customers that prefer one option versus the other.
The last comment made in the previous sentence essentially states that
by putting the onus on the customers to choose the class they want to belong
makes the system appear more fair and not skewed toward the preference
of one type. The whole notion of fairness, especially in queues with human
entities is a critical aspect. Customer satisfaction and frustration with waiting
can get worse if there is a perception of the system being unfair. Consider a
grocery store checkout line. Sometimes when a new counter opens up while
everyone is waiting in long lines, the manager invites customers arbitrar-
ily. A lot of customers consider that as unfair. In restaurants one tends to
get annoyed when someone that arrived later gets seated first, although
that might be for practical reasons such as a table of the right size becom-
ing available. But usually when such an unfair thing occurs, the agony of
waiting worsens. Therefore, most systems adopt FCFS as it is a common
notion of being fair. But again that has been questioned by many researchers.
Unfortunately, what is considered fair, is completely in the mind of the one
experiencing the situation.
Talking about situations, usually the same duration of wait times could
be tolerable in one situation and unbearable in another even though it is the
same person. There are many reasons for that difference. One has to do with
the customer’s expectations. If one waits longer or shorter than expected,
although in both cases the actual wait times are the same, the latter makes
a customer more satisfied. In fact, that is why most of the times services
overestimate the wait time while informing their customers. Also, whether
a wait time is considered acceptable or not depends on what one is waiting
for. There are many things that are considered “worth the wait,” especially
something precious. Of course as we described earlier, it also depends on
what the customer is doing while waiting, an engaged customer would per-
ceive the same wait time as shorter than when the same customer is idle.
There are things that also appeal logically; for example, if it takes 5 min to
download a very large file, that is alright but the same 5 min for a small web
page is ridiculous. In summary, understanding human nature is important
while making design and control decisions.
Human nature not only plays out while considering customer satisfaction
but also in terms of behavior. Balking is a behavior when arriving customers
decide not to join the queue. Although usually the longer the line, the greater
the chance for balking. But that may not be true all the time as sometimes a
longer line might imply better experience! In fact, people balk lines saying,
“I wonder why no one is here, maybe it is awful.” So it becomes important
to understand the balking behavior and rationale. Same applies to reneg-
ing, which is abandoning a queue after waiting for some time and service
does not begin. Again, understanding the behavior associated with reneg-
ing can help develop appropriate models. It was observed that the longer
40 Analysis of Queues

one waits, the lesser the propensity to renege. This is not intuitive because
one expects customers to wait for some time and become impatient, so with
time the reneging rate should have increased. In fact, systems like emer-
gency management (such as 9-1-1 calls in the United States) use an LCFS
policy because customers that are under true emergency situations tend to
hang up and try again persistently with high rates of reneging. However,
other callers behave in the opposite fashion, that is, wait patiently or renege
and not retry. By understanding the behavior of true emergency callers, a
system that prioritizes such callers without actually knowing their condition
can be built.
Customer behavior and customer satisfaction go hand in hand. Satis-
fied customers behave in a certain way and unsatisfied customers behave
in other ways. In other words, customer behavior is a reaction to customer
satisfaction (or dissatisfaction). On the flip side, for organizations that pro-
vide service, understanding customer behavior and being able to model it
goes a long way in providing customer satisfaction. There are three compo-
nents of customer satisfaction: (a) quality of service (QoS), (b) availability,
and (c) cost. A service system (with its limited resources) is depicted in
Figure 1.7. Into such a system customers arrive, if resources are available
they enter the system, obtain service for which the customers incur a cost,
and then they leave the system. We define availability as the fraction of time
arriving customers enter the system. Thereby QoS is provided only for cus-
tomers that “entered” the system. From an individual customer’s standpoint,
the customer is satisfied if the customer’s requirements over time on QoS,
availability, and cost are satisfied.
The issue of QoS (sometimes called conditional QoS as the QoS is condi-
tioned upon the ability to enter the system) versus availability needs further
discussion. Consider the analogy of visiting a medical doctor. The ability
to get an appointment translates to availability; however, once an appoint-
ment is obtained, QoS pertains to the service rendered at the clinic such as

QoS

Output
Enter Service
Exit
Customer system
arrival
Reject Input
Payment

FIGURE 1.7
Customer satisfaction in a service system. (From Gautam, N., Quality of Service Metrics. Frontiers
in Distributed Sensor Networks, S.S. Iyengar and R.R. Brooks (eds.), Chapman & Hall/CRC Press,
Boca Raton, FL, pp. 613–628, 2004. With permission.)
Introduction 41

waiting time and healing quality. Another analogy is airline travel. Getting
a ticket on an airline at a desired time from a desired source to a desired des-
tination is availability. QoS measures include delay, smoothness of flight,
in-flight service, etc. One of the most critical business decisions is to find the
right balance between availability and QoS. A service provider can increase
availability by decreasing QoS and vice versa. A major factor that could
affect QoS and availability is cost. As a final set of comments regarding cus-
tomer satisfaction, consider the relationship between customer satisfaction
and demand. If a service offers excellent customer satisfaction, very soon
its demand would increase. However, if the demand increases, the service
provider would no longer be able to provide high customer satisfaction,
which eventually deteriorates, thereby decreasing demand. This is a cycle
one has to plan carefully.
As a final comment, although this book mainly considers physics of
queues, it is by no means the only way to design and operate systems. As
we saw in the examples given earlier, by considering psychological issues it
is certainly possible to alleviate anxiety and stress associated with waiting. In
fact, a holistic approach would combine physics and psychology of queues
to address design, control, and operational issues.

Reference Notes
There are several excellent books and papers on various aspects of theory
and applications of queueing models. The list is continuously growing and
something static such as this book may not be the best place for that list.
However, there is an excellent site maintained by Hlynka [54], which is a
phenomenal repository of queueing materials. The website includes a won-
derful historical perspective of queues and cites several papers that touch
upon the evolution of queueing theory. It also provides a large list of queue-
ing researchers, books, course notes, and software among other things such
as queueing humor.
This chapter as well as most of this book has been influenced by a sub-
set of those fantastic books in Hlynka [54]. In particular, the general results
based on queueing theory is out of texts such as Kleinrock [63], Wolff [108],
Gross and Harris [49], Prabhu [89], and Medhi [80]. Then, the applications of
queues to computer and communication systems have predominantly been
influenced by Bolch et al. [12], Menasce and Almeida [81], and Gelenbe and
Mitrani [45]. Likewise, applications to production systems are mostly due to
Buzacott and Shanthikumar [15]. The theoretical underpinnings of this book
in terms of stochastic processes are mainly from Kulkarni [67], which is also
the source for many of the notations used in this chapter as well as others in
this book.
42 Analysis of Queues

Talking about theoretical underpinnings, it is important to note that they

have not been treated in this chapter with the kind of rigor one normally
expects in a queueing theory text. This is intentional, being an introductory
chapter of this textbook. However, interested readers are strongly encour-
aged to consider books such as Baccelli and Bremaud [7] as well as Serfozo
[96] for a discussion on Palm calculus, which is crucial to derive results such
as Little’s law and PASTA. Further generalizations of Little’s law can also
be found there. An addition, an excellent source for cutting-edge research
on queueing is the journal Queueing Systems: Theory and Applications that is
affectionately called QUESTA in the queueing community.
On a somewhat tangential note, the last section of this chapter on psy-
chology of queues usually does not appear in most queueing theory texts.
However, interested readers must follow the literature popularized greatly
by the groundbreaking paper by Larson [72]. Also, readers interested in
applications of queueing could visit the website (Hlynka [55]) for an up-
to-date list of queueing software [57, 83]. On there, software that run on
various other applications (such as MATLAB , Mathematica, and Excel.)
are explained and the most suitable one for the reader can be adopted.

Exercises
1.1 Consider a doctor’s office where reps stop by according to a renewal
process with an inter-renewal time according to a gamma distribu-
tion with mean 25 days and standard deviation 2 days. Whenever a
rep stops, he or she drops off 10 samples of a medication. If a patient
needs the medication, the doctor gives one of the free samples if
available. Patients arrive at the doctor’s office needing medication
according to a Poisson process at an average rate of 1 per day. To
make the model more tractable, assume that all samples given dur-
ing a rep visit must be used or discarded before the next visit by the
rep. Model the number of usable samples at the doctor’s office as a
flow system. Is it stable? Derive expressions for the average input
rate , the time-averaged number of free samples in the doctor’s
office H, and the average number of days each sample stays at the
doctor’s office ?
1.2 Consider a G/D/1/2 queue that is empty at time t = 0. The first eight
arrivals occur at times t = 1.5, 2.3, 2.7, 3.4, 5.1, 5.2, 6.5, and 9.3. The
service times are constant and equal to 1 time unit. Draw a graph
of W(t) and X(t) from t = 0 to t = 6.5. Make sure to indicate arrival
times and departure times. What is the time-averaged workload as
well as number in the system from t = 0 to t = 6.5?
Introduction 43

X(t)
3

1
t
0

FIGURE 1.8
Plot of X(t) for a G/G/1 queue.

1.3 Figure 1.8 represents the number in the system during the first few
minutes of a G/G/1 queue that started empty. On the figure, mark
A3 and D2 . Also, what is X(D2 +)? Derive the time-averaged number
in the system between t = 0 and t = D4 in terms of Ai and Di for
i = 1, 2, 3, 4.
1.4 An assembly line consists of three stages with a single machine at
each stage. Jobs arrive to the first stage one by one and randomly
with an average of one every 30 s. After processing is completed the
jobs get processed at the second stage and then the third stage before
departing. The first-stage machine can process 3 jobs per minute, the
second-stage machine can process 2.5 jobs per minute, and the third-
stage machine can process 2.25 jobs per minute. The average sojourn
times (including waiting and service) at stages 1, 2, and 3 were 1, 2,
and 4 min, respectively. What is the average number of jobs in the
system for the entire assembly line and at each stage? What is the
average time spent by each job in the system? What is the x-factor
for the entire system that is defined as the ratio of the average time
in the system to the average time spent processing for any job?
1.5 Consider a manual car wash station with three bays and no room
for waiting (assume that cars that arrive when all the three bays are
full leave without getting washed there). Cars arrive according to a
Poisson process at an average rate of 1 per minute but not all cars
enter. It is known that the long-run fraction of time there were 0, 1,
2, or 3 bays full are 6/16, 6/16, 3/16, and 1/16, respectively. What is
L for this system? What about W and Wq ? What is the average time
to wash a car?
1.6 Consider a factory floor with two identical machines and jobs arrive
externally at an average rate of one every 20 min. Each job is
processed at the first available machine and it takes an average of
30 min to process a job. The jobs leave the factory as soon as process-
ing is completed. The average work-in-process in the entire system
is 24/7. Compute the steady-state average throughput (number of
44 Analysis of Queues

processed jobs exiting the system), cycle time (i.e., mean time in the
system), and the long-run fraction of time each machine is utilized.
1.7 Consider a production system as a “black box.” The system pro-
duces only one type of a customized item. The following informa-
tion is known: the cycle time (i.e., average time between when an
order is placed and a product is produced) for any item is 2 h and the
throughput is one item every 15 min on average. It is also known that
the average time spent on processing is only 40/3 min (the rest of the
cycle time is spent waiting). In addition, the standard deviation of
the processing times is also 40/3 min. What is the steady-state aver-
age number of products in the system? When the standard deviation
of the processing time was reduced, it was observed that the aver-
age number in the system became 7, but the throughput remained
the same. What is the new value for cycle time?
1.8 Two barbers own and operate a barber shop. They provide two
chairs for customers who are waiting for a haircut, so the number
of customers in the shop varies between 0 and 4. For n = 0, 1, 2, 3, 4,
the probability pn that exactly n customers are in the barber shop
in steady-state is p0 = 1/16, p1 = 4/16, p2 = 6/16, p3 = 4/16, and
p4 = 1/16.
(a) Calculate L and Lq .
(b) Given that an average of four customers per hour arrive
according to a Poisson process to receive a haircut, determine
W and Wq .
(c) Given that the two barbers are equally fast in giving haircuts,
what is the average duration of a haircut?
1.9 Consider a discrete time queue where time is slotted in minutes.
At the beginning of each minute, with probability p, a new cus-
tomer arrives into the queue, and with probability 1 − p, there are
no arrivals at the beginning of that minute. At the end of a minute
if there are any customers in the queue, one customer departs with
probability q. Also, with probability 1−q, no one departs a nonempty
queue at the end of that minute. Let Zn be the number of customers
in the system at the beginning of the nth minute before any arrivals
occur in that minute. Model {Zn , n ≥ 0} as a discrete time Markov
chain by drawing the transition diagram. What is the condition of
stability? Assuming stability, obtain the steady-state probabilities
for the number of customers in the system as seen by an arriving
customer.
1.10 For an M/M/2/4 system with potential arrivals according to a Pois-
son process with mean rate 3 per minute and mean service time
0.25 min, it was found that p4 = 81/4457 and L = 0.8212. What are
the values of W and Wq for customers that enter the system?
2
Exponential Interarrival and Service Times:
Closed-Form Expressions

The most fundamental queueing models, and perhaps the most researched
as well, are those that can be modeled as continuous time Markov chains
(CTMC). In this chapter, we are specifically interested in such queueing mod-
els for which we can obtain closed-form algebraic expressions for various
performance measures. A common framework for all these models is that
the potential customer arrivals occur according to a Poisson process with
parameter λ, and the service time for each server is according to exp(μ). Note
that the inter-arrival times for potential customers are according to exp(λ)
distribution, but all arriving customers may not enter the system.
The methods to analyze such queues to obtain closed-form expressions
for performance measures are essentially by solving CTMC balance equa-
tions. The methods can be broadly classified into three categories. The first
category is a network graph technique that uses flow balance across arcs
called arc cuts on the CTMC rate diagram. Then, there are some instances
where it is difficult to solve the balance equations using arc cuts for which
generating functions would be more appropriate. Finally, in some instances
where neither arc cuts nor generating functions work, it may be possible
to take advantage of a property known as reversibility to obtain closed-form
expressions for the performance measures.
In the next three sections, we describe those three methods with some
examples. It is crucial to realize that the scenarios are indeed examples, but
the methodologies can be used in many more instances. In fact, the focus
of this chapter (and the entire book) is to explain the methodologies using
examples and not showcase the results for various examples of queues. In
other words, we would like to focus on the means and not the ends. There
are several wonderful texts that provide results for every possible queueing
system. Here we concentrate on the techniques used to get those results.
Since the methods in this chapter are based on CTMC analysis, we
explain the basics of that first. Consider an arbitrary irreducible CTMC
{Z(t), t ≥ 0} with state space S and infinitesimal generator matrix Q.
Sometimes Q is also called rate matrix. In order to obtain the steady-state
probability
row vector p = [pi ], we solve for pQ = 0 and normalize using
i∈S pi = 1. The set of equations for pQ = 0 is called balance equations and
can be written for every i ∈ S as

45
46 Analysis of Queues

pi qii + pj qji = 0.
j=i

If we draw
the rate diagram, then what is immediately obvious is that since
qii = − j=i qij , we have

pi qij = pj qji .
j=i j=i

This means that across each node i, the flow out equals the flow in.
Many times it is not straightforward to solve the balance equations
directly. The next three sections present various simplifying techniques to
solve them and also use the results obtained for various other performance
metrics besides the steady-state probability distribution.

2.1 Solving Balance Equations via Arc Cuts

Similar to the flow-in-equals-flow-out at each node (described above) is the
notion of balance equations via arc cuts. If the sample space S is partitioned
into two sets A and S − A such that A ⊂ S, then across all the arcs between
A and S − A there is flow balance (flow from A to S − A equals flow from
S − A to A). We have already seen a special case of that, which is the flow
balance across a node, where A would just be node i and the rest of the nodes
as S − A. So, in some sense, the node flow balance is the result of a special
type of arc cut.
We further explain the arc cut by means of an example. See the rate
diagram in Figure 2.1 with the arc cut depicted as a dashed line. Here
S = {0,1,2,3,4,5}, A = {0,2}, and S − A = {1,3,4,5}. The flow balance across the
arc cut becomes

αp0 + (δ + α + γ)p2 = β(p1 + p3 ). (2.1)

As an exercise, the reader is encouraged to verify that the node balance

equations

1 3 5
α δ α δ α
β β β
γ γ
0 2 4

FIGURE 2.1
Arc cut example.
Exponential Interarrival and Service Times: Closed-Form Expressions 47

⎡ ⎤
−α − γ α γ 0 0 0
⎢ β −β 0 0 0 0 ⎥
⎢ ⎥
⎢ 0 δ −α − γ − δ α γ 0 ⎥
[p0 p1 p2 p3 p4 p5 ] ⎢
⎢
⎥
⎥
⎢ 0 0 β −β 0 0 ⎥
⎣ 0 0 0 δ −α − δ α ⎦
0 0 0 0 β −β

= [0 0 0 0 0 0]

can be manipulated to get the flow balance across the arc cut resulting in
Equation 2.1. It is crucial to understand that the arc cut made earlier is just to
illustrate the theory. In practice, the cut chosen in the example would not be
a good one to use. The objective of these cuts is to write down relationships
between the unknowns so that the unknowns can be solved easily. Therefore,
cuts that result in two or three pi terms would be ideal to use. To illustrate
that, we present the following example problem.

Problem 6
For the CTMC with rate diagram in Figure 2.1, compute the steady-state
probabilities p0 , p1 , p2 , p3 , p4 , and p5 by making appropriate arc cuts and
solving the balance equations.
Solution
For this example, by making five suitable cuts (which ones?), we get the
following relationships

(α + γ)p0 = βp1

γp0 = δp2

(α + γ)p2 = βp3

γp2 = δp4

αp4 = βp5

from which it is relatively straightforward to write down p1 , p2 , p3 , p4 , and

p5 in terms of p0 as follows:

α+γ
p1 = p0 ,
β
γ
p2 = p0 ,
δ
48 Analysis of Queues

(α + γ)γ
p3 = p0 ,
δβ

γ2
p4 = p0 ,
δ2

γ2 α
p5 = p0 .
δ2 β

Note how much simpler this is compared to solving the node balance
equations. Once we know p0 , we have an expression for all pi . Now, by solv-
ing for p0 using the normalizing relation p0 + p1 + p2 + p3 + p4 + p5 = 1,
we get

1
p0 = .
1 + (α + γ)/β + γ/δ + (α + γ)γ/(δβ) + γ2 /δ2 + γ2 α/(δ2 β)

A word of caution is that when the cut set A is separated from the state
space S, all the arcs going from A to S − A must be considered. A good
way to make sure of that is to clearly identify the cut set A as opposed to
just performing a cut arbitrarily. Next, we use the arc cut method to obtain
steady-state distributions of the number in the system for a specific class of
queueing systems.

2.1.1 Multiserver and Finite Capacity Queue Model (M/M/s/K)

Consider a queueing system where potential arrivals occur one at a time
according to a Poisson process at an average rate of λ per unit time. There
are s stochastically identical servers and the capacity of the system is K where
K ≥ s. The service time at any server is exponentially distributed with mean
1/μ. An arriving customer that sees the system full is rejected without ser-
vice. Such a system in Kendall notation is represented as an M/M/s/K queue.
Although Kendall notation uses a default first come first served (FCFS), it is
not necessary for most expressions unless explicitly stated. We however do
require that the service discipline be work conserving where a server would
not be idle if there is work left to do. In this formulation, we also do not allow
multiple servers to work on a single customer.
The resulting system can be modeled as a CTMC with X(t) being the
number in the system at time t. Note that X(t) includes customers that are
at the servers as well as those waiting to be served. The state space S is the
set of all integers between 0 and K, that is, S = {0,1,2, . . . ,K}. Then, {X(t), t ≥ 0}
is indeed a CTMC with generator matrix Q = [qij ] such that for i ∈ S and j ∈ S,
we have
Exponential Interarrival and Service Times: Closed-Form Expressions 49

⎧
⎪
⎪ λ if j = i + 1 and i < K
⎪
⎪
⎨ iμ if j = i − 1 and 0 < i < s
qij = sμ if j = i − 1 and s ≤ i ≤ K
⎪
⎪
⎪
⎪ − min(1, K − i)λ − min(i, s)μ if j = i
⎩
0 otherwise.

Although the methodological emphasis is on obtaining steady-state prob-

abilities for the number in the system, it is worthwhile pointing out that
transient analysis is also possible. When K < ∞, using transient CTMC
analysis we have

P{X(t + τ) = j|X(τ) = i} = [exp(Qt)]ij

where exp(Qt) = I + Qt + Q2 t2 /2 + Q3 t3 /3! + Q4 t4 /4! + · · · , and is called the

exponential of matrix Qt. The previous equation states that if you are given
that there were i in the system at time τ, then the probability that there would
be j in the system after t time units is given by the ijth element of the exp(Qt)
matrix. The exponential of a matrix is usually an in-built matrix operation in
most mathematical software (e.g., in MATLAB after defining Q and t, the
command expm(Q*t) will return exp(Qt)). For the case K = ∞, the reader is
encouraged to refer to other techniques for CTMC transient analysis. Next,
we proceed with the steady-state analysis.

2.1.2 Steady-State Analysis

Figure 2.2 is the rate diagram for an M/M/s/K queue. By developing arc cuts
between every pair of adjacent nodes, we obtain the following equations:

λp0 = μp1

λp1 = 2μp2

λp2 = 3μp3

..
.

λps−1 = sμps

λps = sμps+1

..
.

λpK−1 = sμpK
50 Analysis of Queues

λ λ λ λ λ λ λ λ λ
0 1 2 S–1 S S+1 K–1 K
μ 2μ 3μ (s–1)μ Sμ Sμ Sμ Sμ Sμ

FIGURE 2.2
Rate diagram for M/M/s/K queue.

where pj for any j ∈ S is the probability that there are j customers in the
system in steady state, that is,

pj = lim P{X(t) = j}.

t→∞

In addition, since the CTMC is ergodic, pj is also the long-run fraction of time
the system has j customers, that is,
T
0 I{X(t)=j} dt
pj = lim
T→∞ T

where IB is an indicator function such that it is 1 if event B occurs and 0

otherwise.
Next, solving for the previous arc cut equations in terms of p0 , and using
the normalizing condition

K
pi = 1,
i=1

we get

⎡ ⎤−1
s n s
K
1 λ (λ/μ)
p0 = ⎣ + ρn−s ⎦
n! μ s!
n=0 n=s+1

where

λ
ρ= .
sμ

Therefore, for all j ∈ S

⎧ j
⎪
⎨ 1
j!
λ
μ p0 if j < s,
pj = j
⎪
⎩ 1 λ
p0 if s ≤ j ≤ K.
s!sj−s μ
Exponential Interarrival and Service Times: Closed-Form Expressions 51

Now that we have the steady-state probabilities, we can use them to

obtain various performance measures for the system. The probability that
an arriving customer in steady state is rejected is

(λ/μ)K
pK = p0
s!sK−s

since pK is the probability that in steady state there are K customers in the sys-
tem and that the arrivals are according to a Poisson process (due to PASTA
property). Also note that the rate at which customers are rejected is λpK on
average and the mean queue entering rate is λ(1 − pK ).
Another performance metric that we can quickly obtain is the long-run
average number of customers in the system that are waiting for their service
to begin (Lq ). Using the definition of Lq and the pj values, we can derive

K
p0 (λ/μ)s ρ
Lq = pj max(j − s, 0) = [1 − ρK−s − (K − s)ρK−s (1 − ρ)].
s!(1 − ρ)2
j=0

Using Lq , it is relatively straightforward to obtain some related steady-state

performance measures such as Wq (the average time spent by entering cus-
tomers in the queue before service), W (the average sojourn time, i.e., time
in the system, for entering customers), and L (the average number in the
system) as follows:

Lq
Wq =
λ(1 − pK )

Lq 1
W= +
λ(1 − pK ) μ

λ(1 − pK )
L = Lq +
μ

using the standard system analysis results via Little’s law and the definitions
(see Equations 1.8 through 1.10). Note that for Little’s law, we use the enter-
ing rate λ(1 − pK ) and not the arrival rate because not all arriving customers
enter the system.
In a similar fashion, using pj values, it would be possible to derive higher
moments of the steady-state number in the system and number waiting to be
served. See Exercise 2.1 at the end of this chapter for one such computation.
However, computing higher moments of the time in the system and time in
the queue in steady state for entering customers is a little more tricky. For
that we first derive the steady-state distribution. Let Yq and Y be random
variables that respectively denote the time spent by an entering customer in
52 Analysis of Queues

the queue and in the system (including service). To obtain the cumulative
distribution function (CDF) of Yq and Y, we require that the entities in the sys-
tem are served according to FCFS. The analysis until now did not require FCFS,
and for any work-conserving discipline, the results would continue to hold.
However, for the following analysis, we specifically take the default FCFS
condition into account. Having said that, it is worthwhile to mention that
other service disciplines can also be considered, but we will only consider
FCFS here.
To obtain the CDF of Yq that we denote as FYq (t), we first reiterate that Yq
is in fact a conditional random variable. In other words, it is the time spent
waiting to begin service for a customer that entered the system in steady
state, that is, given that there were less than K in the system upon arrival
in steady state. Since the arrivals are according to a Poisson process, due to
PASTA the probability that an arriving customer in steady state will see j in
the system is pj . Also, the probability that an entering customer in steady state
would see j others in the system is the probability that an arriving customer
would see j others in the system given that there are less than K in the system.
Using a relatively straightforward conditional probability argument, one can
show that the probability that an entering customer will see j in the system
in steady state is pj /(1 − pK ) for j = 0, 1, . . . , K − 1.
Also, if an entering customer sees j in the system, the time this customer
spends before service begins is 0 if 0 ≤ j < s (why?) and an Erlang(j − s + 1, sμ)
random variable if s ≤ j < K. The explanation for the Erlang part is that since
all the s servers would be busy during the time, the entering customer waits
before service, and this time corresponds to j − s + 1 service completions.
However, since each service completion corresponds to the minimum of s
random variables that are according to exp(μ), service completions occur
according to exp(sμ) (due to minimum of exponentials property). Further, since
the sum of j − s + 1 exp(sμ) random variables is an Erlang(j − s + 1, sμ)
random variable (due to the sum of independently and identically distributed
[IID] exponentials property), we get the desired result. Thus, using the defini-
tion and CDF of the Erlang random variable, we can derive the following
by conditioning on j customers in the system upon entering in steady
state:

FYq (t) = P{Yq ≤ t}

⎛ ⎞
j−s

s−1
pj
K−1 (sμt) r pj
= 1 + ⎝1 − e−sμt ⎠
1 − pK r! 1 − pK
j=0 j=s r=0

j−s

K−1
pj −sμt (sμt)r
=1− e .
1 − pK r!
j=s r=0
Exponential Interarrival and Service Times: Closed-Form Expressions 53

Once the CDF of Yq is known, FY (t), the CDF of Y can be obtained using
the fact that Y−Yq is an exp(μ) random variable, corresponding to the service
time of this entering customer. Therefore, we have for K > s > 1 the CDF as

FY (t) = P{Y ≤ t}

t t
= FYq (t − u)μe−μu du = 1 − e−μ(t−u) dFYq (u) + 1 − e−μt FYq (0)
0 0

j−s

s−1
pj pj
K−1 pj
K−1
(sμt)r
= 1 − e−μt + − e−sμt
1 − pK 1 − pK 1 − pK r!
j=0 j=s j=s r=0

t
K−1
pj
− e−μt eμu − sμe−sμu
1 − pK
0 j=s

j−s

−sμu (sμu)r−1 1 − sμu
+ e sμ du
(r − 1)! r
r=1
j−s

s−1
pj pj
K−1
(sμt)r
= 1 − e−μt − e−sμt
1 − pK 1 − pK r!
j=0 j=s r=0

⎡ ⎧

K−1
pj ⎨ s
−μt ⎣
−e − (1 − e−(s−1)μt )
1 − pK ⎩ s − 1
j=s

s
+ (1 − e−(s−1)μt − (s − 1)μte−(s−1)μt )
s−1
⎛ ⎞
j−s+1 j−s

s ⎝1 − ((s − 1)μt)i
− e−(s−1)μt ⎠
s−1 i!
i=0
⎫⎤
j−s
r ⎬
s ((s − 1)μt)r+1 s ⎦.
+ e−(s−1)μt
s−1 (r + 1)! s−1 ⎭
r=1

Note here that since Wq is a random variable with mass at 0, one has to be
additionally careful with the convolution realizing that FYq (0) is nonzero.
Also, when K = s, FY (t) = 1 − e−μt since the sojourn time equals service time
for entering customers. The case s = 1 can be addressed by rederiving the
integral using s = 1.
54 Analysis of Queues

Next, we present an alternative approach to obtain the distribution of Y

since it is used for many of the special cases that we will see (such as K = ∞
and/or s = 1). It is based on the Laplace Stieltjes transform (LST) which is
defined uniquely for all nonnegative random variables. The LST of FY (·) is
defined as

∞
F̃Y (w) = E[e−wY ] = e−wu dFY (u).
0

Once we know the LST, there are several techniques to invert it to obtain the
CDF FY (·), for example, direct computation by looking up tables, converting
to Laplace transform (LT) and inverting it, or by numerical transform inver-
sion. However, moments of Y can easily be obtained by taking derivatives.
With that understanding, the LST can be computed using the definition of Y
(as opposed to taking the LST of FY (·)) as

s−1
K−1 j−s+1
pj μ pj μ sμ
F̃Y (w) = +
1 − pK μ+w 1 − pK μ+w sμ + w
j=0 j=s

s−1 s
pj μ p0 μ 1 sμ λ
= +
1 − pK μ+w 1 − pK μ+w s! sμ + w μ
j=0

1 − (λ/(sμ + w))K−s
× .
1 − λ/(sμ + w)

Having developed expressions for performance measures of the

M/M/s/K queue, a natural question to ask is what insights can we obtain
from them? In other words, how do the performance measures vary with
respect to the parameters they are expressed as (e.g., λ, μ, s, and K)?
Although the sensitivity of the performance measures cannot easily be eval-
uated by just looking at the expressions, to determine that one can write a
computer program (or use one of the several packages available on the web,
see reference notes at the end of Chapter 1). The parameters μ, s, and K can be
thought of as representing the system capacity and λ as the load on the sys-
tem. Using the computer programs, one can do what-if analysis to quickly
determine the performance under various scenarios by tweaking the input
parameters (λ, μ, s, and K). As an interesting example, if one were to design
for K keeping all other parameters a constant, then there is a trade-off in
terms of W and pK . This is because W increases with K but pK decreases with
K. It is worthwhile explaining that although W increasing with K appears
counterintuitive as more capacity must improve performance, the reason
for this is that W only accounts for customers that enter the queue (and
Exponential Interarrival and Service Times: Closed-Form Expressions 55

larger K besides more capacity also corresponds to higher load). However,

since we are unable to represent the performance measures as closed-form
algebraic expressions of the input parameters, their relationships are not
easily apparent. For this reason, we next present some special cases where
the relationship between performance measures and input parameters
is clearer.

2.1.3 Special Cases

There are several special cases of the M/M/s/K queue that lend themselves
nicely for closed-form algebraic expressions that are insightful. In fact, most
queueing theory courses and textbooks would start with these special cases
and subsequently move to the general case of M/M/s/K queue. The special
cases involve special values of one or both of s (1, s, or ∞) and K (s, K, or ∞).
We present them as worked-out problems next.

Problem 7
Using the same notations as the M/M/s/K queue earlier, derive the distribu-
tion for the number of entities in the system and also the sojourn time in the
system for the classic M/M/1 queue. Do any conditions need to be satisfied?
Obtain expressions for L, W, Lq , and Wq .
Solution
The M/M/1 queue is a special case of the M/M/s/K queue with s = 1 and
K = ∞. Most of the performance measures can be obtained by letting s = 1
and K = ∞ in the M/M/s/K analysis. Hence, unless necessary, the results
will not be derived, but the reader is encouraged to verify them. However,
there is one issue. While letting K = ∞, we need to ensure that the CTMC
that models the number of customers in the system for the M/M/1 queue is
positive recurrent. The condition for positive recurrence (and hence stability
of the system) is

λ
ρ= < 1.
μ

In other words, the average arrival rate (λ) must be smaller than the aver-
age service rate (μ) to ensure stability. This is intuitive because in order for
the system to be stable, the server should be able to remove customers on
average faster than the speed at which they enter. Note that when K is finite,
instability is not an issue.
The long-run probability that the number of customers in the system is j
(for all j ≥ 0) is given by pj = ρj p0 , which can be obtained by writingthe bal-
ance equations for pj in terms of p0 . Now the normalizing equation j pj = 1
56 Analysis of Queues

can be solved only when ρ < 1, and hence this is called the condition for
positive recurrence. Therefore, when ρ < 1, we have

p0 = 1 − ρ

and

pj = (1 − ρ)ρ j for all j ≥ 0.

Therefore, the number of customers in the system in steady state follows a

modified geometric distribution. Further, the long-run probability that there
are more than n customers in the system is ρn (why?).
Either using pj from earlier equations (and using Little’s law wherever
needed) or by letting s = 1 and K = ∞ in the M/M/s/K results, we have

λ
L= ,
μ−λ
λ2
Lq = ,
μ(μ − λ)
1
W= ,
μ−λ
λ
Wq = .
μ(μ − λ)

It is crucial to realize that all these results require that ρ < 1. Also note that
while using Little’s law, one can use λ as the entering rate as no customers
are going to be turned away. Besides these performance metrics, one can also
derive higher moments of the steady-state number in the system. However,
to obtain the higher moments of the time spent in the system by a customer
arriving at steady state, one technique is to use the distribution of the time
in the system Y. By letting s = 1, K = ∞, and using pj = (1 − λ/μ)(λ/μ)j in the
M/M/s/K results, we get the LST after some algebraic manipulation as

μ−λ
F̃Y (w) = .
μ−λ+w

Inverting the LST, we get FY (y) = P{Y ≤ y} = 1 − e(λ−μ)y for y ≥ 0. There-

fore, the sojourn time for a customer arriving to the system in steady state is
according to an exponential distribution with mean 1/(μ − λ), that is,

Y ∼ exp(μ − λ).

Verify that E[Y] = W = 1/(μ − λ).

Exponential Interarrival and Service Times: Closed-Form Expressions 57

Problem 8
Using the same notations as the M/M/s/K described earlier in this section,
derive the distribution for the number of entities in the system and also the
sojourn time in the system for the multiserver M/M/s queue. What is the
stability condition that needs to be satisfied? Obtain expressions for L, W, Lq ,
and Wq .
Solution
The M/M/s queue is a special case of the M/M/s/K queue with K = ∞.
Most of the performance measures can be obtained by letting K = ∞ in the
M/M/s/K analysis; hence unless necessary the results will not be derived
but the reader is encouraged to verify them. Similar to the M/M/1 queue,
here too we need to be concerned about stability while letting K = ∞. The
condition for stability is

λ
ρ= < 1.
sμ

In other words, the average arrival rate (λ) must be smaller than the average
service capacity (sμ) across all servers to ensure stability.
By writing down the balance equations for the CTMC or by letting K = ∞
in the M/M/s/K results, we can derive the long-run probability that the
number of customers in the system is j (when ρ < 1) as
⎧ j
⎪
⎨ 1
j!
λ
μ p0 if 0 ≤ j ≤ s − 1
pj = j
⎪
⎩ 1 λ
p0 if j ≥ s
s! sj−s μ

where

s−1 −1
1 λ n (λ/μ)s 1
p0 = + .
n! μ s! 1 − λ/(sμ)
n=0

Either using pj from the previous equation (and using Little’s law wher-
ever needed) or by letting K = ∞ in the M/M/s/K results, we have

p0 (λ/μ)s λ
Lq = ,
s!sμ[1 − λ/(sμ)]2

λ p0 (λ/μ)s λ
L= + ,
μ s!sμ[1 − λ/(sμ)]2
58 Analysis of Queues

1 p0 (λ/μ)s
W= + ,
μ s!sμ[1 − λ/(sμ)]2

p0 (λ/μ)s
Wq = .
s!sμ[1 − λ/(sμ)]2

It is a worthwhile exercise to verify that by letting s = 1, we get the M/M/1

results. Also note that all the previous results require that ρ = λ/(sμ) < 1.
Other than the previous performance metrics, we can also derive higher
moments of the steady-state number in the system. In order to obtain the
higher moments of the time spent in the system by a customer arriving at
steady state, we use the distribution of the time in the system, Y. By taking
the limit K → ∞ in the M/M/s/K results with the understanding that λ < sμ,
we get the LST as

s−1 s
μ μ 1 sμ λ
F̃Y (w) = pj + p0 .
μ+w μ+w s! sμ + w − λ μ
j=0

Using partial fractions we can write

1 1 1
= − .
(μ + w)(sμ + w − λ) (μ + w)(sμ − μ − λ) (sμ − λ − μ)(sμ + w − λ)

Then by inverting the LST, we get

s−1
(λ/μ)s sμ
FY (y) = pj (1 − e−μy ) + p0 (1 − e−μy )
s! (s − 1)μ − λ
j=0

sμ2
− (1 − e−(sμ−λ)y )
(sμ − λ)[(s − 1)μ − λ]

for y ≥ 0. The reader is encouraged to verify that E[Y] results in the expres-
sion for W, and that by letting s = 1 we get Y ∼ exp(μ − λ), the M/M/1
sojourn time expression.

Problem 9
Using the same notations as the M/M/s/K described earlier in this section,
derive the distribution for the number of entities in the system and also the
sojourn time in the system for the single-server finite capacity M/M/1/K
queue. What is the rate at which customers enter the system on an average?
Obtain expressions for L, W, Lq , and Wq .
Exponential Interarrival and Service Times: Closed-Form Expressions 59

Solution
The M/M/1/K queue is a special case of the M/M/s/K system with s = 1.
All the results presented here are obtained by letting s = 1 for the corre-
sponding M/M/s/K results. We define ρ = λ/μ; however, since K is finite ρ
can be greater than one and the system would still be stable. The steady-
state probability that there are j customers in the system (for 0 ≤ j ≤ K) is
given by

ρj [1 − ρ]
pj = .
1 − ρK+1

In particular, the fraction of arrivals that are turned away due to a full
system is

ρK [1 − ρ]
pK = .
1 − ρK+1

Note that pK is also the probability that a potential arrival is rejected.

Therefore, the effective entering rate into the system is λ(1 − pK ).
Using the previous results or by letting s = 1 in the M/M/s/K results, we
can derive the following:

ρ ρ KρK + 1
Lq = − ,
1−ρ 1 − ρK+1

1 (K − 1)ρK+1 + ρ − KρK
Wq = ,
μ (1 − ρ)(1 − ρK )

1 KρK+1 − (K + 1)ρK + 1
W= ,
μ (1 − ρ)(1 − ρK )

KρK+2 − (K + 1)ρK+1 + ρ
L= .
(1 − ρ)(1 − ρK+1 )

Recall that Lq = λ(1 − pK )Wq since a fraction pK of arriving customers are

turned away. The only situation where the previous results are problem-
atic is when ρ = 1. In that case, one would need to carefully take limits for
the earlier expressions as ρ → 1. In addition, using pj one can obtain higher
moments of the number in the system in steady state.
For customers that enter the system, using the M/M/s/K results with
s = 1, the LST of the CDF of the steady-state sojourn time in the system Y
is given by
60 Analysis of Queues

K−1 j+1
pj μ
F̃Y (w) = .
1 − pK μ+w
j=0

This LST can be inverted using the fact that (μ/μ + w)j+1 is the LST of an
Erlang(j + 1, μ) distribution to get
⎛ ⎞
j

K−1
pj (μy) r
FY (y) = P{Y ≤ y} = ⎝1 − e−μy ⎠
1 − pK r!
j=0 r=0

for y ≥ 0. The reader is encouraged to cross-check all the results with the
M/M/1 queue by letting K → ∞ and assuming ρ < 1.

Problem 10
Using the same notations as the M/M/s/K described earlier in this section,
derive the distribution for the number of entities in the queue-less M/M/s/s
system. What if s = ∞? Is it trivial to obtain the sojourn time distribu-
tions for customers that enter the system? Obtain expressions for L, W, Lq ,
and Wq .
Solution
The last of the special cases of the M/M/s/K system is the case when s = K,
which gives rise to the M/M/s/s system. Note that no customers wait for
service. If there is a server available, an arriving customer enters the system
otherwise the customer is turned away. This is also known as the Erlang loss
system. Similar to the previous special cases, here too one can either work
with the M/M/s/K system letting s = K or start with the CTMC.
The probability that there are j (for j = 0, . . . , s) customers in the system in
steady state is

(λ/μ)j /j!
pj = s .
(λ/μ)i /i!
i=0

Therefore, the Erlang loss formula is the probability that a customer arriving
to the system in steady state is rejected (or the fraction of arriving customers
that are lost in the long run) and is given by

(λ/μ)s /s!
ps = s .
(λ/μ)i /i!
i=0

Although we do not derive the result here (see Section 4.5.3), it is worthwhile
to point out the remarkable fact that the previous formulae hold even for the
Exponential Interarrival and Service Times: Closed-Form Expressions 61

M/G/s/s system with mean service time 1/μ. In other words, the steady-
state distribution of the number in the system for an M/G/s/s queue does
not depend on the distribution of the service time.
Using the steady-state number in the system, we can derive

λ
L= (1 − ps ).
μ

Since the effective entering rate into the system is λ(1 − ps ), we get W = 1/μ.
This is intuitive because for customers that enter the system, since there is no
waiting for service, the average sojourn time is indeed the average service
time. For the same reason the sojourn time distribution for customers that
enter the system is same as that of the service time, that is, Y ∼ exp(μ). In
addition, since there is no waiting for service, Lq = 0 and Wq = 0.
It is customary to consider a special case of the M/M/s/s system which is
when s = ∞. We call that M/M/∞ queue since K takes the default value of
infinite. For the M/M/∞ system, the probability that there are j customers in
the system in the long run is

1
pj = (λ/μ)j e−λ/μ .
j!

In other words, the steady-state number in the system is according to a

Poisson distribution. This result also holds good for the M/G/∞ system (see
Section 4.5.1). Also, L = λ/μ and W = 1/μ.

Having described the M/M/s/K queue and its special cases in detail, next
we move to other CTMC-based queueing systems for which arc cuts are
inadequate and we demonstrate other methodological tools.

2.2 Solving Balance Equations Using Generating Functions

There are many queueing systems where the resulting CTMC balance equa-
tions cannot be solved in closed form using arc cuts (or for that matter
directly using the node balance equations). Under such cases, especially
when the CTMC has a state space with infinite elements, the next option
to try is using generating functions. The reader is encouraged to read gen-
erating function of a random variable in Section A.2.1, if required, before
proceeding further. The best way to explain the use of generating functions
is by means of examples. We first present a short example (although it can
62 Analysis of Queues

be easily solved using arc cuts) for illustration purposes, followed by some
detailed ones.

Problem 11
Consider a CTMC with S = {0, 1, 2, 3, . . .} for which we are interested in
obtaining the steady-state probabilities p0 , p1 , . . . represented using a gen-
erating function. For all i ∈ S and j ∈ S, let the elements of the infinitesimal
generator matrix Q be
⎧
⎪
⎪ α if j = i + 1,
⎨
−α − iβ if j = i,
qij =
⎪ iβ
⎪ if j = i − 1 and i > 0,
⎩
0 otherwise.

Using the balance equations to obtain the generating function

∞

(z) = pi zi .
i=0

Solution
The balance equations can be written for i > 0 as

−pi (α + iβ) + pi−1 α + pi+1 (i + 1)β = 0, (2.2)

and for i = 0 as

−p0 α + p1 β = 0. (2.3)

Now by multiplying Equation 2.2 by zi , Equation 2.3 by z0 , and summing

over all i ≥ 0 we get

∞
∞

pi zi (α + iβ) = pi zi αz + ipi zi−1 β .
i=0 i=0

If this derivation is not clear, it may be better to write down the balance equa-
tions for i = 0, 1, 2, 3, . . ., multiply them by 1, z, z2 , z3 , . . ., and then see how
that results in the previous equation. We can rewrite the previous equation
in terms of (z) as

α(z) + βz
(z) = αz(z) + β
(z),
Exponential Interarrival and Service Times: Closed-Form Expressions 63

where
(z) = d(z)/dz. Upon rearranging terms, we get the differential
equation

α

(z) = (z)
β

which can be solved by dividing the equation by (z) and integrating both
sides with respect to z to get

α
log((z)) = z + c.
β

Using the condition (1) = 1 (why?), the constant c can be obtained as equal
to −α/β. Thus, we obtain the generating function (z) as

(z) = e−α/β e(α/β)z .

Note that in the general case we typically do not compute pi values

explicitly, but can obtain them if necessary taking the ith derivative of (z)
with respect to z, letting z = 0 and dividing by i!. Although that is the general
technique, in Problem 11 we could just write down the infinite sum to get pi
directly. Also, in the general case several performance measures such as the
average number in the system can be computed directly with (z). We will
see that in the following examples as well as a breadth of problems where
generating functions can be used.

2.2.1 Single Server and Inﬁnite Capacity Queue (M/M/1)

Although we have already obtained the steady-state distribution for the
number in the system for the M/M/1 queue in Problem 7, we once again con-
sider it here to further illustrate the generating function technique as well as
cross-check our results.

Problem 12
Consider an M/M/1 queue with arrival rate λ per hour and service rate μ
per hour. For j = 0, 1, 2, . . ., let pj be the steady-state probability that there are
j in the system. Using the balance equations, derive an expression for the
generating function

∞

(z) = pi zi .
i=0

Using that, compute L, Lq , W, and Wq .

64 Analysis of Queues

Solution
The node balance equations are

p0 λ = p1 μ
p1 (λ + μ) = p0 λ + p2 μ

p2 (λ + μ) = p1 λ + p3 μ

p3 (λ + μ) = p2 λ + p4 μ

.. .. ..
. . .

and multiply the first equation by z0 , the second by z1 , the third by z2 , the
fourth by z3 , and so on. Then, upon adding we get

(p0 z0 + p1 z1 + p2 z2 + p3 z3 + · · · )(λ + μ) − p0 μ

= (p0 z0 + p1 z1 + p2 z2 + p3 z3 + · · · )λz
μ p0 μ
+ (p0 z0 + p1 z1 + p2 z2 + p3 z3 + · · · ) − .
z z

Multiplying the previous equation by z and using the relation

∞

(z) = pi zi
i=0

we get

(λ + μ)z(z) − p0 μz = λz2 (z) + μ(z) − p0 μ.

Therefore, we can write

p0 μ(1 − z) p0 μ
(z) = = .
μ − (λ + μ)z + λz2 (μ − λz)

The only unknown in this equation is p0 . Using (0) = p0 would not

result in anything meaningful. However, using (1) = 1 would imply
p0 μ/(μ − λ) = 1, and hence p0 = 1 − λ/μ. It is crucial to note that for p0 to
be a probability, we require that λ < μ, which is the condition for stability.
Hence, when λ < μ we can write

μ−λ 1−ρ
(z) = =
μ − λz 1 − ρz
Exponential Interarrival and Service Times: Closed-Form Expressions 65

where ρ = λ/μ is the traffic intensity. We can write (z) = (1 − ρ)(1 + ρz +

(ρz)2 + (ρz)3 + · · · ) and hence pi = (1 − ρ)ρi for all i ≥ 0, which is consistent
with the results in Problem 7.
To obtain the performance metrics using (z), we first compute L. By
definition we have

L = 0p0 + 1p1 + 2p2 + 3p3 + · · ·

=
(1)

where
(z) = d(z)/dz. Using the expression for (z), we have

(1 − ρ)ρ

(z) = .
(1 − ρz)2

Therefore, we have

ρ λ
L =
(1) = = .
1−ρ μ−λ

Finally, based on the relations W = L/λ, Wq = W − 1/μ, and Lq = λWq , we

have (when λ < μ) the following results:

λ2
Lq = ,
μ(μ − λ)
1
W= ,
μ−λ
λ
Wq = .
μ(μ − λ)

Although we have seen these results before in Problem 7, the main reason
they are presented here is to get an appreciation of the generating function as
an alternate technique. In the next few examples, the flow balance equations
may not be easily solved (also the arc cuts would not be useful) and the
power of generating functions will become clearly evident.

2.2.2 Retrial Queue with Application to Local Area Networks

In the M/M/s/s queue that we studied in Problem 10, if an arriving cus-
tomer finds no available server, that customer is dropped or rejected. Here
we consider a modification of that system where the customer that finds no
server upon arrival, instead of getting dropped, waits for a random time
exponentially distributed with mean 1/θ and retries. At this time of retrial,
if a server is available the customer’s service begins, otherwise the customer
66 Analysis of Queues

retries after another exp(θ) time. This process continues until the customer
is served. This system is called a retrial queue. A popular application of this
is the telephone switch. If there are s lines in a telephone switch and all of
them are busy, the customer making a call gets a message “all lines are busy
please try your call later” and the customer retries after a random time. In
the following example, we consider another application (albeit a much sim-
plified model than what is observed in practice) where s = 1, which is used
in modeling Ethernets with exponential back-off.

Problem 13
Consider a simplified model of the Ethernet (an example of a local area net-
work). Requests arrive according to a Poisson process with rate λ per unit
time on average to be transmitted on the Ethernet cable. If the Ethernet
cable is free, the request is immediately transmitted and the transmission
time is exponentially distributed with mean 1/μ. However, if the cable is
busy transmitting another request, this request waits for an exp(θ) time and
retries (this is called exponential back-off in the networking literature). Note
that every time a retrial occurs, if the Ethernet cable is busy the request
gets backed off for a further exp(θ) time. Model the system using a CTMC
and write down the balance equations. Then obtain the following steady-
state performance measures: probability that the system is empty (i.e., no
transmissions and no backlogs), fraction of time the Ethernet cable is busy
(i.e., utilization), mean number of requests in the system, and cycle time
(i.e., average time between when a request is made and its transmission is
completed).
Solution
The state of the system at time t can be modeled using two variables: X(t)
denoting the number of backlogged requests and Y(t) the number of requests
being transmitted on the Ethernet cable. The resulting bivariate stochastic
process {(X(t), Y(t)), t ≥ 0} is a CTMC with rate diagram given in Figure 2.3.
To explain this rate diagram, consider node (3,0). This state represents three
messages that have been backed off and each of them would retry after

0,0 1,0 2,0 3,0 4,0

λ θ λ 2θ λ 3θ λ 4θ λ
μ μ μ μ μ

0,1 1,1 2,1 3,1 4,1

λ λ λ λ

FIGURE 2.3
Rate diagram for the retrial queue.
Exponential Interarrival and Service Times: Closed-Form Expressions 67

exp(θ) time. Hence, the first of them would retry after exp(3θ) time result-
ing in state (2,1). Also, a new request could arrive when the system is in
state (3,0) at rate λ which would result in a new state (3,1). Note that in state
(3,0), there are no requests being transmitted. Now consider state (3,1). A
new request arrival at rate λ would be backed off resulting in the new state
(4,1); however, a transmission completion at rate μ would result in the new
state (3,0). Note that a retrial in state (3,1) would not change the state of the
system and is not included. However, even if one were to consider the event
of retrial, it would get canceled in the balance equations and hence we need
not include it.
Although one could write down the node balance equations, it is much
simpler when we consider arc cuts. Specifically, cuts around nodes (i, 0) for
all i would result in the following balance equations:

p0,0 λ = p0,1 μ

p1,0 (λ + θ) = p1,1 μ

p2,0 (λ + 2θ) = p2,1 μ

p3,0 (λ + 3θ) = p3,1 μ

.. .. ..
. . .

and vertical cuts on the rate diagram would result in the following balance
equations:

p0,1 λ = θp1,0

p1,1 λ = 2θp2,0

p2,1 λ = 3θp3,0

p3,1 λ = 4θp4,0

.. .. ..
. . .

and we will leave as an exercise for the reader to solve for p0,0 using the
previous set of equations. We consider using generating functions. Let 0 (z)
and 1 (z) be defined as follows:

∞

0 (z) = pi,0 zi
i=0
68 Analysis of Queues

and
∞

1 (z) = pi,1 zi .
i=0

Also, define the derivative

∞
d0 (z)
0
(z) = = ipi,0 zi−1 .
dz
i=0

For the first set of balance equations, if we multiply the first equation by z0 ,
the second by z1 , the third by z2 , the fourth by z3 , and so on, then upon
adding we get

λ0 (z) + zθ0

(z) = μ1 (z). (2.4)

Likewise, if we multiply the first equation in the second set of balance equa-
tions by z0 , the second equation by z1 , the third by z2 , the fourth by z3 , and
so on, then upon adding we get

λ1 (z) = θ0

(z). (2.5)

Rearranging Equations 2.4 and 2.5, we get

λ
1 (z) = 0 (z), (2.6)
μ − λz

λ θz θ
0 (z) + 0
(z) = 0
(z). (2.7)
μ μ λ

Using Equation 2.6 and the fact that 0 (1) + 1 (1) = 1, we get

λ
1 (1) = = 1 − 0 (1)
μ

provided λ < μ which is the condition for stability. Therefore, if λ < μ, the
utilization or fraction of time Ethernet cable is busy is 1 (1) = λ/μ. Hence, the
fraction of time the Ethernet cable is idle in steady state is 1 − λ/μ. Also, if we
can solve for 0 (z) in the (differential) Equation 2.7, then we can immediately
compute 1 (z) using Equation 2.6. This is precisely what we do next.
Letting y = 0 (z), we can rewrite Equation 2.7 as

1 λ/μ
dy = dz.
y (θ/λ) − (θ/μ)z
Exponential Interarrival and Service Times: Closed-Form Expressions 69

Integrating both sides of this equation, we get

λ θ θz
log(y) + k = − log − ,
θ λ μ

where k is a constant that needs to be determined. But since 0 (1) = 1 − λ/μ

(i.e., if z = 1 then y = 1−λ/μ), we have k = −log(1−λ/μ)−(λ/θ) log(θ/λ−θ/μ).
Hence, we have

−λ/θ λ/θ
λ 1 z 1 1
0 (z) = 1 − − −
μ λ μ λ μ
(λ/θ)+1 λ/θ
λ μ
= 1− .
μ μ − λz

Also using Equation 2.6, we have

λ/θ
λ λ (λ/θ)+1 μ
1 (z) = 1− .
μ − λz μ μ − λz

Next, we obtain the performance metrics using 0 (z) and 1 (z). We have
already computed the utilization of the Ethernet cable. The probability that
the system is empty with no transmissions or backlogs is p0,0 , which is equal
to 0 (0), and hence

(λ/θ)+1
λ
p0,0 = 1− .
μ

The mean number of backlogged requests in the system is

! "
λ2 μ + θ
0
(1) + 1
(1) =
θμ μ − λ

and the mean number of requests being transmitted in the system is λ/μ
(i.e., the cable utilization). Therefore, the mean number of requests in the
system (L) is

! "
λ λ2 μ + θ λ(λ + θ)
L= + = .
μ θμ μ − λ θ(μ − λ)
70 Analysis of Queues

Hence, using Little’s law, the cycle time (W) which is the average time
between when a request is made and its transmission is completed, is
λ+θ
W= .
θ(μ − λ)

2.2.3 Bulk Arrival Queues (M[X] /M/1) with a Service System Example
So far we have only considered the case of individual arrivals. However, in
practice it is not uncommon to see bulk arrivals into a system. For example,
arrivals into theme parks is usually in groups and arrivals as well as service
in restaurants is in groups. We do not consider bulk service in this text; the
reader is referred to other books in the queueing literature on that subject.
We only discuss the single server bulk arrival queue here.
Consider an infinite-sized queue with a single server (with service times
exp(μ)). Arrivals occur according to a Poisson process with average rate λ
per unit time. Each arrival brings a random number X customers into the
queue. The server processes the customers one by one taking an independent
exp(μ) time for each. This system is called an M[X] /M/1 queue. Let ai be
the probability that arriving batch is of size i, that is, ai = P{X = i} for i > 0.
The generating function of the probability mass function (PMF) of X is φ(z),
which is given by
∞
∞

φ(z) = E[zX ] = P(X = i)zi = ai zi .
i=1 i=1

Note that φ(z) is either given or can be computed since ai is known. In addi-
tion, we can compute E[X] = φ
(1) (i.e., the derivative of φ(z) with respect to
z at z = 1) and E[X(X − 1)] = φ

(1) (i.e., the second derivative of φ(z) with

respect to z at z = 1) from which we can derive E[X2 ] and thereby Var[X].
With that we present Problem 14.

Problem 14
Consider a single server fast-food restaurant where customers arrive in
groups according to a Poisson process with rate λ per unit time on aver-
age. The size of each group is independent and identically distributed with
a probability ai of having a batch of size i (with generating function φ(z)
described earlier). Customers are served one by one, even though they may
have arrived in batches, and it takes the server an exp(μ) time to serve each
customer. Model the system using a CTMC and write down the balance
equations. Define (z) as the generating function
∞

(z) = pj zj ,
j=0
Exponential Interarrival and Service Times: Closed-Form Expressions 71

where pj is the steady-state probability of j customers being in the system.

Derive an expression for (z). Then obtain L (the long-run average number
of customers in the system) and W (the average time spent by each customer
in the system in steady state).
Solution
Let X(t) denote the number of customers in the system at time t. The stochas-
tic process {X(t), t ≥ 0} is a CTMC with rate diagram given in Figure 2.4. To
explain this rate diagram, consider node 1. This state represents one cus-
tomer in the system that finishes service in exp(μ) time resulting in state
0. However, if a batch of size i arrives the resulting state is i + 1, and this
happens at rate λai .
Using the rate diagram, we write down the node balance equations as

p0 λ = μp1

p1 (λ + μ) = μp2 + λa1 p0

p2 (λ + μ) = μp3 + λa1 p1 + λa2 p0

p3 (λ + μ) = μp4 + λa1 p2 + λa2 p1 + λa3 p0

p4 (λ + μ) = μp5 + λa1 p3 + λa2 p2 + λa3 p1 + λa4 p0

.. .. ..
. . .

and multiply the first equation by z0 , the second by z1 , the third by z2 , the
fourth by z3 , and so on. Then, upon adding we get

μ μ
λ(z) + μ(z) − μp0 = (z) − p0 + λa1 z(z)
z z
+ λa2 z2 (z) + λa3 z3 (z) + · · · .

λa4
λa4
λa3 λa3 λa3
λa2 λa2 λa2 λa2
λa1 λa1 λa1 λa1 λa1
0 1 2 3 4
μ μ μ μ μ

FIGURE 2.4
Rate diagram for the M[X] /M/1 queue.
72 Analysis of Queues

Since φ(z) = a1 z + a2 z2 + a3 z3 + · · · , we can write down the previous

equation as

λz(z) + μz(z) − μzp0 = μ(z) − μp0 + λz(z)φ(z)

which can be rewritten as

μp0 (1 − z)
(z) = .
μ(1 − z) + λz(φ(z) − 1)

The only unknown in the previous equation is p0 . Similar to the previ-

ous examples, here too (0) = p0 does not yield anything useful in terms of
solving for p0 . But we can use the normalizing equation which corresponds
to (1) = 1. However, since φ(1) = 1, we would get a 0/0 type expression by
substituting z = 1. Therefore, we first consider A(z) defined as

λzφ(z) − λz
A(z) =
1−z

so that
μp0
(z) = .
μ + A(z)

Therefore, to compute (1) it would suffice if we obtained A(1), which we

do as follows:

λzφ(z) − λz
A(1) = lim = −λφ
(1)
z→1 1−z

where the last equality is using L’Hospital’s rule since A(z) would also
result in a 0/0 form by substituting z = 1. However, we showed earlier that
φ
(1) = E[X] and hence A(1) = − λE[X]. Thus, (1) = 1 implies that

1 − λE[X]
p0 =
μ

provided λE[X] < μ. The condition λE[X] < μ is necessary for stability, and
it is intuitive since λE[X] is the effective average arrival rate of customers.
Thus, we have

μ(1 − λE[X]/μ)(1 − z)
(z) =
μ(1 − z) + λz(φ(z) − 1)

when λE[X]/μ < 1.

Exponential Interarrival and Service Times: Closed-Form Expressions 73

From the previous expression, L can be computed as

L =
(1)
d(z)
= lim dz
z→1
−(μ(1−z)+λz(φ(z)−1))−(1−z)(−μ+λ(φ(z)−1)+λzφ
(z))
= μp0 lim {μ(1−z)+λz(φ(z)−1)}2
z→1
−μ(1−z)−λzφ(z)+λz+(1−z)μ−λφ(z)+λ+λzφ(z)−λz−λzφ
(z)+λz2 φ
(z)
= μp0 lim {μ(1−z)+λz(φ(z)−1)}2
z→1
−λφ(z)+λ−λzφ
(z)+λz2 φ
(z)
= μp0 lim {μ(1−z)+λz(φ(z)−1)}2
.
z→1

However, taking the limit results in a 0/0 format, we use L’Hospital’s rule
and continue as follows:
# $
−λφ
(z) − λφ
(z) − λzφ

(z) + 2λzφ
(z) + λz2 φ

(z)
L = μp0 lim
z→1 2{μ(1 − z) + λz(φ(z) − 1)}{−μ + λφ(z) − λ + λzφ
(z)}

2λ(z − 1)φ
(z) + λz(z − 1)φ

(z)
= μp0 lim
z→1 2{μ(1 − z) + λz(φ(z) − 1)}{−μ + λφ(z) − λ + λzφ
(z)}
⎛ ⎞

2λφ (z) + λzφ (z)

= μp0 lim ⎝ % & ⎠
z→1 2 −μ + λz φ(z)−1 {−μ + λφ(z) − λ + λzφ
(z)}
z−1

2λφ
(1) + λφ

(1)
= μp0 .
2{−μ + λE[X]}{−μ + λφ(1) − λ + λφ
(1)}

The last equation uses the result we earlier derived namely limz→1 (φ(z) − 1)/
(z − 1) = − A(1) = E[X]. Also since E[X] = φ
(1), E[X2 ] − E[X] = φ

(1), and
realizing μp0 = μ − λE[X], we can rewrite L as
# $
2λE[X] + λE[X2 ] − λE[X]
L = (μ − λE[X])
2{−μ + λE[X]}{−μ + λE[X]}

λE[X] + λE[X2 ]
= .
2{μ − λE[X]}

To compute W, we use Little’s law (note that the effective customer

entering rate is λE[X]) to obtain

λE[X] + λE[X2 ]
W= .
2λE[X]{μ − λE[X]}
74 Analysis of Queues

Note that although (z) requires the entire distribution of X, in order to

compute L and W all we need are E[X] and E[X2 ]. So if one is only interested
in L and W, instead of estimating ai for all i, one could just compute the mean
and variance of the batch size (besides λ and μ).

2.2.4 Catastrophic Breakdown of Servers in M/M/1 Queues

Server breakdown and repair issues have been well studied in the queueing
literature by modeling the server being in two states, working or under repair.
However, in those models the effect of machines breaking down is only felt in
terms of customers getting delayed. Here we consider a situation where when
a server breaks down, all the customers are ejected from the system and while
the machine is down no new customers enter the system. For that reason,
this is termed as catastrophic breakdown. Such catastrophic breakdowns are
especially typical in computer systems like web servers. Consider a web
server with a single queue where the entities are requests for various files
that are stored in the server. These requests are from browsers and when the
requests are satisfied, the web page gets displayed at the browser. From a
queueing standpoint, the request arrivals to the web server queue correspond
to customer arrivals. And the process of finding or creating a file as well as
transmitting it back to the browser corresponds to a service.
We have all experienced the inability to reach a web server when we type
a URL on our browsers. If the reason is due to a problem at the server’s end,
then it is perhaps due to a catastrophic breakdown. They could be due to
the server shutting down when the room is over-heated, a serious power
outage, a denial-of-service attack, or something more benign like losing con-
nectivity to the network, or a congestion in the network. When one of this
happens, the web server loses all its pending requests in the system and also
new requests cannot enter. As described in the previous paragraph, this is
termed as catastrophic breakdown. Next, we model such a system as a queue
with catastrophic breakdowns and repairs of servers by stating the problem
and providing a solution. This is based on Gautam [42] which also provides
additional details in terms of pricing, performance, and availability.

Problem 15
As an abstract model, we have a single server queue where customers arrive
according to a Poisson process with mean arrival rate λ. Service times are
exponentially distributed with mean 1/μ. There is infinite room for requests
to wait. The server stays “on” for a random time distributed exponentially
with mean 1/α after which a catastrophic breakdown occurs. When the
server turns off (i.e., breaks down), all customers in the system are ejected.
Note that the server can break down when there are no customers in the sys-
tem. The server stays off for a random time distributed exponentially with
mean 1/β. No requests can enter the system when the server is off (typically
Exponential Interarrival and Service Times: Closed-Form Expressions 75

after a time-out the client browser would display something to the effect of
unable to reach server). Model the system as a CTMC, obtain steady-state prob-
abilities, and performance measures such as average number in the system,
fraction of requests lost, and sojourn time.
Solution
Note that the system behaves as an M/M/1 queue when the server is on, and
the system is empty when the server is off. The server toggles between on
and off states irrespective of the queueing process. We model the system as a
CTMC. Let X(t) = i (for i = 0, 1, 2, 3, . . .) imply that there are i requests in the
system and the server is on at time t. In addition, let X(t) = D denote that the
server is down (and there are no customers) at time t. Clearly, {X(t), t ≥ 0} is
a CTMC with rate diagram shown in Figure 2.5. The CTMC is ergodic, and
for j = D, 0, 1, 2, . . ., let

pj = lim P{X(t) = j}.

t→∞

To obtain the steady-state probabilities pj , consider the following balance

equations:

α(p0 + p1 + · · · ) = βpD

βpD + μp1 = (λ + α)p0

μp2 + λp0 = (λ + α + μ)p1

μp3 + λp1 = (λ + α + μ)p2

μp4 + λp2 = (λ + α + μ)p3

.. .. ..
...

λ λ λ λ λ λ
0 1 2 3 4 5
μ μ μ μ μ μ
α α
α α
α
α
D
β

FIGURE 2.5
Rate diagram of the CTMC. (From Gautam, N., J. Revenue Pricing Manag., 4(1), 7, 2005.)
76 Analysis of Queues

From the first equation, we have pD = α/(α + β) since p0 + p1 + · · · = 1 − pD .

Multiplying the second equation by 1, third equation by z, fourth equation
by z2 , and so on, and summing up we get

μ(ψ(z) − p0 )
βpD + + λzψ(z) = (λ + α + μ)ψ(z) − μp0
z

where ψ(z) = p0 + p1 z + p2 z2 + p3 z3 + p4 z4 + · · · (note that unlike typi-

cal moment generation functions, here ψ(1) = 1 − pD ). We can rearrange the
terms and write down ψ(z) as

μp0 − zβpD − p0 μz
ψ(z) = . (2.8)
μ + λz2 − λz − αz − μz

Since we already know pD = α/(α + β), the only unknown in Equation 2.8
is p0 . However, standard techniques such as ψ(0) = p0 and ψ(1) = β/(α + β)
do not yield a solution for p0 . Hence, we need to do something different to
determine p0 and thereby ψ(z).
Note that since ψ(z) is p0 + p1 z + p2 z2 + p3 z3 + p4 z4 + · · · , it is a continuous,
differentiable, bounded, and increasing function over z ∈ [0, 1]. However,
from Equation 2.8, ψ(z) is of the form φ(z) = A(z)/B(z), where A(z) and B(z)
are polynomials corresponding to the numerator and denominator of the
equation. If there exists a z∗ ∈ [0, 1] such that B(z∗ ) = 0, then A(z∗ ) = 0 (other-
wise it violates the condition that ψ(z) is a bounded and increasing function
over z ∈ [0, 1]). We now use the previous realization to derive a closed-form
algebraic expression for ψ(z).
By setting the denominator of ψ(z) in Equation 2.8 to zero, we get

'
∗ (λ + μ + α) − (λ + μ + α)2 − 4λμ
z = ,
2λ

as the unique solution such that z∗ ∈ [0, 1]. Setting the numerator of ψ(z) in
Equation 2.8 to zero at z = z∗ , we get

αβz∗
p0 = .
(α + β)μ(1 − z∗ )

By substituting for z∗ , we get p0 as

'
αβ λ+μ+α− (λ + μ + α)2 − 4λμ
p0 = ' . (2.9)
μ(α + β) λ − μ − α + (λ + μ + α)2 − 4λμ
Exponential Interarrival and Service Times: Closed-Form Expressions 77

Also, by rearranging terms in Equation 2.8, we get the function ψ(z) as

μp0 (1 − z) − zαβ/(α + β)
ψ(z) = . (2.10)
λz2 − (λ + μ + α)z + μ

Based on this equation, we can also derive an asymptotic result. We

let the server up- and downtimes to be extremely large (especially in com-
parison to the arrival and service rates). Then we can remark that if λ < μ,
α → 0 and β → 0 such that α/β → r, then pD = r/(1 + r) and for i = 0, 1, 2, . . .,
pi = (1 − pD )(1 − λ/μ)(λ/μ)i . This statement confirms our intuition that since
the system reaches steady state when the server is up, the probability that
there are i requests in steady state is the product of the probability that the
server is up and the probability there are i requests in steady state given the
server is up.
Next, we derive few steady-state performance measures, namely, aver-
age number in the system, fraction of requests lost, and sojourn time (also
known as response time) for the nonasymptotic case. Let P be the probabil-
ity that a request is lost and let W be the average sojourn time (or response
time) at the server as experienced by users that receive a response. Note
that the latter is a conditional expected value, conditioned on receiving a
response (this is analogous to the M/M/s/K queue case where W was com-
puted only for those customers that entered). Note that measures P and W
are such that the lower their values, the better the quality of service expe-
rienced by the users. Both measures are in terms of L, the time-averaged
number of requests in the system in the long run (note that it includes the
downtimes when there are no requests in the system) which we derive first.
By definition

L = 0pD + 0p0 + 1p1 + 2p2 + 3p3 + · · · ,

and clearly that can be written as L = ψ

(1). By taking the derivative of
ψ(z) in Equation 2.10, and then letting z = 1, we get the average number of
requests in the web server system as

! "
1 λβ − μβ + p0 μ(α + β)
L= ,
α α+β

where p0 is described in Equation 2.9.

The number of requests that are dropped per unit time in steady state is
α(1p1 + 2p2 + 3p3 + · · · ) = αL. Therefore, the fraction of requests that entered
the queue and were dropped when the server switched from on to off is
αL/λ(1 − pD ). The probability that an arriving request will complete service,
78 Analysis of Queues

given that it arrived when the server was up, is given by (conditioning on
the number of requests seen upon arrival)
∞
j+1
pj μ μ 1 μ
= ψ
1 − pD μ+α μ + α 1 − pD μ+α
j=0

μ β − p0 (α + β)
= .
1 − pD λ(α + β)

Therefore, the rate at which requests exit the queue is μ(β)/(α + β) − μp0 ,
which makes sense as whenever there are one or more requests in the system,
the exit rate is μ. In addition, since the drop rate (derived earlier) is αL, we
can write μ((β)/(α + β)) − μp0 = λ(1 − pD ) − αL, which again makes sense
and the total arrival rate when web server is on is λ(1 − pD ).
We also have a fraction pD requests that are rejected when the server is
down. Therefore, the loss probability is (λpD + αL)/λ, and by substituting
for pD we obtain P in terms of L as

αL(α + β) + λα
P = .
λ(α + β)

Using Little’s law, we can derive W in the following manner. The expected
number of requests in the system when the server is on is L/(1 − pD ). In
steady state, of these requests a fraction (λ(1 − pD ) − αL)/λ(1 − pD ) only
will receive service. Therefore, the average sojourn time (or response time)
at the server as experienced by users that receive a response is given by
L/λ(1 − pd )2 , which yields W in terms of L as

L(α + β)2
W= .
λβ2

2.3 Solving Balance Equations Using Reversibility

There are some CTMCs, especially those with multidimensional and finite
state space, where balance equations cannot be solved using either arc cuts
or generating functions. Yet in some instances, the solutions to the balance
equations have a tractable closed-form structure. One such class of problems
that lend to elegant algebraic expressions is when the CTMC is reversible.
We first explain what a reversible process is, then go over some properties,
and finally using an example application illustrate the methodology of
solving balance equations using reversibility.
Exponential Interarrival and Service Times: Closed-Form Expressions 79

2.3.1 Reversible Processes

We first explain a reversible process in words. Consider observing the states
of a system continuously over time and videotaping it. Now, if one were
unable to statistically distinguish between running the videotape forward
versus backward, then the states of the observed system form a reversible
process. This is explained mathematically next.
At time t, let X(t) be the state of the system described earlier that is con-
tinuously observed, for all t ∈ (−∞, ∞). If stochastic process {X(t), −∞ <
t < ∞} is stochastically identical to the process {X(τ − t), −∞ < t < ∞} for
all τ ∈ (−∞, ∞), then {X(t), −∞ < t < ∞} is a reversible process. The process
{X(τ − t), −∞ < t < ∞} for any τ ∈ (−∞, ∞) is known as the reversed pro-
cess at τ. Although this is the definition of a reversed process, it is usually
hard to show a CTMC is reversible based on that directly. Instead we resort
to one of the properties of reversible processes that are especially applied to
CTMCs.
The first property is an easy check to see if a CTMC is not reversible.
In the rate diagram if there is an arc from node i to node j of the CTMC,
then there must be an arc from j to i as well for the CTMC to be reversible.
This is straightforward because only if you can go from j to i in the forward
video, you can go from i to j in the reversed video. Note that this is neces-
sary but not sufficient. For example, the CTMC corresponding to the rate
diagram in Figure 2.1 is not reversible because it is possible to only go from
0 to 2 but not 2 to 0 in one transition. However, the CTMC corresponding
to the M/M/s/K queue depicted in Figure 2.2 is such that this necessary
condition is satisfied. The next result would help us decide if it is indeed
reversible.
An ergodic CTMC {X(t), −∞ < t < ∞} with state space S, infinitesi-
mal generator matrix Q (with qij being the ijth element of Q), and limiting
distribution {pi , i ∈ S} is reversible if and only if for all i ∈ S and j ∈ S

pi qij = pj qji .

Note how this also requires the necessary condition which can also be math-
ematically shown that if qij is zero or nonzero, then qji is also respectively
zero or nonzero (since pi and pj are nonzero). It is not necessary for i to be a
scalar; it just represents a possible value that X(t) can take.
It is worthwhile to note that the CTMC corresponding to the M/M/s/K
queue for any s and K (as long as the queue is stable) is reversible. In essence,
the condition pi qij = pj qji is identical to either the balance equation corre-
sponding to arc cuts between successive nodes or corresponding to the case
qij = qji = 0. In essence, any one-dimensional birth and death process that is
ergodic would be reversible (for the same reason as the M/M/s/K queue).
However, it is a little tricky to check if other CTMCs (that satisfy the neces-
sary condition) are reversible. To address this shortcoming, we next explain
80 Analysis of Queues

some properties of reversible processes that can be used to determine if a

given CTMC is reversible and how to compute its steady-state probabilities.

2.3.2 Properties of Reversible Processes

We now describe two properties of reversible processes that are useful in
checking if a process is reversible. Here we just describe the properties, but
in the next section we will illustrate using an example the power and benefit
of reversible processes which would not be apparent here. The first of the
two properties essentially extends one-dimensional reversible processes to
multiple dimensions.

Remark 6

Joint processes of independent reversible processes are reversible.

Mathematically the previous property can be interpreted as follows. Let

{X1 (t), −∞ < t < ∞}, {X2 (t), −∞ < t < ∞}, . . . , {Xn (t), −∞ < t < ∞} be n
independent reversible process. Then the joint process

{(X1 (t), X2 (t), . . . , Xn (t)), −∞ < t < ∞}

is reversible. In fact, the steady-state probabilities would also just be the

product of those of the corresponding states of the individual reversible
processes. Although the previous result is fairly general, from a practi-
cal standpoint, if {Xi (t), −∞ < t < ∞} for all i are one-dimensional birth
and death processes (which is usually easy to verify as being reversible),
then {(X1 (t), X2 (t), . . . , Xn (t)), −∞ < t < ∞} is an n-dimensional birth and
death process which is also reversible. It is easy to check that the resulting
steady-state probabilities satisfy the condition pi qij = pj qji (where i and j are
n-dimensional). As described earlier, we will illustrate an application of this
in the following section. Now we move on to the second property.
Consider a reversible CTMC {X(t), −∞ < t < ∞} with infinitesimal gen-
erator Q = [qij ] defined on state space S and steady-state probabilities
pj that the CTMC is in state j for all j ∈ S. Now consider another
CTMC {Y(t), −∞ < t < ∞} which is a truncated version of {X(t), −∞ < t < ∞}
defined on state space A such that A ⊂ S. By truncated we mean that
we first perform an arc cut described in Section 2.1 to the rate diagram of
{X(t), −∞ < t < ∞} to separate the state space S into A and S − A. Then, if
we remove all arcs going from A to S − A, the resulting connected graph on
set A would be the rate diagram of the truncated CTMC {Y(t), −∞ < t < ∞}.
In other words, for all i and j such that i = j and i ∈ A as well as j ∈ A, the
rate of going from state i to j is qij . Only the diagonal elements would change
Exponential Interarrival and Service Times: Closed-Form Expressions 81

in the CTMC {Y(t), −∞ < t < ∞}. We next describe a result characterizing the
truncated CTMC {Y(t), −∞ < t < ∞}.

Remark 7

Truncated processes of reversible processes are reversible.

This remark essentially says that {Y(t), −∞ < t < ∞} described ear-
lier is reversible. Next is to obtain the steady-state distribution of
{Y(t), −∞ < t < ∞}. For the reversible process {X(t), −∞ < t < ∞}, we have
pi qij = pj qji for all i ∈ A and j ∈ A. Also, since qij for {X(t), −∞ < t < ∞}
and {Y(t), −∞ < t < ∞} are identical, the steady-state probability that the
CTMC {Y(t), −∞ < t < ∞} is in state j is proportional to pj . However, using
the normalizing condition that all the steady-state probabilities must add to
1, we have the steady-state probability that the CTMC {Y(t), −∞ < t < ∞} is
in state j as
pj

k∈A pk

for all j ∈ A.
As an illustration of this, consider the M/M/s queue. Let λ < sμ where λ
and μ are, respectively, the arrival and service rates. If X(t) is the number
of customers in the system at time t, then {X(t), −∞ < t < ∞} is a reversible
M/M/s
CTMC on state space S = {0, 1, 2, 3, . . .} with steady-state probabilities pj
given in Section 2.1. Now the truncated process {Y(t), −∞ < t < ∞}, where
Y(t) is the number of customers in the system in an M/M/s/K queue, is also
reversible and defined on state space A = {0, 1, . . . , K}. Verify from the results
M/M/s/K
in Section 2.1 that the steady-state probabilities pj satisfy

M/M/s
M/M/s/K
pj
pj = M/M/s
K
i=0 pi

for j = 0, 1, . . . , K.

2.3.3 Example: Analysis of Bandwidth-Sensitive Trafﬁc

Consider the access link between a web server farm and the Internet. This
link is usually a bottleneck considering the amount of bandwidth-sensitive
traffic it transmits onto the Internet. Let the capacity of the link be C kbps. We
consider N classes of connections. Class i connection requests arrive accord-
ing to a Poisson process at rate λi . The holding time for class i requests is
82 Analysis of Queues

according to exp(μi ). During the entire duration of the connection, each class
i request uses bi kbps of bandwidth. Note that these applications are usu-
ally real-time multimedia traffic and we assume no buffering takes place. In
the traditional telephone network, we have N = 1 class, and each call uses 60
kbps with C/b1 being the number of lines a telephone switch could handle.
This problem is just a multiclass version of that.
Let Xi (t) be the number of ongoing class i connections at time t across
the bottleneck link under consideration. Clearly, there is a constraint at all
times t:

b1 X1 (t) + b2 X2 (t) + · · · + bN XN (t) ≤ C.

In this example, we use a complete sharing admission policy; by this we

mean that if a request of class i seeks admission, we admit the request if the
available capacity at the time of request exceeds bi kbps. In other words, if a
class i request arrives at time t and bi + b1 X1 (t) + b2 X2 (t) + · · · + bN XN (t) > C,
then we reject the request. However, if bi + b1 X1 (t) + b2 X2 (t) + · · · +
bN XN (t) ≤ C, we accept the request. A more general version of this prob-
lem is to devise an admission control policy to decide based on the state
of the system and the class of the request whether or not to admit it (sub-
ject to satisfying the bandwidth constraint described earlier). This is known
as thestochastic knapsackproblem analogous to the deterministic equivalent
max{ i ai yi } subject to i bi yi < C, where ai is the reward for accepting a
class i entity into the knapsack.
In this example, our objective is not to derive the optimal policy, but
given a policy we would like to evaluate its performance. In other words,
given a policy we could compute the average number of class i requests in the
system in steady state, the fraction of class i requests rejected in the long run,
time-averaged fraction of capacity C used, etc. For the performance analysis,
we only consider the complete sharing policy, that is, admit a request of
any class as long as there is bandwidth available for that class at the time
of arrival. The stochastic process {(X1 (t), X2 (t), . . . , XN (t)), −∞ < t < ∞} is a
CTMC with an N-dimensional state space constrained by b1 X1 (t) + b2 X2 (t) +
· · · + bN XN (t) ≤ C at all t. Let us call this constrained state space A. We next
describe a methodology based on the reversibility properties to derive the
steady-state distribution for the stochastic process.
Let C = ∞, then the resulting N-dimensional CTMC would be uncon-
strained with state space S = ZN + , essentially the N-dimensional grid of
nonnegative integers. Clearly, our original CTMC {(X1 (t), X2 (t), . . . , XN (t)),
−∞ < t < ∞} on A is a truncated process of this process with state space
S. Further, this N-dimensional CTMC on S is just a joint process of N-
independent processes (corresponding to each of the N classes). Therefore,
when C = ∞, let Xi∞ (t) be the number(of class i connections )
in the system
at time t. Then, the stochastic process Xi∞ (t), −∞ < t < ∞ is a reversible
Exponential Interarrival and Service Times: Closed-Form Expressions 83

% &
CTMC which is independent of other CTMCs Xj∞ (t), −∞ < t < ∞ for i = j.
( )
In addition, Xi∞ (t), −∞ < t < ∞ is the queue length process of an M/M/∞
queue with arrival rate λi and service rate μi for each server. The steady-state
probabilities for this queue are for i = 1, 2, . . . , N and j = 0, 1, . . .

j
λi 1
pij (∞) = e−λi /μi
μi j!

where ∞ in pij (∞) denotes C = ∞.

(
Working our way backward, the joint process (X1∞ (t), X2∞ (t), . . . , XN
∞ (t)) ,
( ∞ )
−∞ < t < ∞} of independent reversible processes Xi (t), −∞ < t < ∞ is a
reversible process with joint distribution

( )
P X1∞ (t) = x1 , X2∞ (t) = x2 , . . . , XN
∞
(t) = xN
⎛ ⎞
N N xi
*
− λi /μi λi 1
= p1x1 (∞)p2x2 (∞) . . . pN
xN (∞) = ⎝e i=1 ⎠ .
μi xi !
i=1

( ∞ )
Since the joint process X1 (t), X2∞ (t), . . . , XN
∞ (t) , −∞ < t < ∞ is a
reversible process, its truncated process {(X1 (t), X2 (t), . . . , XN (t)), −∞ < t <
∞} is also reversible with the steady-state probability of having x1 class-1
connections, x2 class-2 connections, . . ., xN class-N connections as

N N xi
*
− λi 1
px1 ,x2 ,...,xN ∝ e i=1 λi /μi
μi xi !
i=1

such that b1 x1 + b2 x2 + · · · + bN xN ≤ C. Hence, we can rewrite the steady-state

probability as

N xi
* λi 1
px1 ,x2 ,...,xN = R (2.11)
μi xi !
i=1

subject to b1 x1 + b2 x2 + · · · + bN xN ≤ C, where R is the normalizing constant

which can be obtained by solving

px1 ,x2 ,...,xN = 1.
x1 ,x2 ,...xN :b1 x1 +b2 x2 +···bN xN ≤C
84 Analysis of Queues

In other words
⎡ ⎤−1
N xi
* λi 1⎦
R=⎣ . (2.12)
μi xi !
x1 ,x2 ,...xN :b1 x1 +b2 x2 +···bN xN ≤C i=1

It is important to realize that although we have a closed-form solution for

the steady-state probabilities, there is a significant computational overhead
in obtaining R. There are several researchers who have described ways to
efficiently obtain R. One such technique is to iteratively compute R for
C = 0, 1, . . . , C assuming C is an integer. However, for the purpose of this
book we do not go into the details.
Next, we present a numerical example to illustrate the methodology and
showcase its capability.

Problem 16
Consider a channel with capacity 700 kbps on which two classes of
bandwidth-sensitive traffic can be transmitted. Class-1 uses 200 kbps band-
width and class-2 uses 300 kbps bandwidth. Let λ1 and λ2 be the parameters
of the Poisson processes corresponding to the arrivals of the two class. Also,
let each admitted request spend exp(μi ) time holding onto the bandwidth
they require for i = 1, 2. Let X1 (t) and X2 (t) be the number of ongoing class-1
and class-2 requests at time t. Model the CTMC {(X1 (t), X2 (t)), t ≥ 0} and
obtain its steady-state probabilities.
Solution
Note that this is a special case when N = 2, C = 700, b1 = 200, and b2 = 300.
The CTMC {(X1 (t), X2 (t)), t ≥ 0} can be modeled as the rate diagram in
Figure 2.6. Since we have the constraint 200X1 (t) + 300X2 (t) ≤ 700, the

μ1 2μ1 3μ1
0,0 1,0 2,0 3,0
λ1 λ1 λ1
λ2 λ2 λ2
μ2 μ2 μ2
μ1 2μ1
0,1 1,1 2,1
λ1 λ1
λ2
2μ2

0,2

FIGURE 2.6
Arc cut example.
Exponential Interarrival and Service Times: Closed-Form Expressions 85

state space is {(0,0), (0,1), (0,2), (1,0), (1,1), (2,0), (2,1), (3,0)}. In order to fully
appreciate the power of using reversibility results, the reader is encouraged
to solve for p0,0 , p0,1 , p0,2 , p1,0 , p1,1 , p2,0 , p2,1 , and p3,0 using the node balance
equations.
Now, using Equation 2.12, we can obtain the normalizing constant as

1
R= .
λ1 λ21 λ2 λ31 λ22 λ1 λ2 λ21 λ2
1+ + + + + + +
μ1 2μ21 μ2 6μ31 2μ22 μ1 μ2 2μ21 μ2

Therefore, we can derive the steady-state probabilities as

λ2 λ22 λ1
p0,0 = R, p0,1 = R, p0,2 = R, p1,0 = R,
μ2 2μ22 μ1

λ1 λ 2 λ21 λ21 λ2 λ31

p1,1 = R, p2,0 = R, p2,1 = R, p3,0 = R.
μ1 μ2 2μ21 2μ21 μ2 6μ31

Reference Notes
The underlying theme of this chapter, as the title suggests, is queues that
can be modeled and analyzed using CTMCs. The sources for the three main
thrusts, namely, arc cuts for birth and death processes, generating functions,
and reversibility, have been different. In particular, the M/M/s/K queue
and special cases are mainly from Gross and Harris [49]. The section on
using generating functions is largely influenced by Kulkarni [67], Medhi [80],
and Prabhu [89]. The reversibility material is presented predominantly from
Kelly [59]. The exercise problems are essentially a compilation of homework
and exam questions in courses taught by the author over the last several
years. However, a large number of the exercise problems were indeed
adapted from some of the books described earlier.

Exercises
2.1 Compute the variance of the number of customers in the system in
steady state for an M/M/s/K queue.
2.2 Consider an M/M/s/K queue with λ = 10 and μ = 1.25. Write a com-
puter program to plot pK and W for various values of K from s to
s + 19. Consider two cases (i) s = 10 and (ii) s = 5.
86 Analysis of Queues

2.3 The second factorial moment of an integer-valued random variable

Z is defined as E[Z(Z − 1)] and the second moment of a continuous
random variable Y is defined as E[Y2 ]. Consider an M/M/1 queue.
Let L(2) be the second factorial moment of the number in the system
and let W2 be the second moment of the time in the system experi-
enced by a customer in steady state (assume ρ < 1). Verify based on
the steady-state distributions that

L(2) = λ2 W2 .

2.4 Let U be a random variable that denotes the time between successive
departures (in the long run) from the system in an M/M/1 queue-
ing system. Assume that λ < μ. Show by conditioning whether or
not a departure has left the system empty that U is an exponentially
distributed random variable with mean 1/λ.
2.5 Feedback queue. In the M/M/1 system suppose that with probability q,
a customer who completes their service rejoins the queue for further
service. What is the stability condition for this queue? Assuming
conditions for stability hold, derive expressions for L and W.
2.6 Static control. Consider an M/M/1 queue where the objective is to
pick a service rate μ in an optimal fashion. There are two types of
costs associated: (i) a service-cost rate, c (cost per unit time per unit
of service rate) and (ii) a holding-cost rate h (cost per unit time per
customer in the system). In other words, (i) if we choose a service
rate μ, then we pay a service cost c μ per unit time; (ii) the system
incurs h i monetary units of holding cost per unit time while i cus-
tomers are present. Let C(μ) be the expected steady-state cost per
unit time, when service rate μ is chosen, that is,

C(μ) = cμ + h lim E[X(t)].

t→∞

The objective is to choose a μ that minimizes C(μ). What is the

optimal value μ∗ for the service rate and what is the optimal cost
C(μ∗ )?
2.7 Consider a queueing system with two servers. You have to decide
if it is better to have two queues, one in front of each server or
just one queue. Assume that the service times are exponentially dis-
tributed with mean service rate μ for each server and the servers
follow an FCFS discipline. Assume that the system has infinite wait-
ing room. The arrival process to the system is PP(λ). In the system
with two queues, each arriving customer chooses either queue with
equal probability (assume that customers are not able to join the
shortest queue or jockey between queues at any time). By comparing
Exponential Interarrival and Service Times: Closed-Form Expressions 87

the average waiting time W, decide which system you will go with,
single-queue or two-queue system?
2.8 Consider a single-server queue with two classes. Class i customers
arrive according to PP(λi ) for i = 1,2. For both classes, the service
times are according to exp(μ). If the total number of customers (of
both classes) in the system is greater than or equal to K, class-1 cus-
tomers do not join the system, whereas class-2 customers always
join the system. Model this system as a CTMC. When is this sys-
tem stable? Under stability, what is the steady-state distribution
of the number of customers in the system? Compute the expected
sojourn time for each type of customer in steady state, assuming
they exist. Note that for type 1 customers, you are only required
to obtain the mean waiting time for those customers that join the
system.
2.9 There is a single line to order drinks at a local coffee shop. When
the number of customers in the line is three or less, only one person
does the check out as well as making beverages. This takes exp(μ1 )
time. When there are more than three persons in the line, the store
manager comes in to help. In this case, the service rate increases
to μ2 > μ1 (i.e., the reduced service times now become exp(μ2 )).
Assume that the arrival process is PP(λ). Model this queue as a
CTMC.
2.10 Consider a standard M/M/1 queue with arrival rate λ and service
rate μ. The server toggles between being busy and idle. Let B and
I denote random times the server is busy and idle, respectively,
in a cycle. Obtain an expression for the ratio E(B)/E(I). Using that
relation, show that

1
E(B) = .
μ−λ

Assume that stability condition holds.

2.11 In the usual M/M/2 model, we assume that both servers are identi-
cal. Now consider the following modification, although with respect
to the customers things are the same (i.e., arrivals according to PP(λ)
and exp(μ) amount of work required). Speed of server i is ci , that is,
if there is x amount of work it takes x/ci time to complete (i = 1,2). Let
c1 > c2 . The customers join a common queue. Arriving customers go
to server 1 if available; otherwise they go to server 2 if available; else
they wait. Model this as a CTMC. Derive the condition of stability
and the steady-state distribution.
2.12 Consider two infinite-capacity queueing systems: system 1 has s
servers, each serving at rate μ; system 2 has a single server, serv-
ing at rate sμ. Both systems are subject to PP(λ) arrivals and
88 Analysis of Queues

exponentially distributed service times. Show that in steady state,

assuming the systems are stable, the expected number of customers
in system 2 is less than that in system 1. Do the same analysis for the
expected number of customers in the queue Lq .
2.13 Consider a machine shop with two machines. The lifetime of a
machine is an exp(λ) random variable. When a machine fails, it joins
a queue of failed machines. There is one repair person who repairs
these machines according to FCFS. The repair times are exp(μ) ran-
dom variables. Assume that up- and downtimes are independent.
Suppose that every working machine produces revenues at a rate of
r dollars per unit time. The cost of repairing the machine is C dollars
per repair. Compute the long-run net rate at which the system earns
profits (revenue rate − cost rate).
2.14 Consider the bulk arrival M[X] /M/1 queue. For the following PMFs
for X (i.e., the random number of customers in a batch), obtain
the long-run average number in the system (note that ai is the
probability that a batch of size i arrives to the queue):
(a) ai = (1 − q)qi−1 for i ≥ 1 and 0 < q < 1 (i.e., the geometric
distribution)
(b) ai = 1 for i = K and for all other i, ai = 0 (i.e., the constant size
batch case)
(c) ai = 1/K for i = 1, . . . , K and for all other i, ai = 0 (i.e., the discrete
uniform distribution)

(d) ai = e−θ /(1 − e−θ ) θi /i! for i ≥ 1 (i.e., the modified Poisson dis-
tribution)
2.15 Consider the infinite server bulk arrival M[X] /M/∞ queue. As usual,
batches arrive according to PP(λ) and each customer’s service takes
exp(μ) time. The batch sizes are geometrically distributed such that
probability that a batch of size i arrives to the queue is ai = (1 − q)qi−1
for i ≥ 1 and 0 < q < 1. Model the number of customers in the
system as a CTMC and write down the balance equations. Using
the balance equations, derive the following differential equation for
the generating function P(z):

λ
P
(z) = P(z),
μ(1 − qz)
∞
where P(z) = pi zi . Then, show that the following is a solution
i=0
to this differential equation:

! "λ/(μq)
1−q
P(z) = .
1 − qz
Exponential Interarrival and Service Times: Closed-Form Expressions 89

2.16 Customers arrive into a single-server queue according to PP(λ). An

arriving customer belongs to class-1 with probability α and class-2
with probability β (note that β = 1 − α). The service times of class i
customers are IID exp(μi ) for i = 1,2 such that μ1 = μ2 . Customers
form a single queue and are served according to FCFS. Let X(t) be
the number of customers in the system at time t and Y(t) be the class
of the customer in service at time t (assume Y(t) = 0 if X(t) = 0).
Model the process {(X(t), Y(t)), t ≥ 0} as a CTMC. Let pik be the long-
run probability that there are i customers in the system and the one
∞
in service is of class k. Define the function ψk (z) = pik zi , for
i=1
k = 1, 2. Derive the following expression for ψ1 (z):

αλ(λ(1 − z) + μ2 )
ψ1 (z) = p00 .
μ1 μ2 /z − λμ1 (1 − α/z) − λμ2 (1 − β/z) − λ2 (1 − z)

The expression for ψ2 (z) is obtained by swapping α with β and μ1

with μ2 . Compute p00 and show that the condition of stability is
λ(α/μ1 + β/μ2 ) < 1.
2.17 For a stable M/M/2 queue with PP(λ) arrivals and service rate μ for
each server, the following are the balance equations:

λp0 = μp1

λp1 = 2μp2

λp2 = 2μp3

λp3 = 2μp4

λp4 = 2μp5

.. .. ..
. . .

Define (z) as the generating function

(z) = p0 + p1 z + p2 z2 + · · · .

By multiplying the ith balance equation by zi for i = 0, 1, . . ., and

summing, show that

2μp0 + μzp1
(z) = .
2μ − λz
90 Analysis of Queues

Solve for the unknowns p0 and p1 (for this do not use the results from
M/M/s queue but feel free to verify).
2.18 Solve the retrial queue steady-state equations in Section 2.2.2 and
compute p00 using the arc cut method.
2.19 Consider a post office with two queues: queue 1 for customers with-
out any financial transactions (such as waiting to pick up mail)
and queue 2 for customers requiring financial transactions (such as
mailing a parcel). For i = 1,2, queue i gets arrivals according to a
Poisson process with parameter λi , service time for each customer is
according to exp(μi ), and has i servers. Due to safety reasons, a max-
imum of four customers are allowed inside the post office at a time.
Model the system as a reversible CTMC and derive the steady-state
probabilities.
2.20 Consider a queueing system with two parallel queues and two
servers, one for each queue. Customers arrive to the system accord-
ing to PP(λ) and each arriving customer joins the queue with the
fewer number of customers. If both queues have the same number of
customers, then the arriving customer picks either with equal proba-
bility. The service times are exponentially distributed with mean 1/μ
at either server. When a service is completed at one queue and the
other queue has two more customers than this queue, then the cus-
tomer at the end of the line instantaneously switches to the shorter
queue to balance the queues. Let X1 (t) and X2 (t) be the number of
customers in queues 1 and 2, respectively, at time t. Assuming that
X1 (0) = X2 (0) = 0, we have |X1 (t) − X2 (t)| ≤ 1 for all t. Model the
bivariate stochastic process {(X1 (t), X2 (t)), t ≥ 0} as a CTMC by writ-
ing down the state space and drawing the rate diagram. Assuming
stability, let

pij = lim P{X1 (t) = i, X2 (t) = j}.

t→∞

Compute the steady-state probabilities (pij ) for the CTMC either

using the rate diagram or otherwise.
2.21 Consider a queueing system with infinite waiting room where cus-
tomers arrive in batches. Batch arrivals occur according to PP(λ)
and each batch has either one or two customers, with equal prob-
ability. Shuttle cars arrive according to PP(μ). Each shuttle car can
carry a maximum of two customers. When a shuttle car arrives, it
instantaneously picks up as many customers as possible from the
queueing system and leaves immediately. If the system is empty
when a shuttle car arrives, it leaves empty. Likewise, if the sys-
tem has one customer when a shuttle car arrives, it leaves with this
one customer. Otherwise the shuttle car leaves with two customers
Exponential Interarrival and Service Times: Closed-Form Expressions 91

(i.e., when there are two or more customers when a shuttle arrives).
Model the number of customers in the queueing system at time t as a
CTMC and write down the balance equations. Obtain the generating
function of the number of customers in the system in steady state for
the special case λ = μ. Compute L and W for this queueing system.
2.22 Consider an M/M/1 queue where customers in queue (but not the
one in service) may get discouraged and leave without receiving ser-
vice. Each customer who joins the queue will leave after an exp(γ)
time, if the customer does not enter service by that time. Assume
FCFS.
(a) What fraction of arrivals are served? Hence, what are the
average departure rates both after service and without service.
(b) Suppose an arrival finds one customer in the system. What is the
probability that this customer is served?
(c) On an average, how long do customers that get served wait in
the queue before beginning service?
2.23 For an M[X] /M/2 queue with batch arrival rate λ, constant batch size
4, exp(μ) service time, and traffic intensity ρ = 2λ/μ < 1, show that
∞
the generating function P(z) = pn zn for the distribution of the
n=0
number of customers in the system is

(1 − z)(1 − ρ)(16 + 4ρz)

P(z) = + , .
ρz5 − (ρ + 4)z + 4 (4 + ρ)

2.24 Justify using a brief reasoning whether each of the following is TRUE
or FALSE.
(a) Consider two M/M/1 queues: one has arrival rate λ and service
rate μ, while the other has arrival and service rates as 2λ and 2μ,
respectively. Is the following statement TRUE or FALSE? On an
average, both queues have the same number of customers in the
system in steady state.
(b) Consider two stable queues: one is an M/M/1 queue with arrival
rate λ and service rate μ, while the other is an M/M/2 queue
with arrival rate λ and service rate μ for EACH server. Is the
following statement TRUE or FALSE? On an average, twice as
many entities depart from the M/M/2 queue as compared to the
M/M/1 queue in steady state.
(c) Consider an M/M/1 queue with reneging. The arrival rate is λ,
the service rate is μ, but the reneging rate is also equal to μ (i.e.,
θ = μ). Note that the birth and death process is identical to that of
an M/M/∞ queue. Is the following statement TRUE or FALSE?
For this M/M/1 queue with reneging, we have Lq = 0.
92 Analysis of Queues

(d) Consider two stable queues: one is an M[X] /M/1 queue with
batch arrival rate λ, constant batch size N (i.e., P(X = N) = 1), and
service rate μ, while the other is an M/M/1 queue with arrival
rate Nλ and service rate μ. Note that both queues have the same
effective entity-arrival rate. Is the following statement TRUE or
FALSE? On an average, entities spend more time in the system in
the M[X] /M/1 queue as compared to the M/M/1 queue in steady
state.
(e) Consider a stable M/M/1 queue that uses processor sharing dis-
cipline (see Section 4.5.2). Arrivals are according to PP(λ), and it
would take exp(μ) time to process an entity if it were the only
one in the system. Is the following statement TRUE or FALSE?
The average workload in the system at an arbitrary point in
steady state is λ/(μ(μ − λ)).
3
Exponential Interarrival and Service Times:
Numerical Techniques and Approximations

In the previous chapter, we considered queueing systems with exponential

interarrival and service times where we were able to compute the steady-
state distribution of performance measures such as the number of entities in
the system as closed-form algebraic expressions. However, there are several
instances where it is not possible to obtain closed-form algebraic expressions
and we resort to numerical techniques or approximations. The word resort
should not be taken in a negative sense, especially since most of these tech-
niques produce extraordinarily accurate results extremely quickly. In fact, in
some cases, these techniques are as good. For example, Feldman and Valdez-
Flores [35] recommend using numerical techniques to solve the steady-state
probabilities for the M[X] /M/1 queue for which we used the generating func-
tion approach in the previous chapter. In the following sections, we will
first explain some numerical techniques (mostly based on matrix analysis)
followed by some approximations.

3.1 Multidimensional Birth and Death Chains

The continuous time Markov chain (CTMC) corresponding to the M/M/s/K
queue in the previous chapter is a special case of a one-dimensional birth
and death process (or chain). A one-dimensional birth and death chain is a
CTMC {X(t), t ≥ 0} on S = {0, 1, 2, . . .} with generator matrix Q, which is of
the form
⎧
⎨ λi if j = i + 1
qij = μ if j = i − 1 and i > 0
⎩ i
0 otherwise

for all i ∈ S and j ∈ S. Essentially with every transition, the stochastic process
jumps to a state one value higher (birth) or one value lower (death). These
are also called skip-free CTMCs because it is not possible to go from state i
to state j without going through every state in between (i.e., no skipping of
states is allowed). The rates λ0 , λ1 , . . . are known as birth rates and the rates

93
94 Analysis of Queues

μ1 , μ2 , . . . are known as death rates. For the M/M/1 queue, all birth rates are
equal to λ and all death rates are equal to μ. The steady-state distribution of
the one-dimensional birth and death process (or chain) is easy to compute
using arc cuts.
However, in the multidimensional case, it is not as easy unless one
can use reversibility argument discussed toward the end of the previ-
ous chapter. In this chapter (particularly in this section), we will show
numerical techniques to obtain the steady-state distribution of the multi-
dimensional birth and death chains that are not reversible. For that, we
first define a multidimensional birth and death chain. An n-dimensional
CTMC {(X1 (t), X2 (t), . . . , Xn (t)), t ≥ 0} is multidimensional birth and death
chain if with every transition the CTMC jumps to a state one value higher
or lower in exactly one of the dimensions. In other words, if X(t) is an n-
dimensional row vector [X1 (t) X2 (t) . . . Xn (t)] and ei is an n-dimensional unit
(row) vector (i.e., one in the ith dimension and zero everywhere else), then
the next state the CTMC {X(t), t ≥ 0} goes to from X(t) is either X(t) + ei
or X(t) − ei for some i ∈ [1, 2, . . . , n]. It is worthwhile to point out that
the discrete time version of this is called a multidimensional random walk,
although sometimes the terms “random walk” and “birth–death” are used
interchangeably.
In the remainder of this section, we first motivate the need to study mul-
tidimensional birth–death chains using an example in optimal control. Then,
we provide an efficient algorithm to obtain the steady-state probabilities.
Finally, we provide an example based on energy conservation in data centers
where this approach comes in handy.

3.1.1 Motivation: Threshold Policies in Optimal Control

This book mainly focuses on performance analysis of queues, that is, obtain-
ing analytical expressions for various measures of performance. However,
there is a vast literature on control of queues that we do not address in this
book, except in a few minor places including here. Clearly, the two areas
are extremely interrelated, albeit using potentially different methodologies
to address them. In particular, the emphasis in the field of control of queues
is on optimization and methods that address it. A simple example that we
have all seen in control of queues is in grocery stores. The store manager
typically decides to open up new checkout counters when the existing coun-
ters start experiencing long lines. In terms of terminology, adding a new
counter is a control action, and deciding when to add or reduce counters is
a control policy. Note that looking at the number of customers in line and
adding a server when this crosses a “threshold” is called a threshold policy.
Such kinds of threshold policies are very common in the control of queues.
We next present an example of a queueing system where the optimal policy
is of threshold type.
Exponential Interarrival and Service Times: Numerical Methods 95

Problem 17
Consider two single server queues that work in parallel. Both queues have
finite waiting rooms of size Bi , and the service times are exponentially dis-
tributed with mean 1/μi in queue i (for i = 1, 2). Arrivals occur into this
two-queue system according to a Poisson process with mean rate λ. At every
arrival, a scheduler observes the number in the system in each queue and
decides to take one of three control actions: reject the arrival, send the arrival
to queue 1, or send the arrival to queue 2. Assume that the control actions
happen instantaneously and customers cannot jump from one queue to the
other or leave the system before their service is completed. The system earns
a reward r dollars for every accepted customer and incurs a holding cost hi
dollars per unit time per customer held in queue i (for i = 1, 2). Assume that
the reward and holding cost values are such that the scheduler rejects an
arrival only if both queues are full. Describe the structure of the scheduler’s
optimal policy.
Solution
The system is depicted in Figure 3.1. For i = 1, 2, let Xi (t) be the number of
customers in the system in queue i at time t (including any customers at the
servers). If an arrival occurs at time t, the scheduler looks at X1 (t) and X2 (t)
to decide whether the arrival should be rejected or sent to queue 1 or queue 2.
Note that because of the assumption that the scheduler rejects an arrival only
if both queues are full, the scheduler’s action in terms of whether to accept or
reject a customer is already made. Also, if only one of the queues is full, then
the assumption requires sending the arrival to the nonfull queue. Therefore,
the problem is simplified so that the control action is essentially which queue
to send an arrival to when there is space in both (we also call this routing
policy, i.e., decision to send to queue 1 or 2 depending on the number in
each queue).
Intuitively, the optimal policy when there is space in both queues is to
send an arriving request to queue i if it is “shorter” than queue 3 − i for
i = 1, 2. If μ1 = μ2 , B1 = B2 , and h1 = h2 , then it can be shown that routing

Reject arrival

Arrival Exp(μ1)
PP(λ)

Scheduler
Exp(μ2)

FIGURE 3.1
Schematic for scheduler’s options at arrivals.
96 Analysis of Queues

X2(t) Region 3: Reject arrival

Region 1: Send
arrival to queue 1

Region 2: Send
arrival to queue 2

X1(t)
B1

FIGURE 3.2
Structure of the optimal policy given arrival at time t.

to the shorter queue is optimal. However, in the generic case, we expect a

threshold policy or switching curve as the optimal policy (with joining the
shorter queue being a special case of that). Before deriving this result, we
first illustrate the structure of the optimal policy in Figure 3.2. As shown in
the figure, there are three regions. If an arrival occurs at time t when there
are X1 (t) in queue 1 and X2 (t) in queue 2, then depending on the coordinates
of (X1 (t), X2 (t)), the optimal action is taken. In particular, if (X1 (t), X2 (t)) is
in region 1, 2, or 3, then the optimal action is to send the arrival at time t
to queue 1, queue 2, or reject, respectively. Note that region 3 is the single
point (B1 , B2 ).
Although the threshold policy or switching curve in Figure 3.2 is intu-
itively appealing as the optimal policy, we still need to show mathematically
that it is indeed the case. We do this next. First, we need some notation.
Let x denote a vector x = (x1 , x2 ) such that x1 and x2 are the number of cus-
tomers in queue 1 and queue 2, respectively, when a new customer arrives.
In essence, suppose x denotes the state of the system as seen by an arrival. Let
ei denote a unit vector in the ith dimension, for i = 1, 2. Clearly, e1 = (1, 0) and
e2 = (0, 1). If the arriving customer is routed to queue 1, then the new state
becomes x + e1 and if the arriving customer is routed to queue 2, the new
state becomes x + e2 . To show the monotonic switching curve in Figure 3.2 is
optimal, all we need to show are the following:

• If the optimal policy in state x is to route to queue 1, then the opti-

mal policy in states x + e2 and x − e1 would also be to route to
queue 1.
• If the optimal policy in state x is to route to queue 2, then the opti-
mal policy in states x + e1 and x − e2 would also be to route to
queue 2.

Before proceeding ahead, it is critical to convince oneself of that.

Exponential Interarrival and Service Times: Numerical Methods 97

To show the previous set of results, we need to first formulate the prob-
lem as a semi-Markov decision process (SMDP) and then investigate the
optimal policy in various states. The reader is encouraged to read any stan-
dard text on stochastic dynamic programming or Markov decision processes
for a thorough understanding of this material. We first define the value func-
tion V(x), which is the maximal expected total discounted net benefit over
an infinite horizon, starting from state x, that is, (x1 , x2 ). Note that although
x is a vector, V(x) is a scalar. We also use the term “discounted” because we
consider a discount factor α and V(x) denotes the expected present value.
It is customary in the SMDP literature to pick appropriate time units so that

α + λ + μ1 + μ2 = 1,

and therefore in our analysis to compute V(x), we do not have to be con-

cerned about α. To obtain V(x), we first describe some notation. Let h(x) be
the holding cost incurred per unit time in state x, that is,

h(x) = h1 x1 + h2 x2 .
+
Let a+ denote max{a, 0}, if a is a scalar and x+ = x+
1 , x2 . Now, we are in a
position to write down the optimality or Bellman equation.
The value function V(x) satisfies the following optimality equation: for
x1 ∈ [0, B1 ) and x2 ∈ [0, B2 ),

V(x) = − h(x) + λ max{r + V(x + e1 ), r + V(x + e2 )} + μ1 V((x − e1 )+ )

+ μ2 V((x − e2 )+ ). (3.1)

We will not derive this optimality equation (the reader is encouraged to refer
to any standard text on stochastic dynamic programming or Markov deci-
sion processes). However, there is merit in going over the equation itself.
When the system is in state x, a holding cost of h(x) is incurred per unit time
(the negative sign in front of h(x) is because it is a cost as opposed to a bene-
fit). If an arrival occurs (at rate λ), a revenue of r is obtained and depending
on whether the arrival is sent to queue 1 or 2, the new state becomes x + e1
or x + e2 , respectively. From the new state x + ei for i = 1, 2, the net benefit is
V(x + ei ). It is quite natural to select queue 1 or 2 depending on which has
a higher net benefit, hence the maximization. Instead of the arrival, if the
next event is a service completion at queue i (for i = 1, 2), then the new state
becomes (x−ei )+ at rate μi and the value function is V((x−ei )+ ). In summary,
the left-hand side (LHS) V(x) equals the (negative of) holding cost incurred
in state x, plus the net benefit that depends on the next event (arrival routed
to queue 1 or 2, service completion at queue 1, and service completion at
queue 2), which would lead to a new state. The reason it appears as if the
units do not match in Equation 3.1 is that in reality, the entire right-hand
98 Analysis of Queues

1
side (RHS) should be multiplied by α+λ+μ 1 +μ2
, and in our case, that is equal
to 1. As a matter of fact, if xi = 0 for i = 1 or 2, then the actual equation is
(α+λ+μ3−i )V(x) = −h(x)+λ max{r+V(x+e1 ), r+V(x+e2 )}+μ3−i V((x−e3−i )+ )
since server i cannot complete service as there are no customers. How-
ever, we add μi V(x) (since it is equal to μi V((x − ei )+ )) to both sides to
get V(x)(α + λ + μ1 + μ2 ) = − h(x) + λ max{r + V(x + e1 ), r + V(x + e2 )} +
μ1 V((x − e1 )+ ) + μ2 V((x − e2 )+ ), which is identical to Equation 3.1 since
α+λ+μ1 +μ2 = 1. A similar argument can be made for V(x) when x1 = x2 = 0,
and hence we do not have to be concerned about that either.
Also, by looking at the states for which Equation 3.1 holds, we still need
the value function at the boundaries, that is, x1 = B1 or x2 = B2 , which we
present next. For x1 = B1 and x2 < B2 where arrivals are routed to queue 2

V(x) = −h(x) + λ(r + V(x + e2 )) + μ1 V((x − e1 )+ ) + μ2 V((x − e2 )+ ).

For x2 = B2 and x1 < B1 where arrivals are routed to queue 1

V(x) = −h(x) + λ(r + V(x + e1 )) + μ1 V((x − e1 )+ ) + μ2 V((x − e2 )+ ).

For x1 = B1 and x2 = B2 where arrivals are rejected

V(x) = −h(x) + λV(x) + μ1 V((x − e1 )+ ) + μ2 V((x − e2 )+ ).

Since no arrivals enter when x1 = B1 and x2 = B2 , this equation is derived

from (α + μ1 + μ2 )V(x) = − h(x) + μ1 V((x − e1 )+ ) + μ2 V((x − e2 )+ ) by adding
λV(x) to both sides. For the remainder, we will just use Equation 3.1 with the
understanding that at the boundary, the appropriate one among the previous
three equations would be used instead.
The following are sufficient conditions that V(x) must satisfy for all x to
guarantee that the monotonic switching curve in Figure 3.2 is optimal (i.e., the
bulleted items described near the figure are met):

V(x + e2 + e1 ) − V(x + e2 + e2 ) ≥ V(x + e1 ) − V(x + e2 ), (3.2)

V(x + e1 + e2 ) − V(x + e1 + e1 ) ≥ V(x + e2 ) − V(x + e1 ), (3.3)

V(x + e2 ) − V(x + e1 + e2 ) ≥ V(x) − V(x + e1 ). (3.4)

It is relatively straightforward to check (and the reader is encouraged to do

so) that if conditions (3.2) and (3.3) are met, then: (a) if the optimal policy in
state x is to route to queue 1, then the optimal policy in states x + e2 and x − e1
would also be to route to queue 1; (b) if the optimal policy in state x is to route
to queue 2, then the optimal policy in states x + e1 and x − e2 would also be
to route to queue 2. For example, if condition (3.2) is met for all x, then it can
be shown that: if the optimal policy in state x is to route to queue 1, then the
Exponential Interarrival and Service Times: Numerical Methods 99

optimal policy in state x + e2 would also be to route to queue 1; if the optimal

policy in state x is to route to queue 2, then the optimal policy in state x − e2
would also be to route to queue 2. Likewise for condition (3.3). However,
we are yet to show why the inequality (3.4) is needed; we will address that
subsequently.
At this time, all we need to show is that for all x, V(x) satisfies conditions
(3.2), (3.3), and (3.4). For that we use the principle of mathematical induction.
Let V0 (x) = 0 for all x. Define the iterative relation for n = 0, 1, 2, . . ..

Vn+1 (x) = −h(x) + λ max{r + Vn (x + e1 ), r + Vn (x + e2 )} + μ1 Vn ((x − e1 )+ )

+ μ2 Vn ((x − e2 )+ ). (3.5)

Incidentally, this is known as value iteration and is typically used to com-

pute V(x) for all x by computing V1 (x) using V0 (x) = 0, then V2 (x) using
V1 (x), and so on so that Vk (x) converges to V(x) as k → ∞. Trivially, V0 (x)
satisfies conditions (3.2), (3.3), and (3.4). Next, we assume that Vn (x) satisfies
conditions (3.2), (3.3), and (3.4). Using that we need to show that Vn+1 (x)
satisfies conditions (3.2), (3.3), and (3.4). Thus by mathematical induction,
Vk (x) satisfies those conditions for all k and in the limit so will V(x). Of
course, for this to make sense, we do mean to replace V(x) by V0 (x), Vn (x),
Vn+1 (x), and Vk (x), wherever appropriate to check conditions (3.2), (3.3),
and (3.4). We use that slight abuse of notation in the arguments that follow
as well.
Recall that, all that is left to show is if Vn (x) satisfies conditions (3.2),
(3.3), and (3.4), then Vn+1 (x) also satisfies those three conditions. To do that,
consider the value iteration Equation 3.5. Assuming that Vn (x) satisfies the
three conditions, we need to show that (a) −h(x); (b) max{r + Vn (x + e1 ), r +
Vn (x+e2 )}; and (c) Vn ((x−ei )+ ) for i = 1, 2 all satisfy the conditions (3.2), (3.3),
and (3.4) as well. Then we are done because from Equation 3.5, Vn+1 (x) will
also satisfy those three conditions. It is relatively straightforward to show
(a) and (c), and hence they are left as an exercise for the reader (see the end
of the chapter). The only extra result we need is that Vn (x) is nonincreasing
for all n and this can be shown once again by induction (V0 (x) = 0 is non-
increasing, −h(x) is nonincreasing, and if Vn (x) is nonincreasing, then from
Equation 3.5, Vn+1 (x) is also nonincreasing). Thus, we are just left to show
that max{r + Vn (x + e1 ), r + Vn (x + e2 )} satisfies the conditions (3.2), (3.3), and
(3.4). Let us denote g(x) by g(x) = max{r + Vn (x + e1 ), r + Vn (x + e2 )}.
To show that g(x) = max{r + Vn (x + e1 ), r + Vn (x + e2 )} satisfies the con-
ditions (3.2), (3.3), and (3.4) when Vn (x) satisfies them for all x, we first start
with condition (3.4) and consider four cases representing the actions in states
x and x + e1 + e2 :

1. Action in x is route to 1 and in x + e1 + e2 is route to 1.

This implies g(x) = r+Vn (x+e1 ) and g(x+e1 +e2 ) = r+Vn (x+2e1 +e2 ).
100 Analysis of Queues

Also, g(x + e1 ) ≥ r + Vn (x + 2e1 ) (since g(x + e1 ) = max{r + Vn (x +

2e1 ), r + Vn (x + e1 + e2 )}) and g(x + e2 ) ≥ r + Vn (x + e1 + e2 ) (since
g(x + e2 ) = max{r + Vn (x + e1 + e2 ), r + Vn (x + 2e2 )}). Using these
relations, we show that g(x) satisfies (3.4), that is, g(x + e2 ) − g(x +
e1 + e2 ) − g(x) + g(x + e1 ) ≥ 0. For this, we begin with the LHS and
show it is ≥ 0:

LHS = g(x + e2 ) − g(x + e1 + e2 ) − g(x) + g(x + e1 )

= g(x + e2 ) − r − Vn (x + 2e1 + e2 ) − r − Vn (x + e1 ) + g(x + e1 )

≥ r + Vn (x + e1 + e2 ) − r − Vn (x + 2e1 + e2 ) − r − Vn (x + e1 ) + r

+ Vn (x + 2e1 )

= Vn (x + e1 + e2 ) − Vn (x + 2e1 + e2 ) − Vn (x + e1 ) + Vn (x + 2e1 ) ≥ 0.

The last inequality is because Vn (x + e1 ) satisfies condition (3.4).

Hence, for this case, g(x) does satisfy (3.4).
2. Action in x is route to 1 and in x + e1 + e2 is route to 2.
This implies g(x) = r+Vn (x+e1 ) and g(x+e1 +e2 ) = r+Vn (x+e1 +2e2 ).
Also, g(x + e1 ) ≥ r + Vn (x + e1 + e2 ) (since g(x + e1 ) = max{r + Vn (x +
2e1 ), r + Vn (x + e1 + e2 )}) and g(x + e2 ) ≥ r + Vn (x + e1 + e2 ) (since
g(x + e2 ) = max{r + Vn (x + e1 + e2 ), r + Vn (x + 2e2 )}). Using these
relations, we show that g(x) satisfies (3.4), that is, g(x + e2 ) − g(x +
e1 + e2 ) − g(x) + g(x + e1 ) ≥ 0. For this we begin with the LHS and
show it is ≥ 0:

LHS = g(x + e2 ) − g(x + e1 + e2 ) − g(x) + g(x + e1 )

= g(x + e2 ) − r − Vn (x + e1 + 2e2 ) − r − Vn (x + e1 ) + g(x + e1 )

≥ r + Vn (x + e1 + e2 ) − r − Vn (x + e1 + 2e2 ) − r − Vn (x + e1 ) + r

+ Vn (x + e1 + e2 )

= Vn (x + e1 + e2 ) − Vn (x + e1 + 2e2 ) − Vn (x + e1 )

+ Vn (x + e1 + e2 ) ≥ 0.

Since Vn (x + e1 ) satisfies conditions (3.2) and (3.4), by adding them

up we get the last inequality (which also shows that Vn (x + e1 ) is
concave in x2 for a fixed x1 + 1). Hence, for this case, g(x) does
satisfy (3.4).
3. Action in x is route to 2 and in x + e1 + e2 is route to 1.
This implies g(x) = r+Vn (x+e2 ) and g(x+e1 +e2 ) = r+Vn (x+2e1 +e2 ).
Exponential Interarrival and Service Times: Numerical Methods 101

The remaining arguments are symmetric to case 2 (i.e., action in x is

route to 1 and in x + e1 + e2 is route to 2). Hence, in a symmetric
manner, we can show that for this case, g(x) does satisfy (3.4).
4. Action in x is route to 2 and in x + e1 + e2 is route to 2.
This implies g(x) = r + Vn (x + e2 ) and g(x + e1 + e2 ) = r + Vn (x +
2e2 + e1 ). The remaining arguments are symmetric to case 1 (i.e.,
action in x is route to 1 and in x + e1 + e2 is route to 1). Hence,
in a symmetric manner, we can show that for this case, g(x) does
satisfy (3.4).

Therefore, g(x) = max{r + Vn (x + e1 ), r + Vn (x + e2 )} satisfies condition (3.4)

when Vn (x) satisfies conditions (3.2), (3.3), and (3.4).
To show that g(x) = max{r + Vn (x + e1 ), r + Vn (x + e2 )} satisfies con-
dition (3.3), when Vn (x) satisfies (3.2), (3.3), and (3.4) for all x, we once
again consider four cases but representing the actions in states x + e2
and x + 2e1 :

1. Action in x + e2 is route to 1 and in x + 2e1 is route to 1.

This implies g(x+e2 ) = r+Vn (x+e1 +e2 ) and g(x+2e1 ) = r+Vn (x+3e1 ).
Also, g(x+e1 +e2 ) ≥ r+Vn (x+2e1 +e2 ) (since g(x+e1 +e2 ) = max{r+
Vn (x + 2e1 + e2 ), r + Vn (x + e1 + 2e2 )}) and g(x + e1 ) ≥ r + Vn (x + 2e1 )
(since g(x + e1 ) = max{r + Vn (x + 2e1 ), r + Vn (x + e1 + e2 )}). Using
these relations, we show that g(x) satisfies (3.3), that is, g(x + e1 +
e2 ) − g(x + e1 + e1 ) − g(x + e2 ) + g(x + e1 ) ≥ 0. For this, we begin with
the LHS and show it is ≥ 0:

LHS = g(x + e1 + e2 ) − g(x + e1 + e1 ) − g(x + e2 ) + g(x + e1 )

= g(x + e1 + e2 ) − r − Vn (x + 3e1 ) − r − Vn (x + e1 + e2 ) + g(x + e1 )

≥ r + Vn (x + 2e1 + e2 ) − r − Vn (x + 3e1 ) − r − Vn (x + e1 + e2 ) + r

+ Vn (x + 2e1 )

= Vn (x + 2e1 + e2 ) − Vn (x + 3e1 ) − Vn (x + e1 + e2 ) + Vn (x + 2e1 ) ≥ 0.

The last inequality is because Vn (x + e1 ) satisfies condition (3.3).

Hence, for this case, g(x) does satisfy (3.3).
2. Action in x + e2 is route to 1 and in x + 2e1 is route to 2.
This implies g(x + e2 ) = r + Vn (x + e1 + e2 ) and g(x + 2e1 ) = r + Vn (x +
2e1 + e2 ). Also, g(x + e1 + e2 ) ≥ r + Vn (x + 2e1 + e2 ) (since g(x + e1 +
e2 ) = max{r + Vn (x + 2e1 + e2 ), r + Vn (x + e1 + 2e2 )}) and g(x + e1 ) ≥ r +
Vn (x + e1 + e2 ) (since g(x + e1 ) = max{r + Vn (x + 2e1 ), r + Vn (x + e1 +
e2 )}). Using these relations, we show that g(x) satisfies (3.3), that is,
102 Analysis of Queues

g(x + e1 + e2 ) − g(x + e1 + e1 ) − g(x + e2 ) + g(x + e1 ) ≥ 0. For this, we

begin with the LHS and show it is ≥ 0:
LHS = g(x + e1 + e2 ) − g(x + e1 + e1 ) − g(x + e2 ) + g(x + e1 )

= g(x + e1 + e2 ) − r − Vn (x + 2e1 + e2 ) − r − Vn (x + e1 + e2 )

+ g(x + e1 )

≥ r + Vn (x + 2e1 + e2 ) − r − Vn (x + 2e1 + e2 ) − r − Vn (x + e1 + e2 )

+ r + Vn (x + e1 + e2 )

≥ 0.

Hence, for this case, g(x) does satisfy (3.3).

3. Action in x + e2 is route to 2 and in x + 2e1 is route to 1.
This implies g(x+e2 ) = r+Vn (x+2e2 ) and g(x+2e1 ) = r+Vn (x+3e1 ).
The remaining arguments are symmetric to case 2 (i.e., action in x+e2
is route to 1 and in x + 2e1 is route to 2). Hence, in a symmetric
manner, we can show that for this case, g(x) does satisfy (3.2).
4. Action in x + e2 is route to 2 and in x + 2e1 is route to 2.
This implies g(x+e2 ) = r+Vn (x+2e2 ) and g(x+2e1 ) = r+Vn (x+2e1 +
e2 ). The remaining arguments are symmetric to case 1 (i.e., action in
x + e2 is route to 1 and in x + 2e1 is route to 1). Hence, in a symmetric
manner, we can show that for this case, g(x) does satisfy (3.2).
Hence g(x) = max{r + Vn (x + e1 ), r + Vn (x + e2 )} satisfies condition (3.3) when
Vn (x) satisfies conditions (3.2), (3.3), and (3.4). To show that g(x) = max{r +
Vn (x + e1 ), r + Vn (x + e2 )} satisfies condition (3.2), when Vn (x) satisfies (3.2),
(3.3), and (3.4) for all x, one needs to follow the argument symmetric to
what we showed for condition (3.3). However, because of the symmetry the
solution steps are identical and we do not present that here.
Hence g(x) = max{r+Vn (x+e1 ), r+Vn (x+e2 )} satisfies the conditions (3.2),
(3.3), and (3.4) when Vn (x) satisfies them for all x, and therefore Vn+1 (x) also
satisfies all those conditions implying by induction that V(x) satisfies them
as well. Therefore, the monotonic switching curve in Figure 3.2 is optimal. It
is reasonable to wonder why condition (3.4) is necessary since even without
it, g(x) would satisfy conditions (3.2) and (3.3) which are the ones we really
need. As it turns out (see the exercises at the end of this chapter), condition
(3.4) would also need to be satisfied for Vn (x), to show that conditions (3.2)
and (3.3) are satisfied for Vn ((x − ei )+ ). In fact, besides (3.2), Vn (x) would also
need to be nonincreasing.

Similarly, there are several situations in control of queues where the

optimal policy is threshold type (or a switching curve). However, this only
Exponential Interarrival and Service Times: Numerical Methods 103

explains the structure of the optimal policy but not the optimal policy itself.
For example, if numerical values for the preceding problem are given, where
does the optimal line that separates region 1 from 2 lie? In other words,
can we draw Figure 3.2 precisely for a given set of numerical values? The
answer is yes. For every candidate switching curve, we can model the result-
ing system as a CTMC and evaluate its performance using the steady-state
probabilities. For example, one algorithm would be to start with the switch-
ing curve being the straight line from (0, 0) to (B1 , B2 ) and evaluate the
expected discounted net benefit (via the steady-state probabilities). Then try
all possible neighbors to determine the optimal switching curve that would
maximize the expected discounted net benefit. We explain the algorithm in
detail and provide a numerical example at the end of the next subsection.
However, we first need a method to quickly compute steady-state probabil-
ities of such CTMCs so that we can efficiently search through the space of
switching curves swiftly, which is the objective of the next subsection.

3.1.2 Algorithm Using Recursively Tridiagonal Linear Equations

Matrix operations using standard matrix structures and algorithms are
slow and consume large amounts of memory when applied to large sparse
matrices, which are typical in multidimensional birth and death processes.
Therefore, solving the CTMC balance equations for steady-state probabilities
of multidimensional birth and death processes is computationally inten-
sive using standard techniques, and to do it quickly one needs efficient
algorithms that are specialized. One such algorithm uses matrix geometric
technique and is explained in the following section. The technique requires a
repetitive structure but is generalizable to multidimensional CTMCs that are
not birth and death type. However, here we explain another algorithm for
multidimensional birth and death processes, which is generic enough and
does not require any repetitive structure. The algorithm is based on Servi
[97] and will be termed as the Servi algorithm. We present only the two-
dimensional and nonsingular version here and refer the reader to Servi [97]
for variants. We explain the algorithm for two-dimensional birth and death
processes using an example problem.

Problem 18
Two classes of requests arrive to a computer system according to a Poisson
process with rate λi per second for class i (for i = 1, 2). The number of bytes
of processing required for class i requests are exponentially distributed with
mean 1/μi MB. Assume that there is a 1 MB/s single processor that simulta-
neously processes all the requests using a full processor-sharing regime. In
other words, if there are two class-1 requests and three class-2 requests cur-
rently running, then each of the five requests get 200 kB/s (assuming there
are 1000 kB in 1 MB, which is not technically correct as there ought to be
104 Analysis of Queues

1024 kB in 1 MB). However, in practice, the processor cycles through the five
requests processing each for a tiny amount of time called time-quantum, and
this is approximated as a full processor-sharing discipline. Further, there is
a restriction that a maximum of four requests of class1 and three of class2
can be simultaneously served at any given time. Model the system as a two-
dimensional birth and death process and obtain the steady-state distribution
of the number of customers of each class in the system.
Solution
Let Xi (t) be the number of class i customers in the system at time t (for
i = 1, 2). Then the stochastic process {(X1 (t), X2 (t)), t ≥ 0} is a CTMC on
state space S = {(0, 0), (0, 1), (0, 2), (0, 3), (1, 0), . . . , (4, 2), (4, 3)} and infinitesi-
mal generator matrix Q such that

Q − Diag(Q) =
⎡ ⎤
0 λ2 0 0 λ1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
⎢μ2 0 λ2 0 0 λ1 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥
⎢ ⎥
⎢0 μ 0 λ 0 0 0 0⎥
⎢ 2 2 λ1 0 0 0 0 0 0 0 0 0 0 0 ⎥
⎢ ⎥
⎢ 0 0 μ2 0 0 0 0 λ1 0 0 0 0 0 0 0 0 0 0 0 0⎥
⎢ ⎥
⎢μ1 0 0 0 0 λ2 0 0 λ1 0 0 0 0 0 0 0 0 0 0 0⎥
⎢ ⎥
⎢ 0 μ1 0 0 μ2 0 0 0 λ1 0 0 0 0 0 0 0 0 0 0⎥
⎢ 2 2 λ2 ⎥
⎢ ⎥
⎢ 0 0 μ31 0 0 2μ3 2 0 λ2 0 0 λ1 0 0 0 0 0 0 0 0 0⎥
⎢ ⎥
⎢ 0 0 0 μ1 0 0 3μ2
0 0 0 0 λ1 0 0 0 0 0 0 0 0⎥
⎢ 4 4 ⎥
⎢0 0 0 0 μ 0 0⎥
⎢ 1 0 0 0 0 λ2 0 0 λ1 0 0 0 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 0 2μ1 0 0 μ32 0 λ2 0 0 λ1 0 0 0 0 0 0⎥
⎢ 3 ⎥
⎢0 0 0 0 0 0 2μ1
0 0 2μ4 2 0 0⎥
⎢ 4 0 λ2 0 0 λ1 0 0 0 ⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 2μ1
0 0 3μ2
0 0 0 0 λ1 0 0 0 0⎥
⎢ 5 5 ⎥
⎢0 0 0 0 0 0 0 0 μ1 0 0 0 0 λ2 0 0 λ1 0 0 0⎥
⎢ ⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 0 0 3μ4 1 0 0 μ42 0 λ2 0 0 λ1 0 0⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 0 0 0 3μ1
0 0 2μ5 2 0 λ2 0 0 λ1 0 ⎥
⎢ 5 ⎥
⎢ 3μ1 3μ2 ⎥
⎢0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 λ1 ⎥
⎢ 6 6 ⎥
⎢0 0 0 0 0 0 0 0 0 0 0 0 μ1 0 0 0 0 λ2 0 0⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 0 0 0 0 0 0 4μ5 1 0 0 μ52 0 λ2 0 ⎥
⎢ ⎥
⎢ 4μ1 ⎥
⎣0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 2μ6 2 0 λ2 ⎦
4μ1 3μ2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 7 0

where Diag(Q) represents the diagonal of Q replacing the diagonal of an

identity matrix. This is purely because of space restrictions, and one can eas-
ily obtain Q by summing up the rows and adding the negative of that to the
diagonal.
By drawing the rate diagram (left as an exercise to the reader), one can see
that the system can be modeled as a two-dimensional birth and death process
with state space and Q matrix given earlier. Next, we obtain the steady-state
distribution of the number of customers of each class in the system. For that
we first write down Q as
Exponential Interarrival and Service Times: Numerical Methods 105

⎡ ⎤
Q00 Q01 0 0 0
⎢ ⎥
⎢ Q10 Q11 Q12 0 0 ⎥
⎢ ⎥
Q=⎢
⎢ 0 Q21 Q22 Q23 0 ⎥
⎥
⎢ ⎥
⎣ 0 0 Q32 Q33 Q34 ⎦
0 0 0 Q43 Q44

where
⎡ ⎤
0 0 0 0
⎢0 0⎥
⎢ 0 0 ⎥
0=⎢ ⎥,
⎣0 0 0 0⎦
0 0 0 0
⎡ ⎤
−λ1 − λ2 λ2 0 0
⎢ μ −λ1 − λ2 − μ2 ⎥
⎢ 2 λ2 0 ⎥
Q00 =⎢ ⎥,
⎣ 0 μ2 −λ1 − λ2 − μ2 λ2 ⎦
0 0 μ2 −λ1 − μ2
⎡ ⎤
λ1 0 0 0
⎢0 0⎥
⎢ λ1 0 ⎥
Q01 =⎢ ⎥,
⎣0 0 λ1 0⎦
0 0 0 λ1
⎡ ⎤
μ1 0 0 0
⎢0 μ1 ⎥
⎢ 0 0 ⎥
Q10 =⎢ 2
μ1 ⎥,
⎣0 0 3 0 ⎦
μ1
0 0 0 4
⎡
−λ1 − λ2 − μ1 λ2 0
⎢ μ2
−λ1 − λ2 − μ1
− μ2
λ2
⎢ 2 2 2
Q11 =⎢ 2μ2 μ1 2μ2
⎣ 0 3 −λ1 − λ2 − 3 − 3
3μ2
0 0 4
⎤
0
0 ⎥
⎥,
λ2 ⎦
3μ2 μ1
−λ1 − 4 − 4
⎡ ⎤
λ1 0 0 0
⎢0 λ1 0 0⎥
Q12 =⎢
⎣0
⎥,
0 λ1 0⎦
0 0 0 λ1
106 Analysis of Queues

⎡ ⎤
μ1 0 0 0
⎢0 2μ1
0 0 ⎥
⎢ 3 ⎥
Q21 = ⎢ 2μ1 ⎥,
⎣0 0 4 0 ⎦
2μ1
0 0 0 5

⎡
−λ1 − λ2 − μ1 λ2 0
⎢ μ2
−λ1 − λ2 − 2μ1
− μ2
⎢ 3 3 3 λ2
Q22 =⎢ 2μ2 2μ1 2μ2
⎣ 0 4 −λ1 − λ2 − 4 − 4
3μ2
0 0 5
⎤
0
0 ⎥
⎥,
λ2 ⎦
3μ2 2μ1
−λ1 − 5 − 5

⎡ ⎤
λ1 0 0 0
⎢0 0⎥
⎢ λ1 0 ⎥
Q23 =⎢ ⎥,
⎣0 0 λ1 0⎦
0 0 0 λ1

⎡ ⎤
μ1 0 0 0
⎢0 3μ1
0 0 ⎥
⎢ 4 ⎥
Q32 =⎢ 3μ1 ⎥,
⎣0 0 5 0 ⎦
3μ1
0 0 0 6
⎡
−λ1 − λ2 − μ1 λ2 0
⎢ μ2
−λ1 − λ2 − 3μ1
− μ2
⎢ 4 4 4 λ2
Q33 =⎢ 2μ2 3μ1 2μ2
⎣ 0 5 −λ1 − λ2 − 5 − 5
3μ2
0 0 6
⎤
0
0 ⎥
⎥,
λ2 ⎦
3μ2 3μ1
−λ1 − 6 − 6
⎡ ⎤
λ1 0 0 0
⎢0 0⎥
⎢ λ1 0 ⎥
Q34 =⎢ ⎥,
⎣0 0 λ1 0⎦
0 0 0 λ1
Exponential Interarrival and Service Times: Numerical Methods 107

⎡ ⎤
μ1 0 0 0
⎢0 4μ1
0 0 ⎥
⎢ 5 ⎥
Q43 = ⎢ 4μ1 ⎥
⎣0 0 6 0 ⎦
4μ1
0 0 0 7

⎡ ⎤
−λ2 − μ1 λ2 0 0
⎢ μ2
−λ2 − 4μ1
− μ2
0 ⎥
⎢ 5 5 5 λ2 ⎥
and Q44 =⎢ 2μ2 4μ1 2μ2 ⎥.
⎣ 0 6 −λ2 − 6 − 6 λ2 ⎦
3μ2
0 0 7 − 3μ7 2 − 4μ1
7

Now, to obtain p = [p0,0 p0,1 . . . p4,2 p4,3 ], we typically solve pQ =

[0 0 . . . 0 0] and normalize using

4
3
pi,j = 1.
i=0 j=0

However, that process gets computationally intensive for large state spaces.
Therefore, we describe an alternate procedure to obtain p, which is essen-
tially the Servi algorithm that we would subsequently describe for a general
two-dimensional birth and death process.
Instead of obtaining the 1 × 20 row vector p directly, we write
p = a[R0 R1 R2 R3 R4 ] where a is a 1 × 4 row vector and Ri is a 4 × 4 matrix
(for i = 0, 1, 2, 3, 4). The vector a and matrix Ri (for i = 0, . . . , 4) are unknown
and need to be obtained recursively. Since pQ = [0 . . . 0], we have

a[R0 R1 R2 R3 R4 ]Q = [0 . . . 0]. (3.6)

Using Q as

⎡ ⎤
Q00 Q01 0 0 0
⎢ ⎥
⎢ Q10 Q11 Q12 0 0 ⎥
⎢ ⎥
Q=⎢ 0 Q21 Q22 Q23 0 ⎥
⎢ ⎥
⎢ ⎥
⎣ 0 0 Q32 Q33 Q34 ⎦
0 0 0 Q43 Q44
108 Analysis of Queues

we can rewrite Equation 3.6 as the following set:

a[R0 Q00 + R1 Q10 ] = [0 0 0 0],

a[R0 Q01 + R1 Q11 + R2 Q21 ] = [0 0 0 0],

a[R1 Q12 + R2 Q22 + R3 Q32 ] = [0 0 0 0],

a[R2 Q23 + R3 Q33 + R4 Q43 ] = [0 0 0 0],

a[R3 Q34 + R4 Q44 ] = [0 0 0 0].

The following set of Ri (for i = 0, . . . , 4) and a values would ensure that the
previous set of equations are satisfied:

R0 = I the identity matrix,

R1 = −R0 Q00 Q−1

10 ,

R2 = [−R0 Q01 − R1 Q11 ]Q−1

21 ,

R3 = [−R1 Q12 − R2 Q22 ]Q−1

32 ,

R4 = [−R2 Q23 − R3 Q33 ]Q−1

and a is computed as a solution to a[R3 Q34 + R4 Q44 ] = [0 0 0 0]. Next, we

let θ = a[R0 R1 R2 R3 R4 ] and obtain p = θ/(θ1), where 1 is a 20 × 1 column
vector of ones. Therefore, note that p = [p0,0 p0,1 . . . p4,2 p4,3 ] also satisfies

4
3
pi,j = 1.
i=0 j=0

Initially, it is natural for one to wonder why it would be a better idea

to solve for a, R0 , . . ., R4 in the previous example as opposed to computing
20 unknowns of the p vector directly. The main reason is that Q is a sparse
matrix and solving for p directly would result in more computational time.
For example, consider numerical values λ1 = 1, λ2 = 1.5, μ1 = 2, and μ2 = 4.
For this example, solving p directly took about 0.9 ms on average, whereas
solving using a, R0 , . . ., R4 took about 0.4 ms on average.
One of the main benefits is that inverses are taken of smaller matri-
ces, which are also tridiagonal (hence much easier to invert). In fact, this
is the key concept explored in the Servi algorithm. The percentage gains
in terms of computations increase as the size of Q increases. Although the
Exponential Interarrival and Service Times: Numerical Methods 109

algorithm is only presented for two-dimensional birth and death processes,

the recursive tridiagonal structure is also leveraged upon in the higher
dimensional algorithm. The only concern is when the inverses do not exist
due to singularities, and Servi [97] explains how to get around that. With
that motivation, we present the Servi algorithm.
Let {(X1 (t), X2 (t)), t ≥ 0} be a finite-state CTMC that can be modeled as
a two-dimensional birth and death process. For given finite integers b1 and
b2 , 0 ≤ X1 (t) ≤ b1 and 0 ≤ X2 (t) ≤ b2 for all t. Arrange the states of the CTMC
in the following order: (0, 0), (0, 1), (0, 2), . . ., (0, b2 ), (1, 0), (1, 1), (1, 2), . . .,
(1, b2 ), . . ., (b1 , 0), (b1 , 1), (b1 , 2), . . ., (b1 , b2 ). Then, since the CTMC is a
two-dimensional birth and death process, the Q matrix takes the form (see
exercise problems)
⎡ ⎤
Q0,0 Q0,1 0 0 ... 0 0
⎢ ⎥
⎢ Q1,0 Q1,1 Q1,2 0 ... 0 0 ⎥
⎢ ⎥
⎢ 0 Q2,1 Q2,2 Q2,3 ... 0 0 ⎥
⎢ ⎥
⎢ ⎥
Q=⎢ 0 0 Q3,2 Q3,3 ... 0 0 ⎥
⎢ ⎥
⎢ .. .. .. .. .. .. .. ⎥
⎢ . . . . . . . ⎥
⎢ ⎥
⎢ ⎥
⎣ 0 0 0 0 ... Qb1 −1,b1 −1 Qb1 −1,b1 ⎦
0 0 0 0 ... Qb1 ,b1 −1 Qb1 ,b1

where 0 is a (b2 + 1) × (b2 + 1) matrix of zeros and for all i, j, Qi,j are (b2 + 1) ×
(b2 + 1) matrices. Assuming that the CTMC is irreducible (i.e., it is possible
to go from every state to every other state in one or more transitions), our
objective is to determine the steady-state probabilities pi,j for 0 ≤ i ≤ b1 and
0 ≤ j ≤ b2 where

pi,j = lim P{X1 (t) = i, X2 (t) = j}.

t→∞

pQ = 0 (where p is a 1 × (b1 + 1)(b2 + 1)

Of course one approach is to solve
row vector of pi,j values) and i j pi,j = 1. We next present an alternative
approach that we call Servi algorithm, which is computationally much more
efficient.
Servi algorithm

1. Let R0 = I, that is, an identity matrix of size (b2 + 1) × (b2 + 1).

2. Obtain R1 as R1 = − R0 Q0,0 Q−1
1,0 .
3. For i = 1, . . . , b1 − 1, obtain Ri+1 as Ri+1 = [−Ri−1 Qi−1,i − Ri Qi,i ]Q−1
i+1,i .
4. Find a 1 × (b2 + 1) row vector a that satisfies a[Rb1 −1 Qb1 −1,b1 +
Rb1 Qb1 ,b1 ] = [0 . . . 0].
110 Analysis of Queues

5. Compute the 1 × (b1 + 1)(b2 + 1) row vector θ as θ = a[R0 R1 . . . Rb1 ].

6. The steady-state probability vector p is p = θ/(θ1), where 1 is a
(b1 + 1)(b2 + 1) × 1 column vector of ones.

Next, we present a numerical example to illustrate the algorithm.

Problem 19
Consider a bilingual customer service center where there are two finite
capacity queues: one for English-speaking customers and other for Spanish-
speaking customers. A maximum of three Spanish-speaking customers can
be in the system at any time. Likewise, a maximum of four English-
speaking customers can be in the system simultaneously. Spanish-speaking
and English-speaking customers arrive into their respective queues accord-
ing to a Poisson process with respective rates 4, and 6 per hour. There is
a Spanish-speaking server that takes on an average 12 min to serve each
of his customers and there is an English-speaking server that takes on an
average 7.5 min to serve each of her customers. Assume that none of the cus-
tomers are bilingual; however, the manager who oversees the two servers
can speak English and Spanish. Whenever the number of customers in one
of the queues has two or more customers than the other queue, the man-
ager helps out the server with the longer queue, thereby increasing the
service rate by 2 per hour. Assume that all service times are exponentially
distributed.
Model the bilingual customer service center system using a two-
dimensional birth and death process. Then use the Servi algorithm to obtain
the steady-state probabilities of the number of customers in the system
speaking Spanish and English. What fraction of customers of each type is
rejected without service? Determine the average time spent by each type of
accepted customer in the system.
Solution
Let X1 (t) be the number of Spanish-speaking customers in the system at
time t and X2 (t) be the number of English-speaking customers in the sys-
tem at time t. From the problem description, b1 = 3 and b2 = 4. Clearly,
{(X1 (t), X2 (t)), t ≥ 0} is a finite-state CTMC that can be modeled as a two-
dimensional birth and death process with 0 ≤ X1 (t) ≤ 3 and 0 ≤ X2 (t) ≤ 4
for all t. The state space is S = { (0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0),
(1, 1), (1, 2), (1, 3), (1, 4), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (3, 0), (3, 1), (3, 2),
(3, 3), (3, 4) }. Note that when the Spanish-speaking server is by himself,
the service rate is 5 per hour, and when the English-speaking server is
by herself, the service rate is 8 per hour. However, when the manager
comes to assist, the service rate of the Spanish-speaking server becomes
7 per hour and that of the English-speaking server becomes 10 per hour.
Exponential Interarrival and Service Times: Numerical Methods 111

Using that and the arrival rates, the Q matrix (in the order of states in S)
is given as

⎡
−10 6 0 0 0 4 0 0 0 0
⎢ 8 −18 6 0 0 0 4 0 0 0
⎢
⎢ 0 10 −20 6 0 0 0 4 0 0
⎢
⎢ 0 0 10 −20 6 0 0 0 4 0
⎢
⎢ 0 0 0 10 −14 0 0 0 0 4
⎢
⎢ 5 0 0 0 0 −15 6 0 0 0
⎢
⎢ 0 5 0 0 0 8 −23 6 0 0
⎢
⎢ 0 0 5 0 0 0 8 −23 6 0
⎢
⎢ 0 0 0 5 0 0 0 10 −25 6
⎢
⎢ 0 0 0 0 5 0 0 0 10 −19
⎢
⎢ 0 0 0 0 0 7 0 0 0 0
⎢
⎢ 0 0 0 0 0 0 5 0 0 0
⎢
⎢ 0 0 0 0 0 0 0 5 0 0
⎢
⎢ 0 0 0 0 0 0 0 0 5 0
⎢
⎢ 0 0 0 0 0 0 0 0 0 5
⎢
⎢ 0 0 0 0 0 0 0 0 0 0
⎢
⎢ 0 0 0 0 0 0 0 0 0 0
⎢
⎢ 0 0 0 0 0 0 0 0 0 0
⎢
⎣ 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
⎤
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 ⎥
⎥
0 0 0 0 0 0 0 0 0 0 ⎥
⎥
0 0 0 0 0 0 0 0 0 0 ⎥
⎥
0 0 0 0 0 0 0 0 0 0 ⎥
⎥
4 0 0 0 0 0 0 0 0 0 ⎥
⎥
0 4 0 0 0 0 0 0 0 0 ⎥
⎥
0 0 4 0 0 0 0 0 0 0 ⎥
⎥
0 0 0 4 0 0 0 0 0 0 ⎥
⎥
0 0 0 0 4 0 0 0 0 0 ⎥
⎥.
−17 6 0 0 0 4 0 0 0 0 ⎥
⎥
8 −23 6 0 0 0 4 0 0 0 ⎥
⎥
0 8 −23 6 0 0 0 4 0 0 ⎥
⎥
0 0 8 −23 6 0 0 0 4 0 ⎥
⎥
0 0 0 10 −19 0 0 0 0 4 ⎥
⎥
7 0 0 0 0 −13 6 0 0 0 ⎥
⎥
0 7 0 0 0 8 −21 6 0 0 ⎥
⎥
0 0 5 0 0 0 8 −19 6 0 ⎥
⎥
0 0 0 5 0 0 0 8 −19 6 ⎦
0 0 0 0 5 0 0 0 8 −13
112 Analysis of Queues

Now, to use the Servi algorithm, we write down

⎡ ⎤
Q0,0 Q0,1 0 0
⎢ Q1,0 Q1,1 Q1,2 0 ⎥
Q=⎢
⎣ 0
⎥
Q2,1 Q2,2 Q2,3 ⎦
0 0 Q3,2 Q3,3

where each Qi,j is a 5 × 5 matrix for all i, j and 0 is a 5 × 5 matrix of zeros. It is

important to point out here that it would have been wiser to have switched
X1 (t) and X2 (t) since they would have resulted in 4×4 matrices, which would
have been computationally simpler to manipulate, especially inverses. We
now use the Servi algorithm to determine the steady-state probabilities pi,j
for all (i, j) ∈ S.

1. Let R0 = I, and hence

⎡ ⎤
1 0 0 0 0
⎢ 0 1 0 0 0 ⎥
⎢ ⎥
R0 = ⎢
⎢ 0 0 1 0 0 ⎥.
⎥
⎣ 0 0 0 1 0 ⎦
0 0 0 0 1

2. Obtain R1 as R1 = − R0 Q0,0 Q−1

1,0 , which implies

⎡ ⎤
2 −1.2 0 0 0
⎢ −1.6 3.6 −1.2 0 0 ⎥
⎢ ⎥
R1 = ⎢
⎢ 0 −2 4 −1.2 0 ⎥.
⎥
⎣ 0 0 −2 4 −1.2 ⎦
0 0 0 −2 2.8

3. For i = 1, . . . , 2, obtain Ri+1 as Ri+1 = [−Ri−1 Qi−1,i − Ri Qi,i ]Q−1

i+1,i .
Hence

⎡ ⎤
5.0857 −7.9200 1.4400 0 0
⎢ −7.5429 19.6000 −9.8400 1.4400 0 ⎥
⎢ ⎥
R2 = ⎢
⎢ 2.2857 −15.6000 22.4000 −10.8000 1.4400 ⎥,
⎥
⎣ 0 3.2000 −17.2000 24.0000 −9.3600 ⎦
0 0 4.0000 −15.6000 12.2400
Exponential Interarrival and Service Times: Numerical Methods 113

⎡ ⎤
20.2596 −31.3420 16.1280 −1.7280 0
⎢ −39.8041 80.0539 −70.1280 18.4320 −1.7280 ⎥
⎢ ⎥
R3 = ⎢
⎢ 23.3796 −77.6735 135.8400 −78.4800 18.4320 ⎥.
⎥
⎣ −3.6571 30.1714 −119.7600 146.5600 −63.4080 ⎦
0 −4.5714 43.3600 −99.4400 62.9920

4. Find a 1 × 5 row vector a that satisfies a[R2 Q2,3 + R3 Q3,3 ] = [0 . . . 0],

which results in

a = [0.7269 0.5447 0.3395 0.2087 0.1269].

5. The 1×20 row vector θ is computed as θ = a[R0 R1 R2 R3 ]. Therefore,

θ = [0.7269 0.5447 0.3395 0.2087 0.1269 0.5823 0.4097 0.2867 0.1736
0.1050 0.3641 0.2914 0.2085 0.1471 0.0891 0.2185 0.1730 0.1446 0.1094
0.0779].
6. The steady-state probability vector p is p = θ/(θ1), where 1 is a 20 × 1
column vector of ones. Then, we get p = [p00 p01 p02 p03 p04 p10 p11 p12
p13 p14 p20 p21 p22 p23 p24 p30 p31 p32 p33 p34 ] = [0.1364 0.1022 0.0637
0.0392 0.0238 0.1093 0.0769 0.0538 0.0326 0.0197 0.0683 0.0547 0.0391
0.0276 0.0167 0.0410 0.0325 0.0271 0.0205 0.0146].

Having obtained the steady-state probabilities, the next step is to answer

the system performance questions. The fraction of Spanish-speaking cus-
tomers that are rejected is p30 + p31 + p32 + p33 + p34 = 0.1357. This is due
to Poisson arrivals see time process (PASTA) since every time a potential
Spanish-speaking customer arrives, he/she will see there are three other
Spanish-speaking customers in the system with probability p30 + p31 + p32 +
p33 + p34 . Likewise, the fraction of English-speaking customers that are
rejected is p04 + p14 + p24 + p34 = 0.0748.
Therefore, the rejection and entering rates of Spanish-speaking customers
are 4 × 0.1357 = 0.5428 and 4 × 0.8643 = 3.4572 per hour, respectively. Like-
wise, the rejection and entering rates of English-speaking customers are
6 × 0.0748 = 0.4488 per hour and 6 × 0.9252 = 5.5512 per hour, respectively.
The average number of Spanish-speaking customers in the system is

3
4
ipij = 1.1122
i=0 j=0

and the average number of English-speaking customers in the system is

3
4
jpij = 1.2926.
i=0 j=0
114 Analysis of Queues

Hence, from Little’s law, the average time spent by accepted Spanish- and
English-speaking customers is 1.1122/3.4572 = 0.3217 h and 1.2926/5.5512 =
0.2329 h, respectively.

3.1.3 Example: Optimal Customer Routing

In Section 3.1.1, we saw that threshold policy or switching curve policy
was optimal in many queueing control problems. Then, in Section 3.1.2,
we introduced an effective algorithm to expeditiously compute steady-state
probabilities of multidimensional birth and death processes. Here, we seek
to close the loop by considering a scenario where the objective is to obtain
the switching curve based on the fact that it is optimal from Section 3.1.1.
We use the algorithm in Section 3.1.2 to quickly compare candidate switch-
ing curves by computing the steady-state system performance measures. By
searching through the solution space of switching curves, we obtain the opti-
mal one numerically. We illustrate the approach using an example of optimal
customer routing.

Problem 20
Laurie’s Truck Repair offers emergency services for which they have two
facilities: one in the north end of town and the other in the south end. All
trucks that require an emergency repair call a single number to schedule
their repair. When a call is received, the operator must determine whether
to accept the repair request. If a repair is accepted, the operator must also
determine whether to send it to the north or south facility. The company has
installed a sophisticated system that can track the status of all the repairs (this
means the operator can know how many outstanding repairs are in progress
at each facility). Due to space restrictions to park the trucks, the north-side
facility can handle at most three requests at a time, whereas the south-side
facility can only handle two simultaneous requests. Use the following infor-
mation to determine the routing strategy: calls for repair arrive according
to a Poisson process with mean rate of four per day; the service times are
exponentially distributed at both facilities; however, north-side facility can
repair three trucks per day on average, whereas the south-side facility can
repair two per day on average. The average revenue earned per truck repair
is $100. The holding cost per truck per day is $20 in the north side and $10 in
the south side (the difference is partly due to the cost of insurance in the two
neighborhoods). Assume the following: the time to take a truck to a repair
facility is negligible compared to the service time; decisions to accept/reject
and route are made instantaneously and are based only on the number of
committed outstanding repairs at each facility; once accepted at a facility,
the truck does not leave it until the repair is complete; the operator would
never reject a call if there is space in at least one of the facilities to repair; at
Exponential Interarrival and Service Times: Numerical Methods 115

either location, trucks are repaired one at a time; the revenue earned for a
truck repair is independent of the time to repair, location, and type of repair.
Solution
For notational convenience, we use subscript “1” to denote the north side
and subscript “2” for south side. Note that the problem description is almost
identical to that of Problem 17 with B1 = 3, B2 = 2, r = $100, μ1 = 3 per day,
μ2 = 2 per day, λ = 4 per day, h1 = $20 per truck per day, and h2 = $10 per
truck per day. However, one key difference between Problem 17 and this
one is that here the solution needs to be the actual policy (not the structure as
required in Problem 17). In other words, if a request for service is made when
there are i trucks in the north side and j in the south side, should we accept
the request, and if we do accept, should it be sent to the north or south side?
Let X1 (t) and X2 (t) denote the number of trucks under repair in the north-
and south-side facilities, respectively. Ignoring the time to schedule as well
as to drive to the appropriate facility, the system can be modeled as a two-
dimensional birth and death process {(X1 (t), X2 (t)), t ≥ 0} with state space
S = {(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2), (3, 0), (3, 1), (3, 2)}.
The action in state (3, 2) is to reject requests for repairs. When the system
is in states (0, 2), (1, 2), and (2, 2), the optimal action is to route to facility 1
(i.e., north) since there is no space in facility 2. Likewise, when the system
is in states (3, 0) and (3, 1), the optimal action is to route to facility 2 (i.e.,
south) since there is no space in facility 1. Thus, we only need to determine
the actions in the six states (0, 0), (0, 1), (1, 0), (1, 1), (2, 0), and (2, 1).
Although there are 26 = 64 possible actions in the six states together
where we need to determine the optimal action (route to 1 or 2), from the
solution to Problem 17, we know the optimal solution is a monotonic switch-
ing curve. Therefore, we are reduced to only 10 different sets of actions that
we need to consider, which are summarized in Table 3.1. Let Aij be the action
in state (i, j) such that Aij = 1 implies routing to 1 and Aij = 2 implies routing
to 2. Therefore, there is considerable computation and time savings from 64
possible alternatives to 10. The only possible concern is that in Problem 17,
the objective is in terms of long-run average discounted cost, whereas here
it is the long-run average cost per unit time. As it turns out, the average cost
per unit time case also yields the same structure of the optimal policy.
Each one of the 10 alternative actions in Table 3.1 yields a two-
dimensional birth and death process. As described earlier, for all the 10
alternatives, X1 (t) and X2 (t) denote the number of trucks under repair in
the north- and south-side facilities, respectively, and {(X1 (t), X2 (t)), t ≥ 0}
would be a two-dimensional birth and death process with state space
S = {(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2), (3, 0), (3, 1), (3, 2)}.
The key difference among the 10 alternatives would be the Q matrix. There-
fore, although notationally we have the same set of pi,j values, they would
depend on the Q matrix. The objective is to determine the optimal one among
the 10 alternatives that would maximize the expected net revenue per unit
116 Analysis of Queues

TABLE 3.1
Alternatives for Actions in the 6 States
Where They Are to Be Determined
A00 A01 A10 A11 A20 A21
1 1 1 1 1 1
1 1 1 1 2 1
1 1 1 1 2 2
1 1 2 1 2 1
1 1 2 1 2 2
2 1 2 1 2 1
1 1 2 2 2 2
2 1 2 1 2 2
2 1 2 2 2 2
2 2 2 2 2 2

time. For a given alternative, if the steady-state probabilities pi,j can be com-
puted for all i and j such that (i, j) ∈ S, the steady-state expected net revenue
per unit time is

3
2
rλ(1 − p3,2 ) − (ih1 + jh2 )pi,j
i=0 j=0

dollars per day. This is due to the fact that a fraction (1 − p3,2 ) of requests
are accepted (which arrive at rate λ on average) and every request on aver-
age brings a revenue of r dollars; hence, the average revenue is rλ(1 − p3,2 ).
The remaining term is the average holding cost expenses that need to be sub-
tracted from the revenue. Since at any given time there are i trucks in location
1 and j trucks in location 2 with probability pi,j , by conditioning the number
of trucks in each location, we can obtain the holding cost per unit time as
ih1 + jh2 if there are i trucks in location 1 and j in location 2.
For each of the candidate alternate actions, to evaluate the objective func-
tion, that is, the steady-state expected net revenue per unit time, we solve for
pi,j and compute the objective function. To speed up the process to obtain pi,j ,
we use Servi algorithm in Section 3.1.2 but do not present the details here.
Using the numerical values for r, h1 , h2 , λ, μ1 , and μ2 , we obtain the optimal
set of actions as A00 = 1, A01 = 1, A10 = 2, A11 = 1, A20 = 2, and A21 = 2 with an
objective function value of $320.8905 per day. This optimal action set yields
a two-dimensional birth and death process with rate diagram depicted in
Figure 3.3. For this system, obtaining the steady-state probability using Servi
algorithm is left as an exercise.

There are many such queueing control problems where the objective
is to determine the optimal control actions in each state. In a majority of
Exponential Interarrival and Service Times: Numerical Methods 117

μ1 μ1 μ1

0,0 1,0 2,0 3,0

μ2 μ2 λ μ2 λ λ
μ2
μ1 μ1 μ1
0,1 1,1 2,1 3,1
λ λ

μ2 μ2 μ2 λ μ2 λ

μ1 μ1 μ1
0,2 1,2 2,3 3,2
λ λ λ

FIGURE 3.3
Two-dimensional birth and death process corresponding to optimal action.

those problems, the structure of the optimal policy is a switching curve or

a threshold policy. To determine the optimal switching curve or threshold
from a set of candidate values, if the resulting CTMC is a multidimensional
birth and death process, we can use Servi algorithm to quickly compute the
steady-state probabilities, which are used in the objective function to deter-
mine the optimal actions (especially when other techniques such as using
the value iteration directly, or using a linear programming formulation of
such Markov decision processes, are more time consuming). The next natural
question is what if we need the steady-state probabilities of a multidimen-
sional CTMC that is not necessarily a birth and death process. We address
such multidimensional stochastic processes in the following section.

3.2 Multidimensional Markov Chains

There are several queueing systems with Poisson arrivals and exponentially
distributed service times that can be modeled as multidimensional CTMCs.
One such example is the multidimensional birth and death process explained
in the previous section. Although the methodologies are related, some of the
key differences between the multidimensional birth and death chains in the
previous section and the multidimensional CTMCs in this section are as fol-
lows: here we consider infinite number of states in one of the dimensions,
whereas in the previous section, all dimensions were finite; here we require
a repetitive structure (we will see this subsequently), whereas in the previous
section, the structure could be arbitrary; and of course, here we do not
require a birth and death structure unlike the previous section. Having made
118 Analysis of Queues

that last comment, interestingly, although our multidimensional CTMCs will

not be birth and death processes, the resulting Q matrix will have a birth-
and-death-like structure and, hence, we call them quasi-birth-and-death
processes. This will be the focus of the following section.

3.2.1 Quasi-Birth-Death Processes

A quasi-birth-death (QBD) process is a CTMC whose infinitesimal generator
matrix Q is of the following block diagonal form:
⎛ ⎞
B1 B0 0 0 0 0 ...
⎜ B2 A1 A0 0 0 0 ... ⎟
⎜ ⎟
⎜ 0 A2 A1 A0 0 0 ... ⎟
⎜ ⎟
Q=⎜ 0 0 A2 A1 A0 0 ... ⎟
⎜ ⎟
⎜ 0 0 0 A2 A1 A0 ... ⎟
⎝ ⎠
.. .. .. .. .. .. ..
. . . . . . .

where A0 , A1 , A2 , B0 , B1 , and B2 are finite-sized matrices (instead of scalars,

which is the case in one-dimensional birth and death processes; hence the
name quasi-birth–death process). Usually, A0 , A1 , and A2 are square matri-
ces of size m × m, whereas B0 , B1 , and B2 are of sizes × m, × , and
m × , respectively. It is important to note that m < ∞ and < ∞. The zeros
in the Q matrix are also matrices of appropriate sizes. The terminology used
for each block in Q is called level. Each level consists of multiple phases of
the original CTMC. In particular the first level has phases and the remain-
ing levels have m phases, the states can jump down one level, stay in the
same level, or jump up one level. Hence, the CTMC is said to be skip-free
between levels.
An intuitive reason for observing QBD processes in queues is that if one
considers the system as a whole, customers are arriving into the system one
by one according to a Poisson process (unless they are in batches) and get
served one after the other (hence depart from the system one by one). There-
fore, it is natural that the total number in the system most likely goes up
or down by one and thereby in that dimension one does have a birth-and-
death-like structure. Hence, overall one sees birth and death processes. Next,
we present a few examples of queueing systems where the multidimensional
CTMCs can be modeled as QBDs.

Problem 21
Consider a queueing system with two parallel queues and two servers, one
for each queue. Customers arrive to the system according to PP(λ), and each
arriving customer joins the queue with fewer number of customers. If both
queues have the same number of customers, then the arriving customer picks
Exponential Interarrival and Service Times: Numerical Methods 119

either with equal probability. The service times are exponentially distributed
with mean μi at server i for i = 1, 2. When a service is completed at one queue
and the other queue has two more customers than this queue, then the cus-
tomer at the end of the line instantaneously switches to the shorter queue
(this is called jockeying) to balance the queues. Let X1 (t) and X2 (t) be the
number of customers in queues 1 and 2, respectively, at time t. Assuming
that X1 (0) = X2 (0) = 0, we have |X1 (t) − X2 (t)| ≤ 1 for all t. Model the bivari-
ate stochastic process {(X1 (t), X2 (t)), t ≥ 0} as a QBD by obtaining A0 , A1 , A2 ,
B0 , B1 , and B2 .
Solution
The bivariate CTMC {(X1 (t), X2 (t)), t ≥ 0} has a state space

S = {(0, 0), (1, 0), (0, 1), (1, 1), (2, 1), (1, 2), (2, 2), (3, 2), (2, 3), . . .}.

By writing down the Q matrix of the CTMC and considering sets of three
states as a “level” with three “phases,” it is easy to verify that Q has a QBD
structure with
⎛ ⎞
0 0 0
A0 = B0 = ⎝ λ 0 0 ⎠
λ 0 0
⎛ ⎞
0 μ2 μ1
A2 = B2 = ⎝ 0 0 0 ⎠
0 0 0
⎛ ⎞
−λ − μ1 − μ2 λ/2 λ/2
A1 = ⎝ μ1 + μ2 −λ − μ1 − μ2 0 ⎠ and
μ1 + μ2 0 −λ − μ1 − μ2
⎛ ⎞
−λ λ/2 λ/2
B1 = ⎝ μ2 −λ − μ2 0 ⎠.
μ1 0 −λ − μ1

Care should be taken to write the states in the order (i, i), then (i + 1, i), fol-
lowed by (i, i + 1) for all i = 0, 1, 2, . . . and use those three states as part of
a level. Note that the previous CTMC does not resemble a birth and death
process at all.

Problem 22
Consider a single server queue with infinite waiting room where the service
times are exp(μ). Into this queue, arrivals occur according to a Poisson pro-
cess with one of three possible rates λ0 , λ1 , or λ2 . The rates are governed
by a CTMC {Z(t), t ≥ 0} called the environment process on states {0, 1, 2} and
120 Analysis of Queues

infinitesimal generator matrix

⎡ ⎤
−α1 − α2 α1 α2
⎣ β0 −β0 − β2 β2 ⎦.
γ0 γ1 −γ0 − γ1

For i = 0, 1, 2 and any t ∈ [0, ∞], if the CTMC is in state i at time t

(i.e., Z(t) = i), then arrivals into the queue occur according to a PP(λi ). Let
X(t) be the number of entities in the system at time t. Model the bivariate
stochastic process {(X(t), Z(t)), t ≥ 0} as a QBD by obtaining A0 , A1 , A2 , B0 , B1 ,
and B2 .
Solution
The bivariate CTMC {(X(t), Z(t)), t ≥ 0} has a state space

S = {(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2), . . .}.

By writing down the Q matrix of the CTMC and considering sets of three
states as a “level” with three “phases,” it is easy to verify that Q has a QBD
structure with
⎛ ⎞
λ0 0 0
A0 = B0 = ⎝ 0 λ1 0 ⎠
0 0 λ2
⎛ ⎞
μ 0 0
A2 = B2 = ⎝ 0 μ 0 ⎠
0 0 μ
⎛ ⎞
−λ0 − μ − α1 − α2 α1 α2
A1 = ⎝ β0 −λ1 − μ − β0 − β2 β2 ⎠ and
γ0 γ1 −λ2 − μ − γ0 − γ1
⎛ ⎞
−λ0 − α1 − α2 α1 α2
B1 = ⎝ β0 −λ1 − β0 − β2 β2 ⎠.
γ0 γ1 −λ2 − γ0 − γ1

Note that the levels correspond to the number in the system and phase
corresponds to the state of the environment CTMC {Z(t), t ≥ 0}. In this
example, the phases and levels have a true meaning unlike the previous
example. In fact, historically these types of queues were first studied as QBDs
and hence some of that terminology remain. Further, such types of time-
varying arrival processes are called Markov-modulated Poisson processes.
Exponential Interarrival and Service Times: Numerical Methods 121

Although the arrival process is indeed a piecewise constant nonhomoge-

neous Poisson process (NPP), since the rates change randomly (as opposed
to deterministically), the literature on NPP usually does not consider this. In
practice, typically the X(t) changes much more frequently than Z(t), and
while Z(t) is a constant, the queue does behave like an M/M/1 queue.
Also, unlike the previous example, this CTMC does resemble a birth and
death process.

Problem 23
Customers arrive at a single server facility according to PP(λ). An arriving
customer, independent of everything else, belongs to class-1 with probabil-
ity α and class-2 with probability β = 1 − α. The service time of class i (for
i = 1, 2) customers are IID exp(μi ) such that μ1 = μ2 . The customers form a
single line and are served according to first come first served (FCFS). Let
X(t) be the total number of customers in the system at time t, and Y(t) be the
class of the customer in service (with Y(t) = 0 if there are no customers in the
system). Model the bivariate stochastic process {(X(t), Y(t)), t ≥ 0} as a QBD
by obtaining A0 , A1 , A2 , B0 , B1 , and B2 .
Solution
The bivariate CTMC {(X(t), Y(t)), t ≥ 0} has a state space

S = {(0, 0), (1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2), . . .}.

By writing down the Q matrix of the CTMC and considering sets of two
states as a “level” with two “phases,” note that Q has a QBD structure with
= 3, m = 2

λ 0
A0 =
0 λ

⎛ ⎞
0 0
B0 = ⎝ λ 0 ⎠
0 λ

αμ1 βμ1
A2 =
αμ2 βμ2

0 αμ1 βμ1
B2 =
0 αμ2 βμ2
122 Analysis of Queues

−λ − μ1 0
A1 = and
0 −λ − μ2

⎛ ⎞
−λ αλ βλ
B1 = ⎝ μ1 −λ − μ1 0 ⎠.
μ2 0 −λ − μ2

3.2.2 Matrix Geometric Method for QBD Analysis

Matrix geometric method (MGM) is widely used for exact analysis of fre-
quently encountered type of queueing systems that can be modeled as QBDs.
It is worthwhile to point out that similar techniques can be used in other
contexts that are not QBDs, which are called the more generic matrix ana-
lytic methods. However, the MGM can only be applied if the system can be
decomposed into two parts: the initial and the repetitive portions which are
typically found in all QBDs. To perform MGM-based analysis, we assume
that the CTMC with generator Q is irreducible. Further, we assume that
A0 + A1 + A2 is an irreducible infinitesimal generator with stationary proba-

bility π (a 1 × m row vector) such that π(A0 + A1 + A2 ) = 0 and π1 = 1, where
for i = 0, 1, i is an m × 1 column vector of i. The meaning of π is very subtle;
given any level greater than zero, the elements of π denote the probabilities
of being at the respective phase. If one were to draw the rate diagram and
perform arc cuts between levels i and i+1 for any i ≥ 0, then the net flow from
left to right is πA0 1 and likewise the net flow from right to left is πA2 1 (note
that the transitions in A1 correspond to those within a level; however, A0 are
those that go up a level and A2 those that go down a level). For the system
to be stable (much like the M/M/1 queue result), the net flow to the right
ought to be smaller than the net flow to the left. Thus, once π is obtained, the
condition that the CTMC is stable can be stated as

πA2 1 > πA0 1.

If the CTMC is stable based on this condition, the steady-state probabili-

ties exist; let us call them p as usual. We write p as [p0 p1 p2 . . .], where p0
is an 1 × row vector and p1 , p2 , . . . , are 1 × m row vectors. Since pQ = 0, we
have the following set of equations:

p0 B1 + p1 B2 = 0

p0 B0 + p1 A1 + p2 A2 = 0

p1 A0 + p2 A1 + p3 A2 = 0
Exponential Interarrival and Service Times: Numerical Methods 123

p2 A0 + p3 A1 + p4 A2 = 0

p3 A0 + p4 A1 + p5 A2 = 0

.. .. ..
. . .

for which we try the solution for all i ≥ 1

pi = p1 Ri−1 .

This is known as the matrix geometric solution due to the matrix geometric
relation between the stationary probabilities. Clearly, for that to be a solu-
tion, we need A0 + RA1 + R2 A2 = 0, where R is an unknown. In fact, the
crux of the MGM is in computing the R that satisfies A0 + RA1 + R2 A2 = 0.
The matrix R is known as the auxiliary matrix. Then, once R is known, we
can obtain p0 in terms of p1 so that it satisfies both p0 B1 + p1 B2 = 0 and
p0 B0 + p1 A1 + p1 RA2 = 0. Note that the result in + m sets of equations with
unknowns for p0 and m unknowns for p1 . However, as with any CTMC, the
+ m equations are not linearly independent. Thus, we would have to drop
one of them and use the normalizing condition that the elements of p sum to
one. For this, it is convenient to write down all the pi terms are in terms of
p1 . The normalizing condition can be written as

p0 1 + (p1 + p2 + p3 + · · · )1 = 1,

p0 1 + (p1 + p1 R + p1 R2 + · · · )1 = 1,

p0 1 + p1 (I + R + R2 + · · · )1 = 1,

p0 1 + p1 (I − R)−1 1 = 1

provided that all eigenvalues of R are in the open interval between −1 and 1.
This is also sometimes written as the spectral radius of R should be less than
1. An intuitive way to think about that result is that if x is an eigenvector and
k an eigenvalue, then Rx = kx and Ri x = ki x. Thus, the sum (I + R + R2 + · · · )x
can be written as (1 + k + k2 + · · · )x which converges provided |k| < 1. Thus,
the sum (I +R+R2 +· · · ) converges to I −R if |k| < 1. Since the spectral radius
of R is the largest |k|, by ensuring it is less than 1, all eigenvalues are between
−1 and 1.
Summarizing, we obtain the steady-state probabilities p = [p0 p1 p2 . . .]
by solving for p0 and p1 in the following set of equations (after dropping one
124 Analysis of Queues

of the + m equations from the first two sets):

p0 B1 + p1 B2 = 0

p0 B0 + p1 A1 + p1 RA2 = 0

p0 1 + p1 (I − R)−1 1 = 1

where
1 is an × 1 column vector
R is the minimal nonnegative solution to the equation

R2 A2 + RA1 + A0 = 0.

Then, using pi = p1 Ri−1 for all i ≥ 2, we obtain p2 , p3 , p4 , etc. Sometimes R

can be obtained analytically in closed form by solving A0 + RA1 + R2 A2 = 0,
especially when A0 , A1 , and A2 are simple. Otherwise, R can be obtained
numerically, especially recursively using A0 + RA1 + R2 A2 = 0. We will not
delve into solving for R using A0 + RA1 + R2 A2 = 0 and assume that is some-
thing that can be done. Although it is worthwhile to point out that there
have been several articles in the literature that focus on efficiently comput-
ing R recursively, we would like to reiterate that once R is computed, p
can be obtained which could be used to determine steady-state performance
measures such as queue length and waiting times, among others. Next, we
illustrate the MGM technique for a specific example.

3.2.3 Example: Variable Service Rate Queues in Computer Systems

Resource sharing, especially the processor, is fairly common in computer
systems that run a variety of applications. From a single application’s point
of view, the processing rate (say in terms of number of bytes per second) can
vary with time during the course of processing its tasks. At this time, we
will describe the system in an abstract fashion; however, at the end of this
section, there are some concrete examples. Consider a single server queue
whose service capacity varies over time. That is, the speed with which it
serves a customer is determined by an external environment process. In par-
ticular, we assume that the server speed changes according to a CTMC that is
independent of the arrival process and service requirements of the customer.
Each customer brings a certain random amount of work; however, the rate
at which this work is completed is time-varying. It is important to clarify
that during the service of a request, the processing rate of the server could
change. Other than that, the queue is a standard system. We assume that the
customers in the queue are served in an FCFS manner. For this model, we are
interested in obtaining the distribution of the steady-state number in system.
Exponential Interarrival and Service Times: Numerical Methods 125

Z(t) Environment process

Z(t)

t
PP(λ) bz(t)

Server speed bz(t) amount

Customer brings of work per unit time
exp(μ) amount of work

FIGURE 3.4
Schematic representation. (From Mahabhashyam, S., and Gautam, N., Queueing Syst. Theory
Appl., 51(1–2), 89, 2005. With permission.)

The system is represented schematically in Figure 3.4. Customers arrive

into the single server queue according to a Poisson process with mean rate λ
customers per unit time. Each arriving customer brings a certain amount of
work distributed exponentially with mean 1/μ bytes. Let X(t) be the num-
ber of customers in queue at time t. Let Z(t) be the state of the environment
process which governs the server speed at time t such that {Z(t), t ≥ 0} is
an irreducible finite-state CTMC with m states and infinitesimal generator
matrix Qz = [qi,j ] for i ∈ {1, 2, . . . , m} and j ∈ {1, 2, . . . , m} (we use Qz to dis-
tinguish it from Q that is reserved for the MGM analysis). When the state of
the environment process Z(t) = i, the service speed available is bi bytes per
unit time. That is, the server can process bi bytes per unit time. Let θi be
the instantaneous service completion rate when the environment is in state
i. Typically, θZ(t) = μbZ(t) . One can show that the bivariate stochastic process
{(Z(t), X(t)), t ≥ 0} is a CTMC. Next, we use MGM to obtain the steady-state
probabilities of this CTMC since none of the other techniques introduced
earlier would be feasible.
The two-dimensional CTMC {(Z(t), X(t)), t ≥ 0} is a QBD process
on state space {(1, 0), (2, 0), . . . , (m, 0), (1, 1), (2, 1), . . . , (m, 1), (1, 2), (2, 2), . . . ,
(m, 2), . . .} with infinitesimal generator matrix Q of form

⎛ ⎞
B1 B0 0 0 0 0 ...
⎜ B2 A1 A0 0 0 0 ... ⎟
⎜ ⎟
⎜ ... ⎟
⎜ 0 A2 A1 A0 0 0 ⎟
Q=⎜ ⎟
⎜ 0 0 A2 A1 A0 0 ... ⎟
⎜ ⎟
⎜ 0 0 0 A2 A1 A0 ... ⎟
⎝ ⎠
.. .. .. .. .. .. ..
. . . . . . .
126 Analysis of Queues

where
⎛ ⎞
λ 0 0 ... 0
⎜ 0 λ 0 ... 0 ⎟
⎜ ⎟
⎜ ⎟
A0 = B0 = ⎜ 0 0 λ ... 0 ⎟
⎜ ⎟
⎜ .. .. .. .. .. ⎟
⎝ . . . . . ⎠
0 0 ... 0 λ
⎛ ⎞
θ1 0 0 ... 0
⎜ 0 θ2 0 ... 0 ⎟
⎜ ⎟
⎜ ⎟
A2 = B2 = ⎜ 0 0 θ3 ... 0 ⎟
⎜ ⎟
⎜ .. .. .. .. .. ⎟
⎝ . . . . . ⎠
0 0 ... 0 θm
⎛ ⎞
s(1) q1,2 q1,3 ... q1,m
⎜ q2,1 s(2) q2,3 ... q2,m ⎟
⎜ ⎟
⎜ ⎟
A1 = ⎜ q3,1 q3,2 s(3) q3,4 ... ⎟
⎜ ⎟
⎜ .. .. .. .. .. ⎟
⎝ . . . . . ⎠
qm,1 qm,2 . . . qm,m−1 s(m)

such that for i = 1, . . . , m, s(i) = qi,i − λ − θi and

⎛ ⎞
u(1) q1,2 q1,3 ... q1,m
⎜ q2,1 u(2) q2,3 ... q2,m ⎟
⎜ ⎟
⎜ ⎟
B1 = ⎜ q3,1 q3,2 u(3) q3,4 ... ⎟
⎜ ⎟
⎜ .. .. .. .. .. ⎟
⎝ . . . . . ⎠
qm,1 qm,2 . . . qm,m u(m)

such that for i = 1, . . . , m, u(i) = qi,i − λ. Recall that qi,j values in these matrices
are from Qz . Note that A0 , A1 , and A2 are square matrices of size m × m.
In addition, B0 , B1 , and B2 are also of sizes m × m; in essence, the value
corresponding to the Bi matrices for i = 0, 1, 2 equals m. The zeros in the Q
matrix are also of size m × m.
In summary, {(Z(t), X(t)), t ≥ 0} is a level-independent infinite-level QBD
process. Next, we use MGM to obtain the steady-state probabilities. Since
Qz = A0 + A1 + A2 and {Z(t), t ≥ 0} is an irreducible CTMC, we satisfy the
requirement that A0 + A1 + A2 is an irreducible. Let π be the stationary prob-
ability for the CTMC {Z(t), t ≥ 0}. The 1 × m row vector π = [π1 . . . πm ] can
be obtained by solving π(A0 + A1 + A2 ) = [0 0 . . . 0] and π1 = 1, where 1 is
an m × 1 column vector. Since the condition for the CTMC to be stable is
Exponential Interarrival and Service Times: Numerical Methods 127

πA2 1 > πA0 1,

we have the necessary and sufficient condition for stability as

m
μ πi bi > λ.
i=1

Note that this condition implies that the average arrival rate must be smaller
than the average service rate for stability. However, an interesting observa-
tion is that the average service time experienced by a customer is not the
reciprocal of the time-averaged service rate (described earlier).
Having described the stability condition, assuming it is met, the next
step is to obtain the steady-state probabilities p of the QBD process with rate
matrix Q. As described in the MGM analysis, we write p as [p0 p1 p2 . . . ],
where p0 , p1 , p2 , . . ., are 1 × m row vectors. We obtain the steady-state prob-
abilities p = [p0 p1 p2 . . . ] by solving for p0 and p1 in the following set of
equations:

p0 B1 + p1 B2 = 0

p0 B0 + p1 A1 + p1 RA2 = 0

p0 1 + p1 (I − R)−1 1 = 1

where R is the minimal nonnegative solution to the equation

R2 A2 + RA1 + A0 = 0.

Then, using pi = p1 Ri−1 for all i ≥ 2, we obtain p2 , p3 , p4 , etc. In this case, R

can only be obtained numerically using a recursive computation once Qz , λ,
and θi for i = 1, . . . , m are known.
Once p0 and p1 are obtained, the average number of customers in the
system (L) is
∞
∞

L= ipi 1 = ip1 Ri−1 1 = p1 (I − R)−2 1.
i=1 i=1

Therefore, the expected waiting (including service) time of a job in the system
can be calculated using Little’s law as follows:

W = λ−1 p1 (I − R)−2 1 (3.7)

where W is the average sojourn time in the system for an arbitrary arrival in
steady state.
128 Analysis of Queues

Next, to illustrate the previous results, we present a numerical example

on analyzing multimedia traffic at web servers. Web servers often trans-
mit multimedia information that can typically be classified into two types:
bandwidth sensitive and loss sensitive. Bandwidth-sensitive traffic usually
requires information to be streamed at a certain bandwidth; however, when
the available capacity is reached, requests for bandwidth-sensitive infor-
mation are dropped (or lost). On the other hand, loss-sensitive traffic uses
up available capacity not used by the bandwidth-sensitive traffic. Hence,
it is called elastic traffic. Bandwidth-sensitive traffic is usually audio or
video, whereas loss-sensitive traffic is usually data. In fact, data traffic (loss-
sensitive) requests are always accepted; however, their transmission rates
can vary depending on the available capacity.
From the data traffic standpoint, this system is similar to a single server
with variable service rate. Consider a channel with finite capacity through
which this bandwidth-sensitive and loss-sensitive traffic are transmitted.
Analyzing the bandwidth-sensitive traffic is straightforward as it can be
analyzed separately considering the loss-sensitive traffic has no effect on
its dynamics. However, the queueing process for the loss-sensitive traffic
is extremely dependent on the bandwidth-sensitive traffic. Hence, we ana-
lyze the queueing process of the elastic traffic (loss-sensitive) such that the
available processor capacity varies with time and governed by an environ-
ment process, which essentially is the dynamics of the bandwidth-sensitive
traffic. To illustrate this analysis, we present the following numerical
example.

Problem 24
Consider a web server that streams video traffic at different bandwidths.
This is very common in websites that broadcast sports over the Internet.
The users are given an option to select one of the two bandwidths offered
depending on their connection speed. Let us denote the two bandwidths by
r1 = 0.265 Mbps (low bandwidth) and r2 = 0.350 Mbps (high bandwidth). Let
the processing capacity of the web server be C = 0.650 Mbps. The arrival rates
of requests for the two bandwidths (low and high, respectively) are exponen-
tially distributed with parameters λ1 = 1 per second and λ2 = 2 per second.
The service rates are exponentially distributed with parameters μ1 = 2 per
second and μ2 = 3 per second for the two bandwidths, respectively. Note
that service rate corresponds to the holding time that the streaming request
stays connected streaming traffic at its bandwidth. Besides the streaming
traffic, there is also elastic traffic, which is usually data. The arrival rate and
file size of elastic traffic are exponentially distributed with parameter λ = 3
per second and μ = 8 per MB (note the unit of the file size parameter). The
elastic traffic uses whatever remaining capacity (out of C) the processor has.
Compute mean number of elastic traffic requests in the system in steady state
as well as the steady-state response time they experience.
Exponential Interarrival and Service Times: Numerical Methods 129

Solution
Let the state of the environment be a two-dimensional vector denoting the
number of low- and high-bandwidth streaming requests in the system at
time t. Note that the low- and high-bandwidth requirements are r1 = 0.265
Mbps and r2 = 0.350 Mbps, respectively, with total capacity C = 0.650 Mbps.
Hence, the possible states for the environment are (0, 0), (1, 0), (2, 0),
(0, 1), (1, 1), where the first tuple represents the number of ongoing low-
bandwidth requests and the second one represents the number of ongoing
high-bandwidth requests. Without loss of generality, we map the states
(0, 0), (1, 0), (2, 0), (0, 1), and (1, 1) to 1, 2, 3, 4, and 5, respectively. Therefore,
the environment process {Z(t), t ≥ 0} is a CTMC on state space {1, 2, 3, 4, 5}.
The corresponding available bandwidths (in Mbps) for the elastic traffic
in those five states are b1 = 0.650, b2 = 0.385, b3 = 0.120, b4 = 0.300, and
b5 = 0.035. Further, the infinitesimal generator matrix Qz for the irreducible
CTMC {Z(t), t ≥ 0} is given by
⎡ ⎤
−λ1 − λ2 λ1 0 λ2 0
⎢ μ1 −μ1 − λ1 − λ2 λ1 0 λ2 ⎥
⎢ ⎥
⎢ ⎥
Qz = ⎢ 0 2μ1 −2μ1 0 0 ⎥
⎢ ⎥
⎣ μ2 0 0 −μ2 − λ1 λ1 ⎦
0 μ2 0 μ1 −μ1 − μ2

⎡ ⎤
−3 1 0 2 0
⎢ 2 −5 1 0 2 ⎥
⎢ ⎥
⎢ ⎥
=⎢ 0 4 −4 0 0 ⎥.
⎢ ⎥
⎣ 3 0 0 −4 1 ⎦
0 3 0 2 −5

Now, consider the elastic traffic. Elastic traffic requests arrive into a single
server queue according to a Poisson process with mean rate λ = 3 per second.
Each arriving request brings a certain amount of work distributed exponen-
tially with mean 1/μ = 0.125 Mb that is processed at varying rates, depending
on the available capacity left over by the streaming traffic. In particular, if
the environment process is in state i, the request (if any) in process is served
at rate θi = μbi for i = 1, 2, 3, 4, 5. Let X(t) be the number of elastic requests
in queue at time t. Let Z(t) be the state of the environment process, which
governs the server speed at time t and {Z(t), t ≥ 0} is an irreducible finite-
state CTMC with m = 5 states and infinitesimal generator matrix Qz = [qi,j ]
for i ∈ {1, 2, . . . , m} and j ∈ {1, 2, . . . , m} described earlier. When the state of
the environment process Z(t) = i, the service speed available is bi bytes per
unit time, which is also given earlier for i = 1, . . . , 5. Clearly, the bivariate
stochastic process {(Z(t), X(t)), t ≥ 0} is a two-dimensional CTMC on state
space {(1, 0), (2, 0), . . . , (5, 0), (1, 1), (2, 1), . . . , (5, 1), (1, 2), (2, 2), . . . , (5, 2), . . .}
130 Analysis of Queues

with infinitesimal generator matrix Q of form

⎛ ⎞
B1 B0 0 0 0 0 ...
⎜ B2 A1 A0 0 0 0 ... ⎟
⎜ ⎟
⎜ 0 A2 A1 A0 0 0 ... ⎟
⎜ ⎟
Q=⎜ 0 0 A2 A1 A0 0 ... ⎟
⎜ ⎟
⎜ 0 0 0 A2 A1 A0 ... ⎟
⎝ ⎠
.. .. .. .. .. .. ..
. . . . . . .

where
⎛ ⎞ ⎛ ⎞
λ 0 0 0 0 θ1 0 0 0 0
⎜ 0 λ 0 0 0 ⎟ ⎜ 0 θ2 0 0 0 ⎟
⎜ ⎟ ⎜ ⎟
A0 = B0 = ⎜
⎜ 0 0 λ 0 0 ⎟ , A2 = B2 = ⎜
⎟ ⎜ 0 0 θ3 0 0 ⎟,
⎟
⎝ 0 0 0 λ 0 ⎠ ⎝ 0 0 0 θ4 0 ⎠
0 0 0 0 λ 0 0 0 0 θ5

⎛
−λ1 − λ2 − λ − θ1 λ1 0
⎜ μ1 −μ1 − λ1 − λ2 − λ − θ2 λ1
⎜
A1 = ⎜
⎜ 0 2μ1 −2μ1 − λ − θ3
⎝ μ2 0 0
0 μ2 0
⎞
λ2 0
0 λ2 ⎟
⎟
0 0 ⎟
⎟
−μ2 − λ1 − λ − θ4 λ1 ⎠
μ1 −μ1 − μ2 − λ − θ5

where λ = 3, λ1 = 1, λ2 = 2, μ1 = 2, μ2 = 3, θ1 = 5.20, θ2 = 3.08, θ3 = 0.96,

θ4 = 2.40, and θ5 = 0.28 (all in per second), and
⎛
−λ1 − λ2 − λ λ1 0
⎜ μ1 −μ1 − λ1 − λ2 − λ λ1
⎜
B1 = ⎜
⎜ 0 2μ1 −2μ1 − λ
⎝ μ2 0 0
0 μ2 0
⎞
λ2 0
0 λ2 ⎟
⎟
0 0 ⎟.
⎟
−μ2 − λ1 − λ λ1 ⎠
μ1 −μ1 − μ2 − λ
Exponential Interarrival and Service Times: Numerical Methods 131

Now, to obtain the steady-state probabilities for the bivariate CTMC

{(Z(t), X(t)), t ≥ 0}, we use MGM for the analysis. For that, we begin by let-
ting π be the stationary probability for the CTMC {Z(t), t ≥ 0}. By solving
πQz = [0 0 0 0 0] and π1 = 1, we get π = [0.3810 0.1905 0.0476 0.2540 0.1270].
Next, we check if the condition for stability

5
μ πi bi > λ
i=1

is satisfied. Since

5
μ πi bi = 3.2585
i=1

and λ = 3, clearly the stability condition is satisfied.

The next step is to describe the steady-state probabilities p of the QBD
process with rate matrix Q. As described in the MGM analysis, we write p as
[p0 p1 p2 . . .], where p0 , p1 , p2 , . . . are 1 × 5 row vectors. We first obtain R
as the minimal nonnegative solution to the equation

R2 A2 + RA1 + A0 = 0.

For this, we initialize R to be a 5 × 5 matrix of zeros and iterate

writing R =− (R2 A2 + A0 )A−11 so that the RHS is the old R and the
LHS is the new one. This iteration continues until R converges. The
methodology to obtain R as the minimal nonnegative solution to R2 A2 +
RA1 + A0 = 0 is a greatly researched topic and the reader is encour-
aged to investigate that as a poor choice of initial R (such as an identity
matrix) using the previous iteration would result in a wrong R. A wise
choice would be to select R so that the spectral radius is less than 1
(for the identity matrix it is one). Also, the reader must be warned
that for large matrices, the convergence of R is not fast. For this, we
get R as

⎡ ⎤
0.4551 0.0845 0.0131 0.1456 0.0400
⎢ 0.2526 0.4187 0.0600 0.1273 0.1210 ⎥
⎢ ⎥
⎢ ⎥
R=⎢ 0.2585 0.2917 0.4400 0.1291 0.0907 ⎥.
⎢ ⎥
⎣ 0.2959 0.0905 0.0145 0.4774 0.0830 ⎦
0.2958 0.2373 0.0363 0.2367 0.4574
132 Analysis of Queues

Then, by solving for p0 and p1 in the following set of equations:

p0 B1 + p1 B2 = 0

p0 B0 + p1 A1 + p1 RA2 = 0

p0 1 + p1 (I − R)−1 1 = 1

we get p0 = [0.0345 0.0115 0.0020 0.0168 0.0052] and p1 = [0.0256 0.0111

0.0024 0.0160 0.0067]. Note that there are 11 equations and 10 unknowns in
the previous set. However, the 10 equations corresponding to p0 B1 +p1 B2 = 0
and p0 B0 + p1 A1 + p1 RA2 = 0 are linearly independent. Hence, the best
approach is to use 9 of those 10 equations and use the normalizing equation
p0 1 + p1 (I − R)−1 1 = 1 as the 10th equation. The average number of customers
in the system is L = p1 (I −R)−2 1 = 14.4064 and the expected response time for
a job in steady state is W = λ−1 L = 4.8021 s.

3.3 Finite-State Markov Chains

We have seen so far only CTMCs that have a nice structure, such as (a) a mul-
tidimensional birth and death process, (b) a repeating block structure where
one can use MGMs, or (c) the ability to obtain closed-form expressions such
as in Chapter 1. In this section, we will consider CTMCs that do not have a
nice structure and we would have to solve for the steady-state probability
by direct matrix manipulations. Typically, this is computationally intensive
and would not be appropriate for very large matrices that have to be solved
repeatedly (such as to obtain the boundaries of control regions optimally).
However, software packages such as MATLAB , Mathematica, and Maple
have extremely good inbuilt functions to perform matrix manipulations effi-
ciently. Next, we provide some techniques to further speed up the matrix
manipulations for finite-state CTMCs.

3.3.1 Efﬁciently Computing Steady-State Probabilities

We begin by considering approaches to solve for the steady-state proba-
bilities of finite-state irreducible CTMCs using software packages. As an
example, consider a CTMC on state space S = {1, 2, 3, 4} with infinitesimal
generator matrix
Exponential Interarrival and Service Times: Numerical Methods 133

⎡ ⎤
−4 1 2 1
⎢ 1 −3 2 0 ⎥
Q=⎢
⎣ 2
⎥. (3.8)
2 −6 2 ⎦
3 0 0 −3

The steady-state probabilities p = [p1 p2 p3 p4 ] can be obtained by solving for

pQ = [0 0 0 0] and p1 + p2 + p3 + p4 = 1. This example can be solved in many
ways; however, we present a few generic techniques that can be used for any
Q matrix.

3.3.1.1 Eigenvalues and Eigenvectors

Note that one of the eigenvalues of Q is 0 and p is in fact a left eigenvector.
Therefore, one approach to solve for p is to compute the left eigenvalues of Q
and pick the one corresponding to eigenvalue of 0. Since eigenvectors are not
unique, each software package uses its own way to represent eigenvectors.
However, since the sum of the pi values must be 1, a quick way to obtain
p is to take the appropriate left eigenvector and divide each element by the
sum of the elements of that eigenvector. It is also worthwhile to point out
that some software packages only compute right eigenvectors, so one would
have to use the transpose of Q to obtain the left eigenvectors.

Problem 25
Obtain the left eigenvector of the Q matrix in Equation 3.8 corresponding to
eigenvalue 0 and normalize it so that it adds to 1 to obtain p.

Solution
The left eigenvectors of Q are [−0.6528 −0.4663 −0.3730 −0.4663],
[−0.5 0.5 −0.5 0.5], [−0.0000 0.4082 −0.8165 0.4082], and [0.4126 −0.7220
−0.2063 0.5157]. They correspond to eigenvalues 0, −6, −7, and −3,
respectively. The left eigenvector corresponding to eigenvalue of 0 is
[−0.6528 −0.4663 −0.3730 −0.4663]. Normalizing by dividing each element
by the sum of the elements of the previous eigenvector, we get

p = [0.3333 0.2381 0.1905 0.2381].

It is a good idea to verify this result by performing pQ = 0. A word of

caution is that obtaining eigenvalues and eigenvectors is extremely time-
consuming. However, most software packages have efficient routines to
solve for them, even for sparse matrices.
134 Analysis of Queues

3.3.1.2 Direct Computation

The most common approach to solve for pQ = 0 and i∈S pi = 1, where Q is
an m × m infinitesimal generator matrix, is to compute
it directly. There are
m + 1 equations (m equations for pQ = 0 and 1 for pi = 1) and m unknowns.
Since Q has linearly dependent columns (because the rows of Q add to 1),
drop one of the columns of Q and use the normalizing equation pi = 1
instead. We illustrate this with an example.

Problem 26
For the Q matrix in Equation 3.8, write down p in matrix form using Q so
that it can be used to solve for p.

Solution
Since the steady-state probabilities p = [p1 p2 p3 p4 ] can be obtained by
solving for pQ = [0 0 0 0] and p1 + p2 + p3 + p4 = 1, we have
⎡ ⎤
−4 1 2 1
⎢ 1 −3 ⎥
⎢ 2 0 ⎥
[p1 p2 p3 p4 ] ⎢ ⎥ = [0 0 0 0]
⎣ 2 2 −6 2 ⎦
3 0 0 −3

and
⎡ ⎤
1
⎢ 1 ⎥
[p1 p2 p3 p4 ] ⎢ ⎥
⎣ 1 ⎦ = 1.
1

By knocking off the last column

of pQ = 0 and replacing it with the matrix
equation corresponding to pi = 1, we get
⎡ ⎤
−4 1 2 1
⎢ 1 −3 2 1 ⎥
[p1 p2 p3 p4 ] ⎢
⎣ 2
⎥ = [0 0 0 1]
2 −6 1 ⎦
3 0 0 1

and that can be used to get p as

⎡ ⎤−1
−4 1 2 1
⎢ 1 −3 2 1 ⎥
[p1 p2 p3 p4 ] = [0 0 0 1] ⎢
⎣ 2
⎥
2 −6 1 ⎦
3 0 0 1
= [0.3333 0.2381 0.1905 0.2381].
Exponential Interarrival and Service Times: Numerical Methods 135

It is a good idea to verify this result by performing pQ = 0. A word of cau-

tion is that obtaining inverses could be time-consuming and some software
packages prefer to solve xA = b using x = b/A as opposed to x = bA−1 .

3.3.1.3 Using Transient Analysis

When the CTMC is irreducible with not a very large number of states, one
of the commonly used procedures is to just use transient analysis taking the
limit. The attractiveness of this approach is that it takes one line of code to
write as opposed to several lines for the other techniques. Be warned that this
method is neither accurate nor computationally efficient. Based on transient
analysis of finite-state irreducible CTMCs where Q is an m × m infinitesimal
generator matrix, we know that

p = [1/m 1/m . . . 1/m] lim exp(Qt)

t→∞

where

∞
(Qt)j
exp(Qt) = I + .
j!
j=1

Therefore, by using a large enough value of t, one could obtain p using the
previous equations (the reason for it being a one-line code is that in most
mathematical software packages, exponential of a matrix is a built-in func-
tion, namely, in MATLAB, the command is expm(Q*t)). We show this via
an example.

Problem 27
For the Q matrix in Equation 3.8, obtain p as the limit of a transient analysis.
Solution
Since the steady-state probabilities p = [p1 p2 p3 p4 ] can be obtained by

p = [1/4 1/4 1/4 1/4] lim exp(Qt),

t→∞

we substitute t = 100, 000 and

⎡ ⎤
−4 1 2 1
⎢ 1 −3 2 0 ⎥
Q=⎢
⎣ 2
⎥
2 −6 2 ⎦
3 0 0 −3
136 Analysis of Queues

to approximate p as [1/4 1/4 1/4 1/4] exp(Qt). For that, we get

[p1 p2 p3 p4 ] = [0.3333 0.2381 0.1905 0.2381].

It is a good idea to verify this result by performing pQ = 0. A word of

caution is that the choice of t should be large enough for p to be accurate.
This is sometimes hard to check if the CTMC is quite large. Therefore, this
method would not be recommended unless m is really small, such as less
than 10.

3.3.1.4 Finite-State Approximation

A natural question to ask is that if there is a CTMC with infinite states for
which an analytical solution is impossible, how does one obtain steady-state
performance measures? The most intuitive approximation under such cir-
cumstances is to truncate the state space to some finite amount and do the
analysis as though it is a finite-state CTMC. There are no rules of thumb to
determine where the cutoff needs to be made; it depends on the parameters
of the Q matrix but can be selected using a trial and error procedure. Next,
we present an example where the state space of an infinite-state CTMC is
truncated and the performance measures are obtained repeatedly to deter-
mine the optimal control actions. However, we first describe the setting as
it is a timely topic, then the control policy, and finally compute the optimal
control in various states.

3.3.2 Example: Energy Conservation in Data Centers

Data centers are facilities that comprise hundreds to thousands of computer
systems, servers, and associated components, such as telecommunications
and storage systems. The main purpose of a data center is to run applications
that handle the core business and operational data of one or more orga-
nizations. Practically every single organization uses the services of a data
center (in-house or outsourced, usually the latter) for their day-to-day opera-
tions. In particular, the data centers acquire, analyze, process, store, retrieve,
and disseminate different types of information for their clients. Large data
centers are frequently referred to as Internet hotels because of the way the
thousands of servers are arranged in the facility in racks along aisles.
Data centers consume a phenomenal amount of energy and emit a stag-
gering amount of greenhouse gases. We realize that laptops and desktops
themselves generate quite a bit of heat, and those of us that have been to a
room with servers know how hot that can get. Now, imagine a data center
with thousands of servers. Since so many servers are packed into a small
area, data centers have energy densities 10 times greater than that of com-
mercial office buildings, and their energy use is doubling every 4 years (as of
Exponential Interarrival and Service Times: Numerical Methods 137

year 2006). In a study conducted in January 2006, almost half the fortune 500
IT executives identified power and cooling as problems in their data centers.
A study identified that 100,000 ft2 data center would cost about $44 million
per year just to power the servers and $18 million per year to power the
cooling infrastructure.
One of the biggest concerns for data centers is to find a way to signif-
icantly reduce the energy consumed. Although there are several strategic
initiatives to design green data centers, by controlling their operations energy
consumption in data centers can be significantly conserved. For example,
instead of running one application per server, collect a set of applications
and run them on multiple servers. If the load for a collection of applications
is low, then one could turn off a few servers. For example, if we have 8 appli-
cations a1 , a2 , . . ., a8 and 10 servers s1 , s2 , . . ., s10 , then a possible assignment
is as follows: the set of applications {a1 , a3 , a6 } are assigned to each of servers
s3 , s5 , s7 , and s10 ; applications {a2 , a5 , a7 , a8 } are assigned to each of servers
s1 , s2 , s8 , and s9 ; and application {a4 } is assigned to each of servers s4 and
s6 . Then, if the load for the set of applications {a1 , a3 , a6 } is low, medium, or
high, then one could perhaps turn off two, one, or zero servers, respectively,
from the set of servers assigned to this application {s3 , s5 , s7 , s10 }.
Two of the difficulties for turning off servers are that (a) turning servers
on and off frequently causes wear and tear and reduces their lifetime in addi-
tion to spending energy for powering on and off; and (b) it takes few minutes
for a server to be powered on, and therefore any unforeseen spikes in load
would cause degradation in service to the clients. To address concern (a), in
the model that we develop there would be a cost to boot a server (also popu-
larly known as switching cost in the queueing control literature). Further,
a strategy for concern (b) is to perform what is known as dynamic volt-
age/frequency scaling. In essence, the speed at which the server processes
information (which is related to the CPU frequency) can be reduced by scal-
ing down the voltage dynamically. By doing so, the CPU not only consumes
lesser energy (which is proportional to the cube of the frequency), but also
can switch instantaneously to a higher voltage when a spike occurs in the
load. However, even at the lowest frequency (similar to hibernation on a
laptop), the server consumes energy and therefore the best option is still to
turn servers off.
Next, we describe a simple model to develop a strategy for controlling the
processing speed of the servers as well as powering servers on and off. We
assume that applications are already assigned to servers and consider a sin-
gle collection of applications, all of which are loaded on K identical servers.
At any time, a subset of the K servers are on (the rest are off) and all on
servers run at the same frequency. There is a single queue for the set of K
servers and requests arrive according to a Poisson process with mean rate λ.
The number of bytes to process for each request is assumed to be IID with an
exponential distribution. There are possible frequencies to run the servers.
Therefore, we assume that at any time, the service times are exponentially
138 Analysis of Queues

distributed with mean rate μ1 , or μ2 , or . . ., μ . Without loss of generality, we

assume that

μ1 < μ2 < · · · < μ .

At every arrival and departure, an action is taken to determine the number

of “on” servers and their service rate. The number of requests at any time
in the system can go up to infinite. However, for stability, we assume that
λ < Kμ (if all the servers are on and running at full speed, they system must
be stable). Note that the system is a “piecewise” M/M/s queue where the
number of servers (i.e., s such that s ≤ K) and service rate remain a constant
between events (entering arrivals or departure).
The objective is to determine the optimal control action in each state (i.e.,
how many servers should be “on” and at what rate?) so that the long-run
average cost per unit time is minimized. There are three types of costs: power
cost, booting cost, and holding cost. We explain them next. At an arbitrary
time in steady state, let there be s servers running and processing jobs at
rate μi (for an arbitrary s such that 1 ≤ s ≤ K and any μ i ∈ {μ1 , μ2 , . . . , μ }).
The power cost (in dollars) incurred per unit time is s C0 + Cμ3i for oper-
ating the s servers at rate μi . Note that the power cost is a cubic equation
in terms of the request processing rates of the servers and linear in terms of
the number of servers. A booting cost of B dollars is incurred every time a
server is powered on. There is no cost for turning a server off and there is no
cost for switching the service rate. Also, service rates can be switched almost
instantaneously and that is realistic. However, we make an assumption for
modeling convenience that servers can also be powered on and off instan-
taneously. Finally, a holding cost of h dollars is incurred per unit time for
every customer (i.e., request) in the system. Actually, the holding cost does
not have to be linear and can be modeled based on expectations of users that
send requests.
Similar to the optimal control problems described earlier in this chapter,
this problem can also be formulated as an SMDP to determine the optimal
policy in various states. However, as described earlier, the objective of this
book is not on control of queues but rather on performance analysis; hence,
we simply state the structure of the optimal policy without deriving it. Note
that there are two actions to be performed in each state, determining the
number of servers to be on and also their speed. Since there is a cost to
power up a server, it would not make sense to have a pure threshold pol-
icy on the number of servers for the following reason. For example, a server
was switched on upon an arrival, then if the policy is threshold, when a ser-
vice is completed (and no new arrivals occur), the server must be switched
off and this process would continue as customers arrive and depart. To avoid
frequently paying a switching cost (booting cost to be precise), a hysteretic
policy is considered to be more optimal than threshold. In that case, there
Exponential Interarrival and Service Times: Numerical Methods 139

Z(t)
3 μ1 μ2 μ3 μ3 μ3 μ3 μ3 ...

2 μ1 μ1 μ2 μ2 μ3 μ3

1 μ1 μ1 μ2 μ2 μ3
X(t)
0 1 2 3 4 5 6 7 8 9

FIGURE 3.5
Schematic for hysteretic policy.

are two thresholds (let us call them θ1 and θ2 such that θ1 ≤ θ2 ), and the hys-
teretic policy suggests that if the queue length is larger than θ2 , then switch
on a server, but do not switch it off until the queue length goes below θ1 .
However, for the frequencies, the optimal policy is threshold type.
Next, we illustrate the hysteretic policy for the number of servers and
threshold policy for the service rate using Figure 3.5. Let X(t) denote the
number of customers in the system at time t and let Z(t) be the number of
servers on at time t. For the purpose of illustration, let there be three servers
(i.e., K = 3) and the number of service rates possible is three (i.e., = 3 and
hence three rates μ1 , μ2 , and μ3 ). We represent the state of the system at time t
using the two-tuple (X(t), Z(t)). Since there are two actions in each state, the
actions depicted in Figure 3.5 for each state need some explanation. When
the system is empty and one server is running (this corresponds to state (0,1)
in the figure), the server runs at rate μ1 . Now, if a new customer arrives in
this state (which is the only possible event), the action based on the policy
is to continue with one server and run the server in this new state (1,1) at
rate μ1 . In state (1,1) if an arrival occurs, the policy is to continue with one
server; however, the server will run at rate μ2 in the new state (2,1). Whereas,
if a service is completed in state (1,1), the policy is to stick with one server
and run the server at rate μ1 in the new state (0,1). As long as there are less
than or equal to four requests in the system, only one server will run at rate
μi , which will depend on the number in the system (i.e., μ1 for 0 or 1 in the
system, μ2 for 2 or 3 in the system, and μ3 for 4 in the system).
Now, in state (4,1), if an arrival occurs, from the policy in Figure 3.5, a
new server is added to the system as the first action. For the second action,
we observe the new state after arrival as (5,2), where the action is to run both
servers at rate μ2 (note that prior to this in state (4,1), the single server was
running at μ3 and that is slowed down). Note that in state (5,2), if a service is
completed, we do not immediately go back to 1 server but wait till the num-
ber of customers goes below 2. In other words, once there are two servers
running and the number of customers are between 2 and 7, the action with
respect to number of servers is to do nothing (i.e., no addition or subtraction).
The action in terms of service rates of both servers is to use μ1 for 2 or 3 in
140 Analysis of Queues

the system, μ2 for 4 or 5 in the system, and μ3 for 6 or 7 in the system. In state
(2,2), if a service is completed, the action is to turn off a server (of course the
natural choice is to select the server, which completed the service). Also, in
the new state (1,1), the single server would run at rate μ1 . Likewise, if an
arrival occurs in state (7,2), then the action is to power on a new server and
in the new state (8,3), all three servers would run at μ3 . As long as there are
three or more customers in the system, all the three servers would continue
running at rates μ1 for 3 in the system, μ2 for 4 in the system, and μ3 for 5
or more in the system. When the number in the system reaches 3 with three
servers running and one service completes, one of the servers are turned off
and the remaining two servers in the new state (2,2) run at rate μ1 . All this is
depicted in the policy schematic in Figure 3.5. The policy is also described in
Table 3.2.
Although the optimal policy has a structure as described earlier (hys-
teretic for number of servers and threshold for service rates), it is not clear
what the exact optimal policy is. For that, we need to search across all such
policies and determine the one that results in the minimal long-run average
cost per unit time. For that, we explain how to compute the long-run average
cost per unit time for a given policy. For any given policy, we can describe
a CTMC {(X(t), Z(t)), t ≥ 0} with state space S. As an example, consider the
policy in Table 3.2 for a system with K = 3 and = 3. The rate diagram of the
corresponding CTMC is depicted in Figure 3.6. Assume that for this CTMC,
we can obtain the steady-state probabilities pi,j for all (i, j) ∈ S. Then, the
long-run average cost per unit time is

Bλ(p4,1 + p7,2 ) + j(C0 + Cμ(i, j)3 )pi,j + hipi,j .
(i,j)∈S (i,j)∈S

This expression deserves an explanation. Whenever the system is in state

(4,1) or (7,2) and an arrival occurs, a booting cost of B dollars is incurred
for powering on a server. Hence, the average booting cost incurred per unit
time is Bλ(p4,1 + p7,2 ). The notation μ(i, j) denotes the server speed in state
(i, j) according to given policy. For example, μ(3, 1) = μ2 , μ(6, 2) = μ3 , and
μ(3, 3) = μ1 . Therefore, the second expression states that if the system is in
state (i, j) (that happens with probability pi,j ), a power cost of j(C0 + Cμ(i, j)3 )
is incurred per unit time for running j servers at speed μ(i, j). In a similar
manner, the last expression is the holding cost per unit time, which can be
computed as the holding cost in state (i, j) is hi per unit time times the proba-
bility of being in state (i, j) summed over all (i, j) ∈ S. Likewise, the long-run
average cost per unit time for each policy can be computed and the best pol-
icy can be selected. To navigate through policies and quickly compute pi,j
for all (i, j) ∈ S for such CTMCs that do not have a nice structure, we resort
to the numerical techniques mentioned earlier in this section. Although not
necessary, one technique is to truncate the state space to create a finite-state
Exponential Interarrival and Service Times: Numerical Methods 141

TABLE 3.2
Hysteretic Policy in Tabular Form
Current State Server New Rate
(X(t), Z(t)) Event Action State Action
(0,1) Arrival Do nothing (1,1) μ1
(1,1) Arrival Do nothing (2,1) μ2
Departure Do nothing (0,1) μ1
(2,1) Arrival Do nothing (3,1) μ2
Departure Do nothing (1,1) μ1
(3,1) Arrival Do nothing (4,1) μ3
Departure Do nothing (2,1) μ2
(4,1) Arrival Add 1 server (5,2) μ2
Departure Do nothing (3,1) μ2
(2,2) Arrival Do nothing (3,2) μ1
Departure Remove 1 (1,1) μ1
(3,2) Arrival Do nothing (4,2) μ2
Departure Do nothing (2,2) μ1
(4,2) Arrival Do nothing (5,2) μ2
Departure Do nothing (3,2) μ1
(5,2) Arrival Do nothing (6,2) μ3
Departure Do nothing (4,2) μ2
(6,2) Arrival Do nothing (7,2) μ3
Departure Do nothing (5,2) μ2
(7,2) Arrival Add 1 server (8,3) μ3
Departure Do nothing (6,2) μ3
(3,3) Arrival Do nothing (4,3) μ2
Departure Remove 1 (2,2) μ1
(4,3) Arrival Do nothing (5,3) μ3
Departure Do nothing (3,3) μ1
(5,3) Arrival Do nothing (6,3) μ3
Departure Do nothing (4,3) μ2
(6,3) Arrival Do nothing (7,3) μ3
Departure Do nothing (5,3) μ3
(7,3) Arrival Do nothing (8,3) μ3
Departure Do nothing (6,3) μ3
(8,3) Arrival Do nothing (9,3) μ3
Departure Do nothing (7,3) μ3
(9,3) Arrival Do nothing (10,3) μ3
Departure Do nothing (8,3) μ3
.. ..
. Arrival Do nothing . μ3
.. ..
. Departure Do nothing . μ3
142 Analysis of Queues

3 μ2 3μ 3 3μ 3 3μ3 3μ3 3μ3 3μ3

3,3 4,3 5,3 6,3 7,3 8,3 9,3 10,3

3μ1 λ λ λ λ λ λ λ
2 μ1 2μ2 2μ2 2 μ3 2μ 3 λ
2,2 3,2 4,2 5,2 6,2 7,2
2μ1 λ λ λ λ λ
μ1 μ2 μ2 μ3 λ
0,1 1,1 2,1 3,1 4,1
λ λ λ λ

FIGURE 3.6
Rate diagram corresponding to hysteretic policy.

CTMC. Next, we present a small numerical example to completely solve and

obtain the optimal policy.
For that, we require the following notation. For i = 1, 2, . . . , K, let li be
the smallest number of customers for which there would be i servers run-
ning and ui be the largest number of customers for which there would be i
servers running. In other words, if there are li customers in the system with i
servers running and a service is completed, one server would be turned off.
Likewise, a server would be powered on if there are li in the system with
i servers running and a new arrival occurs. Both l1 and uK can be prede-
termined, since clearly we need l1 = 0 and uK = ∞. Further, for j = 2, . . . , ,
we let mij to be the smallest number of customers in the system when there
are i servers running to be processing at rate μj . In other words, if there are
mij customers in the system and i servers are running, a service completion
would result in switching to service rate μj−1 for all i servers. The objective of
the optimization problem is to select li , ui , and mij values for all i = 1, 2, . . . , K
and j = 2, . . . , so that the long-run average cost per unit time is minimized.
We illustrate this computation next for a small numerical example.

Problem 28
Using the notation described so far in this section, consider a small exam-
ple of K = 3 servers and = 2 different service rates (such that μ1 = 10 and
μ2 = 15). For this system, obtain optimal values of l2 , l3 , u1 , u2 , m12 , m22 , and
m32 so that the long-run average cost per unit time is minimized subject to
the following constraints:

2 ≤ m12 ≤ u1 ,
2 ≤ l2 ≤ m22 ≤ u2 ,
l2 ≤ u1 + 1 ≤ u2 ,
l2 + 1 ≤ l3 ≤ m32 ,
l3 ≤ u2 + 1.
Exponential Interarrival and Service Times: Numerical Methods 143

In addition, use the following numerical values: λ = 30, C0 = 6500, C = 1,

B = 5000, and h = 1000.
Solution
We formulate the problem description as a mathematical program where
the integer-valued decision variables are l2 , l3 , u1 , u2 , m12 , m22 , and m32 ,
which need to be obtained optimally. The objective function is to mini-
mize the long-run average cost per unit time subject to satisfying the set of
constraints described earlier. It is crucial to point out that not all of these con-
straints are due to the hysteretic/threshold structure of the optimal policy,
although most are. Also, we do not check if the value function correspond-
ing to the SMDP is submodular. The main aim is to illustrate the use of
numerical techniques to solve for the stationary probabilities expeditiously
and use it in the optimization setting. Since the problem size is small, we
use a complete enumeration scheme, searching through the entire space of
possible solutions to obtain the optimal one. We describe our procedure
stepwise next.

1. For each l2 , l3 , u1 , u2 , m12 , m22 , and m32 that satisfies the following
constraints

2 ≤ m12 ≤ u1

2 ≤ l2 ≤ m22 ≤ u2

l2 ≤ u1 + 1 ≤ u2

l2 + 1 ≤ l3 ≤ m32

l3 ≤ u2 + 1

we determine the objective function value by going through the

remaining steps.
2. Define X(t) as the number of requests in the system and Z(t) the
number of servers powered on at time t. Given l2 , l3 , u1 , u2 , m12 , m22 ,
and m32 , the CTMC {(X(t), Z(t)), t ≥ 0} has a state space S given by

S = {(0, 1), (1, 1), . . . , (u1 − 1, 1), (u1 , 1), (l2 , 2), (l2 + 1, 2), . . . , (u2 − 1, 2),

(u2 , 2), (l3 , 3), (l3 + 1, 3), . . .}.

We use l1 = 0 and u3 = ∞ for the remainder of this exercise. The

infinitesimal generator Q can be obtained for all (i, j) ∈ S and
(k, l) ∈ S using q(i,j),(k,l) , which is the rate of going from state (i, j)
144 Analysis of Queues

to (k, l) and is given by

⎧
⎪ λ if k = i + 1, l = j, i < uj
⎪
⎪
⎪
⎪ λ if k = i + 1, l = j + 1, i = uj
⎪
⎪
⎪
⎪ jμ1 if k = i − 1, l = j, i < mj2 , i > lj
⎪
⎪
⎪
⎪ jμ1 if k = i − 1, l = j − 1, i < mj2 , i = lj
⎪
⎪
⎪
⎪ if k = i − 1, l = j, i ≥ mj2 , i > lj
⎨ jμ2
q(i,j),(k,l) = jμ2 if k = i − 1, l = j − 1, i ≥ mj2 , i = lj
⎪
⎪
⎪
⎪ −λ − jμ1 if k = i, l = j, 0 < i < mj2
⎪
⎪
⎪
⎪ −λ − jμ2 if k = i, l = j, 0 < i ≥ mj2
⎪
⎪
⎪
⎪ −λ if k = i, l = j, i = 0, i < mj2
⎪
⎪
⎪
⎪ −λ if k = i, l = j, i = 0, i ≥ mj2
⎪
⎩
0 otherwise.

The rate diagram for a special case of l2 = 2, l3 = 4, u1 = 6, u2 = 15,

m12 = 2, m22 = 5, and m32 = 7 is depicted in Figure 3.7. We will see
later that this special case is indeed the optimal solution.
3. Given l2 , l3 , u1 , u2 , m12 , m22 , and m32 , as well as the correspond-
ing CTMC {(X(t), Z(t)), t ≥ 0} as described in the previous step,
next we obtain the steady-state probabilities pi,j for all (i, j) ∈ S
such that

pi,j = lim P{X(t) = i, Z(t) = j}.

t→∞

Although there are many ways to numerically obtain pi,j for all
(i, j) ∈ S (especially by writing all pi,j values in terms of pu2 +1,3 ),
we use a finite state approximation by truncating the state space.
In particular, we pick a large M such that M is much larger than
m32 . Since λ = 30 and 3μ2 = 45, the terms in the state space beyond

3μ1 3μ1 3μ2 3μ2 3μ2

4,3 5,3 6,3 7,3 14,3 15,3 16,3

3μ1 λ λ λ λ λ
2μ1 2μ1 2μ2 2μ2 2μ2 2μ2 λ
2,2 3,2 4,2 5,2 6,2 7,2 14,2 15,2
2μ1 λ
λ λ λ λ λ
μ1 μ2 μ2 μ2 μ2 μ2 λ
0,1 1,1 2,1 3,1 4,1 5,1 6,1
λ λ λ λ λ λ

FIGURE 3.7
Rate diagram corresponding to optimal solution.
Exponential Interarrival and Service Times: Numerical Methods 145

(M, 3) would contribute negligibly to the objective function or for

that matter to the pi,j values. Now, we define the truncated state
space S as S = {(0, 1), (1, 1), . . . , (u1 − 1, 1), (u1 , 1), (l2 , 2), (l2 + 1, 2),
. . . , (u2 − 1, 2), (u2 , 2), (l3 , 3), (l3 + 1, 3), . . . , (M, 3)}.
The truncated infinitesimal generator matrix Q is such that for
all (i, j) ∈ S and for all (k, l) ∈ S, q(i,j),(k,l) = q(i,j),(k,l) with the exception
of q(M,3),(M,3) = − 3μ2 . Let p be a row vector that can be found by
solving for pQ = 0 and p1 = 1, where 1 is a column vector of ones.
Therefore, for the original CTMC (not the truncated), we use the
approximation: for all (i, j) ∈ S

pi,j if (i, j) ∈ S
pi,j =
0 otherwise.

4. Now that we have an expression for pi,j for all (i, j) ∈ S given l2 , l3 , u1 ,
u2 , m12 , m22 , and m32 , the next step is to obtain the objective function.
Let us denote the objective function as f (l2 , l3 , u1 , u2 , m12 , m22 , m32 ).
Using the expression described prior to the problem state-
ment for the long-run average cost per unit time, the objective
function is

f (l2 , l3 , u1 , u2 , m12 , m22 , m32 )

j2 −1
3 m

= Bλ(pu1 ,1 + pu2 ,2 ) + j C0 + Cμ31 pi,j
j=1 i=lj

uj
3
uj
3
3
+ j C0 + Cμ2 pi,j + hipi,j .
j=1 i=mj2 j=1 i=lj

It is worthwhile to point out that the various pi,j values are them-
selves functions of l2 , l3 , u1 , u2 , m12 , m22 , and m32 . In particular, we
write pi,j = gij (l2 , l3 , u1 , u2 , m12 , m22 , m32 ) although we do not know
gij (·) explicitly, and this is done purely to write down a mathematical
program. Therefore, the relationship between the decision variables
and the objective function does not exist in closed form. The math-
ematical programming formulation to optimally select integers l2 , l3 ,
u1 , u2 , m12 , m22 , and m32 is
146 Analysis of Queues

Minimize f (l2 , l3 , u1 , u2 , m12 , m22 , m32 )

Subject to

2 ≤ m12 ≤ u1

2 ≤ l2 ≤ m22 ≤ u2

l2 ≤ u1 + 1 ≤ u2

l2 + 1 ≤ l3 ≤ m32

l3 ≤ u2 + 1

f (l2 , l3 , u1 , u2 , m12 , m22 , m32 ) = Bλ(pu1 ,1 + pu2 ,2 )

j2 −1
3 m

+ j C0 + Cμ31 pi,j
j=1 i=lj

uj
3
uj
3
+ j C0 + Cμ32 pi,j + hipi,j
j=1 i=mj2 j=1 i=lj

pi,j = gij (l2 , l3 , u1 , u2 , m12 , m22 , m32 )

∀ (i, j) : j = 1, 2, 3; lj ≤ i ≤ uj .

Although there may be several ways to solve the mathematical pro-

gram, we just use a brute force approach due to the small size. We
completely enumerate all possible solutions and select the minimum
from that. From the complete set of l2 , l3 , u1 , u2 , m12 , m22 , and
m32 values, select the one with the lowest long-run average cost
per unit time as the set of optimal values l∗2 , l∗3 , u∗1 , u∗2 , m∗12 , m∗22 ,
and m∗32 .

For our example with numerical values λ = 30, μ1 = 10, μ2 = 15,

C0 = 6500, C = 1, B = 5000, and h = 1000, the optimal solution is l∗2 = 2, l∗3 = 4,
u∗1 = 6, u∗2 = 15, m∗12 = 2, m∗22 = 5, and m∗32 = 7. The CTMC corresponding
to this optimal solution is depicted in Figure 3.7 as the appropriate rate
diagram.
Exponential Interarrival and Service Times: Numerical Methods 147

Reference Notes
The material presented in this chapter is rather unusual in a queueing
theory text, and in fact in most universities this material is not part of a grad-
uate level course on queueing. However, there are two topics in this chapter
that have received a tremendous amount of attention in the literature: matrix
analytical methods and control of queues. There are several excellent books
on matrix analytical methods, and one of the pioneering works is by Neuts
[85]. Other books include Stewart [99], and Latouche and Ramaswami [73].
The essence of matrix analytical methods is to use numerical and iterative
methods for Markov chains, and it is general enough to be used in a vari-
ety of settings beyond what is considered in this chapter. We will visit this
technique in the next chapter as well to get a full appreciation.
The topic of control of queues is also widespread. However, the litera-
ture on performance analysis and control is quite distinct. There is a clear
optimization flavor in control of queues and the use of stochastic dynamic
programming. A recent book by Stidham [100] meticulously covers the topic
of optimization in queues by design and control. The crucial point made
in this chapter is that stochastic dynamic programming only provides the
structure of the optimal policy (such as threshold, switching curve, and hys-
teretic). Whereas to obtain the exact optimal policy, one needs to explore the
space of solutions that satisfy the given policy. This chapter, especially the
first and third parts, explicitly explores that.

Exercises
3.1. Refer to the notation in Problem 17 in Section 3.1.1. First show that
−h(x) satisfies conditions (3.2), (3.3), and (3.4). Then, assuming that
Vn (x) is nonincreasing and satisfies the conditions (3.2), (3.3), and
(3.4), show that Vn ((x − ei )+ ) for i = 1, 2 also satisfy the conditions
(3.2), (3.3), and (3.4).
3.2. A finite two-dimensional birth and death process on a rectangular
lattice for some given a and b has a state space given by S = {(i, j) :
0 ≤ i ≤ a, 0 ≤ j ≤ b}, that is, all integer points on the XY plane such
that the x points are between 0 and a and the y points are between
0 and b. The rate of going from state (i, j) to (i + 1, j) is αi for i < a
and the rate of going from state (i, j) to (i, j + 1) is γj for j < b such
that (i, j) ∈ S. The rates αi and γj are known as birth rates. Likewise,
define death rates βi and δj such that they are probabilities of going
from (i, j) to (i − 1, j) and (i, j − 1), respectively, when i > 0 and j > 0.
Show that this generic two-dimensional birth and death process has
148 Analysis of Queues

Q matrix of the block tridiagonal form (as described before the Servi
algorithm in Section 3.1.2) by writing it down in that form.
3.3. Consider Problem 20 in Section 3.1.3. The two-dimensional birth and
death process corresponding to the optimal action is described in
Figure 3.3. Using the numerical values stated in that problem, com-
pute the steady-state probabilities for that two-dimensional birth
and death process using the Servi algorithm. Also, compute the
long-run average cost per unit time.
3.4. For an M[X] /M/1 batch arrival queue with individual service where
batches arrive according to PP(λ) and each batch is of size 1, 2, 3, or 4
with probability 0.4, 0.3, 0.2, and 0.1, respectively. If the service rate
is μ, model the number of customers in the system as a QBD process.
Obtain the condition for stability and the steady-state probabilities
using MGM. For the special case of λ = 1 and μ = 2.5, obtain numeri-
cal values for the steady-state probabilities and the average number
in the system in steady state.
3.5. Consider a CPU of a computer that processes tasks from a software
agent as well as other tasks on the computer in parallel by sharing
the computer’s processor. The software agent submits tasks accord-
ing to a Poisson process with parameter λ and each task has exp(μ)
work (in terms of kilobytes) in it that the CPU has to perform. If the
only process running on the CPU is that of the agent, it receives all
the CPU speed. However, if there are few other processes running
on the CPU, only a fraction of the CPU speed is available depending
on how many processes, running at the same time. Model the system
as a queue with time-varying service rates according to an external
environment process (the other processes that run on the CPU). For
this, let the available capacity vary according to a CTMC {Z(t), t ≥ 0}
with generator matrix Qz such that at time t the available processing
speed for the agent tasks is bZ(t) (kilobytes per second). There are
five possible server speeds, that is, Z(t) takes values 1–5. They are
b1 = 1, b2 = 2, b3 = 3, b4 = 4, and b5 = 5. The infinitesimal generator
matrix Qz is a 5×5 matrix given by

⎡ ⎤
−6 2 1 2 1
⎢ 1 −7 3 2 1 ⎥
⎢ ⎥
Qz = ⎢
⎢ 3 2 −8 2 1 ⎥.
⎥
⎣ 2 1 1 −5 1 ⎦
3 4 1 2 −10

The mean arrival rate λ = 2.5 and the mean task size 1/μ = 1. Com-
pute the mean response time for jobs posted by the software agent
to the CPU. Use a similar framework as Problem 24 in Section 3.2.3.
Exponential Interarrival and Service Times: Numerical Methods 149

3.6. Consider CTMCs {U(t), t ≥ 0} and {Y(t), t ≥ 0} with respective state

spaces SU = {0, 1, 2} and SY = {0, 1} and infinitesimal generator
matrices
⎡ ⎤
−α α 0
QU = ⎣ δ −β − δ β ⎦
0 γ −γ

and

−θ θ
QY = .
ν −ν

For i = 0, 1, 2, whenever the CTMC {U(t), t ≥ 0} is in state i, arrivals

to a single server queue with exponentially distributed service times
and infinite waiting room occur according to a Poisson process with
rate λi . Likewise, for j = 0, 1, whenever the CTMC {Y(t), t ≥ 0} is in
state j, the service rate is μj . Wireless channels are sometimes charac-
terized this way where the service rates change depending on signal
strength and the arrival rates vary depending on the type of infor-
mation flow. Model this system as a QBD process by representing
the Q matrix in block diagonal form and stating A0 , A1 , A2 , B0 , B1 ,
and B2 .
3.7. Customers arrive into a queue with infinite waiting room accord-
ing to PP(λ). The service times are exponentially distributed with
mean 1/μ. There is a single permanent server in the system and a
maximum of K temporary servers at any given time. Whenever a
customer finishes service, with probability p, the customer becomes
a temporary server if there are less than K of them in the system.
This is the only way temporary servers are created. Each tem-
porary server acts as a server for an exp(θ) time and leaves the
system. Peer-to-peer systems are sometimes characterized this way.
Model the system as a QBD process. Obtain the average number
of customers waiting to begin service in the system in steady state
using the following numerical values: λ = 10, μ = 6, K = 2, p = 0.5,
and θ = 1.
3.8. Grab-a-grub is a tiny outfit at an airport that makes a variety of
sandwiches and burritos. Because of space restrictions, at any given
time, only sandwiches or burritos are made and there is a setup time
involved in switching from making one to the other. However, once
a sandwich or burrito is made, it is inspected, carefully packaged,
and labeled before it is put on display. It takes an exp(λs ) time to
make a sandwich and an exp(λb ) time to make a burrito. Grab-a-
grub makes Ns sandwiches, then spends exp(θ) time setting up to
make burritos and makes Nb burritos. After that, another exp(θ)
150 Analysis of Queues

time is incurred to switch back to making sandwiches. This cycle

continues and after making every sandwich or burrito, it enters a
single server queue for inspection, packaging, and labeling, which
together takes exp(μ) time per sandwich or burrito. Let X(t) be the
number of food products (sandwiches and burritos together) in this
queue including any in the process of being inspected, packaged,
or labeled. Using a suitable CTMC {Z(t), t ≥ 0} to characterize the
sandwich- and burrito-making process, model {(Z(t), X(t)), t ≥ 0} as
a QBD process.
3.9. Consider an FCFS single server queue where the arrivals are accord-
ing to PP(λ) and service times are IID exp(μ) random variables.
Upon arrival, a customer enters the system with probability ri if
there are i other customers in the system. Also, each customer
that entered the system has a patience time distributed exponen-
tially with mean 1/β so that if the patience time elapses before
service begins, the customer abandons the queue. Model the system
as a one-dimensional birth and death process. Obtain an expres-
sion using infinite sums and products for the average number of
customers in the system (L) and the average departure rate from
the server after service in steady state (throughput). Using those,
compute numerical values of L and throughput for λ = 10, r = 0.99,
μ = 12, and β = 1. Now, using a finite-state approximation of the
birth and death process, compute L and throughput for the same
numerical values.
3.10. Using an appropriate matrix software package, compare the solu-
tion times for the three techniques in Section 3.3.1 (i.e., eigenvectors,
direct computation, and transient analysis) to numerically solve for
the steady-state probabilities. For that, randomly generate a 100×100
Q matrix where it is possible to go from every state to every other
state in one step.
4
General Interarrival and/or Service Times:
Closed-Form Expressions and
Approximations

In Chapters 2 and 3, we studied queues for which both the interarrival

times and service times were exponentially distributed. This enabled us to
use CTMCs to model and analyze such queues. In this chapter, we study
how to analyze queues with other distributions for interarrival times or
service times or both. Of course, if the distributions are mixtures of exponen-
tials (such as exponential, Erlang, hyperexponential, hypoexponential, and
phase-type), we could still analyze using CTMCs. Otherwise, we would need
other techniques and approximations to model and analyze these queues.
The objective of this chapter is to present a breadth of methods to ana-
lyze queues with general distributions for either interarrival times or service
times or both. We present techniques that would result in closed-form alge-
braic expressions, numerical values, or approximations for various measures
of performance such as distributions and moments of queue length as well
as sojourn times. We begin by considering discrete time Markov chains to
model some queueing systems to obtain closed-form algebraic expressions
with the understanding that such Markov models can be widely used for
many other settings. We also end the chapter with closed-form algebraic
expressions; however, the techniques presented are somewhat specialized
for those settings. The middle of the chapter describes techniques that
can be used for obtaining performance measures numerically or through
approximations.

4.1 Analyzing Queues Using Discrete Time Markov Chains

In this section, we will consider a few queueing systems where we would
use discrete time Markov chains (DTMCs) for their analysis. In that light, we
first describe the fundamentals involved in DTMC analysis and then use it
in two systems, the M/G/1 queue and the G/M/1 queue. Unlike the CTMC
models where in most cases all we needed was the solution to the balance
equations, here in the DTMC models we need more analysis after the balance

151
152 Analysis of Queues

equations are solved in order to obtain the required performance metrics.

They are fairly specialized and so we will consider them individually for the
M/G/1 queue and the G/M/1 queue. But first we take a look at the steady-
state probabilities in a DTMC.
Consider a DTMC {Zn , n ≥ 0} where Zn is the state of the system at the
nth observation. Although in many DTMC examples, one considers these
observations being made at equally spaced intervals, that is not necessary.
In fact, in both analyses of this section, the inter-observation times will not
be equal. However, as we will see subsequently, it is crucial to point out
that the choices of when to observe the system and what to capture in Zn
(the state) must be made carefully so that the resulting process {Zn , n ≥ 0} is
indeed a DTMC. Assuming that is done, consider a DTMC with state space
S and one-step transition probability matrix P. Denote pij as an element of P
corresponding to row i and column j. By definition

pij = P{Zn+1 = j|Zn = i}

and pij is not a function of n, that is, time-homogeneous.

In this chapter we are only interested in irreducible DTMCs, that is,
DTMCs where it is possible to go from every state in S to every other state
in S in one or more steps. In order to obtain the steady-state
probability vec-
tor π = [πi ], we solve for πP = π and normalize using i∈S πi = 1. The set of
equations for πP = π is also called balance equations and can be written for
every i ∈ S and j ∈ S as

πi pii + πj pji = πi .
j=i

If we draw the transition diagram, then what is immediately obvious is that

since pii = 1 − j=i pij we have

πi pij = πj pji .
j=i j=i

This means that across each node i, the flow out equals the flow in just like
in the CTMCs. Notice that the preceding expression does not include pii ;
however, if that is preferred, we could add πi pii to both sides of the equation
to get

πi pij = πj pji .
j∈S j∈S

Since the balance equations along with the normalizing conditions are
exactly analogous to those of the CTMC (essentially replacing qij by pij
General Interarrival and/or Service Times 153

and pi by πi ), the methodologies to solve for them to get the steady-state

probabilities are also identical. All the techniques explained in Chapters 2
and 3 can be applied to DTMCs as well and there would be no point in
repeating those here. In particular, we will use generating functions in the
first example (M/G/1 queue) to follow and that is identical to that explained
in Chapter 2. However, for the second DTMC (G/M/1 queue) we will use a
trial approach that can potentially also be used for CTMCs although we do
not explicitly address it anywhere in Chapters 2 or 3.

4.1.1 The M/G/1 Queue

We begin by describing the M/G/1 queue and setting the notation. Con-
sider a single server queue with infinite waiting room. Arrivals to this queue
occur according to a Poisson process with mean rate λ arrivals per unit time.
The service times are independent and identically distributed random vari-
ables with common CDF G(t) such that if Si is the service time of the ith
customer, then

G(t) = P{Si ≤ t}

for all i. Service times are nonnegative and hence G(t) = 0 for all t < 0. How-
ever, there are no other restrictions on the random variables Si ; in fact, they
could be discrete, continuous, or a mixture of discrete and continuous. For
the sake of notational convenience, we let the mean and variance of service
times to be 1/μ and σ2 , respectively, such that for all i,

1
E[Si ] =
μ

and

Var[Si ] = σ2 .

Since the mean and variance of the service times can easily be derived from
the CDF G(t), sometimes μ and σ are not specified while describing the
service times.
We have almost everything in place to call the preceding system an
M/G/1 queue. The only aspect that remains is the service discipline.
Although Kendall notation specifies that the default discipline is FCFS and
we will derive all our results assuming FCFS, it is worthwhile to discuss
the generality of the analysis. For most of the analysis, we only require that
if there is an entity in the system, useful work is always performed and at
most one entity in the queue can have incomplete service. This deserves
some explanation. Firstly, the server cannot be idle when there are customers
in the system. That also means that if there are customers waiting, as soon
154 Analysis of Queues

as service completes for a customer, the service for the next customer starts
instantaneously. This is an aspect that is frequently overlooked while collect-
ing service time data. The simplest way to fix the problem is to add any time
spent between service of customers to the customers’ service time. Secondly,
since at most one customer can have incomplete service, this precludes
disciplines involving preemption or processor sharing. However, schemes
such as LCFS without preemption and random order of service can be
analyzed.
Having described the setting for the M/G/1 queue, we next model and
analyze the system. Notice that unless the service times are according to
some mixture of exponential distributions, we cannot model the system
using a CTMC. In fact, we will see that even for the mixture of exponential
case, modeling as a DTMC provides some excellent closed-form algebraic
expressions that the CTMC model may fail to provide. With that said, we
begin modeling the system. The first question that comes to mind is when to
observe the system so that the resulting process is a DTMC. We can immedi-
ately rule out observing the system any time in the middle of a service since
the remaining service time would now depend on history and Markov prop-
erty would not be satisfied. Therefore, the only options are to observe at the
beginning and/or end of service times. Although it may indeed be possible
to model the system as a DTMC by observing both at the beginning and at
the end of a service, we will see that it is sufficient if we observed the system
at the end of a service. In other words, we will observe the system whenever
a customer departure occurs. The next question is that during these depar-
ture epochs, the number in the system goes down by one—so should we
observe before or after a departure? Although either case would work, we
will observe immediately after the departure so that the departing customer
is not included in the state.
With that in place, we let Xn be the number of customers in the system
immediately after the nth departure. The state space, that is, set of all pos-
sible values of Xn , for all n is {0, 1, 2, 3, . . .}. For some arbitrary n, let Xn = i
such that i > 0. We now derive a distribution for Xn+1 , given Xn = i. If Xn = i
immediately after the nth departure, we will have one customer at the server
and i − 1 waiting. When this customer at the service completes service, we
observe the system next, that is, the n+1st departure. So Xn+1 would be equal
to i − 1 plus all the customers that arrived during the service time that just
completed. In other words, Xn+1 would be i − 1 + j with probability aj , where
aj is the probability that j customers arrive during a service (for j = 0, 1, 2, . . .).
Hence, we write mathematically

P{Xn+1 = i − 1 + j|Xn = i} = aj

for all i > 0 and j ≥ 0.

The case Xn = 0 is a little trickier. When the nth customer departs, the
system is empty. The next event is necessarily an arrival. After the arrival
General Interarrival and/or Service Times 155

occurs, during this customer’s service if j arrivals occur, then when this n+1st
customer departs, there would be j in the system. In other words, if Xn = 0,
Xn+1 would be j with probability aj , where aj once again is the probabil-
ity that j customers arrive during a service (for j = 0, 1, 2, . . .). This we write
mathematically as

P{Xn+1 = j|Xn = 0} = aj

for all j ≥ 0. This deserves a little explanation as it is a little different from the
case where Xn > 0 when the time between observations was equal to a service
time. Notice that when Xn = 0, the next observation is after two events, one
arrival and one service, in that order. Clearly, Xn+1 would be equal to the
number of customers that arrive during the second phase, that is, a service.
Thus, we are able to use the same notation aj . Of course, we do not have an
expression for aj and would need to derive it. We will do that after modeling
the system as a DTMC.
From the earlier description, to determine Xn+1 we only need to be given
Xn and not the history. Also the probability of transitioning from Xn to Xn+1
does not depend on n. Therefore, we can model {Xn , n ≥ 0} as a DTMC with
state space {0, 1, 2, . . .} and transition probability matrix
⎡ ⎤
a0 a1 a2 a3 ...
⎢ a0 a1 a2 a3 ... ⎥
⎢ ⎥
⎢ 0 a0 a1 a2 ... ⎥
⎢ ⎥
P=⎢ 0 0 a0 a1 ... ⎥
⎢ ⎥
⎢ 0 0 0 a0 ... ⎥
⎣ ⎦
.. .. .. .. ..
. . . . .

which is an example of an upper Hessenberg matrix (which is an upper tri-

angular matrix plus an additional off-diagonal entry of a0 values). The only
parameters that need to be determined are aj values for all j ≥ 0 (where aj
is the probability that j arrivals occur during one service completion). To
obtain aj , we condition on the service time being t. Therefore, given the ser-
vice time equals t, the probability that j arrivals occur during that service
time is e−λt (λt)j /j! because the number of arrivals in time t is according to a
Poisson distribution with parameter (λt). By unconditioning on the service
time, we get for any j = 0, 1, 2, . . .,

∞ (λt)j
aj = e−λt dG(t).
j!
0

A minor technicality is that Riemann integral should be replaced by

Lebesgue integral if Si is a mixture random variable with point masses
at discrete values. For the remainder of this chapter, we continue to use
156 Analysis of Queues

Riemann integral pretending that Si is purely continuous and unless stated

otherwise, it can easily be generalized to the mixture case. With that we have
the P matrix completely characterized in terms of the “inputs,” namely, λ
and G(t).
Since aj > 0 for all finite j, the DTMC {Xn , n ≥ 0} is irreducible as it is
possible to go from every state to every other state in one or more tran-
sitions. Next, we obtain the steady-state probabilities for the DTMC and
the condition for stability. Assuming that the M/G/1 queue is stable (we
will later derive the condition for stability), let πj be the limiting proba-
bility that in steady state a departing customer sees j others in the system,
that is,

πj = lim P{Xn = j}.

n→∞

The limiting
distribution π = (π0 π1 . . .) can be obtained by solving π = πP
and πj = 1. To solve the balance equations (i.e., the equations that corre-
spond to π = πP for each node), we use the generating function approach
seen in Chapter 2. The balance equations are

π0 = a0 π0 + a0 π1

π1 = a1 π0 + a1 π1 + a0 π2

π2 = a2 π0 + a2 π1 + a1 π2 + a0 π3

π3 = a3 π0 + a3 π1 + a2 π2 + a1 π3 + a0 π4

.. .. ..
...

We multiply the first equation by z0 , the next by z1 , the next by z2 , and so on.
Upon summing we get

π0 z0 + π1 z1 + π2 z2 + π3 z3 + · · · = π0 (a0 z0 + a1 z1 + a2 z2 + · · · )

+ π1 (a0 z0 + a1 z1 + a2 z2 + · · · )

+ π2 (a0 z1 + a1 z2 + a2 z3 + · · · )

+ π3 (a0 z2 + a1 z3 + a2 z4 + · · · ) + · · ·
General Interarrival and/or Service Times 157

which we can rewrite as

π0 z0 + π1 z1 + π2 z2 + π3 z3 + · · · = π0 (a0 z0 + a1 z1 + a2 z2 + · · · )

+ π1 (a0 z0 + a1 z1 + a2 z2 + · · · )

+ π2 z(a0 z0 + a1 z1 + a2 z2 + · · · )

+ π3 z2 (a0 z0 + a1 z1 + a2 z2 + · · · ) + · · · .

By collecting common terms we get

π0 z0 + π1 z1 + π2 z2 + π3 z3 + · · ·
= (a0 z0 + a1 z1 + a2 z2 + · · · )(π0 + π1 + π2 z + π3 z2 + · · · ).

Since we are going to use generating functions, we can rewrite the preceding
equation as

π0 z0 + π1 z1 + π2 z2 + π3 z3 + · · · = (a0 z0 + a1 z1 + a2 z2 + · · · )

1
π0 + (−π0 + π0 z0 + π1 z1 + π2 z2 + π3 z3 + · · · )
z

which using summations becomes

⎛ ⎞
∞
∞ ∞

1
πi zi = ⎝ aj zj ⎠ π0 + −π0 + πi zi . (4.1)
z
i=0 j=1 i=0

Clearly, there are two generating functions to consider: one correspond-

ing to the πi values and the other corresponding to the aj terms. For that,
consider generating functions φ(z) and A(z) such that

∞

φ(z) = πi zi ,
i=0
∞

A(z) = aj zj .
j=0

Therefore, we can rewrite Equation 4.1 in terms of the two generating

functions as

1
φ(z) = A(z) π0 + (−π0 + φ(z)) .
z
158 Analysis of Queues

Note that A(z) can be computed based on the inputs λ and G(t). Hence, by
rewriting the preceding equation, we get φ(z) as

π0 A(z)(1 − z)
φ(z) = . (4.2)
A(z) − z

The only unknown on the RHS of Equation 4.2 is π0 . To obtain that, we first
need to write down some properties for A(z).
From the definition of A(z), we have A(1) = 1 since a0 + a1 + · · · = 1.
However, to get other properties, we first write A(z) in the simplest pos-
sible form in terms of the parameters in the problem definition, namely, λ
and G(t). By using the definition of aj , we get
⎛ ⎞
∞
∞
∞ ∞ ∞

(λt)j (λt) ⎠
j
A(z) = aj zj = e−λt zj dG(t) = e−λt ⎝ zj dG(t)
j! j!
j=0 j=0 0 0 j=0

⎛ ⎞
∞ ∞
(zλt)j ∞
= e−λt ⎝ ⎠ dG(t) = e−λt ezλt dG(t)
j!
0 j=0 0

∞
= e−(1−z)λt dG(t) = G̃((1 − z)λ)
0

where the last expression G̃((1 − z)λ) is the LST of G(t) at (1 − z)λ. By defi-
nition, if S is a random variable corresponding to the service times, then the
LST of G(t) at u is

∞
E e−uS = e−ut dG(t) = G̃(u).
u=0

Also, using the properties of LSTs, G̃(0) = 1, G̃ (0) = − E[S] = − 1/μ, and
G̃ (0) = E[S2 ] = 1/μ2 + σ2 . Therefore, from the earlier results and the relation
A(z) = G̃((1 − z)λ), we get

λ
A (1) = −λG̃ (0) = , (4.3)
μ
λ2
A (1) = λ2 G̃ (0) = + λ2 σ2 . (4.4)
μ2

Now we get back to Equation 4.2. To obtain π0 we first try φ(0) and that
gives φ(0) = π0 , which is true but does not help us to get π0 . Next we try
General Interarrival and/or Service Times 159

φ(1) = 1. To get φ(1), we do the following:

π0 A(z)(1 − z)
φ(1) = lim
z→1 A(z) − z
(1 − z)
= π0 A(1) lim .
z→1 A(z) − z

Using A(1) = 1 and realizing that the limit is of the type 0/0, we use
L’Hospital’s rule to get

(−1)
φ(1) = π0 lim
z→1A (z) − 1
1
= π0 .
1 − A (1)

Using φ(1) = 1 and A (1) = λ/μ from Equation 4.3, we get

λ
π0 = 1 − .
μ

Clearly, for a meaningful solution to π0 and thereby φ(z), we require that

0 < π0 < 1, and since λ as well as μ are positive, we necessarily require that
λ < μ. Therefore, the stability condition is

λ < μ.

This is similar to the stability condition of the M/M/1 queue in Problem 7 in

Section 2.1.3. It states that on average the arrival rate must be smaller than
the service rate for the system to be stable, which is quite intuitive. We use
the term ρ as the traffic intensity such that

λ
ρ= .
μ

Therefore, if ρ < 1, we can write down the generating function φ(z) as

(1 − ρ)(1 − z)G̃(λ − λz)

φ(z) = . (4.5)
G̃(λ − λz) − z

Since φ(z) = π0 + π1 z + π2 z2 + π3 z3 + · · · , in theory we can obtain π1 , π2 , . . .

from φ(z) by either writing down the RHS in Equation 4.5 as a polyno-
mial and matching terms corresponding to z1 , z2 , . . ., or taking the first,
second, . . ., derivatives of φ(z) and letting z → 0. However, it may be
160 Analysis of Queues

possible to obtain certain performance measures without actually obtaining

πj values and that is what we will consider in the next few problems.

Problem 29
Consider a stable M/G/1 queue with PP(λ) arrivals, mean service time 1/μ,
and variance of service time σ2 . Compute L, the long-run time-averaged
number of entities in the system. Using that obtain W the average sojourn
time spent by an entity in the system in steady state.
Solution
Recall that πj is the long-run fraction of time a departing customer sees j
others in the system. It is also known as departure-point steady-state proba-
bility. However, to compute L, we need the time-averaged (and not as seen
by departing customers) fraction of time spent in state j, which we call pj . For
that, we know from PASTA (Poisson arrivals see time averages) described in
Section 1.3.4 that pj must be equal to the long-run fraction of time an arriving
customer sees j others in the system, that is, π∗j . In other words, pj = π∗j . But
we also know from Section 1.3.3 that π∗j = πj for any G/G/s queue and hence
pj = πj for all j.
Using that logic, the average number of customers in the system is

∞
∞

L= jpj = jπj = φ (1).
j=0 j=0

Therefore, to get L we need to compute φ (1), which involves several steps

that we address next. Rewriting Equation 4.5 as

φ(z) G̃(λ − λz) − z = (1 − ρ)(1 − z)G̃(λ − λz)

and differentiating both sides, we get

φ (z) G̃(λ − λz) − z + φ(z) −λG̃ (λ − λz) − 1

= −(1 − ρ)G̃(λ − λz) − (1 − ρ)(1 − z)λG̃ (λ − λz).

We can rewrite the preceding equation as

φ (z) =
(1 − ρ)G̃(λ − λz) + (1 − ρ)(1 − z)λG̃ (λ − λz) − (1 + λG̃ (λ − λz))φ(z)
.
z − G̃(λ − λz)
General Interarrival and/or Service Times 161

Earlier in this section, we saw that G̃(0) = 1 and G̃ (0) = − 1/μ. Using
those (and also realizing φ(1) = 1) to compute φ (1) by taking the limit as z
approaches one, we get a 0/0 expression. Therefore, using L’Hospital’s rule,
we get

−2(1−ρ)λG̃ (λ−λz)−((1−ρ)(1−z)−φ(z))λ2 G̃ (λ−λz)−(1+λG̃ (λ−λz))φ (z)

φ (1) = lim .
z→1 1+λG̃ (λ−λz)

Notice that both sides of the preceding equation has φ (z). Now by taking the
limits as z → 1 using G̃(0) = 1, G̃ (0) = − 1/μ, G̃ (0) = 1/μ2 + σ2 , and φ(1) = 1,
we get

λ2 (σ2 + 1/μ2 )
φ (1) = ρ +
2 1−ρ

where ρ = λ/μ. Since L = φ (1), we have

λ2 (σ2 + 1/μ2 )
L=ρ+ . (4.6)
2 1−ρ

The preceding equation is known as the Pollaczek–Khintchine equation.

Using Little’s law, we can write W = L/λ where W is the average steady-state
sojourn time for an entity in the system as

1 λ (σ2 + 1/μ2 )
W= + .
μ 2 1−ρ

Notice that the preceding equation for W as well as Equation 4.6 for L
are in terms of λ, μ, and σ only. The entire service time distribution G(t) is
not necessary for those expressions. From a practical standpoint, this is very
useful because μ and σ can be estimated more robustly compared to G(t).
Next, having discussed the distribution of the queue length, it is quite
natural to consider the distribution of the sojourn time in the system that
we call waiting time. For this, we require FCFS and it is the first time we
truly require FCFS discipline. Up until now, all the results can be derived
for any work conserving discipline with a maximum of one customer hav-
ing incomplete service at any given time. We describe the next result on the
sojourn time distribution as a problem, by recognizing that we already know
the mean sojourn time from the previous problem.

Problem 30
Let Y be the sojourn time in the system for a customer arriving into a stable
M/G/1 queue in steady state. If the service is FCFS, then show that the LST
162 Analysis of Queues

of the CDF of Y is

(1 − ρ)sG̃(s)
E e−sY = .
s − λ(1 − G̃(s))

Solution
As before, Xn denotes the number of customers in the M/G/1 queueing
system as seen by the nth departing customer. Let Bn be the number of cus-
tomers that arrive during the nth customer’s sojourn, which includes time
spent waiting (if any) and time for service. Since the service discipline is
FCFS, we have

Xn = Bn .

That directly implies that for any z

E zXn = E zBn .

However, since φ(z) is defined as φ(z) = π0 + π1 z1 + · · · , we can rewrite it as

∞

φ(z) = lim P{Xn = i}zi = lim E zXn .
n→∞ n→∞
i=0

Therefore, from the equality E[zXn ] = E[zBn ] and the earlier expression, we
have

lim E zBn = φ(z). (4.7)
n→∞

Next, we develop an expression for E[zBn ] using an entirely different

approach (and it will give an idea of the connection to Y defined in the
problem).
Let Wn be the sojourn time (or waiting time) of customer n with CDF
Hn (w) such that Hn (w) = P{Wn ≤ w}. We seek to obtain an expression for
Hn (w) as n → ∞ as that would be the sojourn time CDF in steady state
with Wn → Y as n → ∞. However, since Bn is indeed the number of cus-
tomers that arrived during the random time Wn , we can write down E[zBn ]
General Interarrival and/or Service Times 163

by conditioning on Wn as follows:

∞
E zBn = E zBn |Wn = w dHn (w)
0

∞
∞
(λw)i i
= e−λw z dHn (w)
i!
0 i=0

where the last equation uses the fact that the arrivals are PP(λ) and the
probability of getting i arrivals in time w is e−λw (λw)i /i!. Therefore, we have

∞
∞ (λw)i
Bn −λw i
E z = e z dHn (w)
i!
0 i=0

∞ ∞
= e−λw eλwz dHn (w) = e−(1−z)λw dHn (w)
0 0

= H̃n (λ(1 − z)) (4.8)

where H̃n (s) is the LST of Hn (w) such that

E e−sWn = H̃n (s).

Taking the limits as n → ∞ of the preceding expression, and using the fact
that Wn → Y as n → ∞, we get

E e−sY = H̃(s).

By substituting z = 1 − s/λ in Equation 4.8 and letting n → ∞, we have

Bn
1−s
E e−sY = H̃(s) = lim E .
n→∞ λ
164 Analysis of Queues

However, from Equation 4.7, we have

Bn
1−s 1−s
lim E =φ .
n→∞ λ λ

Now from the previous two equations, we have

−sY 1−s
Ee = H̃(s) = φ . (4.9)
λ

Using the expression for φ(z) in Equation 4.5 in terms of z, λ, G̃(·), and μ, we
have by letting z = 1 − s/λ

(1 − ρ)sG̃(s)
E e−sY =
s − λ(1 − G̃(s))

where ρ = λ/μ.

Before proceeding, it is worthwhile to verify if the earlier result is accu-

rate for the M/M/1 queue. Consider G(t) = 1 − e−μt for t ≥ 0, that is, the
service times are exp(μ) random variables. We also know from Problem 7 in
Section 2.1.3 that Y ∼ exp(μ − λ) and E[e−sY ] = (μ − λ)/(s + μ − λ) when the
queue is stable. Therefore, we need to check if the RHS of the previous equa-
tion E[e−sY ] = (1 − ρ)sG̃(s)/[s − λ(1 − G̃(s))] is equal to (μ − λ)/(s + μ − λ).
Since G̃(s) = μ/(s + μ), by substituting in the RHS of the previous equation,
we get E[e−sY ] = (μ − λ)/(s + μ − λ).
Reverting to the more general case of G(t) in the M/G/1 queue, notice
that the average sojourn time W can be computed from E[e−sY ] by taking
the derivative with respect to s, multiplying by (−1), and letting s → 0. We
leave this as an exercise problem for the reader (see the last section of this
chapter). That brings about an interesting comment about Little’s law and its
extension. Note that for an FCFS M/G/1 queue, we can actually prove Little’s
law using Equation 4.9. Since H̃(s) = φ(1 − s/λ), taking the derivative with
respect to s on both sides, we get H̃ (s) = −(1/λ)φ (1 − s/λ). Letting s → 0,
we get H̃ (0) = −(1/λ)φ (1). But we know that H̃ (0) = −W and φ (1) = L and
hence we have L = λW. However, notice that H̃(s) = φ(1 − s/λ) is essen-
tially a relationship between the LST of the sojourn time and the generating
function of the queue length, perhaps more can be said in terms of relat-
ing higher moments of the respective quantities. We do that next through a
problem.
General Interarrival and/or Service Times 165

Problem 31
Derive the relationship between higher moments of the steady-state sojourn
time against those of the number in the system for an M/G/1 queue with
FCFS service.
Solution
Let E[Yr ] be the rth moment of the steady-state sojourn time, for r = 1, 2, 3, . . .,
which can be computed as

dr H̃(s)
E[Yr ] = lim(−1)r .
s→0 dsr

Likewise, let L(r) be the rth factorial moment of the steady-state number in
the system. Note that for a discrete random variable X on 0, 1, 2, . . ., the rth
factorial moment is defined as E[X(X − 1)(X − 2) . . . (X − r + 1)]. Therefore,
L(1) is L itself. Then, L(2) can be used to compute the variance of the number
in the system as L(2) + L − L2 . Likewise, higher moments of the number in
the system can be obtained. However, notice that

dr φ̃(z)
E[L(r) ] = lim .
z→1 dsr

Making a change of variable z = 1 − s/λ, we get by taking the rth derivative

of H̃(s) = φ(1 − s/λ) and letting s → 0

E[L(r) ] = λr E[Yr ]. (4.10)

There are many results like the one in the preceding text that can be eas-
ily derived for the M/G/1 queue that sometimes we do not even mention
while talking about the special case M/M/1, although methodologically they
would require quite different approaches. Having said that, the next result
is one that would typically be analyzed using identical methods for M/G/1
and M/M/1, followed by a curious paradox. That result is presented as a
problem.

Problem 32
In a single-server queue, a busy period is defined as a consecutive stretch of
time when the server is busy serving customers. A busy period starts when a
customer arrives into an empty single-server queue and ends when the sys-
tem becomes empty for the first time after that. With that definition, obtain
the LST of the busy period distribution of an M/G/1 queue.
166 Analysis of Queues

Solution
Let Z be a random variable denoting the busy period initiated by an arrival
into an empty queue in steady state. Also, let S be the service time of this
customer that just arrived. Remember that we are only going to consider
nonpreemptive schemes, although this result would not alter if we consid-
ered preemption, as long as it is work conserving. But the proof would have
to be altered, hence the assumption. Let N be the number of customers that
arrive during the service of this “first” customer, that is, in time S. Of course,
if N is zero, then the busy period is S itself. Let us remember this case but
for now assume N > 0. We keep these N customers aside in what we call ini-
tial pool. Take one from the initial pool and serve that customer and in the
mean time if any new customers arrive serve them one by one till there are
no customers in the system except the N − 1 in the initial pool. It is critical to
realize that the time to serve the first customer in the initial pool and all the
customers that came subsequently till the queue did not have any customers
that are not part of the initial pool is stochastically equal to a busy period.
We call this time Z1 . Next, pick the second customer (if any) from the initial
pool and spend a busy period (of length Z2 ) serving that customer and all
that arrive until the queue only has customers from the initial pool. Repeat
the process until there are no customers left in the initial pool. We use this to
write down the conditional relation for some u ≥ 0:

E e−uZ S = t, N = n = E e−u(t+Z1 +Z2 +···+Zn ) .

But this conditional relation also works when N = 0. So from now on, we
remove restriction on N and say that the preceding is true for all N ≥ 0.
Unconditioning the earlier equation using P{N = n|S = t} = e−λt (λt)n /n!,
we get

∞
∞
n (λt)n
E e−uZ = e−ut E e−uZ e−λt dG(t), (4.11)
n!
0 n=0

since Z, Z1 , Z2 , . . ., are IID random variables. We use the notation F̃Z (u) as
the LST of the CDF of Z that is defined mathematically as

∞
F̃Z (u) = E e−uZ = e−uz dFZ (z)
0

where FZ (z) = P{Z ≤ z}. Notice that we do not know FZ (z) and are trying to
obtain it via the LST F̃Z (u). Rewriting Equation 4.11 in terms of the LST of
the busy period distribution, we get
General Interarrival and/or Service Times 167

∞
∞
[F̃Z (u)λt]n −λt
F̃Z (u) = e−ut e dG(t).
n!
0 n=0

By summing over, we get

∞
F̃Z (u) = e−ut e[F̃Z (u)λt] e−λt dG(t) = G̃(u + λ − λF̃Z (u)).
0

Although we have the LST of the busy period distribution as an embedded

relation F̃Z (u) = G̃(u + λ − λF̃Z (u)), it is possible to obtain E[Z] = 1/(μ − λ)
(see Exercises) as well as higher moments E[Zr ].

Although we do not have a closed-form algebraic expression for the LST

of the busy period distribution, we have it as a solution to an equation
F̃Z (u) = G̃(u + λ − λF̃Z (u)). However, for the M/M/1 queue, we can obtain
a closed-form expression. Notice that for the M/M/1 as well, the mean busy
period is 1/(μ − λ). But wait, that looks like the average sojourn time of an
M/M/1 queue. Is that strange? We state a paradox next as a remark.

Remark 8

For an M/G/1 queue with σ > 1/μ since the mean busy period E[Z] = 1/
(μ − λ) and the mean sojourn time W = 1/μ + λ/2(σ2 + 1/μ2 )/(1 − ρ), we
get E[Z] < W. In other words, the mean busy period is smaller than the mean
waiting time when σ > 1/μ. However, this appears like a paradox because if
you take any busy period, the waiting time of a customer that entered and
left during this busy period is always smaller than the busy period itself. But
is the expected waiting time greater than the expected busy period? How
could that be?

Although we are not providing a rigorous argument to support the

remark, it is worthwhile describing the intuition. One of the characteristics
of heavy-tailed random variables (i.e., ones with larger standard deviation
than mean) is that the realizations are usually small values with occasional
large ones. Therefore, when the service times are heavy-tailed, the busy
period equals a single service time very often (that is because the interar-
rival time is usually much larger than the service time). Occasionally, a very
large service time would result in a long busy period. That would develop
into a cycle alternating between several short busy periods (single service)
followed by one long busy period (with several customers). Thus, the aver-
age busy period would be fairly short. But typically there may be several
168 Analysis of Queues

customers stuck in the long busy period and, averaging over customers, the
sojourn times would end up being long. A simulation might help with the
intuition and the reader is encouraged to try one out. Having described this,
we wrap up the topic M/G/1 queue and move on to its counterpart, the
G/M/1 queue.

4.1.2 G/M/1 Queue

Consider a single server queue with infinite waiting room. Arrivals to this
queue occur according to a renewal process with interrenewal time CDF G(t)
such that if Ti is the ith interarrival time, then for all t ≥ 0

G(t) = P{Ti ≤ t}.

Assume that G(0) = 0 to represent that interarrival times cannot be of length

0, that is, no batch arrivals. However, there are no other restrictions on
the random variables; in fact, they could be discrete, continuous, or a mix-
ture of discrete and continuous. Notice that we are using the same letter
G(·) for the interarrival time CDF that we used for the service times of the
M/G/1 queue, a choice that has pros and cons but is commonly done in the
literature. However, unlike the M/G/1 case, where for L and W we only
really need the moments of the service time, here for the G/M/1 queue’s L
and W we need the entire interarrival distribution G(t). Further, the aver-
age interarrival time is 1/λ, that is, E[Ti ] = 1/λ for all i. The service times
are independent and identically distributed exponential random variables
with mean 1/μ. Although the default service discipline is FCFS, most of the
results hold for any work conserving discipline, including processor sharing
and preemptive schemes (although they were not allowed for M/G/1). The
memoryless property of the exponential service time distribution primarily
causes the extra flexibility.
Having looked at modeling an M/G/1 queue (and the title of this section),
it should not be surprising that we would use DTMC to model a G/M/1
queue. However, the similarities end right there after modeling. The anal-
ysis would be significantly different and so would the properties. We first
explain the modeling. We can immediately rule out observing the system
any time in the middle of two arrivals, since the remaining time for the next
arrival would now depend on the history and the Markov property would
not be satisfied. Therefore, the only options are to observe at arrival time
points. The next question is that during these arrival epochs, the number in
the system goes up by one, so should we observe before or after an arrival?
Although either case would work, we will observe immediately before the
arrival so that the arriving customer is not included in the state.
With that justification in place, we let Xn∗ be the number of customers
in the G/M/1 system just before the nth arrival. Then the stochastic process
General Interarrival and/or Service Times 169

∗
Xn , n ≥ 0 is a DTMC with state space S = {0, 1, 2, . . .}, and
⎡ ⎤
b0 a0 0 0 . . .
⎢ b1 a1 a0 0 . . . ⎥
⎢ ⎥
⎢ b2 a2 a1 a0 . . . ⎥
⎢ ⎥
P = ⎢ b3 a3 a2 a1 . . . ⎥
⎢ ⎥
⎢ b4 a4 a3 a2 . . . ⎥
⎣ ⎦
.. .. .. .. ..
. . . . .

where aj is the probability that j departures occur between two consecutive

arrivals and bj is given by

∞

bj = ai for all j ≥ 0
i=j+1

The transition probability matrix P needs some explanation. Consider an

arriving customer that sees there are i (for i = 0, 1, 2, . . .) customers already in
the system (this corresponds to the ith row of P which is [bi ai . . . a0 0 0 . . .]).
Notice that as soon as this customer arrives, there would be i + 1 in the sys-
tem. When the next customer arrives, if there are k departures (among the
i + 1) between this arrival and the previous one (for k ≤ i), then this arriving
customer sees i + 1 − k in the system. That happens with probability ak and
the new state is i + 1 − k. For example, if i = 4 (i.e., the current arrival sees 4 in
the system) and k = 2 with probability a2 (i.e., two service completions occur
before next arrival), then the next arrival sees i + 1 − k = 3 in the system upon
arrival. Further, aj for j = 0, 1, 2, . . ., can be computed by conditioning on the
interarrival time to be t as

∞ (μt)j
aj = e−μt dG(t).
j!
0

Notice that the preceding is derived using exactly the same argument as that
in the M/G/1 queue (the reader is encouraged to that verify). Also, the pre-
ceding equation assumes the interarrival times as being purely continuous.
We will continue to treat the interarrival times that way with the under-
standing that if there were discrete-valued point masses, then the Riemann
integral would be replaced by the Lebesgue integral.
The case not considered in the preceding text is when there are actually
k = i + 1 departures, where i is the number of customers in the system when
the previous arrival occurred. Then, when the next customer arrives, there
would be no other customer in the system. However, the probability of going
from i to 0 in the DTMC is not ai+1 . This is because ai+1 denotes the probabil-
ity there are exactly i+1 departures during one interarrival time interval. But,
170 Analysis of Queues

the i + 1 departures would have occurred before the interarrival time period
ended and if there were more in the system, perhaps there could have been
more departures. Hence, if there were an abundant number of customers in
the system, then during the interarrival time interval, there would be i + 1
or more departures. Hence, the probability of transitioning from state i to 0
in the DTMC is ai+1 + ai+2 + ai+3 + · · · , which we call bi as defined earlier.
Notice that the rows add to 1 in the P matrix and this is a lower Hessenberg
matrix.
Having modeled the G/M/1 queue as a DTMC, next we analyze the
steady-state behavior and derive performance measures. Let π∗j be the
limiting probability that in the long run an arriving customer sees j in
the system, that is,

π∗j = lim P Xn∗ = j .
n→∞

The limiting distribution
π∗ = π∗0 π∗1 . . . , if it exists, can be obtained by
solving π∗ = π∗ P and π∗j = 1. The balance equations that arise out of
∗ ∗
solving for π = π P are

π∗0 = b0 π∗0 + b1 π∗1 + b2 π∗2 + b3 π∗3 + · · ·

π∗1 = a0 π∗0 + a1 π∗1 + a2 π∗2 + a3 π∗3 + · · ·

π∗2 = a0 π∗1 + a1 π∗2 + a2 π∗3 + a3 π∗4 + · · ·

π∗3 = a0 π∗2 + a1 π∗3 + a2 π∗4 + a3 π∗5 + · · ·

.. ..
. .

and we solve it using a technique we have not used before in this text. Since
there is a unique solution to the balance equations (if it exists), we try some
common forms for the steady state-probabilities. In particular, we try the
form π∗i = (1 − α)αi for i = 0, 1, 2, . . ., where α is to be determined. The justi-
fication for that choice is that for the M/M/1 system, π∗i is of that form. The
very first equation from the earlier set π∗0 = b0 π∗0 + b1 π∗1 + b2 π∗2 + b3 π∗3 + · · · ,
is a little tricky, but all others are straightforward. Plugging in π∗i = (1 − α)αi
and bi = ai+1 + ai+2 + ai+3 for i = 0, 1, 2, . . ., we get

π∗0 = b0 π∗0 + b1 π∗1 + b2 π∗2 + b3 π∗3 + · · ·

(1 − α) = (a1 + a2 + · · · )(1 − α) + (a2 + a3 + · · · )(1 − α)α

General Interarrival and/or Service Times 171

+ (a3 + a4 + · · · )(1 − α)α2 + (a4 + a5 + · · · )(1 − α)α3 + · · ·

= (1 − α)[(a1 ) + (a2 + αa2 ) + (a3 + αa3 + α2 a3 ) + · · · ]

1−α 1 − α2 1 − α3
= (1 − α) a1 + a2 + a3 + ···
1−α 1−α 1−α

= a0 (1 − α0 ) + a1 (1 − α1 ) + a2 (1 − α2 ) + a3 (1 − α3 ) + · · ·

∞

= (a0 + a1 + a2 + a3 + · · · ) − ai αi
i=0

and since a0 + a1 + a2 + a3 + · · · = 1, we need α to satisfy

∞

α= ai αi .
i=0

Next we check the remaining balance equations by plugging in π∗i = (1−α)αi

for i = 0, 1, 2, . . ., to convert

π∗1 = a0 π∗0 + a1 π∗1 + a2 π∗2 + a3 π∗3 + · · ·

π∗2 = a0 π∗1 + a1 π∗2 + a2 π∗3 + a3 π∗4 + · · ·

π∗3 = a0 π∗2 + a1 π∗3 + a2 π∗4 + a3 π∗5 + · · ·

.. ..
. .

respectively, to

(1 − α)α = a0 (1 − α) + a1 (1 − α)α + a2 (1 − α)α2 + a3 (1 − α)α3 + · · ·

(1 − α)α2 = a0 (1 − α)α + a1 (1 − α)α2 + a2 (1 − α)α3 + a3 (1 − α)α4 + · · ·

(1 − α)α3 = a0 (1 − α)α2 + a1 (1 − α)α3 + a2 (1 − α)α4 + a3 (1 − α)α5 + · · ·

.. ..
. .

is the solution to α = ∞
which are all satisfied if α i
i = 0 ai α . Let us first write
∞
down the condition α = i=0 ai α in terms of the variables in the G/M/1
i
172 Analysis of Queues

system. Using the expression for ai , we get

∞
∞
∞ (μt)i
∞ ∞
(αμt)i
α= ai αi = αi e−μt dG(t) = e−μt dG(t)
i! i!
i=0 i=0 0 0 i=0

∞
= e−(1−α)μt dG(t) = G̃((1 − α)μ) = E e−(1−α)μTj
0

where G̃(s) is the LST of G(t) at some s ≥ 0. In summary, the limiting proba-
bility π∗i exists for i = 0, 1, 2, . . ., and is equal to π∗i = (1 − α)αi if there exists a
solution to α = G̃((1 − α)μ) such that α ∈ (0, 1). Next, we check the condition
when α = G̃((1 − α)μ) has a solution such that α ∈ (0, 1). As it turns out, that
would be the stability condition for the DTMC Xn∗ , n ≥ 0 .
We use a graphical method to describe the condition for stability for the
G/M/1 queue, which is the same as the condition for positive recurrence

for the DTMC Xn∗ , n ≥ 0 . We write G̃((1 − α)μ) as G̃(μ − αμ). Note that
∞
from the definition, G̃(μ − αμ) = 0 e−(1−α)μt dG(t), where α only appears
on the exponent, and G̃(μ − αμ) is a nondecreasing convex function of α.
Also, G̃(0) = 1 and hence one solution to α = G̃((1 − α)μ) is indeed α = 1.
With these properties of G̃(μ − αμ) in mind, refer to Figure 4.1. We plot
G̃(μ − αμ) versus α as well as the 45◦ line, that is, the function f (α) = α. Since
G̃(μ − αμ) is nondecreasing and convex it, would intersect the 45◦ line at two
points. If the slope of G̃(μ − αμ) at α = 1 is greater than 1, then G̃(μ − αμ)
would intersect the 45◦ line once at some α ∈ (0, 1). This is depicted in the
LHS of Figure 4.1. However, if the slope is less than 1, then the intersection
occurs at some α ≥ 1 depicted in the RHS of Figure 4.1. In fact, if the slope is
exactly 1, then the two points on intersection converge to one point and the
45◦ line just becomes a tangent (we do not show this in Figure 4.1). Therefore,

G˜ (μ – μα) G˜ (μ – μα)

α α
0 1 0 1

FIGURE 4.1
Two possibilities for G̃(μ − μα) vs. α.
General Interarrival and/or Service Times 173

the condition for stability is that the slope of G̃(μ − αμ) at α = 1 must be
greater than 1.
For this, we let α = 1 after computing dG̃(μ − αμ)/dα and require that
−μG̃ (0) > 1. However, G̃ (0) = − 1/λ since the first moment or mean interar-
rival time is 1/λ. Therefore, the condition for stability of the G/M/1
queue or
the condition for positive recurrence of the irreducible DTMC Xn∗ , n ≥ 0 is

λ
ρ= <1
μ

which is the necessary and sufficient condition to find a solution to

α = G̃((1 − α)μ) such that α ∈ (0, 1). In summary, if ρ < 1, we can show
that the probability an arriving customer sees j others in the system in steady
state is

π∗j = (1 − α)αj

which is the unique solution to the DTMC balance equations, where α is the
solution in (0, 1) to

α = G̃(μ − μα).

Problem 33
For a stable G/M/1 queue with FCFS service policy, using the preceding
results derive the distribution for the sojourn time spent by a customer in
the system.
Solution
Let Y be the sojourn time experienced by an arbitrary arrival into the system
in steady state. This is also referred to as the waiting time or time in the sys-
tem. Since we know that in steady state the probability an arriving customer
sees j in the system is π∗j , we can obtain the LST of the distribution of Y by
conditioning on the number in the system as seen by an arrival. Therefore,
we have the LST as

∞ ∗
E e−sY = E e−sY X∞ = j π∗j
j=0

where X∞ ∗ is the number in the system as seen by an arriving customer in

∗
steady state with P X∞ = j = π∗j . Note that if there are j customers in the
174 Analysis of Queues

system upon arrival, the sojourn time for this customer is the sum of service
times of the j customers ahead as well as the customer’s own service times.
Hence, the conditional sojourn time is the sum of j + 1 exponentials with
parameter μ (i.e., according to Erlang(j + 1, μ)). Thus, we have

−sY ∞ ∗
Ee = E e−sY X∞ = j π∗j
j=0

∞
j+1 ∞
j+1
μ μ
= π∗j = (1 − α)αj
μ+s μ+s
j=0 j=0

∞

μ μα j μ(1 − α)
= (1 − α) =
μ+s μ+s s + μ(1 − α)
j=0

where the last equation uses the fact that since 0 < α < 1 and 0 < μ/(μ + s) < 1,
the infinite geometric sum converges. Therefore, Y ∼ exp(μ(1 − α)), that is,
the sojourn time is exponentially distributed with parameter μ(1 − α).

Using the preceding results, we can immediately see that the average
time in the system (sojourn time or waiting time) is

1
W = E[Y] = .
μ(1 − α)

Then, from Little’s law we have the average number of customers in the
system as

λ
L= .
μ(1 − α)

Although we do have the average number of customers in the system, we

do not yet have the distribution of the number in the system in steady state.
Notice that so far all the techniques obtained that first, but in this G/M/1
queue analysis, it is not particularly straightforward. In fact, unlike the
M/G/1 queue where we could invoke PASTA property, that would not be
possible here since arrivals are not necessarily Poisson. We see next in a prob-
lem setting how to compute the distribution of the number in the system in
steady state.
General Interarrival and/or Service Times 175

Problem 34
Consider a stable G/M/1 queue and let X(t) be the number of customers in
the system at time t. Define pj as the probability that there are j in the system
in steady state, that is,

pj = lim P{X(t) = j}.

t→∞

Show that pj can be solved as

p0 = 1 − ρ
pj = ρπ∗j−1 for j > 0

where ρ = λ/μ.
Solution
Let An be the time of the nth arrival in the system with A0 = 0. The bivari-
ate stochastic process Xn∗ , An , n ≥ 0 isa Markov renewal sequence with

∗
Kernel K(x) = [Kij (x)] such that Kij (x) = P Xn+1 = j, An+1 − An ≤ x Xn∗ = i .
Actually the expression
for the Kernel is not necessary; all we need is to
realize that π∗ = π∗j satisfies π∗ = π∗ K(∞) and we already have π∗j . Also,
since the arrivals
are renewal,
we have the conditional expected sojourn
times βj = E An+1 − An Xn∗ = j = 1/λ for all j. Now, the stochastic pro-
cess {X(t), t ≥ 0} is a Markov regenerative process
(MRGP) with embedded
Markov renewal sequence Xn∗ , An , n ≥ 0 . Therefore, from MRGP theory
(see Section B.3.3), we have
∞
π∗i γij
pj = i=0
∞
π∗i βi
i=0

where γij is the expected time spent in state j between time An and An+1
given that Xn∗ = i. We first compute γij and then derive pj . For that, we use the
indicator function IA , which is one if A is true and zero if A is false. Using the
definition of γij , the indicator function, and properties of Poisson processes,
we can derive the following:
⎡ ⎤
An+1
∗
γij = E ⎣ I{X(t)=j} dt Xn = i⎦
An
⎡ ⎤
∞ u ∗
= E ⎣ I{X(t)=j} dt X0 = i, A1 = u⎦ dG(u)
0 0
176 Analysis of Queues

∞ u
= P{X(t) = j|X(0) = i}dt dG(u)
0 0

∞ u (μt)i+1−j
= e−μt dt dG(u)
(i + 1 − j)!
0 0

for i + 1 ≥ j and j ≥ 1. For a given j such that j ≥ 1, from MRGP theory (and
using the fact that βi = 1/λ and γij = 0 if i + 1 < j), we have

∞
π∗i γij
pj = i=0
∞
π∗i βi
i=0

∞ ∞u (μt)i+1−j
π∗i e−μt dt dG(u)
i=j−1 0 0 (i + 1 − j)!
= ∞ 1
π∗
i=0 i λ

∞ u ∞
(μt)i+1−j
=λ e−μt (1 − α)αi dt dG(u)
(i + 1 − j)!
0 0 i=j−1

∞ u
=λ e−μt (1 − α)αj−1 eαμt dt dG(u)
u=0 t=0

λ −(1−α)μt
∞ u
= e (1 − α)μαj−1 dt dG(u)
μ
u=0 t=0

λ
∞
= 1 − e−(1−α)μt αj−1 dG(u)
μ
u=0

λ j−1
= α (1 − G̃(μ(1 − α)))
μ

λ j−1 λ
= α (1 − α) = π∗j−1
μ μ

where the last line uses the fact that α = G̃(μ(1 − α)). Therefore, for j ≥ 1,
pj = ρπ∗j−1 where ρ = λ/μ. From this, p0 can be easily computed as p0 = 1

− j≥1 pj = 1 − ρ.
General Interarrival and/or Service Times 177

Therefore, there is a relationship between the arrival point probabilities

π∗j and the steady-state probabilities pj , but since the arrivals are not Poisson
they are not equal. A natural question to ask is if the arrivals were indeed
Poisson for the G/M/1 queue (which would become M/M/1), would the
arrival point probabilities be the same as the steady-state probabilities? In
fact, it would be a good idea to verify all the results we have derived thus
far by assuming that the arrivals are Poisson, which is the theme of our next
problem.

Problem 35
Using the results derived for the G/M/1 queue, obtain α, π∗j , pj , P{Y ≤ y}, L,
and W when G(t) = 1 − e−λt for t ≥ 0 when the queue is stable.
Solution
Note that the interarrival times are exponentially distributed; therefore,
some of our results can be verified using those of the M/M/1 queue and
the reader is encouraged to do that. The LST of the interarrival time is
G̃(s) = λ/(λ + s). Therefore, we solve for α in α = G̃(μ−αμ), that is, α = λ/(λ+
(1 − α) μ). We get two solutions to that quadratic equation. Since we require
α ∈ (0, 1), we do not consider the solution α = 1. However, we know the
queue is stable (hence λ/μ < 1) and thus we have

λ
α= .
μ

Using that we get for j = 0, 1, 2, . . .,

j
λ λ
π∗j j
= (1 − α) α = 1 − .
μ μ

Using ρ = λ/μ we get

p0 = 1 − ρ = π∗0
pj = ρπ∗j−1 = (1 − ρ)ρj = π∗j for j > 0.

Therefore, pj = π∗j for all j, which is not surprising due to PASTA. Also notice
that pj is identical to what was derived in the M/M/1 queue analysis. Further,

P{Y ≤ y} = 1 − e−(1−α)μy = 1 − e−(μ−λ)y .

178 Analysis of Queues

Since W = 1/μ(1 − α) we get

1
W=
μ−λ

and using L = λW we have

λ
L= .
μ−λ

Notice that L and W as well as the distribution of Y are identical to those of

the M/M/1 queue.

In a similar manner, one can obtain the preceding expressions for other
interarrival time distributions as well, some of which are given in the
exercises at the end of the chapter. Before wrapping up this section on
DTMC-based analysis, it is worthwhile to describe one more example. This
is the G/M/2 queue. It is crucial to point out that the generic G/M/s queue
can hence be analyzed in a similar fashion. However, analyzing the M/G/s
queue using DTMC is quite intractable for s ≥ 2. The reason for that is the
G/M/s queue, if observed at arrivals, is Markovian, whereas the M/G/s
queue observed at departures is not Markovian. Now to the G/M/2 queue.

Problem 36
Consider a G/M/2 queue. Let Xn∗ be the number of customers just before
the nth arrival. Show that Xn∗ , n ≥ 0 is a DTMC by computing the transi-
tion probability matrix. Derive the condition for stability and the limiting
distribution for Xn∗ .
Solution
We begin with some notation. Let aj be the probability that j departures occur
between two arrivals when both servers are working throughout the time
between the two arrivals. Then
∞ (2μt)j
aj = e−2μt dG(t).
j!
0

Let cj be the probability that j departures occur between two arrivals where
both servers are working until the jth departure, after which only one server
is working but does not complete service. Then

∞
c0 = e−μt dG(t) = G̃(μ)
0
General Interarrival and/or Service Times 179

for j > 0,

∞ t (2μs)j−1
cj = e−μ(t−s) e−2μs 2μds dG(t)
(j − 1)!
0 0
⎧ ⎫
∞ ⎨ j−1
(μt) ⎬
i
= 2j e−μt 1 − e−μt dG(t)
⎩ i! ⎭
0 i=0

and bj is given by

j

bj = 1 − cj − ai−1 .
i=1

Let Xn∗ be the number

of customers in the system just before the nth arrival.
Then Xn∗ , n ≥ 0 is a DTMC with

⎡ ⎤
b0 c0 0 0 ...
⎢ b1 c1 a0 0 ... ⎥
⎢ ⎥
⎢ b2 c2 a1 a0 ... ⎥
⎢ ⎥
P=⎢ b3 c3 a2 a1 ... ⎥.
⎢ ⎥
⎢ b4 c4 a3 a2 ... ⎥
⎣ ⎦
.. .. .. ..
. . . . ...

Let π∗j be the limiting probability that in steady state an arriving customer
sees j in the system, that is,

π∗j = lim P Xn∗ = j .
n→∞

By trying the solution π∗j = βαj , we can show the following:

1. The solution π∗j = βαj works for j > 0 if there is a unique solution
to α = G̃(2μ(1 − α)) such that α ∈ (0, 1), which is the condition of
stability and can be written as λ/(2μ) < 1.
2. Also, π∗0 = βα[1 − 2G̃(μ)]/[(1 − 2α)G̃(μ)].
∞ ∗
3. Therefore, using j=0 πj = 1, we can derive that β = (1 − α)
(1 − 2α)G̃(μ)/[α(1 − α) − αG̃(μ)]. This can be used to compute
π∗ = π∗0 π∗1 . . . .
180 Analysis of Queues

It is worthwhile to point out that one can obtain the distribution of the
sojourn time in the system using an analysis similar to that in Problem 33.
That is left as an exercise for the reader.

4.2 Mean Value Analysis

So far in this chapter, we have focused on queues that can be modeled as
appropriate stochastic processes so that distributions of performance mea-
sures such as time and number in the system can be obtained. However,
if one was not interested in the distribution but just the expected value of
the performance measure, then a technique aptly known as mean value
analysis (MVA) can be adopted. In fact, MVA can also be used to obtain
excellent approximations when systems cannot easily be modeled as suit-
able stochastic processes. MVA leverages upon two properties of random
variables:

1. Expected value of a linear combination of random variables is equal

to the linear combination of the expected values, that is, if Y1 , Y2 ,
. . ., Yk are random variables with finite means and a1 , a2 , . . ., ak are
known finite constants, then

E [a1 Y1 + a2 Y2 + · · · + ak Yk ] = a1 E[Y1 ] + a2 E[Y2 ] + · · · + ak E[Yk ].

The preceding result holds even if the random variables Y1 , Y2 , . . .,

Yk are dependent on each other.
2. Expected value of a product of independent random variables is
equal to the product of the expected values of those random vari-
ables, that is, if Y1 , Y2 , . . ., Yk are independent random variables,
then

E [Y1 Y2 . . . Yk ] = E[Y1 ]E[Y2 ] . . . E[Yk ].

The preceding result requires that the random variables Y1 , Y2 , . . .,

Yk be independent.

The easiest way to explain MVA is using an example. We specifically con-

sider an example that we have seen so far. This way, it is possible to compare
the results obtained and contrast the approaches. In that light, we first con-
sider the M/G/1 queue. Subsequently, we will illustrate the use of MVA to
analyze G/G/1 queues.
General Interarrival and/or Service Times 181

4.2.1 Explaining MVA Using an M/G/1 Queue

Consider an M/G/1 queue where the arrivals are according to a Poisson pro-
cess with mean λ per unit time. The service times are IID random variables
sampled from a distribution with mean 1/μ and variance σ2 . There is a sin-
gle server and infinite room for customers to wait. Customers are served
according to FCFS. Let Xn be the number of customers in the system imme-
diately after the nth departure (this is identical to the definition in the DTMC
analysis of M/G/1 queues described in Section 4.1.1). Let Un be the num-
ber of arrivals that occurred during the service of the nth customer. For the
MVA, we also require the following asymptotic values based on Xn and Un
defined earlier:

π0 = lim P{Xn = 0},

n→∞

L = lim E[Xn ],
n→∞

U = lim E[Un ].
n→∞

Having described the notation, now we are ready for MVA. We first write
down a relation between Xn and Xn+1 . If Xn > 0, then Xn+1 = Xn − 1 + Un+1
because the number in the system as seen by the n + 1st departure is equal
to what the nth departure sees (which is Xn and also includes the n + 1st
customer since Xn > 0) plus all the customers that arrived during the service
of the n + 1st customer minus one (since only the number remaining in the
system is described in Xn+1 ). However, if Xn = 0, then Xn+1 = Un+1 since
when the nth customer departs, the system becomes empty, and then the
n + 1st customer arrives and starts getting served immediately, and all the
customers that showed up during that service would remain when the n+1st
customer departs. Thus, we can write down the following relation:

Xn+1 = max{Xn − 1, 0} + Un+1 = (Xn − 1)+ + Un+1 . (4.12)

Taking expected values on both sides of the preceding relation, by suitably

conditioning on whether Xn = 0 or Xn > 0, we can derive the following:

E[Xn+1 ] = E[(Xn − 1)+ ] + E[Un+1 ]

= E[(Xn − 1)+ |Xn > 0]P(Xn > 0) + E[(Xn − 1)+ |Xn = 0]P(Xn = 0)

+ E[Un+1 ]

= E[Xn |Xn > 0]P(Xn > 0) − P(Xn > 0) + E[Un+1 ]

182 Analysis of Queues

= E[Xn |Xn > 0]P(Xn > 0) + E[Xn |Xn = 0]P(Xn = 0)

− P(Xn > 0) + E[Un+1 ]

= E[Xn ] − P(Xn > 0) + E[Un+1 ].

Taking the limit as n → ∞, we get L = L − (1 − π0 ) + U. Canceling L’s

(assuming L < ∞) we get

U = 1 − π0 .

Now, to derive an expression for U, notice that if we condition on the service

times (say Sn is the service time of the nth customer), then

λ
U = lim E[Un ] = lim E[E[Un |Sn ]] = lim E[λSn ] = .
n→∞ n→∞ n→∞ μ

Since ρ = λ/μ, using U = 1 − π0 we have

π0 = 1 − ρ.

Notice that the preceding equation was derived in the M/G/1 analysis using
DTMCs. Also, the condition 0 < π0 < 1 implies ρ < 1, which is the stability
condition, and thus L < ∞.
Continuing with the MVA by squaring Equation 4.12, we get

2
Xn+1 = (Xn − 1)2 I{Xn >0} + 2Un+1 (Xn − 1)I{Xn >0} + Un+1
2

where I{Xn >0} is an indicator function that is one if Xn > 0 and zero otherwise.
Taking the expected value of the preceding equation, we get

2
E Xn+1 = E Xn2 − 2Xn + 1 I{Xn >0}

2
+ 2E[Un+1 ]E[(Xn − 1)I{Xn >0} ] + E Un+1 (4.13)

since Un+1 is independent of (Xn −1)I{Xn > 0} . We derive each term of the RHS
of Equation 4.13 separately starting from the right extreme.
Conditioning
2 on
the service time of the n + 1st customer, we get E Un+1
2 = E E Un+1 |Sn+1

= E[Var[Un+1 |Sn+1 ] + {E[Un+1 |Sn+1 ]}2 ] = E λSn+1 + λ2 S2n+1 = λE[Sn+1 ] +

λ2 E S2n+1 = ρ + λ2 σ2 + ρ2 . Using an identical argument described earlier to
compute E[(Xn − 1)+ ] (see expressions following Equation 4.12), we have the
middle term E[(Xn − 1)I{Xn > 0} ] = E[Xn ] − P(Xn > 0). Of course, we also saw
General Interarrival and/or Service Times 183

earlier that E[Un+1 ] = ρ, which leaves us with the first expression that can be
derived as follows:

E Xn2 − 2Xn + 1 I{Xn >0} = E Xn2 − 2Xn + 1 I{Xn >0} |Xn > 0 P(Xn > 0)

+E Xn2 − 2Xn + 1 I{Xn >0} |Xn = 0 P(Xn = 0)

=E Xn2 − 2Xn + 1 |Xn > 0 P(Xn > 0)

=E Xn2 − 2Xn |Xn > 0 P(Xn > 0) + P(Xn > 0)

=E Xn2 − 2Xn |Xn > 0 P(Xn > 0)

+E Xn2 − 2Xn |Xn = 0 P(Xn = 0)

+ P(Xn > 0)

= E Xn2 − 2Xn + P(Xn > 0).

Therefore, we can rewrite Equation 4.13 as

2
E Xn+1 = E Xn2 − 2E[Xn ] + P(Xn > 0) + 2ρE[Xn ]

− 2ρP(Xn > 0) + ρ + λ2 σ2 + ρ2 .

Taking the limit as n → ∞, canceling the LHS with the first term in the RHS,
and rearranging the terms we get

λ2 σ 2 + ρ 2
L=ρ+ .
2(1 − ρ)

Verify that the preceding expression is identical to the Pollaczek–Khintchine

formula given in Equation 4.6.
Notice that we did not use a DTMC to model {Xn , n ≥ 0} for MVA. All
we did was write down a recursive relation and computed the expected
value. Of course, the trick is in writing down a suitable relation and carefully
rearranging terms. This would be more pronounced in the next example.
However, the overall method is fairly consistent over all examples. Next, we
consider a G/G/1 queue and use MVA to develop some approximations.
184 Analysis of Queues

4.2.2 Approximations for Renewal Arrivals and General Service Queues

Consider a G/G/1 queue where the arrivals are according to a renewal
process with known interarrival time distribution from which the mean
and variance of interarrival times can be obtained. In addition, the ser-
vice times are IID and sampled from a known distribution from which the
mean and variance can be computed. There is a single server that uses
FCFS discipline and there is an infinitely big waiting room. The objec-
tive is to obtain the mean number in the system in steady state and the
mean waiting time. Notice that we have not so far obtained any perfor-
mance measures for the G/G/1 queue. All we have done so far is develop
relations such as Little’s law. Since a generic G/G/1 queue cannot be mod-
eled as a suitable stochastic process, it is difficult to obtain analytical
closed-form expressions for the steady-state performance metrics. How-
ever, there are numerous approximation techniques, some of which we will
present in this chapter. The first of those approximations use MVA that we
describe next.
Similar to the MVA for M/G/1 queue, we describe some notations
(see Table 4.1). The six variables defined can be divided into three cate-
gories: Tn and Sn are known in distribution (i.e., they form an IID sequence
each sampled from given distributions); Wn and In are performance mea-
sures not known a priori and need to be computed using MVA; and An
and Dn are mainly for convenience to derive the approximation. We are
going to use MVA to approximately obtain E[Wn ] and E[In ] as n → ∞ in
terms of the first two moments of the interarrival and service times. In
particular, define the mean arrival rate λ, mean service rate μ, squared
coefficient of variation (SCOV) of arrivals C2a , and SCOV of service C2s
as follows:

1
λ= ,
E[Tn ]

1
μ= ,
E[Sn ]

TABLE 4.1
Notation for the G/G/1 MVA
An Time of nth arrival
Tn+1 = An+1 − An The n + 1st interarrival time
Sn Service time of the nth customer
Wn Time spent (sojourn) in the system by the nth customer
Dn = An + Wn Time of nth departure
In+1 = (An+1 − An − Wn )+ Idle time between nth and n + 1st service
General Interarrival and/or Service Times 185

Var[Tn ]
C2a = = λ2 Var[Tn ],
(E[Tn ])2

Var[Sn ]
C2s = = μ2 Var[Sn ].
(E[Sn ])2

Also define the performance measures W, Id , and I(2) as

W = lim E[Wn ],
n→∞

Id = lim E[In ],
n→∞

I(2) = lim E In2 .
n→∞

Using all the preceding definitions, we carry out MVA by first writing down
the sojourn time of the n + 1st customer, Wn+1 being equal to that customer’s
service time plus any time the customer spent waiting for service to begin
(this happens if the customer arrived before the previous one departed). In
other words,

Wn+1 = Sn+1 + max{Dn − An+1 , 0}.

Using the definitions of Dn , Tn , and In in Table 4.1, we can write down the
following sets of equations:

Wn+1 = Sn+1 + max{Dn − An+1 , 0}

= Sn+1 + max{Wn + An − An+1 , 0}

= Sn+1 + Wn + An − An+1 + max{An+1 − An − Wn , 0}

= Sn+1 + Wn − Tn+1 + In+1 .

Taking expectations, we get

E[Wn+1 ] = E[Sn+1 ] + E[Wn ] − E[Tn+1 ] + E[In+1 ].

Now, by letting n → ∞ and using the notation defined earlier, we see that

1 1
W= + W − + Id .
μ λ
186 Analysis of Queues

Thus, if W < ∞, that is, a stable G/G/1 queue, we have

1 1 (1 − ρ)
Id = − =
λ μ λ

where ρ = λ/μ. From the definition of Id , we have Id > 0 requiring ρ < 1,

which is the stability condition.
Then continuing with the MVA, recall that Wn+1 = Sn+1 +Wn −Tn+1 +In+1 .
Similar to what we did for M/G/1 queue, here too we square the terms. How-
ever, before doing that, we want to carefully rearrange terms by noticing the
following three things:

1. Recall the definitions of Wn+1 , Sn+1 , and In+1 . Notice that Wn+1 −
Sn+1 is the time the n + 1st customer waits for service to begin and
In+1 is the idle time between serving the nth and n + 1st customers.
Based on that, we have (Wn+1 − Sn+1 )In+1 = 0, since when there is a
nonzero idle time, the n + 1st customer does not wait for service and
vice versa.
2. The time a customer waits for service to begin is independent of the
service time of that customer, hence (Wn+1 − Sn+1 ) is independent
of Sn+1 .
3. Also, the sojourn time of the nth customer is independent of the
time between the nth and n + 1st arrivals. Hence, we have Wn
independent of Tn+1 .

The preceding three results prompt us to rearrange Wn+1 = Sn+1 + Wn −

Tn+1 + In+1 as Wn+1 − Sn+1 − In+1 = Wn − Tn+1 . Squaring and rearranging
terms leads to the following results:

Wn+1 − Sn+1 − In+1 = Wn − Tn+1 ,

(Wn+1 − Sn+1 − In+1 )2 = (Wn − Tn+1 )2 ,
2
Wn+1 − 2(Wn+1 − Sn+1 )Sn+1 − S2n+1 + In+1
2

−2(Wn+1 − Sn+1 )In+1 = Wn2 − 2Wn Tn+1 + Tn+1

2
.

Now using the facts that (Wn+1 − Sn+1 )In+1 = 0, (Wn+1 − Sn+1 ) and Sn+1 are
independent, as well as Wn is independent of Tn+1 , and taking the expected
value of the preceding expression, we get

2
E Wn+1 − 2E[(Wn+1 − Sn+1 )]E[Sn+1 ] − E S2n+1 + E In+1
2

= E Wn2 − 2E[Wn ]E[Tn+1 ] + E Tn+1
2
.
General Interarrival and/or Service Times 187

2 2
Notice that E[Tn+1 ] = 1/λ, E[Sn+1 ] = 1/μ, E Tn+1 = Ca + 1 /λ2 , and
2 2
E Sn+1 = Cs + 1 /μ2 . Making those substitutions and taking the limit as
n → ∞, we get

1 ρ2 C2s + C2a + (1 − ρ)2 − λ2 I(2)

W= + (4.14)
μ 2λ{1 − ρ}

where ρ = λ/μ.
The only unknown quantity in the preceding expression for W is I(2) .
Therefore, suitable bounds and approximations for W can be obtained by
cleverly bounding and approximating I(2) . Section 4.3 is devoted to bounds
and approximations for queues, and to obtain some of those bounds, we will
use Equation 4.14. However, for the sake of completing this analysis, we
present a simple upper bound for W. Since the variance of the idle time for a
server between customers must be positive, we have I(2) ≥ (Id )2 = (1−ρ)2 /λ2 .
Thus, we have −λ2 I(2) ≤ − (1 − ρ)2 and plugging into Equation 4.14, we get

1 ρ2 C2s + C2a
W≤ + .
μ 2λ{1 − ρ}

A key point to notice is that the preceding bound only depends on the mean
and variance of the interarrival times and service times. Therefore, we really
do not need the entire distribution information. Of course, the preceding
bound for W is quite weak and one can obtain much better bounds and
approximations that we would describe in Section 4.3, which would also
use only λ, μ, C2a , and C2s . However, before that we present another result for
G/G/1 queue using the MVA results.

4.2.3 Departures from G/G/1 Queues

The MVA for the G/G/1 system can be immediately extended to character-
izing the output from a G/G/1 queue. In a queueing network setting (used
in Chapter 7), output from one queue may act as an input to another queue.
In that light, we strive to obtain the mean and SCOV of the interdeparture
times. There are some notations involved, but we will describe them as we
go along with the explanation. Let Vn+1 be the interdeparture time between
the nth and n + 1st departure. By definition, we have

Vn+1 = Dn+1 − Dn = max(An+1 − Dn , 0) + Sn+1

using the substitution Dn+1 = max(An+1 , Dn ) + Sn+1 . Based on the definition

of In+1 , we can rewrite Vn+1 as

Vn+1 = In+1 + Sn+1 . (4.15)

188 Analysis of Queues

Taking the expected value of the preceding equation, we get E[Vn+1 ] =

E[In+1 ] + E[Sn+1 ] and letting n → ∞ we obtain

1 1 1 1 1
lim E(Vn+1 ) = Id + = − + = .
n→∞ μ λ μ μ λ

This is not a surprising result, as when the queue is stable the average depar-
ture rate is the same as the average arrival rate as no customers are created
or destroyed in the queue (see conservation law in Section 1.2.1).
The SCOV of the interdeparture times C2d is a little more involved. For
that, we go back to Equation 4.15. Since In+1 is independent of Sn+1 , tak-
ing variance on both sides of Equation 4.15 we get Var(Vn+1 ) = Var(In+1 ) +
Var(Sn+1 ). By letting n → ∞ we obtain

C2s
lim Var(Vn+1 ) = I(2) − Id2 + .
n→∞ μ2

However, using the definition of C2d and substituting for I(2) from
Equation 4.14, we get

Var(Vn+1 ) (2) C2s
C2d = lim 2 2
= λ I − Id + 2
n→∞ (E[Vn+1 ])2 μ

= C2a + 2ρ2 C2s + 2ρ(1 − ρ) − 2λW(1 − ρ). (4.16)

The reason this is written in terms of W is that now we only need a good
approximation or bound for W. Once we have that, we get a good bound
or approximation for C2d automatically. Hence, in the next section we mainly
focus on obtaining bounds and approximations for only W.

4.3 Bounds and Approximations for General Queues

In this section, we consider G/G/s-type queues that are hard to model using
Markov chains. For such queues using MVA and other techniques, we obtain
bounds as well as approximations for average performance measures such
as L or W. There are several applications such as flexible manufacturing sys-
tems for which these results are extremely useful and there is almost no
other way of analytically obtaining performance measures. The bounds and
approximations presented here are based on one or more of the following
analysis techniques:
General Interarrival and/or Service Times 189

1. Using MVA along with additional properties of random variables

2. Using M/M/s results and adjusting for G/G/s
3. Using heavy-traffic approximations
4. Using empirically derived results

We begin with the single server G/G/1 queue and continue from where we
left off in the previous section. Then we show bounds and approximations
for multiserver queues for the remainder of this section.

4.3.1 General Single Server Queueing System (G/G/1)

Here we describe some bounds and approximations for W (the sojourn time
in the system) in a G/G/1 queue in terms of λ the arrival rate, μ the service
rate, C2a the SCOV for interarrival times, and C2s the SCOV for service times.
Wherever convenient, we will describe the results in terms of ρ = λ/μ, the
traffic intensity. Other performance metrics such as L, C2d , Lq , and Wq can be
obtained using their relationship to W. In particular, recall that

L = λW,

C2d = C2a + 2ρ2 C2s + 2ρ(1 − ρ) − 2λW(1 − ρ),

Lq = λW − ρ,

1
Wq = W − .
μ

Thus, if we obtain bounds and approximations for W, they can easily be

translated to the preceding performance metrics.
In that light, first we describe some bounds on W based on MVA and
properties of random variables. Recall from Equation 4.14 that the MVA
results for W in terms of the unknown I(2) are

1 ρ2 C2s + C2a + (1 − ρ)2 − λ2 I(2)

W= + .
μ 2λ{1 − ρ}

For any random variable Y, we have E[Y2 ] ≥ {E[Y]}2 . Therefore, restating

the results in the previous section, the idle between 2the nth and n +
2 time
1st services denoted as In+1 also obeys E In+1 ≥ {E[In+1 ]} , and by letting
2
n → ∞ we get I(2) = limn→∞ E In+1 ≥ limn→∞ {E[In+1 ]}2 = (Id )2 . But since
(Id )2 = (1 − ρ)2 /λ2 , we get

1 ρ2 C2s + C2a
W≤ + .
μ 2λ{1 − ρ}
190 Analysis of Queues

However, better bounds can be obtained using more properties of

random variables. We describe two here, both from Buzacott and
Shanthikumar [15].

1. For any two positive-valued random variables Y and Z, we have

E Y2
E {(Y − Z)+ }2 ≥ {E[(Y − Z)+ ]}2 .
(E[Y])2

Using Y = Tn+1 = An+1 − An the n + 1st interarrival time and Z = Wn

the nth customer’s sojourn in the system, we get from the earlier
relation
2
E Tn+1
2 2 2
E In+1 ≥ {E[I n+1 ]} = Ca + 1 {E[In+1 ]}2 .
(E[Tn+1 ])2

Letting n → ∞ in the preceding expression we get I(2) ≥ C2a + 1 (Id )2
and plugging into Equation 4.14, we get

ρ(2 − ρ)C2a + ρ2 C2s 1

W≤ + .
2λ(1 − ρ) μ

2. For any positive-valued continuous random variable X, the hazard

rate function h(x) is defined as

F (x)
h(x) =
1 − F(x)

where F(x) = P{X ≤ x}, the CDF and F (x) its derivative. Some ran-
dom variables are such that h(x) increases with x and they are called
IFR (increasing failure rate) random variables. There are also some
random variables such that h(x) decreases with x and they are called
DFR (decreasing failure rate) random variables. Of course, there are
many random variables that are neither IFR or DFR and the follow-
ing result cannot be used for those. For two positive-valued random
variables Y and Z, if Y is IFR, we have

E Y2
E {(Y − Z)+ }2 ≤ E[(Y − Z)+ ],
E[Y]

whereas if Y is DFR, we have

+ 2
E Y2
E {(Y − Z) } ≥ E[(Y − Z)+ ].
E[Y]
General Interarrival and/or Service Times 191

Using Y = Tn+1 = An+1 − An the n + 1st interarrival time and Z = Wn

the nth customer’s sojourn in the system, we get from the earlier
relation if Tn+1 is IFR
2 2
E Tn+1 C +1
E 2
In+1 ≤ E[In+1 ] = a E[In+1 ],
E[Tn+1 ] λ

whereas if Y is DFR, we have

2 2
E Tn+1 C +1
E 2
In+1 ≥ E[In+1 ] = a E[In+1 ].
E[Tn+1 ] λ

Actually, the preceding results do not require IFR or DFR but a much
weaker condition (that they be decreasing or increasing mean resid-
ual life, respectively). However, we use the stronger requirement of
IFR or DFR to obtain the following bound by letting n → ∞ in the
preceding expressions to get I(2) ≤ C2a + 1 /λId if interarrival times

are IFR and I(2) ≥ C2a + 1 /λId if interarrival times are DFR. Plugging
into Equation 4.14, we get

ρ C2a − 1 + ρ + ρ2 C2s 1
W≥ + if interarrival time is IFR,
2λ(1 − ρ) μ
2
ρ Ca − 1 + ρ + ρ2 C2s 1
W≤ + if interarrival time is DFR.
2λ(1 − ρ) μ

Having presented some bounds, next we consider some approximations

for W that are special for the G/G/1 queue. These are all from Buzacott and
Shanthikumar [15]. The key idea is that we have a closed-form expression for
W from the Pollaczek–Khintchine formula for M/G/1 queues in Equation 4.6.
Therefore, the first two approximations are just appropriate factors multiply-
ing the bounds for the G/G/1 queue so that when C2a = 1, the results are iden-
tical to that of the Pollaczek–Khintchine formula. The third approximation is
based on a heuristic that results in fairly accurate results when C2a ≤ 1.

Approximation W

ρ21+C2s C2a +ρ2 C2s 1
1 2λ(1−ρ) +
1+ρ2 C2s μ

ρ 1+C2s ρ(2−ρ)C2a +ρ2 C2s
2 2λ(1−ρ) + μ1
2−ρ+ρC2s

ρ2 C2a +C2s 1−C2a C2a ρ
3 2λ(1−ρ) + 2λ + μ1
192 Analysis of Queues

There are other approximations for heavy-traffic queues that we will see
in the G/G/s setting where one can use s = 1 and get G/G/1 approximations.
The reader is encouraged to review those approximations as well. The lit-
erature also has several empirical approximations. Care must be taken to
ensure that the test cases that were used to obtain the empirical approxi-
mations and conclusions are identical to those considered by the reader. It is
worthwhile to point out that the steady-state mean waiting time and number
in the system can also be obtained using simulations that we use for testing
our approximations. In fact, one does not even need sophisticated simulation
software for that, we explain it next using an example.

Problem 37
For a G/G/1 queue, develop an algorithm to simulate and obtain the mean
number in the system in steady-state, given the CDF of interarrival times F(t)
and service times G(t).
Solution
Clearly from the problem description, for all n ≥ 0, F(t) = P{Tn ≤ t} and
G(t) = P{Sn ≤ t}. Using Un and Vn as uniform (0, 1) random variables that
come fairly standard in any computational package, we can obtain sam-
ples of Tn and Sn as F−1 (Un ) and G−1 (Vn ), respectively. Notice that F−1 (·)
is the inverse of the function F(·), for example, if F(t) = 1 − e−λt , then
Tn = F−1 (Un ) = (−1/λ)loge (1−Un ). Now we describe the following algorithm
using Tn and Sn for all n:

1. Initialize A0 = 0 and assume the system is empty, thus D0 = S0 .

2. For n ≥ 0, An+1 = An + Tn , Dn+1 = Sn+1 + max(An+1 , Dn ), and
Wn = Dn − An . Notice that the expression for Dn+1 essentially states
that the n+1st departure would occur a service time Sn+1 after either
the nth departure or the n + 1st arrival, whichever happens later
(which is the time when this service begins).
3. Iteratively compute the previous step a large number of times, delete
the first several values (warm-up period), and take the sample mean
of remaining Wn values as Ŵ. This would be a good estimate of W,
the mean time in the system. Using Little’s law, the mean number in
the system L can be estimated as Ŵ/E[Tn ].

A major concern is that almost all the approximations in the literature

are based on the first two moments of the interarrival time and service time
distributions which could lead the readers to misunderstand that only the
first two moments matter. We illustrate that in the next example.
General Interarrival and/or Service Times 193

Problem 38
Consider a G/G/1 queue where the service times are IID uniform random
variables between 0 and 2/μ. Obtain the mean waiting time using simu-
lations when interarrival times (Tn ) are according to a Pareto distribution
whose CDF is

'
1 − (K/x)β if x ≥ K
P(Tn ≤ x) =
0 otherwise

√
with parameters β = 1 + 2 and K = (β − 1)/(βλ). Show that C2a = 1 and then
compare the mean waiting time with that of the M/G/1 queue. Use λ = 10
and μ = 15.
Solution
First let us analyze the arrival process. Using the CDF we can compute

Kβ 1
E[Tn ] = =
β−1 λ

when β > 1 (which is needed for the first equality and it is the case here) and

K2 β 1
Var[Tn ] = = 2
(β − 1) (β − 2)
2 λ

when β > 2 (for the first equality, which

√ is also the case in this problem) and
the second equality is because β = 1 + 2. Thus, the SCOV of the arrival pro-
cess C2a = 1. Since the service times are according to a uniform distribution,
the mean service time is 1/μ, the variance is 1/(3μ2 ), and SCOV C2s = 1/3.
Now to obtain the mean waiting time W for this G/G/1 queue via simu-
lation, we use the solution to Problem 37. Since F(t) = P{Tn ≤ t} = 1 − (K/t)β ,
samples of Tn can be obtained as F−1 (Un ) = K(1 − Un )−1/β . The service
times Sn can easily be obtained by multiplying the uniform random variables
between 0 and 1 by 2/μ.
Now using the algorithm given in Problem 37, we obtain W = 0.094 time
units for λ = 10 and μ = 15. Now for the M/G/1 queue we get W = 0.156 time
units, which is significantly different from the simulation result. In other
words, if we used just λ, μ, C2a , and C2s we could get a fairly inaccurate
result for W. That is, we need the entire distribution, not just the first two
moments.
194 Analysis of Queues

4.3.2 Multiserver Queues (M/G/s, G/M/s, and G/G/s)

Consider the most general of the three cases, namely, the G/G/s queue,
where the arrivals are according to a renewal process with known
interarrival time distribution from which the mean (1/λ) and SCOV C2a can
be obtained. In addition, the service times are IID and sampled from a known
distribution from which the mean (1/μ) and SCOV C2s can be computed.
There are s identical servers that use FCFS discipline and there is an infinite
waiting room. The objective is to obtain approximations and bounds for W,
the mean sojourn time (i.e., time spent) in the system when s ≥ 2. Having
said that, there are several empirical approximations that would work for
s = 1 and, hence, not described in the previous section for G/G/1 queues.
If the service times are indeed according to an exponential distribution,
then it is possible to obtain W exactly (see Gross and Harris [49]) for the
G/M/s queue. The analysis is an extension of the G/M/2 system described in
Problem 36, but we do not present it here. We only describe approximations
and bounds for that system. Our goal is to obtain the approximations and
bounds in terms of λ, μ, C2a , and C2s . Oftentimes, the results are written in
terms of the traffic intensity ρ given by ρ = λ/(sμ). In addition, the results are
mostly based on either heavy-traffic approximations using fluid and diffu-
sion models or empirical techniques. By heavy traffic we mean that ρ is close
to 1, however ρ has to be less than 1 for stability.
We do not explicitly present the fluid and diffusion approximation, since
we would do so in a much broader setting of a network of queues in
Chapter 7. However, we present a simple argument of how heavy-traffic
approximation can be used in the G/G/1 case that we saw in the previous
section using MVA. Consider Equation 4.14 based on MVA that describes W
in terms of the unknown I(2) as

1 ρ2 C2s + C2a + (1 − ρ)2 − λ2 I(2)

W= + .
μ 2λ{1 − ρ}

Notice that I(2) is the second moment of the server idle time between suc-
cessive arrivals. However, under heavy traffic, only for a small fraction of
time (1 − ρ) would the server experience nonzero idle time. Therefore, a
reasonable heavy-traffic approximation as ρ approaches one for the G/G/1
queue is

1 ρ2 C2s + C2a
W≈ + .
μ 2λ{1 − ρ}

Extending this heavy-traffic approximation for G/G/s queues, we get

1 ρ2 C2s + C2a
W≈ +
μ 2λ(1 − ρ)
General Interarrival and/or Service Times 195

where ρ = λ/(sμ). This approximation is especially accurate for the G/M/s

systems and can be proved using fluid and diffusion scaling. Likewise, an
empirical approximation for G/G/s queues (originally developed for M/G/s
queues) is

1 αs 1 C2a + C2s
W≈ + ,
μ μ 1−ρ 2s

where one should choose αs such that

( s
ρ +ρ
if ρ > 0.7,
αs = 2
s+1
ρ 2 if ρ < 0.7.

Another empirical approximation for G/G/s queues is

Wq,M/M/s
Wq ≈ Wq,G/G/1
Wq,M/M/1

where Wq,M/M/s is the expected time before service begins in an M/M/s

queue with the same λ and μ as the G/G/s queue, likewise Wq,M/M/1 and
Wq,G/G/1 are the wait times before service for the corresponding queues.
However, notice that Wq,G/G/1 still relies on approximations, and further
both Wq,G/G/1 and Wq,M/M/1 use the same ρ as the G/G/s.
Bounds for the G/G/s system are described in Bolch et al. [12] using
Kingman’s upper bound and Brumelle/Marchal’s lower bound as

1 ρ2 C2s − ρ(2 − ρ) s − 1 C2s + 1 1 C2 + sρ2 C2s + (s − 1)ρ2
+ − ≤W≤ + a .
μ 2λ(1 − ρ) s 2μ μ 2λ(1 − ρ)

Of course, by suitably substituting for appropriate terms, these bounds could

naturally be used for the M/G/s and the G/M/s queues. The bounds are
reasonable when ρ is small or when s is small. Fortunately, there are heavy-
traffic approximations and other asymptotic results that can be used when ρ
approaches 1 and s is large. Hence in some sense, based on the situation, an
appropriate method can be used.
Having described the G/G/s bounds and approximations, we next
describe a case study of a call center, which is one domain where the results
have been extensively used for planning and operations. The main objective
of the case study is to illustrate the process of making decisions for design
and operation of a system. The case study has been inspired by “Interstate
Rail and Trucking Company: A Case Study in Applying Queueing Theory”
(at the time of writing this book, the exact source is unknown). Upon com-
pleting the case study, in the next section we will move to other methods to
analyze G/G/s queues.
196 Analysis of Queues

4.3.3 Case Study: Stafﬁng and Work-Assignment in Call Centers

TravHelp, a mid-size in-bound call center, is located in a suburb of Philadel-
phia. As the name suggests, TravHelp provides customer support for clients
for a variety of travel-related companies that are web-based. Although the
user-interface and options for online reservations for travel have come a long
way, there still are many instances where individuals feel the need to pick
up the phone and talk to a human so that their needs are addressed. For
example, the call center might get a call from a user who entered all infor-
mation including her credit card details, and before a confirmation number
was given, her computer quit on her. So she is calling the number listed for
customer service to see if her airline ticket is confirmed, obtain a confirma-
tion number, and have an e-ticket emailed to her. For something like this,
TravHelp is equipped with the latest computer telephony integration tech-
nology that enables their agents to not only speak to the customers needing
help but also view the data they have entered.
In essence, there are three parties involved in the system. One is
TravHelp, which is a call center providing customer support. The second
party is the set of web-based travel-related companies or travel agencies
that outsource customer support to TravHelp so that their customers can
call TravHelp for their difficulties experienced while making reservations.
These travel agencies are of various sizes, from large well-known compa-
nies to small mom-and-pop travel agents. And finally, the third party is
the customer. Customer calls come in many flavors, some for business,
some for pleasure and others for both. Sometimes customers are travelers
themselves, other times they are office staff of the travelers. Under this
three-way relationship, the objective of TravHelp is to provide customer-
support for their travel-agent clients’ web-based systems at the highest
quality while incurring the minimum cost. However, since the client (i.e.,
travel agency) is removed from the customer-support operations, sometimes
it becomes necessary for TravHelp representatives (or reps) to call the client
for authorization.
To achieve the objective described in the preceding text, TravHelp orga-
nized its reps into clusters. In each cluster, there are about four reps. Recall
the variability in terms of size of travel-agency clients. Besides that, the cus-
tomer base is all over the United States and Canada, and a variety of services
are being catered to (e.g., airline tickets, hotels, car rental, and cruise). Thus,
the clusters were formed, some based on the major clients, some based on the
service offered, and some by geographic location. TravHelp made sure that
the workloads were fairly divided among the clusters and incoming calls
were appropriately routed. In the beginning of 2011, TravHelp had grown
so much that they had 122 reps during the peak shift of a work day. The 122
reps were divided into 32 clusters. However, even with all these reps, cus-
tomers sometimes had to wait over 5 min to speak with a rep. That seemed
unacceptably high to the owner of TravHelp and felt something needed to be
General Interarrival and/or Service Times 197

done. So the owner of TravHelp decided to call one of his former classmates
from Wharton who runs a professional consulting firm ProCon.

4.3.3.1 TravHelp Calls for Help

Jacob, who works for ProCon, just finished a long consulting project for a
large airline company and was about to leave for the day when his boss
called him. Jacob was briefed about the customer service problems TravHelp
was facing and was assigned to work on that as his next project. Jacob did
not know much about call centers. His main exposure was through the TV
series Outsourced and a movie by the same name. Based on those, it did
not occur to him that there could be issues in call centers, for those actors
seemed to be so satisfied and stress-free at their jobs. So Jacob did his home-
work and showed up the next morning at TravHelp. After understanding
the issues TravHelp was facing, Jacob decided to dig deeper and get more
data. Fortunately, TravHelp had maintained electronic logs of customer call
arrival times, call completion times, reps they spoke to, duration of each call,
wait times to talk to reps, and details about the callers (company, geographic
location, and travel-request type).
Jacob burned the data on a CD and took it back to ProCon to analyze it as
well as to brainstorm with his colleagues. As a result, some interesting find-
ings emerged. The average call-arrival rate was 2030.3 per hour during the
peak shift. At that time, 122 reps were employed and there were 32 clusters
of reps. A histogram of interarrival times of calls is depicted in Figure 4.2
and an exponential distribution seemed to be a good fit. In terms of service
times, the results showed that there was no statistically significant difference
in the mean and variance of the service times across client-company size,
location, or travel-request type. Thus, the aggregated service times across all
customers in the sample data was used. It had a mean of 3 min 20 s and a
standard deviation of about 2 min 30 s. Further, almost 40% of the customers
waited at least 10 s to speak to a representative and the overall average wait
time was close to 2 min (this includes the 60% customers that waited less
than 10 s, most of which were practically 0 s).
Jacob wanted to dig deeper and build a computer simulation of the sys-
tem. He used a gamma distribution for the service times as they fit the data
best among the distributions he tested. He was able to calibrate his simu-
lation results against the waiting time data available for all customers. The
waiting time histogram using the simulations matched closely that of the
data collected. Jacob was thrilled but he was not sure of what to do. He twid-
dled his thumbs for a while and played with the simulation features a little.
Then suddenly, Jacob noticed something when he was running his simula-
tions in animation mode. He saw that sometimes there were long lines at
certain clusters of reps with many customers on hold, whereas at the same
time there were other clusters where the reps were idle. Although the load
was balanced across time, at any given time the system was inefficient. Jacob
198 Analysis of Queues

7000

6000

5000

4000
Frequency

3000

2000

1000

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Interarrival time (h) ×10−3

FIGURE 4.2
Histogram of interarrival times of customer calls.

immediately realized that this was because of the 32 separate queueing sys-
tems. He soon recalled a homework problem from his queueing theory class,
where it was shown that a single queue would be more efficient than having
multiple parallel lines.

4.3.3.2 Recommendation: Major Revamp

Jacob was elated because he knew all that was needed was to get rid of the
clusters and throw all the reps into a single large pool of servers with a single
queue. Of course, he still has to calculate the average time each customer
would wait before talking to a rep so that a meaningful comparison could
be made with the current system. However, he realized there was only a
little time for the next brainstorming meeting with his colleagues and boss.
Clearly, he was not going to be able to run a full-blown simulation study
with many replications. But asking himself how hard would it be to analyze
it as an M/G/s queue since the arrivals were according to a Poisson process
(with λ = 2030.3 per hour and C2a = 1), the service times generally distributed
(with μ = 18 per hour and C2s = 0.5625) and s = 122 servers. He quickly pulled
out the text he used for his queueing theory course and found two formulae
General Interarrival and/or Service Times 199

for G/G/s queues in there that would be appropriate to use. The first was a
heavy-traffic approximation

ρ2 C2s + C2a
Wq ≈
2λ(1 − ρ)

where ρ = λ/(sμ) = 0.956, which appeared to satisfy the heavy-traffic condi-

tion. The second approximation was also appealing because it was originally
developed for M/G/s queues:

αs 1 C2a + C2s
Wq ≈ ,
μ 1−ρ 2s

where αs = (ρs + ρ)/2 when ρ > 0.7 (in this case, ρ is well above 0.7). Plugging
in the numbers for Wq , he got the average waiting time for each customer to
speak to a rep as 0.29 min and 0.13 min using the first and second formula,
respectively.
Although Jacob realized these were approximations, he felt that they
were clearly lower than the 2 min wait that the current system customers
experience on average. Jacob wondered what if there were a fewer number
of reps. He checked for s = 116 reps (a reduction of 6 reps) and the average
wait times for customers to speak to a rep became 0.82 and 0.41 min using the
first and second formula, respectively. Jacob was thrilled, he looked at his
watch and there was enough time to grab a quick latte from a nearby coffee
shop before his meeting. At the brainstorming meeting, Jacob presented his
recommendation, which is to consolidate the 32 clusters into a single large
cluster. When a customer service call arrives, it would go to a free rep if one
is available, otherwise the call would be put on hold until a rep became free.
Jacob also suggested reducing the number of reps to 116. His colleagues liked
the idea. One of them also added another recommendation: to use a monitor
to display the number of customers on hold that all the reps can see. That
way, if the number of customers on hold is zero, then the reps who are busy
can spend more time talking to their customers projecting greater concern.
Whereas if there is a large number of calls on hold, reps can quickly wrap
up their calls. Jacob liked the idea, and when he saw his latte he realized
his coffee shop also adopts a similar notion where if there is a long line, the
orders are taken quickly and if it is empty, the workers spend time chatting
with customers.
Jacob rechecked all his calculations to make sure everything was alright.
Then he proceeded to TravHelp. He made the recommendations that were
discussed. TravHelp decided to adopt them but continue with the 122 reps.
It was an easy redesign for them and they assigned calls to available reps
in some fair round-robin fashion. TravHelp also decided to use monitors
that were spread throughout the call center and reps knew how long the
200 Analysis of Queues

lines were at all times. TravHelp monitored their system as usual and also
collected data electronically as they had done before. Jacob told TravHelp he
would return in a week to see how things were going and analyze the data
to see the actual improvements.

4.3.3.3 Findings and Adjustments

Four days later, when Jacob checked his voicemail, there was a frantic call
from TravHelp. They wanted him there right away. Jacob raced over and
found that TravHelp was inundated with complaints both from customers
and from clients. Customers who have used TravHelp before found it was
taking a lot longer for reps to address their concerns. Clients felt that reps
were calling them for approval for the same things over and over. Jacob was
told that in the previous system with clusters, since the reps were organized
based on clients, if reps checked something with the clients, they would
immediately share that with the other reps in their cluster. Hence, repeated
calls to clients about the same issue were not made. Also, when Jacob talked
to the reps he found that their morale was low. That is attributed to not only
due to customers and clients complaining to them, but also because they felt
pressurized when they saw a large number of calls on hold. This resulted in
the reps’ loss of personal satisfaction.
Jacob quickly collected the four days’ worth data on a CD and began
analyzing it. He found three things that he did not anticipate. First off, the
average waiting time to speak to a rep was over 90 s (which although is still
better than almost 2 min in the original system, it is still quite unacceptable
and not what he had calculated). Next, the aggregate service times had a
mean of 3.5 min and a standard deviation of 4.75 min. These were signifi-
cantly higher than the corresponding values in the original system. Finally, a
more curious result was that several customers called multiple times. Upon
interviewing the reps, Jacob found out that in the original system since
the reps were clustered according to the major clients, services offered, or
geographic locations, they were extremely familiar with their tasks. Now
they had to learn more with the new system. They also had to call clients,
talk to another fellow rep or open up several computer screens, or perform
detailed searches. These clearly increased the mean service times. In par-
ticular, the difficult-to-answer questions took much longer and hence the
standard deviation of service times increased.
In addition, as described earlier, clients were frustrated because differ-
ent reps would call to ask the same questions, which did not happen in
the original system. The reps were constantly getting complaints not only
from clients but also customers. It pressurized the reps so much and their
morale went down. Jacob immediately realized he needed to develop a bet-
ter staffing and work-assignment scheme for the call center. He figured it
was crucial to retain the clusters with major clients. He also felt there was
merit in clustering according to some but not all services. He found out from
General Interarrival and/or Service Times 201

the managers in TravHelp that the geographic clusters were mainly for per-
sonnel reasons (reps started at different times and to accommodate that they
used different time zones). So this time Jacob carefully redesigned the system
with two layers of reps. In the first layer, he recommended a set of 20 reps
that made the initial contact to determine the appropriate cluster to forward
the call. These calls lasted less than 30 s each. At the second layer, there were
23 specialized clusters each with 3–5 reps, as well as a large pool of 40 reps.
The specialized small clusters were for the large volume of quick calls that
were either client specific or for a single service type. The remaining calls
were being handled by the large pool.
To develop this design, Jacob had to perform several what-if analyses
and used queueing approximations to obtain quick results. He also worked
closely with the managers and reps at TravHelp to understand the implica-
tions and estimate quantities for service times. Finally, before implementing
the solution, Jacob developed a simulation of the system. It revealed that the
average wait time of customers (not including the time they spend speaking
to reps) was less than a minute. However, if one were to classify customers
into groups, those that require longer service times and those that require
short ones, then the wait times were larger for the former set of customers.
But this was in line with customers’ expectations. Thus, this differentiated
service was palatable to customers as well as clients and the reps’ morale was
restored. TravHelp implemented the new system and the results matched
those predicted by Jacob’s simulations. Jacob was delighted about that. He
was also appreciative of the use of queueing approximations for doing quick
analysis. And last but not the least, he realized the importance of consid-
ering behavioral aspects while making decisions, the criticality to talk to
the individuals involved to understand the situation better, and finally that
perceptions of customers is a crucial thing to consider.

4.4 Matrix Geometric Methods for G/G/s Queues

In the previous section, we saw a set of techniques that use the exact model-
input (such as interarrival times and service times) to get approximate
performance measures. Another approach that would be the focus of this
section is to approximate the model-input itself and then do an exact analy-
sis for the performance measures. In particular, we approximate the arrival
process and service process as “mixtures” of exponentials called phase-type
distributions. Then, it would be possible to model the resulting system as
a multidimensional CTMC. The added benefit beyond being able to ana-
lyze exactly is that LSTs can also be easily obtained. In fact, performance
measures such as distribution of sojourn times and number in the system
(under moderate traffic, not light, or heavy traffic) can only be obtained this
202 Analysis of Queues

way. However, for a simple system like a G/G/s queue, it is fairly compu-
tationally intensive. But this is a powerful technique that can be effectively
used in a wide variety of applications beyond queueing.

4.4.1 Phase-Type Processes: Description and Fitting

The key requirement before using matrix geometric methods is to model the
arrival processes and service times using phase-type distributions. In theory,
it is possible to find a phase-type distribution to approximate the CDF of any
positive-valued random variable as closely as desired. In other words, if one
wishes to approximate a CDF G(t) by a phase-type distribution, then given
an
> 0, there exists a phase-type CDF F(t) such that |G(t) − F(t)| <
for all
t ≥ 0. It is true even for subexponential distributions such as Pareto with
several moments equaling infinite. However, it may not be the easiest thing
in practice to find such an F(t). There are several articles that we would refer
to later in this section on fitting distributions. Before that we first explain the
phase-type distribution.
A phase-type distribution is essentially a mixture of exponential distribu-
tions. In particular, we consider m exponential distributions mixed in various
forms to form a phase-type distribution. Typically, the phase-type random
variable can be mapped into a stochastic process that starts in state i (for
i = 1, . . . , m), spends an exponentially distributed random time, then per-
haps jumps to state j (for j = 1, . . . , m), spends an exponentially distributed
random time, and so on. Finally, the transitions stop and the process ends.
The time spent in the process is according to a phase-type distribution and
it converges in distribution to the original random variable. The stochastic
process can be nicely modeled as a CTMC with m + 1 states including an
absorption state to indicate the end of transitions.
Consider a CTMC {Z(t), t ≥ 0} defined on m + 1 states 1, 2, . . ., m, m + 1,
and an infinitesimal generator matrix

T T∗
Q=
0 0

where
T is an m × m matrix,
T∗ an m × 1 vector,
0 a 1 × m vector of zeros.

We explicitly assume that m + 1 is the absorption state. Also notice that

the rows add up to zero, hence T∗ = − T1 where 1 is an m × 1 vector of ones.
Let Y be a random variable that denotes the first passage time to state m + 1
given an 1 × (m + 1) initial probability vector [p pm+1 ], with p a 1 × m vector.
In other words,
General Interarrival and/or Service Times 203

Y = min{t ≥ 0 : Z(t) = m + 1}.

The random variable Y is according to a phase-type distribution with m

phases. Also note that the stochastic process {Z(t), t ≥ 0} could transition to
any phase zero or more times. Unless stated otherwise in this book, we will
always assume pm+1 = 0.
For all y ≥ 0, the CDF of the phase-type random variable is

F(y) = P{Y ≤ y} = 1 − p exp(Ty)1,

so that the PDF is

dF(y)
f (y) = = p exp(Ty)T∗ .
dy

Notice the use of exponential of a matrix that is defined for a square matrix
M as

M2 M3
exp(M) = I + M + + + ···
2! 3!

and when M is a scalar, it reduces to the regular eM expression. We can also

compute the moments of Y as

E[Yk ] = (−1)k k!pT−k 1.

The key idea is that if a positive-valued random variable X with CDF G(·)
needs to be approximated as a phase-type distribution Y with CDF F(·),
then there exists at least one m, p, and Q that would ensure that F(y) is
arbitrarily close to G(y) for all y ≥ 0. However, in practice, choosing or
finding the appropriate m, p, and Q is nontrivial. To alleviate that concern
of over-parameterization, one typically considers the following special types
of phase-type distributions (with much fewer parameters to estimate):

1. Hypoexponential distribution: A phase-type distribution with p =

[1 0 . . . 0], pm+1 = 0, and m × m matrix
⎡ ⎤
−λ1 λ1 0 ... 0
⎢ 0 −λ2 λ2 ... 0 ⎥
⎢ ⎥
⎢ .. .. .. .. ⎥
T=⎢ . . . ... . ⎥.
⎢ ⎥
⎣ 0 0 0 . . . λm−1 ⎦
0 0 0 . . . −λm

In this case, m needs to be determined (usually by trial and error),

then the only parameters to be estimated are λ1 , λ2 , . . ., λm . If
204 Analysis of Queues

the random variable to be approximated has SCOV < 1, then the

hypoexponential distribution would be ideal. A special case of
the hypoexponential distribution is the Erlang distribution, when
λ1 = λ2 = · · · = λm .
2. Hyperexponential distribution: A phase-type distribution with
p = [p1 p2 · · · pm ], pm+1 = 0 (such that p1 + p2 + · · · + pm = 1) and
m × m diagonal matrix
⎡ ⎤
−λ1 0 0 ... 0
⎢ 0 −λ2 0 ... 0 ⎥
⎢ ⎥
⎢ .. .. .. .. ⎥
T=⎢ . . . ... . ⎥.
⎢ ⎥
⎣ 0 0 0 ... 0 ⎦
0 0 0 ... −λm

In this case, m needs to be determined (usually by trial and error),

then the parameters to be estimated are λ1 , λ2 , . . ., λm as well as p1 ,
p2 , . . ., pm . If the random variable to be approximated has SCOV > 1,
then a hyperexponential distribution can be found that is extremely
accurate.
3. Generalized Coxian distribution: A phase-type distribution with
p = [p1 p2 . . . pm ], pm+1 = 0 (such that p1 + p2 + · · · + pm = 1), and
m × m matrix
⎡ ⎤
−λ1 α 1 λ1 0 ... 0
⎢ 0 −λ2 α 2 λ2 ... 0 ⎥
⎢ ⎥
⎢ .. .. .. .. ⎥
T=⎢ . . . ... . ⎥.
⎢ ⎥
⎣ 0 0 0 . . . αm−1 λm−1 ⎦
0 0 0 ... −λm

In this case, m needs to be determined (usually by trial and error),

then the parameters to be estimated are λ1 , λ2 , . . ., λm , p1 , p2 , . . .,
pm , as well as α1 , α2 , . . ., αm−1 . Notice how the hypoexponential
and hyperexponential distributions are special cases of the general-
ized Coxian distribution. In practice, usually m is selected as a small
value such as 2 or 3. A more commonly used approximation is the
regular Coxian distribution that requires p = [1 0 . . . 0], that is, one
starts in the first phase.

Before concluding this section and analyzing phase-type queues, here are
a few words in terms of fitting a positive-valued random variable X with CDF
G(·) as a phase-type distribution Y with CDF F(·). There are several papers
that discuss the selection of m, p, and Q so that the resulting phase-type dis-
tribution fits well. A recent paper (Fackrell [34]) nicely summarizes various
General Interarrival and/or Service Times 205

estimation techniques (such as maximum likelihood, moment matching,

least squares, and distance minimization) and suggests ways to cope with
over-parameterization. One method to estimate r parameters, say, is to solve
for equations of the type F(x1 ) = G(x1 ), F(x2 ) = G(x2 ), . . ., F(xr−2 ) = F(xr−2 )
and also match the first two moments. Of course, for this the choice of xi
for i = 1, . . . , r − 2 are important. Feldmann and Whitt [36] discuss some
drawbacks of moment-matching methods and suggest alternative methods
of fitting hyperexponential distributions for heavy-tailed distributions (such
as Weibull and Pareto). A recent paper by Osogami and Harchol-Balter [86]
describes mapping a general distribution into a regular Coxian distribution.
Since hypoexponential and Erlang distributions are special cases of the regu-
lar Coxian distribution, the mapping can directly be used for those too. Note
that the hyperexponential distribution is only a special case of generalized
Coxian but not the regular Coxian distribution. Further, the references in
Osogami and Harchol-Balter [86] also describe other techniques for fitting
phase-type distributions.

4.4.2 Analysis of Aggregated Phase-Type Queue ( PHi /PH/s)
We originally only described how to convert the G/G/s queue to the
PH/PH/s queue (PH is phase-type in the Kendall notation) with phase-type
interarrival time and IID service times with s servers. The analysis can
be generalized with a very similar notation in two ways: (1) by consid-
ering a superposition of many phase-type arrivals and (2) by considering
heterogeneous servers. However, (2) would result in an explosion of the
state-space unless s is small such as
less than 3. Therefore, we only consider
(1) here and call such a system as a PHi /PH/s queue. The results are from
Curry and
Gautam [21] and they have been derived by extending the single
server PHi /PH/1 system considered in Bitran and Dasu [11] to multiple
servers. What enables us to develop a more general system beyond just the
PH/PH/s queue is the ability to use Kronecker products (⊗) and sums (⊕)
to write down expressions more elegantly. It may be worthwhile to explain
Kronecker sums and products first.
Consider a matrix L that is made up of elements lij and another matrix M
that is made up of elements mij so that we write L = [lij ] and M = [mij ]. Then
the Kronecker product of L and M is given by

L ⊗ M = [lij M].

For example, if we choose L and M such that

⎡ ⎤
l11 l12 l13

m11 m12
L = ⎣ l21 l22 l23 ⎦ and M=
m21 m22
l31 l32 l33
206 Analysis of Queues

then we have
⎡ ⎤
l11 m11 l11 m12 l12 m11 l12 m12 l13 m11 l13 m12
⎢ l11 m21 l11 m22 l12 m21 l12 m22 l13 m21 l13 m22 ⎥
⎢ ⎥
⎢ l21 m11 l21 m12 l22 m11 l22 m12 l23 m11 l23 m12 ⎥
L⊗M=⎢
⎢
⎥.
⎥
⎢ l21 m21 l21 m22 l22 m21 l22 m22 l23 m21 l23 m22 ⎥
⎣ l31 m11 l31 m12 l32 m11 l32 m12 l33 m11 l33 m12 ⎦
l31 m21 l31 m22 l32 m21 l32 m22 l33 m21 l33 m22

To develop the Kronecker sum, we define identity matrices IL and IM

corresponding to the sizes of L and M, respectively. In the previous example,
⎡ ⎤
1 0 0

1 0
IL = ⎣ 0 1 0 ⎦ and IM = .
0 1
0 0 1

The Kronecker sum of any two matrices L and M is given by

L ⊕ M = L ⊗ IM + IL ⊗ M.

An interesting property of the Kronecker sums and products when applied

to the exponential of a matrix is that

exp(L) ⊗ exp(M) = exp(L ⊕ M)

for square matrices L and M.

Now,
using the Kronecker product and sum characterization, we analyze
the PHi /PH/s system. Consider a queue with infinite waiting room and s
servers that are identical. Arrivals occur into this queue as a superposition
of K streams (or sources). For i = 1, . . . , K let the interarrival times for the
ith arrival stream be according to a phase-type distribution with parameters
mA,i , pA,i , and TA,i representing the number of phases, the initial probability
vector, and the T matrix, respectively. For the sake of notation, we also define
∗ that can easily be derived from T
TA,i A,i as described earlier. Let Zi (t) be the
state (corresponding to the phase) of the ith arrival process at time t. In a
similar manner, let the service times at the server be according to a phase-
type distribution with parameters mS , pS , and TS . Once again we also define
TS∗ , which can easily be derived from TS . Let Uj (t) be the state (corresponding
to the phase) of the jth working server at time t, for j = 1, . . . , s.
Define X(t) as the number of entities (or customers) in the system at
time t. Then the multidimensional stochastic process

{(X(t), Z1 (t), . . . , ZK (t), U1 (t), . . . , Umin[X(t),s] (t)), t ≥ 0}

General Interarrival and/or Service Times 207

(although if X(t) = 0, then the state is just {(X(t), Z1 (t), . . . , ZK (t)}) is a CTMC.
The CTMC has lexicographically ordered states with infinitesimal generator
(i.e., Q) matrix of the QBD block diagonal form:

⎛B B0,1 0 0 ... 0 0 0 0 0 0 0 ...

⎞
0,0
⎜ B1,0 B1,1 B1,2 0 ... 0 0 0 0 0 0 0 ...⎟
⎜ 0 ... ...⎟
⎜ B2,1 B2,2 B2,3 0 0 0 0 0 0 0 ⎟
⎜ . .. .. .. .. .. .. .. .. .. .. ⎟
⎜ .. . . . ... ...⎟
⎜ . . . . . . . ⎟
⎜ 0 0 0 0 ... Bs−1,s−2 Bs−1,s−1 Bs−1,s 0 0 0 0 ...⎟
⎜ ⎟.
⎜ 0 0 0 0 ... 0 Bs,s−1 A1 A0 0 0 0 ...⎟
⎜ ⎟
⎜ 0 0 0 0 ... 0 0 A2 A1 A0 0 0 ...⎟
⎜ ⎟
⎜ 0 0 0 0 ... 0 0 0 A2 A1 A0 0 ...⎟
⎜ 0 ... ...⎠
⎟
⎝ 0 0 0 0 0 0 0 A2 A1 A0
.. .. .. .. .. .. .. .. .. .. .. ..
. . . . ... . . . . . . . .

The matrices, Bi,j , correspond to transition rates from states where the num-
ber in the system is i to states where the number in the system is j for i, j ≤ s.
Also, A0 , A1 , and A2 are identical to those in the QBD description and valid
when there are more than s customers in the system. We determine the
matrices A0 , A1 , A2 , and Bi,j using Kronecker sums and products as follows
(for i = 1, . . . , s):

,
A2 = I+K ⊗ TS∗ ⊗ pS
i=1 mA,i
s

,
A1 = TA,1 ⊕ TA,2 ⊕ · · · ⊕ TA,K ⊕ (TS )
s

∗ ∗ ∗
A0 = TA,1 ⊗ pA,1 ⊕ TA,2 ⊗ pA,2 ⊕ · · · ⊕ TA,K ⊗ pA,K ⊗ I(mS )s

B0,0 = TA,1 ⊕ TA,2 ⊕ · · · ⊕ TA,K

,
Bi,i−1 = I+K ⊗ TS∗
i=1 mA,i
i

,
Bi,i = TA,1 ⊕ TA,2 ⊕ · · · ⊕ TA,K ⊕ (TS )
i

∗ ∗ ∗
Bi−1,i = TA,1 ⊗ pA,1 ⊕ TA,2 ⊗ pA,2 ⊕ · · · ⊕ TA,K ⊗ pA,K ⊗ I(mS )i−1 ⊗ pS
208 Analysis of Queues

,
where (M) is the Kronecker sum of matrix M with itself j times, that is,
j
, ,
(M) = M ⊕ M and (M) = M ⊕ M ⊕ M.
2 3
Having modeled the system as a QBD, the next step is to calculate
the steady-state probabilities using MGM. Notice that this is just a minor
extension of the MGM described in Chapter 3 and we will just use the
same analysis here. The reader is encouraged to read Section 3.2.2 before
proceeding. First, we assume that A0 + A1 + A2 is an irreducible infinitesi-
mal generator with stationary probability π (a 1 × m row vector) such that

π(A0 + A1 + A2 ) = 0 and π1 = 1, where for i = 0, 1, i is a column vector of i’s.
The irreducibility assumption is automatically satisfied for phase-type dis-
tributions such as hypoexponential, hyperexponential, and Coxian. Once π
is obtained, the condition that the PH/PH/s queue is stable is

πA2 1 > πA0 1

and this usually corresponds to the total mean service rate due to all s servers
being larger that the arrival rate on average.
If the queue is stable, then the next step is to find R that is the minimal
nonnegative solution to the equation

A0 + RA1 + R2 A2 = 0.

Now let pi be a row-vector of steady-state probabilities, where the number in

the system is i for i = 0, 1, 2, . . . and satisfies the flow balance equations. Then
using MGM results, we know that pi for i = 0, 1, . . . , s can be computed by
solving (with ps−2 = 0 if s = 1):

p0 B0,0 + p1 B1,0 = 0
p0 B0,1 + p1 B1,1 + p2 B2,1 = 0
p1 B1,2 + p2 B2,2 + p3 B3,2 = 0
p2 B2,3 + p3 B3,3 + p4 B4,3 = 0
.. .. ..
. . .
ps−2 Bs−2,s−1 + ps−1 Bs−1,s−1 + ps Bs,s−1 = 0
ps−1 Bs−1,s + ps A1 + ps RA2 = 0

p0 1 + p1 1 + · · · + ps−1 1 + ps (I − R)−1 1 = 1

where 1 is an appropriately sized column vector of ones. Then, using

pi = ps Ri−s for all i ≥ s+1, we obtain ps+1 , ps+2 , ps+3 , etc. Thus, the steady-state
General Interarrival and/or Service Times 209

distribution of the number of entities in the system can be evaluated as

lim P{X(t) = j} = pj 1.
t→∞

Using this, it is rather straightforward to obtain performance measures

such as L, Lq , W, and Wq . In particular,

∞
∞

Lq = ips+i 1 = ips Ri 1 = ps R(I − R)−2 1.
i=1 i=1

Using λ as the mean arrival rate and μ as the mean service rate for each
server (both of which can be computed from the phase-type distributions),
we can write down Wq = Lq /λ, W = Wq + 1/μ, and L = λW. However, what
is not particularly straightforward is the sojourn time distribution. For that
we need to know the arrival point probabilities in steady state, that is, the
distribution of the state of the system when an entity arrives into the system.
Once that is known by conditioning on the arrival point probabilities, then
computing the LST of the conditional sojourn time for the arriving customer,
and then by unconditioning, we can obtain the sojourn time distribution.
Even for a simple example, this computation is fairly tedious and hence not
presented here. In the next section, we consider an example application to
illustrate the results seen in this section.

4.4.3 Example: Application in Semiconductor Wafer Fabs

The semiconductor industry consists of extremely long and complex manu-
facturing processes for various products. These products are manufactured
via multiple layered production processes where they make many passes
through a sequence of similar processing steps. With the high cost and high
value of the resulting products, it is important to carefully analyze these pro-
duction systems. It is often fruitful to develop analytical models of these
systems and subprocesses. One of the most widely utilized analytical tool
in semiconductor fab systems analysis is based on queueing. To effectively
develop an analytical model of a complex network of production processes
such as those encountered in the semiconductor industry, it is necessary
to utilize a modeling approach of decomposing the complex interconnected
system of processing workstations. The decomposition approach treats each
workstation individually by characterizing the inflow and outflow streams.
Therefore, it is paramount that we have accurate characterizations of these
product flow streams. In this section, we obtain the performance of a single
workstation assuming that we know the characteristics of the superimposed
arrival streams and service times.
210 Analysis of Queues

We consider a simple system with two distinct arrival streams (K = 2) and

two identical servers (s = 2). The two independent arrival streams are mod-
eled as two-phase (thus mA,1 = mA,2 = 2) exponential processes with a given
probability of using the second phase (therefore, a Coxian distribution). Even
though the individual arrival processes are independent, the time between
arrivals in the superpositioned process are correlated. In a similar fashion,
the service processes are independent (but identical) Coxian processes with
mS = 2. We now describe the parameters of the arrival and service processes.
The first arrival stream is such that the interarrival time is equal to either
an exp(λ1 ) distribution with probability 1 − α or a sum of two exponentials
(with parameters λ1 and λ2 ) with probability α. Hence, the resultant process
is a Coxian distribution, which is a special type of phase-type distribution
with mA,1 = 1, pA,1 = [1 0]:

−λ1 αλ1 ∗ (1 − α)λ1
TA,1 = and TA,1 = .
0 −λ2 λ2

Likewise, the second arrival stream as well as service times are two-phase
Coxian distributions. In particular, the second arrival stream has parameters
mA,2 = 1, pA,2 = [1 0],

−γ1 βγ1 ∗ (1 − β)γ1
TA,2 = and TA,1 = .
0 −γ2 γ2

Each entity requires a service at one of the two identical servers so that the
service time is according to a Coxian distribution with parameters mS = 1,
pS = [1 0],

−μ1 δμ1 ∗ (1 − δ)μ1
TS = and TA,1 = .
0 −μ2 μ2

Next we consider a numerical example to illustrate the methodology

described in the previous section.

Problem 39
A semiconductor wafer fab has a bottleneck workstation with two iden-
tical machines. Products arrive into the workstation from two sources
and they wait in a line to be processed by one of the two identical
machines. Data suggests that the arrival streams as well as service times
can be modeled as two-phase Coxian distributions described earlier. In
particular, for the first arrival stream (λ1 , α, λ2 ) = (20, 0.25, 5), for the sec-
ond arrival stream (γ1 , β, γ2 ) = (9.091, 0.9, 10), and for the service times
(μ1 , δ, μ2 ) = (10, 0.3333, 20). Model the system as a CTMC by writing down
the infinitesimal generator in QBD form using Kronecker sums and products.
General Interarrival and/or Service Times 211

Then solve for the steady-state probabilities to obtain the average number of
products in the system in the long run.
Solution
As defined earlier, X(t) is the number of products in the system, Zi (t) is
the phase of the ith arrival process, and Ui (t) is the phase ith service pro-
cess if there is a product in service, all at time t and for i = 1, 2. Then the
multidimensional stochastic process

{(X(t), Z1 (t), Z2 (t), U1 (t), Umin[X(t),s] (t)), t ≥ 0},

with lexicographically ordered states, is a CTMC. The preceding is with the

understanding that if X(t) = 0, then the state is just {(X(t), Z1 (t), Z2 (t))}, if
X(t) = 1, then the state is just {(X(t), Z1 (t), Z2 (t), U1 (t))}, else when X(t) > 1,
the state {(X(t), Z1 (t), Z2 (t), U1 (t), U2 (t))}. The infinitesimal generator matrix
of the CTMC is of block diagonal form (hence QBD):
⎛ ⎞
B0,0 B0,1 0 0 0 ...
⎜ B1,0 B1,1 B1,2 0 0 ... ⎟
⎜ ⎟
⎜ 0 B2,1 A1 A0 0 ... ⎟
⎜ ⎟
⎜ 0 A2 A1 A0 ... ⎟
⎝ ⎠
.. .. .. .. .. ..
. . . . . .

where

A 2 = I4 ⊗ TS∗ ⊗ pS ⊕ TS∗ ⊗ pS

A1 = TA,1 ⊕ TA,2 ⊕ TS ⊕ TS

∗ ∗
A0 = TA,1 ⊗ pA,1 ⊕ TA,2 ⊗ pA,2 ⊗ I4

B0,0 = TA,1 ⊕ TA,2

∗ ∗
B0,1 = TA,1 ⊗ pA,1 ⊕ TA,2 ⊗ pA,2 ⊗ pS

B1,0 = I4 ⊗ TS∗

B1,1 = TA,1 ⊕ TA,2 ⊕ TS

∗ ∗
B1,2 = TA,1 ⊗ pA,1 ⊕ TA,2 ⊗ pA,2 ⊗ I2 ⊗ pS

B2,1 = I4 ⊗ TS∗ ⊕ TS∗ .
212 Analysis of Queues

Notice that we have numerical values for TA,i and pA,i for i = 1, 2 as well
as TS and pS . Therefore, the preceding matrices can be computed. However,
since some of the matrices are too huge to be displayed in the following text
(e.g., A0 , A1 , and A2 are 16 × 16 matrices), we only show a few computa-
tions to illustrate the Kronecker product and sum calculations. In particular,
verify that
⎡ ⎤
−29.091 8.182 5 0
⎢ 0 −30 0 5 ⎥
B0,0 = TA,1 ⊕ TA,2 =⎢
⎣
⎥,
0 0 −14.091 8.182 ⎦
0 0 0 −15

also to obtain the 4 × 8 matrix B0,1 we need

⎡ ⎤
15.909 0 0 0
⎢ 10 15 0 0 ⎥
TA,1 ⊗ pA,1 ⊕ TA,2 ⊗ pA,2 = ⎢
∗ ∗
⎣
⎥,
5 0 0.909 0 ⎦
0 5 10 0

for the 16 × 16 A1 matrix we use

⎡ ⎤
−20 10/3 10/3 0
⎢ 0 −30 0 10/3 ⎥
TS ⊕ TS = ⎢
⎣ 0
⎥,
0 −30 10/3 ⎦
0 0 0 −20

and the 16 × 8 matrix B2,1 requires

⎡ ⎤
40/3 0
⎢ 20 20/3 ⎥
TS∗ ⊕ TS∗ = ⎢
⎣ 20
⎥.
20/3 ⎦
0 40

Next, assuming that the infinitesimal generator matrix can be obtained

numerically (notice that most mathematical software such as MATLAB
and Mathematica have inbuilt programs for Kronecker sums and products)
we are ready to perform the MGM analysis. We first check if the condition
for stability is met. Notice that the aggregate mean interarrival time can be
computed as 0.0667 and the mean service time is 0.01167, resulting in a traffic
intensity (arrival rate over twice the service rate) of 87.5%. Hence, the queue
is indeed stable.
We find the 16 × 16 matrix R by obtaining the minimal nonnegative
solution to the equation:

A0 + RA1 + R2 A2 = 0.
General Interarrival and/or Service Times 213

Now let pi be a row-vector of steady-state probabilities where the number

in the system is i for i = 0, 1, 2, . . . and satisfies the flow balance equations.
Notice that p0 , p1 , and pi for i > 1 are of lengths 4, 8, and 16, respectively.
They can be computed by solving

p0 B0,0 + p1 B1,0 = 0
p0 B0,1 + p1 B1,1 + p2 B2,1 = 0

p1 B1,2 + p2 A1 + p2 RA2 = 0

p0 1 + p1 1 + p2 (I − R)−1 1 = 1

where 1 is an appropriately sized column vector of ones. Then, using

pi = p2 Ri−2 for all i ≥ 3, we obtain p3 , p4 , p5 , etc. Thus, the steady-state
distribution of the number of entities in the system can be evaluated as

lim P{X(t) = j} = pj 1.
t→∞

Using this we can obtain

∞
∞

Lq = ip2+i 1 = ip2 Ri 1 = p2 R(I − R)−2 1.
i=1 i=1

Based on that, we can compute L as 8.5588 customers on average in steady

state.

4.5 Other General Queues but with Exact Results

In this section, we present results for few special queues with generally dis-
tributed service times. For the analysis of these queues, we will see methods
that we have not previously used for the other systems.

4.5.1 M/G/∞ Queue: Modeling Systems with Ample Servers

An M/G/∞ queue is one where the arrivals are according to a Poisson pro-
cess and the service time for each arriving entity is generally distributed
with cumulative distribution function (CDF) G(·). Further, there are an infi-
nite number of servers in the system and, therefore, the sojourn time in
the system equals the service time. Let λ be the average arrival rate. Also,
let τ denote the mean service time (with respect to the previous notation,
214 Analysis of Queues

τ = 1/μ). We will typically consider the mean service time to be finite, that
is, τ < ∞. Also, for nontriviality we assume τ > 0. It is crucial to note that
there is no restriction in terms of the service time random variable; it could
be discrete, continuous, or a mixture of discrete and continuous (however,
our analysis is in terms of continuous, for the others the Riemann integral
must be replaced by the Lebesgue integral).
Some of the performance measures are relatively straightforward, such
as the sojourn time distribution is identical to the service time distribution.
Therefore, the key analysis is to obtain the distribution of the number in
the system. Define X(t) as the number (of entities) in the system at time t.
The objective of transient and steady-state analysis is to obtain a probabil-
ity distribution of X(t) for finite t and as t → ∞, respectively. We would
first present the transient analysis and then take the limit for steady-state
analysis.
Transient analysis typically depends on the initial state of the queue. To
obtain simple closed-form expressions, we need to make one of the following
three assumptions: (i) the queue is empty at t = 0, that is, X(0) = 0; (ii) the
queue started empty in the distant past, that is, X(−∞) = 0, and hence the
stochastic process {X(t), t ≥ 0} is stationary; and (iii) if X(0) > 0, then the
service for all the X(0) entities begin at t = 0. If one of the preceding three
assumptions are not satisfied, we will have to know the times when service
began for each of the X(0) customers in order to obtain a distribution for X(t).
For the transient analysis here, we make the first assumption, that is,
X(0) = 0. Hence, we are interested in computing pj (t) defined as

pj (t) = P{X(t) = j|X(0) = 0}.

Note that if we made the second assumption, then X(t) would be according
to the steady state distribution for all t ≥ 0. However, if we made the third
assumption, then

min(i,j)
i
P{X(t) = j|X(0) = i, Bi } = [1 − G(t)]k [G(t)]i−k pj−k (t),
k
k=0

with the event Bi denoting that service for all i initial customers begins at
t = 0. Since this is straightforward once pj (t) is known, we continue with
obtaining pj (t) by making the first assumption.
Notice that {X(t), t ≥ 0} is a regenerative process with regeneration epochs
corresponding to when the queueing system becomes empty. To obtain pj (t),
consider an arbitrary arrival at time x such that x ≤ t. Let qx be the probability
that this arriving entity is in the system at time t. Clearly,

qx = 1 − G(t − x)
General Interarrival and/or Service Times 215

since it is the probability that this entity’s service time is larger than t − x. Let
{N(t), t ≥ 0} be a Poisson process with parameter λ that counts the number of
arrivals in time (0, t] for all t ≥ 0. Now consider a nonhomogeneous Bernoulli
splitting of the Poisson (arrival) process {N(t), t ≥ 0} such that with probabil-
ity qx an entity arriving at time x will be included in the split process. Since
the split process counts the number in the M/G/∞ system at time t, we have

⎧ ⎫ - t .j
⎨ t ⎬ λ 0 qx dx
pj (t) = exp −λ qx dx .
⎩ ⎭ j!
0

The proof is described in Kulkarni [67], Gross and Harris [49], and Wolff
[108]. It is based on conditioning the number of arrivals in time t and using
the fact that each of the given arrivals occur uniformly in [0, t]. The argument
is similar to the derivation of the number of departures from the M/G/∞
queue in time t in Problem 41.
Next we write down pj (t) in terms of entities given in the model, namely,
λ and G(t). As a result of using qx = 1 − G(t − x) and change of variables,
we get

⎧ ⎫ - t .j
⎨ t ⎬ λ 0 qx dx
pj (t) = exp −λ qx dx
⎩ ⎭ j!
0

⎧ ⎫ - t .j
⎨ t ⎬ λ 0 [1 − G(t − x)]dx
= exp −λ [1 − G(t − x)]dx
⎩ ⎭ j!
0

⎧ ⎫ - t .j
⎨ t ⎬ λ 0 [1 − G(u)]du
= exp −λ [1 − G(u)]du .
⎩ ⎭ j!
0

Using pj (t) we can also obtain

t
E[X(t)] = λ [1 − G(u)]du
0

and

t
Var[X(t)] = λ [1 − G(u)]du
0
216 Analysis of Queues

by observing that for a given t, X(t) is a Poisson random variable with

t
parameter λ 0 [1 − G(u)]du.
For the steady-state analysis, we let t → ∞ in the transient analysis to
derive all results. Let X(t) converge in distribution to X as t → ∞ such that
pj (t) → pj , where pj = limt → ∞ P{X(t) = j}. Taking the limit t → ∞ for pj (t)
we find

(λτ)j
pj = e−λτ ,
j!

where τ is the average service time. Then, with an integration-by-parts step,

it can be shown that

∞
τ= [1 − G(u)]du.
0

In addition, the mean and variance of the number in the system in steady state
are both λτ since X is a Poisson random variable with parameter λτ. Note
that the mean and variance of the sojourn times correspond, respectively,
to the mean and variance of the service times since there is no waiting for
service to begin. Before concluding this section, it is worthwhile observ-
ing that the steady-state probability pj is identical to those of the M/M/∞
system, and thus pj does not depend on the CDF G(·) but just the mean
service time.

Problem 40
Obtain the distribution of the busy period, that is, the continuous stretch of
time when there are one or more entities in the system beginning with the
arrival of an entity into an empty system.
Solution
As we described earlier, {X(t), t ≥ 0} is a regenerative process with regen-
eration epochs corresponding to times when the number in the system
goes from 1 to 0. Each regeneration time corresponds to one idle period
followed by one busy period (time when there are one or more entities
in the M/G/∞ system). Let U be the regeneration time and U = I + B,
where the idle time I ∼ exp(λ) (i.e., time for next arrival in a Poisson pro-
cess), and the busy period B has a CDF H(·). We need to determine H(·).
For that, we develop the following explanation based on Example 8.17 in
Kulkarni [67].
Let F(·) be the CDF of U. Using a renewal argument by conditioning
on U = u, we can write down a renewal-type equation for p0 (t) = P{X(t) =
0|X(0) = 0} as
General Interarrival and/or Service Times 217

t ∞
p0 (t) = p0 (t − u)dF(u) + P{I > t|U = u}dF(u).
0 t

Since U = I + B and P{I > t|U = u} = 0 if u ≤ t, we can rewrite

t ∞
p0 (t) = p0 (t − u)dF(u) + P{I > t|U = u}dF(u)
0 0

t
= p0 (t − u)dF(u) + P{I > t}
0

= p0 ∗ F(t) + e−λt

with p0 ∗ F(t) is in the convolution notation.

We already derived an expression for p0 (t) as
⎧ ⎫
⎨ t ⎬
p0 (t) = exp −λ [1 − G(u)]du ,
⎩ ⎭
0

where G(·) is the CDF of the service times. One way to obtain the unknown
F(t) function is to numerically solve for F(t) in p0 (t) = p0 ∗ F(t) + e−λt and
similarly solve another convolution equation to get H(t). An alternate
approach, typically standard when there are convolutions, is to use trans-
forms. Therefore, taking the LST on both sides of the equation p0 (t) = p0 ∗
F(t) + e−λt , we get

s
p˜0 (s) = p˜0 (s)F̃(s) + .
s+λ

However, since we are ultimately interested in H(t), we use the relation

λ
F̃(s) = H̃(s)
λ+s

to obtain
s s
H̃(s) = 1 + − .
λ λp˜0 (s)

In the most general case, the preceding LST is not easy to invert and obtain
H(t). Another challenge is to obtain p˜0 (s) from p0 (t), which is not trivial.
However, there are several software packages available (such as MATLAB
218 Analysis of Queues

and Mathematica) that can be used to numerically compute as well as

invert LSTs.
Having said that, it is relatively straightforward to obtain E[B], the mean
busy period. Using the fact that p˜0 (0) = p0 (∞) = e−λτ , we can obtain

eλτ − 1
E[B] = −H̃ (0) = .
λ

Another way to obtain it is to use regenerative process results and solve for
E[B] in 1 − p0 = E[B]/(E[B] + 1/λ).

Problem 41
Compute the distribution of the interdeparture times both in the transient
case and in the steady-state case.
Solution
Since the output from an M/G/∞ may flow into some other queue, it is crit-
ical to analyze the departure process. Let D(t) be the number of departures
from the M/G/∞ system in time [0, t] given that X(0) = 0. For any arbitrary t,
we seek to obtain the distribution of the random variable D(t) and thereby
characterize the stochastic process {D(t), t ≥ 0}. Similar to the analysis for the
number in the system, here too we first consider transient and then describe
steady-state results. The results follow the analysis in Gross and Harris [49].
However, it is crucial to point out that there are many other elegant ways
of analyzing departures from M/G/∞ queues and extending them, some of
which we will see toward the end of this section.
If we are given that n arrivals occurred in time [0, t], then using stan-
dard Poisson process results we know that the time of arrival of any of
the n arrivals is uniformly distributed over [0, t] and it is independent of
the time of other arrival times. Therefore, consider one of the n arrivals
that occurred in time [0, t]. The probability θ(t) that this entity would have
departed before time t can be obtained by conditioning on the time of arrival
and unconditioning as

1 1
t t
θ(t) = G(t − x)dx = G(u)du.
t t
0 0

In addition, the probability that out of the n arrivals in time [0, t], exactly i of
those departed before time t is ni [θ(t)]i [1 − θ(t)]n−i .
Now, to compute the distribution of D(t), we condition on N(t), which is
the number of arrivals in time [0, t]. In order to remind us that we do make
the assumption that X(0) = 0, we include this condition in the expressions.
Therefore, we have
General Interarrival and/or Service Times 219

∞
(λt)n
P{D(t) = i|X(0) = 0} = P{D(t) = i|N(t) = n, X(0) = 0}e−λt
n!
n=i
∞
n (λt)n
= [θ(t)]i [1 − θ(t)]n−i e−λt
i n!
n=i

∞
[θ(t)]i −λt (λt)n−i
= e (λt)i [1 − θ(t)]n−i
i! (n − i)!
n=i

[λtθ(t)]i −λtθ(t)
= e .
i!

Therefore, for a given t, D(t) is a Poisson random variable with parameter

λtθ(t). In other words, the transient distribution of the number of departure
in time t is Poisson with parameter λtθ(t). Having described the transient
analysis, we just let t → ∞ for steady-state analysis.
It can be shown from the definition of θ(t) that for very large t, tθ(t)
approaches t − τ. However, since we assume that τ is finite, as we let t → ∞,
the departure becomes a Poisson process with parameter λ(1 − τ/t) → λ.
Of course, as we have not shown stationary and independent properties
of the departure process, the reader is encouraged to refer to Serfozo [96]
Section 9.6 for a derivation based on stationary marked point processes.
Using the earlier results as well as stationary and independence properties,
we can conclude that the departure process from an M/G/∞ queue in
steady-state is the same as the arrival process, that is, a Poisson process with
parameter λ. Further, similar to the steady-state probability pj being identical
to that of the M/M/∞ system and being independent of the CDF G(·), the
departure process in steady state is also identical to that of the M/M/∞ system
and depends only on the mean service time (τ) and not its distribution.

Problem 42
Consider an extension to the M/G/∞ queue. The arrival process is Poisson,
however, the parameter of the Poisson process is time varying. The average
arrival rate at time t (for all t in (−∞, ∞)) is a deterministic function of t rep-
resented as λ(t). Hence, the arrival process is defined as a nonhomogeneous
Poisson process. Everything else is the same as the regular M/G/∞ queue.
We call such a system an Mt /G/∞ queue. Perform transient analysis for this
system.
Solution
This summary of results for the Mt /G/∞ queue is based on Eick, Massey,
and Whitt [27]. Recall that we need to make one of the three assumptions for
initial condition, otherwise we would need to know when service started for
220 Analysis of Queues

each of the customers at a reference time, say t = 0. We make the second

assumption, although making one of the other two would not significantly
alter the analysis but the results would not be as neat. In other words, assume
that the system was empty at t =−∞. Assume that for all finite t, λ(t) is
nonnegative, measurable, and integrable. With these assumptions in mind,
we state the results.
The results are in terms of the equilibrium random variable of the service
times. In particular, let S be a random variable corresponding to a service
time. We have already defined the CDF G(x) = P{S ≤ x} and mean τ = E[S].
The equilibrium random variable Se corresponding to the service times has
a CDF Ge (x) defined as

1
x
Ge (x) = P{Se ≤ x} = [1 − G(u)]du.
τ
0

One encounters equilibrium random variables in renewal theory. If a

renewal process with interrenewal random variable Y starts at t = −∞, then
at time t = 0, the remaining time for a renewal is according to Ye .
Recall the notation X(t). Here it denotes the number (of entities) in the
Mt /G/∞ system at time t. Then for any t, X(t) has a Poisson distribution
with parameter μ(t) given by

μ(t) = E[λ(t − Se )]E[S]

with E[X(t)] = Var[X(t)] = μ(t). It is important to note that λ(·) is the func-
tion defined in this section and not make the mistake of thinking that at t = 0
the average number in the system is negative! In fact, observe that the num-
ber in the system at any time depends on the arrival rate Se time units ago.
Further, the departure process also has a similar time lag effect where the
average departure rate at time t is E[λ(t − S)] and the resulting process is
nonhomogeneous Poisson.
For u > 0 and any t,

Cov[X(t), X(t + u)] = E λ t − (S − u)+
e E[(S − u)+ ],

where the notation (y)+ is max(y, 0). In fact, this result can be derived for the
homogeneous case (which we have not done earlier but is extremely useful
especially in computer-communication traffic with long-range dependence).
For an M/G/∞ queue where λ(t) = λ,

Cov[X(t), X(t + u)] = λP{Se > u}E[S].

There are several results for networks of Mt /G/∞ queues. Since the entities
do not interact and departure processes are Poisson, the analysis is fairly
General Interarrival and/or Service Times 221

convenient. The reader is referred to Eick et al. [27] as well as the references
therein for further results.

Notice that the methods used to analyze the M/G/∞ system and its
extensions are significantly different from the others in this book. In fact,
even the related system, the M/G/s/s queue would be analyzed in Section
4.5.3 differently using a multidimensional continuous-state Markov process.
To describe that method, we first explain the M/G/1 queue with a special dis-
cipline called processor sharing, then use the same technique for the M/G/s/s
queue subsequently.

4.5.2 M/G/1 Queue with Processor Sharing: Approximating CPUs

Processor sharing is a work-conserving service discipline like FCFS, LCFS,
random order of service, etc. Essentially all the jobs or entities in the sys-
tem equally share the processor at any given time, hence the name processor
sharing. For example, if there is one entity in the system and during its ser-
vice another entity arrives, then the old job and the new job would both be
processed, each at half the rate that the old job was being processed. It is
fairly common to model computer systems (especially CPUs) as ones with
processor sharing. For example, if the processor speed is c bytes per second
and there are n jobs, then each job gets c/n bytes per second of processor
capacity. In practice, what actually happens is that the processor spends a
time-quantum of
micro seconds to process one job and context-switches to
the next job and spends another
micro second. If the processor went in a
round-robin fashion across existing jobs, then as
→ 0, the scheduling mech-
anism converges to processor sharing if the context switching time is zero.
It is also fair to mention that in practice there is a multiprogramming limit
(MPL), which is the maximum number of simultaneous jobs a processor can
handle. Here we assume that MPL is infinite.
Interestingly, this special type of service discipline does produce closed-
form results. The analysis uses multidimensional Markov processes with
continuous state space similar to the M/G/s/s system that we will see in
Section 4.5.3. We first explain the system. Arrivals into the system occur
according to a Poisson process with mean rate λ per second. Let Si be the
amount of work (say in kilo bytes) customer i brings, and S1 , S2 , . . . are IID
random variables. There is a single server with processing capacity of 1 (say
kilo bytes per second). If there are n entities in the system at time t, then each
entity is processed at rate 1/n (say kilo bytes per second). However, since
the processing capacity is 1, if Si is the amount of work for the ith job or cus-
tomer, it would take Si seconds if it were the only job on the processor during
its sojourn. In that light, we say that Si is the service time requirement for the
ith customer so that the CDF is G(t) = P{Si ≤ t}. We assume that the system
is stable, that is, ρ < 1, where ρ = λE[Si ]. Notice that the stability condition is
222 Analysis of Queues

the same as that of the M/G/1 queue with FCFS service discipline since both
queues are work conserving.
Now we model the M/G/1 processor sharing queue to obtain perfor-
mance measures such as distribution of the number in the system and mean
sojourn time. Let X(t) be the number of customers in the system at time t and
Ri (t) be the remaining service time for the ith customer in the system. The
multidimensional stochastic process {(X(t), R1 (t), R2 (t), . . . , RX(t) (t)), t ≥ 0}
satisfies the Markov property (since to predict the future states we only
need the present state and nothing from the past) and hence it is a
Markov process. However, notice that most of the elements in the state
space are continuous, unlike the discrete ones we have seen before. Typ-
ically such Markov processes are difficult to analyze unless they have a
special structure like this one (and the M/G/s/s queue we will see in
Section 4.5.3).
Define Fn (t, y1 , y2 , . . . , yn ) as the following joint probability

Fn (t, y1 , y2 , . . . , yn ) = P{X(t) = n, R1 (t) ≤ y1 , R2 (t) ≤ y2 , . . . , Rn ≤ yn }.

Thereby, the density function fn (t, y1 , y2 , . . . , yn ) is defined as

∂ n Fn (t, y1 , y2 , . . . , yn )
fn (t, y1 , y2 , . . . , yn ) = .
∂y1 ∂y2 . . . ∂yn

Our objective is to write down a partial differential equation for

fn (t, y1 , y2 , . . . , yn ) that captures the queue dynamics. For that we consider
the system at an infinitesimal time h after t and write down an expression for
fn (t + h, y1 , y2 , . . . , yn ). For that we need to realize that if X(t + h) = n, then at
time t one of three things would have happened: X(t) = n + 1 and one of the
customers departed between t and t + h; X(t) = n − 1 and one new customer
arrived between t and t + h; or X(t) = n with no new arrivals or departures
between t and t + h. Of course, there are other possibilities but they would
disappear as we take the limit h → 0. We write this down more formally and
in the process we use the notation o(h), which is a collection of terms of order
higher than h such that

o(h)
lim = 0.
h→0 h

It is important to realize that o(h) is not a specific function of h, but a conve-

nient way of collecting higher-order terms but not state them explicitly since
anyway they would go to zero when the limit is taken.
Now we are ready to write down fn (t + h, y1 , y2 , . . . , yn ) in terms of the
states at time t. Based on the discussion in the previous paragraph, we have
General Interarrival and/or Service Times 223

fn (t + h, y1 , y2 , . . . , yn )

= (1 − λh)fn (t, y1 + h/n, y2 + h/n, . . . , yn + h/n)

n n+1
h

h h
+ (1 − λh) fn+1 t, y1 + , . . . , yi−1 + , y, yi
n+1 n+1
i=0 0

h h
+ , . . . , yn + dy
n+1 n+1

n
G (yi ) h
+ λh fn−1 t, y1 + , . . . , yi−1
n n−1
i=1

h h h
+ , yi+1 + , . . . , yn + + o(h). (4.17)
n−1 n−1 n−1

The preceding equation perhaps deserves some explanation. Since the ser-
vice discipline is processor sharing, if there is yi amount of service remaining
at time t + h, then at time t there would have been yi + h/n service remaining
when there are n customers in the system during time t to t+h. The probabil-
ity that there are no arrivals in a time-interval h units long is (1−λh)+o(h) and
the probability of exactly one arrival is λh + o(h). First consider the case that
there are no new arrivals in time t to t + h, then one of two things could have
happened: no service completions during that time interval (first expression
in the preceding equation) or one service completion such that at time t there
are n + 1 customers and the one with less than h/(n + 1) service remaining
would complete (second expression in the preceding equation). Therefore,
the first term is pretty straightforward and the second term incorporates via
the integral, the probability of having less than h/(n + 1) service in any of the
(n + 1) spots around the n customers in time t + h. The third term considers
the case of exactly one arrival. This arrival could have been customer i with
workload yi . Notice that the G (yi ) is the PDF of the service times at yi , how-
ever this could be any of the n customers with probability 1/n and hence the
summation.
To simplify Equation 4.17, we use the following Taylor-series expansion

h h h
fn t, y1 + , y2 + , . . . , yn + = fn (t, y1 , y2 , . . . , yn )
n n n

h ∂fn (t, y1 , y2 , . . . , yn )
n
+ + o(h),
n ∂yi
i=1
224 Analysis of Queues

as well as the fundamental calculus area-under-the-curve result

h/(n+1)
h h h h
fn+1 t, y1 + , . . . , yi−1 + , y, yi + , . . . , yn + dy
n+1 n+1 n+1 n+1
0

h
= fn+1 t, y1 , . . . , yi−1 , 0, yi , . . . , yn + o(h).
n+1

Using the preceding two results, we can rewrite Equation 4.17 as

fn (t + h, y1 , y2 , . . . , yn )

h ∂fn (t, y1 , y2 , . . . , yn )
n
= (1 − λh)fn (t, y1 , y2 , . . . , yn ) + (1 − λh)
n ∂yi
i=1

n
h
+ (1 − λh) fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0

n
G (yi ) h h
+ λh fn−1 t, y1 + , . . . , yi−1 + ,
n n−1 n−1
i=1

h h
yi+1 + , . . . , yn + + o(h).
n−1 n−1

Subtracting fn (t, y1 , y2 , . . . , yn ) on both sides and dividing by h in the preced-

ing equation, we get

fn (t + h, y1 , y2 , . . . , yn ) − fn (t, y1 , y2 , . . . , yn )
h

= −λfn (t, y1 , y2 , . . . , yn )

1 ∂fn (t, y1 , y2 , . . . , yn )
n
+ (1 − λh)
n ∂yi
i=1

n
1
+ (1 − λh) fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0
General Interarrival and/or Service Times 225

n
G (yi ) h h
+λ fn−1 t, y1 + , . . . , yi−1 + ,
n n−1 n−1
i=1

h h o(h)
yi+1 + , . . . , yn + + .
n−1 n−1 h

Taking the limits as h → 0 in the preceding equation, we get

1 ∂fn (t, y1 , y2 , . . . , yn )
n
∂fn (t, y1 , y2 , . . . , yn )
= −λfn (t, y1 , y2 , . . . , yn ) +
∂t n ∂yi
i=1

n
1
+ fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0

n
G (yi )
+λ fn−1 (t, y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1

The preceding equation is similar to the Chapman–Kolmogorov equation

and we will see next how to derive the steady-state balance equation from
that. Since the stochastic process {X(t), R1 (t), R2 (t), . . . , RX(t) (t)} is a Markov
process and the stability condition ρ < 1 is satisfied, in steady-state, the
stochastic process converges to a stationary process. In other words as
t → ∞, we have ∂fn (t, y1 , y2 , . . . , yn )/∂t = 0 and fn (t, y1 , y2 , . . . , yn ) converges
to the stationary distribution fn (y1 , y2 , . . . , yn ), that is, fn (t, y1 , y2 , . . . , yn ) →
fn (y1 , y2 , . . . , yn ). Therefore, from the preceding equation as we let t → ∞,
we get the following balance equation:

1 ∂fn (y1 , y2 , . . . , yn )
n
0 = −λfn (y1 , y2 , . . . , yn ) +
n ∂yi
i=1

n
1
+ fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0

n
G (yi )
+λ fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1

One way is to solve the balance equations by trying various n values start-
ing from 0. Another way is to find a candidate solution and check if it satisfies
the balance equation. We try the second approach realizing that if we have
a solution it is the unique solution. In particular, we consider the M/M/1
queue with processor sharing. There we can show (left as an exercise for the
226 Analysis of Queues

reader) that

/
n
fn (y1 , y2 , . . . , yn ) = (1 − ρ)λn [1 − G(yi )].
i=1

As a first step we check if the earlier solution satisfies the balance equations
for the M/G/1 with processor sharing case. +
In fact, when fn (y1 , y2 , . . . , yn ) = (1 − ρ)λn ni=1 [1 − G(yi )], it would
imply that

n
1
fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn ) = λfn (y1 , y2 , . . . , yn )
n+1
i=0

+
since fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn ) = (1 − ρ)λn+1 [1 − G(0)] ni=1 [1 − G(yi )] =
(y1 , y2 , . . . , yn ) since G(0) = 0. In addition, if fn (y1 , y2 , . . . , yn ) =(1 − ρ)
λfn+
λn ni=1 [1 − G(yi )], then

∂fn (y1 , y2 , . . . , yn )
= −λG (yi )fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn )
∂yi

since

∂fn (y1 , y2 , . . . , yn ) /
i−1 /
n
= (1 − ρ)λn (−G (yi )) [1 − G(yj )] [1 − G(yj )]
∂yi
j=1 j=i+1

/
i−1 /
n
= −λ(1 − ρ)G (yi )λn−1 [1 − G(yj )] [1 − G(yj )]
j=1 j=i+1

= −λG (yi )fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn ).

+n
Thus, fn (y1 , y2 , . . . , yn ) = (1 − ρ)λn i=1 [1 − G(yi )] satisfies the balance
equation

1 ∂fn (y1 , y2 , . . . , yn )
n
0 = −λfn (y1 , y2 , . . . , yn ) +
n ∂yi
i=1

n
1
+ fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0

n
G (yi )
+λ fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1
General Interarrival and/or Service Times 227

Also, it is straightforward to check that

∞
∞ ∞ ∞
... fn (y1 , y2 , . . . , yn )dyn . . . dy2 dy1 = 1
n=0 y1 =0 y2 =0 yn =0

+
for fn (y1 , y2 , . . . , yn ) = (1 − ρ)λn ni=1 [1 − G(yi )] and hence is the steady-state
solution.
Now, to obtain the performance measures, let pi be the steady-state
probability and there are i in the system, that is,

pi = lim P{X(t) = i}.

t→∞

Using the fact that (via integration by parts)

∞ ∞ 1
[1 − G(yj )]dyj = yj G (yj )dyj = ,
μ
yj =0 yj =0

we have

∞ ∞ ∞
pi = ... fi (y1 , y2 , . . . , yi )dyi . . . dy2 dy1 = (1 − ρ)ρi .
y1 =0 y2 =0 yi =0

Notice that this is identical to the number in the system for an M/M/1 queue
with FCFS discipline. Thus, L = ρ/(1 − ρ) and W = 1/(μ − λ). It is also possi-
ble to obtain the expected conditional sojourn time for a customer arriving in
steady state with a workload S as S/(1 − ρ). It uses the fact that the expected
number of customers in the system throughout the sojourn time (due to sta-
tionarity of the stochastic process) is one plus the average number, that is,
1 + ρ/(1 − ρ) = 1/(1 − ρ). Hence, a workload of S would take S/(1 − ρ) time
to complete processing at a processing rate of 1.

4.5.3 M/G/s/s Queue: Telephone Switch Application

Consider an M/G/s/s queue where the arrivals are according to a Poisson
process with mean rate λ per unit time. There are s servers but no waiting
room. If all s servers are busy, arriving customers are immediately rejected.
However, if there is at least one available server, an arriving customer gets
served in a random time S with CDF G(t) = P{S ≤ t} such that E[S] = 1/μ.
We assume that all servers are identical and the service times are IID ran-
dom variables. There are several applications for such queues, one of them
is telephone switches that in fact started the field of queueing theory by
A. K. Erlang. When a landline caller picks up his or her telephone to make a
228 Analysis of Queues

call, this amounts to an arrival to the switch. If the caller hears a dial tone, it
means a line is available and the caller punches the number he or she wishes
to call. If a line is not available, the caller would get a tone stating all lines
are busy (these are also quite common in cellular phones where messages
such as “the network is busy” are received). The telephone switch has s lines
and each line is held for a random time S by a caller and this time is also
frequently known as holding times. The pioneering work by A. K. Erlang
resulted in the computation of the blocking probability (or the probability a
potential caller is rejected).
For this, let X(t) be the number of customers in the system at time
t and Ri (t) be the remaining service time at the ith busy server. The
multidimensional stochastic process {(X(t), R1 (t), R2 (t), . . . , RX(t) (t)), t ≥ 0}
satisfies the Markov property (since to predict the future states we only need
the present state and nothing from the past) and hence it is a Markov pro-
cess. It is worthwhile to make two observations here. First of all, this analysis
is almost identical to that of the M/G/1 processor sharing queue seen in the
previous section. Some of the terms used here such as o(h) have been defined
in that section and the reader is encouraged to go over that. Second, it is
possible to model the system as the remaining service time in each of the s
servers. However, additional constraints on whether or not the server is busy
imposes more bookkeeping. Hence, we just stick to the X(t) busy servers at
time t with the understanding that the alternating formulation could also
be used.
Define Fn (t, y1 , y2 , . . . , yn ) as the following joint probability

Fn (t, y1 , y2 , . . . , yn ) = P{X(t) = n, R1 (t) ≤ y1 , R2 (t) ≤ y2 , . . . , Rn ≤ yn }.

Thereby, the density function fn (t, y1 , y2 , . . . , yn ) is defined as

∂ n Fn (t, y1 , y2 , . . . , yn )
fn (t, y1 , y2 , . . . , yn ) = .
∂y1 ∂y2 . . . ∂yn

Our objective is to write down a partial differential equation for

fn (t, y1 , y2 , . . . , yn ) that captures the queue dynamics. For that, we consider
the system at an infinitesimal time h after t and write down an expression
for fn (t + h, y1 , y2 , . . . , yn ). For that, we need to realize that if X(t + h) = n,
then at time t one of three things would have happened: X(t) = n + 1 and
one of the customers departed between t and t + h, X(t) = n − 1 and one new
customer arrived between t and t + h, or X(t) = n with no new arrivals or
departures between t and t + h. Of course, there are other possibilities but
they would disappear as we take the limit h → 0. We do the analysis for the
case 0 < n < s, however, for n = 0 and n = s, it is just a matter of not counting
events that cannot occur (such as a departure when n = 0 or arrival when
n = s).
General Interarrival and/or Service Times 229

Now we are ready to write down fn (t + h, y1 , y2 , . . . , yn ) in terms of the

states at time t. Based on the discussion in the previous paragraph, we have

fn (t + h, y1 , y2 , . . . , yn )
= (1 − λh)fn (t, y1 + h, y2 + h, . . . , yn + h)
n h

+ (1 − λh) fn+1 (t, y1 + h, . . . , yi−1 + h, y, yi + h, . . . , yn + h)dy
i=0 0

n
G (yi )
+ λh fn−1 (t, y1 + h, . . . , yi−1 + h, yi+1 + h, . . . , yn + h) + o(h).
n
i=1
(4.18)

The preceding equation perhaps deserves some explanation. If there is yi

amount of service remaining at time t + h, then at time t there would have
been yi + h service remaining. The probability that there are no arrivals in a
time-interval h units long is (1 − λh) + o(h) and the probability of exactly one
arrival is λh + o(h). First consider the case that there are no new arrivals in
time t to t + h, then one of two things could have happened: no service com-
pletions during that time interval (first expression in the preceding equation)
or one service completion such that at time t there are n + 1 customers and
the one with less than h amount of service remaining would complete (sec-
ond expression in the preceding equation). Therefore, the first term is pretty
straightforward and the second term incorporates via the integral, the prob-
ability of having less than h service in any of the (n + 1) spots around the
n customers in time t + h. The third term considers the case of exactly one
arrival. This arrival could have been customer i with workload yi . Notice that
the G (yi ) is the PDF of the service times at yi , however, this could be any of
the n customers with probability 1/n and hence the summation.
To simplify Equation 4.18, we use the following Taylor-series expansion:

fn (t, y1 + h, y2 + h, . . . , yn + h) = fn (t, y1 , y2 , . . . , yn )

n
∂fn (t, y1 , y2 , . . . , yn )
+h + o(h),
∂yi
i=1

as well as the fundamental calculus area-under-the-curve result:

h
fn+1 (t, y1 + h, . . . , yi−1 + h, y, yi + h, . . . , yn + h)dy
0

= fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )h + o(h).

230 Analysis of Queues

Using the earlier two results, we can rewrite Equation 4.18 as

fn (t + h, y1 , y2 , . . . , yn )

n
∂fn (t, y1 , y2 , . . . , yn )
= (1 − λh)fn (t, y1 , y2 , . . . , yn ) + (1 − λh)h
∂yi
i=1

n
+ (1 − λh) fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )h
i=0

n
G (yi )
+ λh fn−1 (t, y1 + h, . . . , yi−1 + h, yi+1
n
i=1

+ h, . . . , yn + h) + o(h).

Subtracting fn (t, y1 , y2 , . . . , yn ) on both sides and dividing by h in the preced-

ing equation, we get

fn (t + h, y1 , y2 , . . . , yn ) − fn (t, y1 , y2 , . . . , yn )
h

n
∂fn (t, y1 , y2 , . . . , yn )
= −λfn (t, y1 , y2 , . . . , yn ) + (1 − λh)
∂yi
i=1

n
+ (1 − λh) fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )
i=0

n
G (yi ) o(h)
+λ fn−1 (t, y1 + h, . . . , yi−1 + h, yi+1 + h, . . . , yn + h) + .
n h
i=1

Taking the limits as h → 0 in the preceding equation, we get

1 ∂fn (t, y1 , y2 , . . . , yn )
n
∂fn (t, y1 , y2 , . . . , yn )
= −λfn (t, y1 , y2 , . . . , yn ) +
∂t n ∂yi
i=1

n
1
+ fn+1 (t, y1 , . . . , yi−1 , 0, yi , . . . , yn )
n+1
i=0

n
G (yi )
+λ fn−1 (t, y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1

The preceding equation is similar to the Chapman–Kolmogorov equation

and we will see next how to derive the steady-state balance equation from
General Interarrival and/or Service Times 231

that. Since the stochastic process {X(t), R1 (t), R2 (t), . . . , RX(t) (t)} is a stable
Markov process, in steady-state the stochastic process converges to a station-
ary process. In other words as t → ∞, we have ∂fn (t, y1 , y2 , . . . , yn )/∂t = 0 and
fn (t, y1 , y2 , . . . , yn ) converges to the stationary distribution fn (y1 , y2 , . . . , yn ),
that is, fn (t, y1 , y2 , . . . , yn ) → fn (y1 , y2 , . . . , yn ). Therefore, from the preceding
equation as we let t → ∞, we get the following balance equation:

n
∂fn (y1 , y2 , . . . , yn )
0 = −λfn (y1 , y2 , . . . , yn ) +
∂yi
i=1

n
+ fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn )
i=0

n
G (yi )
+λ fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1

One way is to solve the balance equations by trying various n values

starting from 0. Another way is to find a candidate solution and check if
it satisfies the balance equation. We try the second approach realizing that if
we have a solution, it is the unique solution. In particular, we consider the
M/M/s/s queue. There we can show that

λn /
n
fn (y1 , y2 , . . . , yn ) = K [1 − G(yi )],
n!
i=1

where K is a constant. As a first step, we check if the earlier solution satisfies

the balance equations for the M/G/s/s queue. +
In fact, when fn (y1 , y2 , . . . , yn ) = K(λn /n!) ni=1 [1 − G(yi )], it would
imply that

n
fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn ) = λfn (y1 , y2 , . . . , yn )
i=0

+
since fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn ) = (1 − ρ)[λn+1 /(n + 1)!][1 − G(0)] ni=1
[1 − G(yi )] = λ/(n + 1)f + n (y1 , y2 , . . . , yn ) with G(0) = 0. In addition, if fn (y1 ,
y2 , . . . , yn ) = K(λn /n!) ni=1 [1 − G(yi )], then

∂fn (y1 , y2 , . . . , yn ) λ
= − G (yi )fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn )
∂yi n
232 Analysis of Queues

since

∂fn (y1 , y2 , . . . , yn ) λn /
i−1 /
n
= K (−G (yi )) [1 − G(yj )] [1 − G(yj )]
∂yi n!
j=1 j=i+1

λn−1 / /
i−1 n
λ
= −K G (yi ) [1 − G(yj )] [1 − G(yj )]
n (n − 1)!
j=1 j=i+1

λ
= − G (yi )fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n

+n
Thus, fn (y1 , y2 , . . . , yn ) = K(λn /n!) i=1 [1 − G(yi )] satisfies the balance
equation

n
∂fn (y1 , y2 , . . . , yn )
0 = −λfn (y1 , y2 , . . . , yn ) +
∂yi
i=1

n
+ fn+1 (y1 , . . . , yi−1 , 0, yi , . . . , yn )
i=0

n
G (yi )
+λ fn−1 (y1 , . . . , yi−1 , yi+1 , . . . , yn ).
n
i=1

Also, to obtain the constant K, we use

s ∞ ∞ ∞
... fn (y1 , y2 , . . . , yn )dyn . . . dy2 dy1 = 1
n=0 y1 =0 y2 =0 yn =0

+n
for fn (y1 , y2 , . . . , yn ) = K(λn /n!) i=1 [1 − G(yi )] to get

1
K= .
s 1
(λ/μ)j
j=0 j!
General Interarrival and/or Service Times 233

Now, to obtain the performance measures, let pi be the steady-state

probability there are i in the system, that is,

pi = lim P{X(t) = i}.

t→∞

Using the fact that (via integration by parts)

∞ ∞ 1
[1 − G(yj )]dyj = yj G (yj )dyj = ,
μ
yj =0 yj =0

we have

∞ ∞ ∞ λi
pi = ... fi (y1 , y2 , . . . , yi )dyi . . . dy2 dy1 = K .
i!μi
y1 =0 y2 =0 yi =0

Therefore, for 0 ≤ i ≤ s,

(λ/μ)i /i!
pi = s .
k=0 (λ/μ) /k!
k

The Erlang loss formula due to A. K. Erlang is the probability that an arriving
customer is rejected (or the fraction of arriving customers that are lost in
steady state) and is given by

(λ/μ)s /s!
ps = s .
(λ/μ)i /i!
i=0

Notice that the distribution of the number in the system in steady state for
an M/G/s/s queue does not depend on the distribution of the service time.
Using the steady-state number in the system, we can derive

λ
L= (1 − ps ).
μ

Since the effective entering rate into the system is λ(1 − ps ), we get W =
1/μ. This is intuitive since there is no waiting for service for customers that
enter the system, the average sojourn time is indeed the average service time.
For the same reason, the sojourn time distribution for customers that enter
the system is same as that of the service time. In addition, since there is no
waiting for service, Lq = 0 and Wq = 0. We conclude by making a remark
without proof.
234 Analysis of Queues

Remark 9

The Markov process defined here

{(X(t), R1 (t), R2 (t), . . . , RX(t) (t)), t ≥ 0}

in steady state is in fact a reversible process. In simple terms, that means if the
process is recorded and viewed backward, it would be stochastically identi-
cal to running it forward. One of the artifacts of reversibility is the existence
of product-form solutions such as the expression for fn (y1 , y2 , . . . , yn ). Further,
because of reversibility, the departures from the original system correspond
to arrivals in the reversed system. Therefore, the departure process from the
M/G/s/s queue is a Poisson process with rate (1 − ps )λ departures per unit
time on average.

Reference Notes
Unlike most of the other chapters in this book, this chapter is a hodgepodge
of techniques applied to a somewhat common theme of nonexponential
interarrival and/or service times. We start with DTMC methods, then gravi-
tate toward MVA, develop bounds and approximations, then present CTMC
models, and finally, some special-purpose models. For that reason, it has
been difficult to present the complete details of all the methods. In that light,
we have provided references along with the description so that the read-
ers can immediately get to the source
to find out the missing steps. These
include topics such as G/G/s and PH/PH/s queues, phase type distribu-
tions and fitting, M/G/∞ queue, M/G/s/s queue, and M/G/1 with processor
sharing. Leaving out some of the details was a difficult decision to make con-
sidering that most textbooks on queues also typically leave those out. But
perhaps there is a good reason for doing so. Nevertheless, thanks to Prof.
Don Towsley’s class notes for all the details on M/G/1 processor sharing
queues that was immensely useful here.
The approximations and bounds presented in this chapter using MVA are
largely due to Buzacott and Shanthikumar [15]. All the empirical approxima-
tions are from Bolch et al. [12]. However, topics such as M/G/1 and G/M/1
queues have been treated in a similar vein as Gross and Harris [49]. Many
of the results presented on those topics have also been heavily influenced
by Kulkarni [67]. A lot of the results presented here on those topics have
been explained in a lot more crisp and succinct fashion in Wolff [108]. Fur-
ther, there is a rich literature on using fluid and diffusion approximations
as well as methodologies to obtain tail distributions. The main reason for
leaving them out in this chapter is that those techniques lend themselves
General Interarrival and/or Service Times 235

nicely to analyze a network of queues, which, is the focus of the other

chapters of this book.

Exercises
4.1 For a stable M/G/1 with FCFS service, derive the average sojourn
time in the system

1 λ (σ2 + 1/μ2 )
W= +
μ 2 1−ρ

from E[e−sY ] by taking the derivative with respect to s, multiplying

by (−1) and letting s → 0.
4.2 Using the LST of the busy period distribution of an M/G/1 queue
that is a solution to F̃Z (u) = G̃(u + λ − λF̃Z (u)), obtain the average
busy period length as E[Z] = 1/(μ − λ) as well as the second moment
of the busy period length as E[Z2 ] =(σ2 + 1/μ2 )/(1 − λ/μ)3 .
4.3 For a stable G/M/1 queue, obtain the second moment of the sojourn
time and the second factorial moment of the number in the system
in steady state. Is there a relation between the two terms like we saw
for the M/G/1 queue?
4.4 Compute the expected queue length (Lq ) in an M/G/1 queue with
the following service time distributions (all with mean = 1/μ):
(a) Exponential with parameter μ
(b) Uniform over [0, 2/μ]
(c) Deterministic with mean 1/μ
(d) Erlang with parameters (k, kμ)
Which service time produces the largest congestion? Which one
produces the lowest?
4.5 Consider a single-server queue that gets customers from k indepen-
dent sources. Customers from source i arrive according to PP(λi ) and
demand exp(μi ) of service time. All customers form a single queue
and served according to FCFS. Let X(t) be the number of customers
in the system at time t. Model {X(t), t ≥ 0} as an M/G/1 queue and
compute its limiting expected value when it exists.
4.6 An M/G/1 queue with server vacations. Consider an M/G/1 queue
where the server goes on vacation if the system is empty after a
service completion. If the system is empty upon the return of the
server from vacation, the server goes on another vacation; otherwise,
236 Analysis of Queues

the server begins service. Successive server vacations are IID ran-
dom variables. Let ψ(z) be the generating function of the number
of arrivals during a vacation (the vacation length may depend upon
the arrival process during the vacation). Let Xn be the number of
customers in the system after the nth service completion. Show
that {Xn , n ≥ 0} is a DTMC by describing the transition probability
matrix. Then:
(a) Show that the system is stable if ρ = λ/μ < 1.
(b) Assuming that ρ < 1, show that the generating function φ(z) of
the steady-state distribution of Xn is given by

1−ρ G̃(λ − λz)
φ(z) = (ψ(z) − 1),
m z − G̃(λ − λz)

where m is the average number of arrivals during a vacation.

1 ρ2 m(2)
L=ρ+ (1 + σ2 μ2 ) + ,
21−ρ 2m

where m(2) = ψ (1) is the second factorial moment of the number
of arrivals during a vacation.
4.7 Consider a G/M/1 queue with interarrival distribution

G(x) = r(1 − e−λ1 x ) + (1 − r)(1 − e−λ2 x ),

where 0 < r < 1 and λi > 0 for i = 1, 2. Find the stability condition and
derive an expression for pj , the steady-state probability that there are
j customers in the system.
4.8 A service station is staffed with two identical servers. Customers
arrive according to a PP(λ). The service times are IID exp(μ). Con-
sider the following two operational policies used to maintain two
separate queues:
(a) Every customer is randomly assigned to one of the two servers
with equal probability.
(b) Customers are alternately assigned to the two servers.
Compute the expected number of customers in the system in steady
state for both cases. Which operating policy is better?
4.9 Requests arrive to a web server according to a renewal process. The
interarrival times are according √to an Erlang distribution with mean
10 s and standard deviation 50 s. Assume that there is infinite
General Interarrival and/or Service Times 237

waiting room available for the requests to wait before being pro-
cessed by the server. The processing time (i.e., service time) for the
server is according to a Pareto distribution with CDF

β
K
G(x) = 1 − , if x ≥ K.
x

The mean service time is Kβ/(β − 1), if β > 1, and the variance of the
service time is K2 β/[(β − 1)2 (β − 2)] if β > 2. Use K = 5 and β = 2.25
so that the mean and standard deviation of the service times are
9 and 12 s, respectively. Using the results for the G/G/1 queue,
obtain bounds as well as approximations for the average response
time (i.e., waiting time in the system including service) for an arbi-
trary request in the long run. Pick any bound or approximation from
the ones given in this chapter.
4.10 Consider a stable M/G/1 queue with PP(λ) arrivals and G̃(w) as the
LST of the CDF of the service times such that G̃ (0) = − 1/μ. Write
down the LST of the interdeparture times (between two successive
departures picked arbitrarily in steady state) in terms of λ, μ, and
G̃(w). (Note that w is used instead of s in the LST to avoid confusion
with S, the service time.)
4.11 Consider a stable G/M/1 queue with traffic intensity ρ and parame-
ter α, which is a solution to α = G̃(μ − μα), where G(t) is the CDF of
the interarrival time and μ is the mean service rate. Derive an expres-
sion for the generating function (z) of the number of entities in the
system in steady state as a closed-form expression in terms of ρ, α,
and z.
4.12 Answer the following multiple choice questions:
(i) For a stable G/M/1 queue with G̃(s) being the LST of the inter-
arrival time CDF and service rate μ, which of the following
statements are not true?
(a) There is a unique solution for α in (0, 1) to the equation
α = G̃(μ − μα).
(b) In steady state, the time spent by an arbitrary arrival in the
system before service begins is exponentially distributed.
(c) If G̃(s) = λ/(λ + s), then the average total time in the system
(i.e., W) is 1/(μ − λ).
(d) The fraction of time the server is busy in the long run is
−1/[G̃ (0)μ].
(ii) Consider a stable M/G/1 queue and the notation given in this
chapter. Which of the following statements are not true?
238 Analysis of Queues

(a) For j = 1, 2, . . ., we have πj = π∗j = pj .

(b) Average time between departures in steady state is 1/λ.
(c) Server idle times are according to exp(λ).
(d) Average workload in the system seen by an arriving cus-
tomer in steady state is W.
(iii) For a stable M/G/1 queue with traffic intensity ρ and squared
coefficient of variation of service times C2s ,
(a) L increases with ρ but decreases with C2s .
(b) L decreases with both ρ and C2s .
(c) L increases with both ρ and C2s .
(d) L decreases with ρ but increases with C2s .
4.13 For the following TRUE or FALSE questions, give a brief reason why
you picked the statements to be true or false.
(a) Consider a stable G/D/1 queue where the interarrival times are
independent and identically distributed with mean a seconds
and standard deviation σA seconds. The service time is con-
stant and equal to τ seconds. Is the following statement TRUE
or FALSE? The average time in the system (W) is given by

σ2A
W=τ + .
2{a − τ}

(b) Consider a stable M/M/1 queue that uses processor sharing dis-
cipline. Arrivals are according to PP(λ) and it would take exp(μ)
time to process an entity if it were the only one in the system. Is
the following statement TRUE or FALSE? The average workload
in the system at an arbitrary point in steady state is λ/[μ(μ − λ)].
(c) The Pollaczek–Khintchine formula to compute L in M/G/1
queues requires the service discipline to be FCFS. TRUE or
FALSE?
4.14 Compare an M/E2 /1 and E2 /M/1 queue’s L values for the case when
both queues have the same ρ. The term E2 denotes an Erlang-2
distribution.
4.15 Which is better: an M/G/1 queue with PP(λ) and processing speed of
1 or one with PP(2λ) and processing speed 2? By processing speed,
we mean the amount of work the server can process per unit time,
so if there is x amount of work brought by an arrival and the pro-
cessing speed is c, then the service time would be x/c. Assume that
the amount of work has a finite mean and finite variance. Also, use
mean sojourn time to compare the two systems.
General Interarrival and/or Service Times 239

4.16 Consider an M/G/1 queue with mean service time 1/μ and variance
1/(3μ2 ). The interarrival times are exponentially distributed with
mean 1/λ and the service times are according to an Erlang distri-
bution. Let Wn and Sn , respectively, be the time in the system and
service time for the nth customer. Define a random variable called
slowdown for customer n as Wn /Sn . Compute the mean and variance
of slowdown for a customer arriving in steady state, that is, compute
E[Wn /Sn ] and Var[Wn /Sn ] as n → ∞. Assume stability. Note that the
term x-factor defined in the exercises of Chapter 1 is E[Wn ]/E[Sn ],
however the means low down is E[Wn /Sn ].
4.17 Consider a G/M/2 queue which is stable. Obtain the cumulative
distribution function (CDF) of the time in the system for an arbi-
trary customer in steady state by first deriving its Laplace Stieltjes
transform (LST) and then inverting it. Also, based on it, derive
expressions for W and thereby L.
4.18 Consider an M/G/1 queue with mean service time 1/μ and second
moment of service time E[S2 ]. In addition, after each service comple-
tion, the server takes a vacation of random length V with probability
q or continues to serve other units in the queue with probability
p (clearly p = 1 − q). However, the server always takes a vacation
of length V as soon as the system is empty; at the end of it, the
server starts service, if there is any unit waiting, and otherwise it
waits for units to arrive. Let Ṽ(s) = E[e−sV ] be the LST of the vaca-
tion time distribution. Use MVA to derive the following results for
p0 , the long-run probability the system is empty, and L, the long-run
average number of units in the system:

1 − (λ/μ) − qE(V)λ
p0 = ,
Ṽ(λ) + pE(V)λ
' 0
λ λ2 2q
L= + 2 2
E[S ] + qE(V ) + E(V)
μ 2 1 − (λ/μ) − λqE(V) μ
λ2 pE(V 2 )
+ .
2 Ṽ(λ) + pE(V)λ

Assume stability.
4.19 Let X(t) be the number of entities in the system in an M/M/1 queue
with processor sharing. The arrival rate is λ and the amount of ser-
vice requested is according to exp(μ) so that the traffic intensity is
ρ = λ/μ. Model {X(t), t ≥ 0} as a birth and death process and obtain
the steady-state distribution. Using that and the properties of the
240 Analysis of Queues

exponential distribution, derive an expression for

lim Fn (t, y1 , y2 , . . . , yn ),
t→∞

where

Fn (t, y1 , y2 , . . . , yn ) = P{X(t) = n, R1 (t) ≤ y1 , R2 (t) ≤ y2 , . . . , Rn ≤ yn }

and Ri (t) is the remaining service for the ith customer in the system.
From that result, show that

/
n
fn (y1 , y2 , . . . , yn ) = (1 − ρ)λn [1 − G(yi )],
i=1

where G(t) = 1 − e−μt .

5
Multiclass Queues under Various Service
Disciplines

In the models considered in the previous chapters, note that there was only
a single class of customers in the system. However, there are several appli-
cations where customers can be differentiated into classes and each class has
its own characteristics. For example, consider a hospital emergency ward.
The patients can be classified into emergency, urgent, and regular cases with
varying arrival rates and service requirements. The question to ask is: In
what order should the emergency ward serve its patients? It is natural to
give highest priority to critical cases and serve them before others. But how
does that impact the quality of service for each class? We seek to address
such questions in this chapter. We begin by describing some introductory
remarks, then evaluate performance measures of various service disciplines,
and finally touch upon the notion of optimal policies to decide the order
of service.

5.1 Introduction
The scenario considered in this chapter is an abstract system into which
multiple classes of customers enter, get “served,” and depart. Why are
there multiple classes? First, the system might be naturally classified into
various classes because there are inherently different items requiring their
own performance measures (e.g., in a flexible manufacturing system, if a
machine produces three different types of parts, and it is important to mea-
sure the in-process inventory of each of them individually, then it makes
sense to model the system using three classes). Second, when the service
times are significantly different for different customers, then it might be ben-
eficial to classify the customers based on service times (e.g., in most grocery
stores there are special checkout lines for customers that have fewer items).
Third, due to physical reasons of where the customers arrive and wait, it
might be practical to classify customers (e.g., at fast-food restaurants cus-
tomers can be classified as drive-through and in-store depending on where
they arrive).

241
242 Analysis of Queues

Next, given that there are multiple classes, how should the customers
be served? In particular, we need to determine a service discipline to serve
the customers. For that one typically considers measures such as cost, fair-
ness, performance, physical constraints, goodwill, customer satisfaction,
etc. Although we will touch upon the notion of optimal service disciplines
toward the end of this chapter, we will assume until that point that we have
a system with a given service discipline and evaluate the performance expe-
rienced by each class in that system. This is with the understanding that in
many systems one may be restricted to using a particular type of service pol-
icy. To get a better feel for that, in the next section we present example of
several systems with multiple classes. Before proceeding with that, as a final
comment, it is worth mentioning that this work typically falls in the literature
under the umbrella of stochastic scheduling.

5.1.1 Examples of Multiclass Systems

The objective of this section is to give the reader a better understanding of
multiclass systems using real-world examples, as well as an appreciation for
various factors considered to determine service disciplines. The examples
are grouped together based on the application domain.

1. Road transportation: One of the classic examples of classifying cus-

tomers due to physical reasons is at a traffic light. As a simple
case, consider a four-way intersection where each flow is a single
lane. Then the system can be modeled as one with four classes.
A service-scheduling policy is to offer a green to one of the four
flows in a round-robin fashion with a fixed green time for each
flow. The notion of dollar cost is quite nebulous, so the timing of
the lights is usually a combination of fairness, performance, and
physical constraints. Note how a manual police-operated traffic
intersection usually works a little differently. Another road trans-
portation example is a toll booth with multiple servers (or lanes).
Some lanes are reserved for electronic toll collection, some for man-
ual with exact changes, some for manual with a human that doles
out change, and perhaps some for larger vehicles. Note that this is
not a single-server system, customers have the choice to select their
lane (with the appropriate exceptions); however, there isn’t a notion
of service discipline besides first come first served (FCFS) within
a lane. Note the similarity between this and the grocery checkout
line with special lanes for customers having fewer items. There are
many other examples of road transportation, including high occu-
pancy vehicle lanes, traffic intersection with yield sign (i.e., priority
queue), etc. The key point is that in roadway transportation sys-
tems alone, there are a variety of service disciplines implemented;
however, the unique feature is that the entities in the queue are
Multiclass Queues under Various Service Disciplines 243

humans (this is also common to the hospitality industry, especially

restaurants).
2. Production: Although road transportation focused on human enti-
ties, the other end of the spectrum is in production, where the
entities are products for which fairness is not all that critical but
cost and performance are crucial aspects. In the 1980s, a large vol-
ume of literature surfaced when researchers became interested in
scheduling jobs on flexible manufacturing systems. Many of the
results presented in this chapter are based on this. There is a plethora
of applications in the production domain and we mention only a
few here. In particular, we alluded to an example earlier where a
flexible machine (essentially a machine capable of doing different
things) produces multiple parts. The machine processes one part at
a time. However, because of the setup time involved in changing
the configuration from one type to another, it makes sense to pro-
cess several parts of one class, then switch over to another class, and
continue in a round-robin fashion. There is a trade-off in minimizing
the time wasted in switching classes against the time that products
wait to be processed. The issues are essentially cost, performance,
and physical constraints for the most part that drive the service dis-
ciplines. Another example is an automobile repair shop (although
some would argue that this is not a production but a service sys-
tem). Vehicles are dropped off, and the supervisor gets to decide the
order in which vehicles are to be repaired. Knowing the approxi-
mate duration for the repair times based on what the vehicle owner
reported, the supervisor can use the shortest expected repair time
policy to, for example, schedule repairs.
3. Computer communications: The topic of stochastic scheduling resur-
faced recently with an explosion of applications and new needs in
computer-communication systems. We describe a very small num-
ber of examples where entities can be classified based on individual
types, service times, or physical location. In the networking arena, it
is common to classify packets into TCP or UDP types, and, perhaps
in the future, into classes of service (such as gold, silver, bronze, and
best effort). At a higher level, the traffic can be classified as file trans-
fer, streaming video, hypertext transfer, etc. All these are based on
types. Entities are also classified based on service times such as web
server requests (large files are also called elephants and small ones
are called mice). In addition, there is a large literature on service
disciplines in the router design area (generalized processor shar-
ing, weighted fair queueing, weighted round-robin, random early
detection, etc.) that we do not cover in this chapter, considering that
it is too specialized for a single application. However, it is worth
mentioning that the queues in the routers are typical examples of
244 Analysis of Queues

physical-location-based classification. Having said that, one of the

major applications of polling models for which we have allocated an
entire section in this chapter is in computer systems. For example, a
CPU would process jobs in a round-robin fashion context switching
from time to time; the CPU also polls various queues, say in network
interface cards. The token-ring local area network protocol and most
metropolitan area networks like FDDI are also polling based. On a
completely different note, most of the acronyms in this paragraph
are not expanded because the acronyms are generally more popular
than their expansions.

These examples are meant to motivate the reader but by no means an indi-
cation of the variety of circumstances in which multiclass queueing systems
occur. While presenting various service disciplines, we will describe some
more examples to put things in better perspective. In fact, many of our fol-
lowing examples would fall under service systems (such as hospitals), which
is another application domain that has received a lot of attention recently.
Next, we present some results that are applicable to any multiclass queueing
system with “almost” any reasonable service discipline.

5.1.2 Preliminaries: Little’s Law for the Multiclass System

Consider a generic flow-conserving system (does not have to be a queueing
system) where K classes of customers arrive, spend some time in the sys-
tem, and then depart. Customers belonging to class i (i ∈ {1, 2, . . . , K}) arrive
into the system according to a renewal process with mean rate λi per unit
time. Assume that the arrival process is independent of other classes and
class switching is not allowed inside the system. Let Li be the long-run aver-
age number of i customers in the system. Also, let Wi be the mean sojourn
time in the system for class-i customers. Irrespective of how the customers
interact with each other, Little’ law holds. In particular, we have for every
i ∈ {1, 2, . . . , K}

Li = λi Wi .

Of course we can aggregate across all K classes of customers and state the
usual Little’s law as

L = λW

where
L is the total number of customers in the system on average in steady state
across all classes
W is the sojourn time averaged over all customers (of all classes)
λ is the aggregate arrival rate
Multiclass Queues under Various Service Disciplines 245

Since the system is flow conserving and class switching is not permitted,
we have

λ = λ1 + λ2 + · · · + λK ,
L = L1 + L2 + · · · + LK .

Next we consider a special case of the generic flow-conserving system

without class switching, namely the G/G/s queue with multiple classes.
It is required that customers enter the system at mean rate λi for class i
(i ∈ {1, 2, . . . , K}), get served once, and leave the system immediately after
service. Customers belonging to class i (which henceforth would be called
class-i customers) require an average service time of 1/μi (in same time units
as 1/λj for all i and j). The service times are independent and identically
distributed (IID) across all the customers of a particular class. We assume
that the distribution of interarrival times and service times are known for
each class (although not required for the results in this section). Unless oth-
erwise stated, we also assume that arrival times and service times are not
known until arrivals occur and service is completed, respectively. This is
sometimes also known as nonanticipative, although that term does have
other related meanings in the literature. There are systems such as web
servers where the service time is declared upon arrival (for example when
file size to be downloaded is known). Although that would technically not
be nonanticipative, we could get around it by considering each file as a
separate class.
Note that so far we have not specified the scheduling policy (i.e., service
discipline) for the multiclass system. We now describe some results that are
invariant across scheduling policies (or at least a subset of policies). Recall
that Li and Wi are the mean queue length and mean sojourn time, respec-
tively, for class-i customers. Irrespective of the scheduling policy, Little’s
law holds for each class, so for all i ∈ [1, K] with [1, K] being a compact
representation of the set {1, 2, . . . , K}

Li = λi Wi .

Also, similar results can be derived for Liq and Wiq which, respectively,
denote the average number waiting in queue (not including customers at
servers) and average time spent waiting before service. In particular, for all
i ∈ [1, K]

Liq = λi Wiq
1
Wi = Wiq +
μi
Li = Liq + ρi
246 Analysis of Queues

where

λi
ρi = .
μi

Note that ρi here is not the traffic intensity offered by class i when s > 1.
But we will mainly consider the case s = 1 for most of this chapter which
would result in ρi being the traffic intensity and will use the previous results.
In addition, L and W are the overall mean number of customers and mean
sojourn time averaged over all classes. Recall that L = L1 + L2 + · · · + LK
and if λ = λ1 + λ2 + · · · + λK , the net arrival rate, then W = L/λ. For the
G/G/1 case with multiple classes, more results can be derived for a spe-
cial class of scheduling policies called work-conserving disciplines which we
describe next.

5.1.3 Work-Conserving Disciplines for Multiclass G/G/1 Queues

We would first like to state explicitly that we restrict ourselves to multiclass
queues with a single server. In other words, we let s = 1 for the system
in the previous section and only study G/G/1 queues with multiple classes.
There are a few reasons for that. First of all, note that except for the very
last section of this chapter, we are typically interested in obtaining perfor-
mance measures for a given policy. However, recall from Chapter 4 that even
the single class M/G/s queue was intractable to analyze. Another reason for
using single-server queue is that we will see subsequently that multi-server
queues are generally not work conserving. That would become more appar-
ent once we define what work conserving is. Having said that, we would like
to add that we will visit multiple servers when we discuss optimal policies
in Section 5.5. Until then we will assume there is only one server.
It is worthwhile to describe the notation used here. We consider a mul-
ticlass G/G/1 queue. There are a total of K classes. For i = 1, . . . , K, class-i
customers arrive into queue according to an independent renewal process
at average rate of λi per unit time. Further, class-i customers require a ran-
dom amount of service which is denoted generically as Si for all i ∈ [1, K].
The mean service time is 1/μi = E(Si ). Also, E(S2i ) is the second moment
of the service time. The traffic intensity for class i is λi /μi . In this section,
we concentrate on a subset of service-scheduling policies (i.e., service dis-
ciplines) seen before. In particular, we consider work-conserving disciplines
where more results for the G/G/1 queue can be obtained. In fact, many of
these results have not been explained in the single class case in the previ-
ous chapters but by letting K = 1 (i.e., only one class), they can easily be
accomplished.
Recall that whenever a customer arrives into a G/G/1 queue, the work-
load in the system jumps by a quantity equal to the service time of that
customer. Also, whenever there is work to be performed, the workload
Multiclass Queues under Various Service Disciplines 247

W(t) FCFS

S2 S3
S1

A1 A2 A3 t
D1 D2 D3

W(t) LCFS-PR

S2 S3
S1

A1 A2 A t
D2 3 D3 D1

FIGURE 5.1
W(t) vs. t for FCFS and LCFS-PR.

decreases at unit rate. A sample path of the workload in the system at time t,
W(t), is described in Figure 5.1 with An denoting the time of the nth arrival
and Sn its service time requirement for n = 1, 2, 3. The figure gives depar-
ture times Dn for the nth arriving customer using two service disciplines,
FCFS and LCFS-PR (with preemptive resume) across all classes. Note that
although the departure times are different under the two service disciplines,
the workload W(t) is identical for all t. In other words, the workload W(t)
at time t is conserved. But that does not always happen. If the server were
to idle when W(t) > 0 or if we considered LCFS with preemptive repeat
(and Sn is not exponentially distributed), the workload would not have been
conserved.
In this and the next section, we only consider the class of service dis-
ciplines that result in the workload being conserved. The essence of work-
conserving disciplines is that the system workload at every instant of time
remains unchanged over all work-conserving service-scheduling disciplines.
Intuitively this means that the server never idles whenever there is work
to do, and the server does not do any wasteful work. The server continu-
ously serves customers if there are any in the system. For example, FCFS,
LCFS, and ROS are work conserving. Certain priority policies that we will
see later such as nonpreemptive and preemptive resume policies are also
work conserving. Further, disciplines such as processor sharing, shortest
expected processing time first, and round-robin policies are also work con-
serving when the switch-over times are zero. There are policies that are
nonwork conserving such as preemptive repeat (unless the service times
248 Analysis of Queues

are exponential) and preemptive identical (i.e., the exact service time is
repeated as opposed to preemptive repeat where the service time is resam-
pled). Usually when the server takes a vacation from service or if there
is a switch-over time (or setup time) during moving from classes, unless
those can be explicitly accounted for in the service times, those servers are
nonwork conserving.
Having described the concept of work conservation, next we present
some results for queues with such service disciplines. Note that across all
work-conserving service-scheduling disciplines, not only is W(t) identical
for all t, but also the busy period and idle time sample paths are identical.
Therefore, all the results that depend only on the busy period distribution
can be derived for all work-conserving disciplines. We present some of those
next. Consider the notation used earlier in this section as well as those in
Section 5.1.2. Define ρ, the overall traffic intensity, as

K
ρ= ρi .
i=1

A K-class G/G/1 queue with a work-conserving scheduling discipline is

stable if

ρ < 1.

This result can be derived using the fact that ρ is just the overall arrival
rate times the average service time across all classes. It is also equal to the
ratio of the mean busy period to the mean busy period plus idle period. The
busy period and idle period are identical across all work-conserving disci-
plines, and we know the previous result works for the single class G/G/1
queue with FCFS. Therefore, the FCFS across all classes is essentially a single
class FCFS with traffic intensity ρ. This result works for all work-conserving
service-scheduling disciplines. In a similar manner, we can show that when
a G/G/1 system is work conserving, the probability that the system is
empty is 1 − ρ.
In the next section, we describe a few more results for multi-class G/G/1
queues with work-conserving scheduling disciplines. However, we require
an additional condition that rules out some of the work-conserving schemes.
For that we also need additional notation. Let Si be the random variable
denoting the service time of a class-i customer (this is different from Sn
we defined earlier which is the service time realization of the nth arriving
customer). It is also crucial to point out that since the service times for all
customers of particular class are IID, we use a generic random variable, such
Multiclass Queues under Various Service Disciplines 249

as Si for class i. Using that notation, the second moment of the overall service
time is

1 K

E S2 = λi E S2i .
λ
i=1

5.1.4 Special Case: At Most One Partially Completed Service

In this section, we consider a special case of work-conserving disciplines
where at any time there can be at most one customer in the system that
has had any service performed. Policies such as FCFS, nonpreemptive LCFS,
ROS, nonpreemptive priority, and round-robin with complete service (with-
out switch-over times) fall under this category where once the service begins
for a customer, the service is not interrupted and service rate remains
unchanged until the service is complete. Typically policies that allow pre-
emption, round-robin with incomplete service, and processor sharing do
not fall under this category because there could be more than one customer
whose service is incomplete. The only exception is when the service times are
exponentially distributed. Because of memoryless property, when a service
is only partially completed, the remaining service time is still exponentially
distributed with the same service time parameter. Therefore, it is equivalent
to the case of at most one partially completed service even under preemp-
tion, round-robin with incomplete service, and processor sharing. With that
we make a quick remark before proceeding.

Remark 10

What we consider here is a special case of the head-of-the-line (HL) dis-

cipline considered in Dai [25]. Under HL, there could be at most one
partially completed service within each class. However, across classes
preemption, round-robin with incomplete service, and processor sharing are
allowed.

We now explain the method used here. For that the first step is to ensure
that the stochastic system is stable. That can be checked fairly quickly; in
fact for the multiclass G/G/1 queue, all we need to check is if ρ < 1. If
the system is stable, the key idea is to observe the system at an arbitrary
time in steady state. Then the observation probabilities correspond to the
steady-state probabilities. For example, in a stable G/G/1 queue with mul-
tiple classes, the probability that an arbitrary observation in steady state
would result in an empty system is 1 − ρ, the steady-state probability of
having no customers in the system. Further, if the system is stationary and
250 Analysis of Queues

ergodic, then the observation just needs to be made at an arbitrary time and
the results would also indicate time-averaged behavior. Thus, the expected
number in the system during this arbitrary observation in steady state is L
(the steady-state mean number in the system). This should not be confused
with customer-stationary process results such as Poisson arrivals see time
averages (PASTA). Note that here we are considering just one observation
and the observation does not correspond to a customer arrival or departure.
For the next step of the analysis, recall that our system is a G/G/1 queue
with K classes and a service discipline that is work conserving with at most
one partially completed service allowed. We can divide this system into two
subsystems, one the waiting area and the other the service area. The work-
load in the system at any time is the sum of the workload in the service area
and that in the waiting area. Note that all arriving customers go to the wait-
ing area (although they may spend zero time there), then go to the service
area and exit the system (without going back to the waiting area). With that
in mind, we present two results that are central to such special cases of work-
conserving disciplines. These results were not presented for the single class
case (but they are easily doable by letting the number of classes K = 1). We
present these results as two problems.

Problem 43
Consider a G/G/1 queue with K classes and a service discipline that is
work conserving with at most one partially completed service allowed.
Assume that the queue is stable, that is, ρ < 1. If the system is observed
at an arbitrary time in steady state, then show that for that observation,
the expected workload at the server (i.e., expected remaining service time)
is λE[S2 ]/2.
Solution
Let C be the class of the customer in service when the observation is made at
an arbitrary time in steady state with C = 0 implying there is no customer
in service. Also, let R be the remaining service time when this observa-
tion is made (again R is indeed the workload at the server). We need to
compute E[R].
Note that P{C = 0} = 1 − ρ and for i = 1, . . . , K, P{C = i} = ρi .
Although P{C = 0} has been mentioned earlier in this section, P{C = i}
deserves some explanation. First consider P{C = i|C > 0} which would
be the probability that if all the service times were arranged back to back
at the server and an arbitrary time point was picked. For this consider a
long stick made up of ni small sticks of random lengths sampled from Si (the
service times of class i) for i = 1, . . . , K. If a point is selected uniformly on
this stick,then the point would be on a class i small stick with probability
ni E[Si ]/( j nj E[Sj ]). The number of sticks ni correspond to the number of

class-i customers sampled. As the sample size grows, ni / j nj → λi / j λj
Multiclass Queues under Various Service Disciplines 251

because the fraction of class-i customers is proportional to their arrival rates.

Therefore, P{C = i|C > 0} = λi E[Si ]/( j λj E[Sj ]) = ρi /ρ. Since P{C > 0}ρ,
unconditioning we get for i = 1, . . . , K, P{C = i} = ρi .
To compute E[R], we now condition on C. Clearly E[R|C = 0] = 0 since
the remaining service time is zero if there is no customer at the server. For
the case C > 0, recall the result from renewal theory (see Section A.5.2) that
the remaining time for a renewal to occur at an arbitrary time in steady state
is according to the equilibrium distribution. Further, the expected remain-
ing time for the renewal is half the ratio of the second moment to the mean
interrenewal time. Therefore, we have for i = 1, . . . , K

E[S2i ]
E[R|C = i] = .
2E[Si ]

Unconditioning, we get

K
E[R] = E[R|C = i]P{C = i}
i=0

K
E S2 ρi i
=0+
2E[Si ]
i=1

K
E S2 λi E[Si ]
i
=
2E[Si ]
i=1

K
E S2 λi i
=
2
i=1

λE S2
=
2

using the notation μi = 1/E[Si ] and

1 K

E S2 = λi E S2i .
λ
i=1

Hence, we have shown that the expected workload at the server is

λE[S2 ]/2 for the system under consideration. Although we did not
explicitly state it, note that for this result we do require that the ser-
vice discipline is work conserving with at most one partially completed
service.
252 Analysis of Queues

In the previous problem, we did not use all the power of work conserva-
tion. In particular, the workload at some time t, W(t) is conserved over all
work-conserving disciplines. The next problem discusses the second subsys-
tem, namely the waiting area, and uses the fact that W(t) does not depend
on the service discipline.

Problem 44
Consider a stable G/G/1 queue with K classes and a service discipline that is
work conserving with at most one partially completed service allowed. Let
Wiq be the average waiting time in the queue (not including service) for a
class-i customer. Let the system be observed at an arbitrary time in steady
state. Show that the expected workload in the waiting area (not including
any at the server) at that observation is

K
ρi Wiq .
i=1

Further, show that this expression is a constant over all work-conserving

disciplines.
Solution
The expected workload in the waiting area is the sum of the expected work-
loads of all classes. Let the expected number of class-i customers in the
waiting area in steady state be Liq . Therefore, the expected workload due
to class-i customers in the waiting area is Liq E[Si ] since each class-i customer
brings in an average workload of E[Si ]. However, from Little’s law we have
Liq = λi Wiq , and hence Liq E[Si ] = ρi Wiq . Thus, the expected workload in the
waiting area (not including any at the server) at an observation made at an
arbitrary time in steady state is

K
ρi Wiq .
i=1

This result only tells us the average workload in the waiting area subsystem
for any given work-conserving discipline where at most one customer can be
at the server. However, we are yet to show that this quantity when computed
for any such service discipline would be a constant. For that consider the
basic work conservation result that states that the amount of work in the
system at a given time point (here we consider an arbitrary time point in
steady state) is the same irrespective of the service discipline, as long as it
is work conserving. Naturally the expected value of the amount of work is
Multiclass Queues under Various Service Disciplines 253

also conserved. However, the expected value of the amount of work at an

arbitrary point in steady state is

λE S2
K
+ ρi Wiq (5.1)
2
i=1

and is conserved across disciplines. However, the term λE[S2 ]/2 remains a
constant across disciplines. Therefore

K
ρi Wiq
i=1

is a constant over all work-conserving disciplines.

It is worthwhile to point out that in this expression typically Wiq does

change with service discipline (ρi does not). One of the biggest benefits of
this result is to verify expressions derived for Wiq . Also, the concept of work
conservation and dividing the system into the waiting area plus server area
subsystem would be exploited in the analysis to follow. Talking about analy-
sis, note that so far we have written down several expressions in this chapter
relating quantities such as L, W, Li , and Wi (and the respective quantities
with the q subscript). However, we have not yet provided a way to compute
any of them. That would be the objective of the remainder of this chapter
(except the last section). In particular, we will compute L, W, Li , and Wi (and
the respective quantities with the q subscript) for a few service disciplines.
Note that unless K = 1, the expressions would typically vary based on the
service discipline used. The most general system for which it is possible to
derive the previous expressions in closed form is the multiclass M/G/1 queue
which we will use in the next few sections.

5.2 Evaluating Policies for Classification Based on Types:

Priorities
In this section, we consider a set of service disciplines and evaluate perfor-
mance measures such as sojourn time and number in the system for each
class of customers in a single-server multiclass queue. One way to think
about the set of service disciplines addressed in this section is that the cus-
tomers belong to different types and each type has its own requirement. In
the sections to follow, we consider classification based on physical location
254 Analysis of Queues

(Section 5.3) and knowledge of service times (Section 5.4). In this section
where we consider classification based on customer type, we also assume
that there is no switch-over time from customer to customer or class to
class. In addition, we assume that the arrival and service times are nonan-
ticipative, specifically the realized service time is known only upon service
completion.
Before describing the model and analysis technique for such service dis-
ciplines, we describe some physical examples of multiclass queues with clas-
sification based on types. The emergency ward situation that we described
earlier and a case study we will address later is a canonical example of such
a system. In particular, depending on the urgency of the patient, they can
be classified as critical, urgent, or regular. Their needs in terms of queue
performance are significantly different. Although in this example the service
performed could be different, next is the one where they are similar. Many
fast-food restaurants nowadays accept online orders. Typically those cus-
tomers are classified differently compared to the ones that stand in line and
order physically. The needs of the online orders in terms of sojourn times
are certainly different but the service times are no different from the other
class. In addition there are many examples in computer systems, network-
ing, transportation, and manufacturing where the analysis to follow can be
applied.
Although there are several applications, we present a fairly generic
model for the system. Consider a special case of the G/G/1 queue with
K classes where the arrival process is PP(λi ) for class i (i = 1, 2, . . . , K).
The service times are IID with mean E[Si ] = 1/μi , second moment E[S2i ],
CDF Gi (·), and ρi = λi /μi for class i (for i = 1, 2, . . . , K). There is a
single server with infinite waiting room. From an analysis standpoint, it
does not matter whether there is a queue for each class or whether all
customers are clubbed into one class as long as the class of each cus-
tomer and the order of arrival are known to the server. We present
results for three work-conserving disciplines: FCFS, nonpreemptive pri-
ority, and preemptive resume priority. Note that the case of preemptive
repeat (identical or random) is not considered but is available in the liter-
ature. Other schemes such as round-robin will be dealt with in subsequent
sections.

5.2.1 Multiclass M/G/1 with FCFS

In this service-scheduling scheme, the customers are served according to
FCFS. None of the classes receives any preferential treatment. In other words,
although the server knows the arrival time and class of each customer in the
system, the server ignores the class and serves only according to the arrivals.
One purpose to study this queue is to benchmark against priority policy,
and more importantly some of the priority policy results are actually based
on this one. In some sense, this is the easiest work-conserving discipline we
Multiclass Queues under Various Service Disciplines 255

can consider for analysis. Therefore, we first derive performance measures

for this system. It also reinforces some of the results we derived earlier in this
chapter.
The analysis of a multiclass M/G/1 queue with FCFS is done by aggre-
gating all K classes into a single class. This is because the server does not
differentiate between classes and serves in the order the customers arrive.
First consider the arrival process. The superposed net customer arrival pro-
cess is PP(λ) with λ = λ1 + λ2 + · · · + λK . Let S be a random variable denoting
the “effective” service time for an arbitrary customer. In essence, S can be
thought of as the service time incurred by the server aggregated over all
customers. Then we can derive the following results by conditioning on the
class of customer being served (the probability that a customer that enters
the server is of class i is λi /λ):

1
K
G(t) = P(S ≤ t) = λi Gi (t),
λ
i=1

1 1
K
E[S] = = λi E[Si ],
μ λ
i=1

1
K
1
E S2 = σ2 + 2 = λi E S2i ,
μ λ
i=1

ρ = λE[S].

Note that the average aggregate service rate is μ and the variance of the
aggregate service time is σ2 . Also, ρ = ρ1 + · · · + ρK .
Assume that the system is stable, that is, ρ < 1. Note that the system is
identical to that of a single class M/G/1 queue with PP(λ) arrivals and service
times with CDF G(t), mean 1/μ, and variance σ2 . Then using Pollaczek–
Khintchine formula (see Equation 4.6) for a single class M/G/1 queue, we
get the following results:

1 λ2 E S2
L=ρ+ ,
2 1−ρ
L
W= ,
λ
Wq = W − 1/μ,

1 λ2 E S2
Lq = .
2 1−ρ

Now we need to derive the performance measures for each class i for i =
1, . . . , K. The key result that enables us to derive performance measures for
256 Analysis of Queues

the individual classes based on aggregate measures is Wiq = Wq for all i.

This result essentially states that the expected time spent waiting for ser-
vice is identical for all classes and is equal to that of the aggregate Wq . This
deserves an explanation. Note that the aggregate arrivals and class i arrivals
are according to Poisson processes with parameters λ and λi , respectively.
Due to PASTA, the average workload in the system in steady state is as
seen by the aggregate arrivals, and it is also the same as that seen by class-i
arrivals. Further, the steady-state average workload also corresponds to the
expected remaining work before service begins for that arrival. Based on all
that, Wiq = Wq for all i.
Using that result we can show that the expected number of class-i cus-
tomers in the system (Li ) as well as those waiting in the queue (Liq ) and the
expected sojourn time in the system for class i (Wi ) as well as waiting in the
queue (Liq ) are given by

1 λE S2
Wiq = Wq = ,
2 1−ρ
Liq = λi Wiq ,
Li = ρi + Liq ,
1
Wi = Wiq + .
μi

Often one is tempted to take intuitive guesses at the relationship between

the individual class performance and that of the aggregate class. For exam-
ple, one can ask whether Li = λi L/λ, or still better, is Li = ρi L/(1 − ρ)?
Although they are intuitively appealing, they are both not true. In fact from
the previous results we have

λi L λi
Li = − + ρi .
λ μ

However, note that the result L1 + L2 + · · · + LK = L holds.

Thus, we have analyzed the performance of each class of a multiclass
M/G/1 queue with FCFS service discipline. Next we move on to systems
with priorities; in other words, the classes correspond to priority order of
service. We describe them first and then analyze.

5.2.2 M/G/1 with Nonpreemptive Priority

In this service-scheduling scheme, the customers are classified and served
according to a priority rule. We assume that the server knows the order
of arrivals and class of each customer in the system. To keep the mapping
Multiclass Queues under Various Service Disciplines 257

between classes and priorities simple, we let class-1 to be the highest pri-
ority and class K the lowest (we will see later how to do this optimally).
Also, service discipline within a class is FCFS. Therefore, the server upon
a service completion always starts serving a customer of the highest class
among those waiting for service, and the first customer that arrived within
that class. Of course, if there are no customers waiting, the server selects the
first customer that arrives subsequently. However, it is important to clar-
ify that the server completes serving a customer before considering whom
to serve next. The meaning of nonpreemptive priority is that a customer in
service does not get preempted (or interrupted) while in service by another
customer of high priority (essentially it is only in the waiting room where
there is priority).
Recall that we are considering an M/G/1 queue with K classes where the
arrival process is PP(λi ) for class i (i = 1, 2, . . . , K); the service times are IID
with mean E[Si ] = 1/μi , second moment E[S2i ], CDF Gi (·), and ρi = λi /μi
for class i (for i = 1, 2, . . . , K). There is a single server with infinite waiting
room. From an analysis standpoint, it does not matter whether there is a
queue for each class or whether all customers are clubbed into one class as
long as the class of each customer and the order of arrival are known to the
server. However, from a practical standpoint it is easiest to create at least a
“virtual” queue for each class and pick from the head of the nonempty line
with highest priority. Further, we assume that the system is stable, that is,
ρ < 1. For the rest of this section, we will use the terms class and priority
interchangeably. With that said, we are ready to analyze the system and
obtain steady-state performance measures.
To analyze the system, we consider a class-i customer that arrives into
the system in steady state for some i ∈ {1, . . . , K}. We reset the clock and call
that time as 0. Another way to do this is to assume the system is stationary at
time 0 and a customer of class i arrives. Consider the random variables defined
in Table 5.1 (albeit with some abuse of notation). It is crucial to note that
the terms in Table 5.1 would perhaps have other meanings in the rest of
this book.
When a customer of class i arrives in the stationary queue at time 0, this
customer first waits for anyone at the server to be served (i.e., for a random
time U). Then the customer also waits for all customers that are in the system

TABLE 5.1
Random Variables Used for Nonpreemptive and Preemptive Resume Cases
q
Wi Waiting time in the queue (not including service) for customer of class i
(note that this is a random variable and not the expected value)
U Remaining service time of the customer in service (this is zero if the server is idle)
Rj Time to serve all customers of type j who are waiting in the queue at time 0 (for 1 ≤ j ≤ i)
q
Tj Time to serve all customers of type j who arrive during the interval [0, Wi ] (for 1 ≤ j < i)
258 Analysis of Queues

i
of equal or higher priority at time 0 (i.e., for a random time Rj ). Note
j=1
that during the time this customer waits in the system to begin service (i.e.,
q
Wi ), there could be other customers of higher priority that may have arrived
i−1
and served before this customer. Thus, this customer waits a further Tj
j=1
(with the understanding that the term is zero if i = 1) before service begins.
Therefore, we have

q
i
i−1
Wi = U + Rj + Tj .
j=1 j=1

Taking expectations we have

q
i
i−1
E Wi = E[U] + E[Rj ] + E[Tj ]. (5.2)
j=1 j=1

We need to derive expressions for E[U], E[Rj ], and E[Tj ] which we do next.

Problem 45
Derive the following results (for the notations described in Table 5.1):

λ 2
E[U] = ES
2
q
E[Rj ] = ρj E Wj
q
E[Tj ] = ρj E Wi

where ρi = λi E[Si ].
Solution
Recall from Problem 43 that if a stable G/G/1 queue with K classes is
observed at an arbitrary time in steady state, then the expected remaining
service time is λE[S2 ]/2. Of course this requires a service discipline that
is work conserving with at most one partially completed service allowed,
which is true here. However, since the arrivals are Poisson, due to PASTA
and M/G/1 system being ergodic, an arriving class-i customer in steady
state would observe an expected remaining service time of λE[S2 ]/2. In
other words, the arrival-point probability is the same as the steady-state
probability. Thus, from the definition of U we have

λ 2
E[U] = ES .
2
Multiclass Queues under Various Service Disciplines 259

Once again because of PASTA this arriving customer at time 0 will see Ljq
customers of class j waiting for service to begin. Therefore, by the definition
of Rj (time to serve all customers of type j waiting in the queue at time 0), we
have E[Rj ] = E[E[Rj |Nj ]] = E[Nj /μj ] = Ljq /μj , where Nj is a random variable
denoting the number of class-j customers in steady state waiting for service
to begin. Next, using Little’s law Ljq = λj Wjq , we have

1 q
E[Rj ] = λj Wjq = ρj E Wj .
μj

Now, to compute E[Tj ] note the definition that it is the time to serve all cus-
q
tomers of type j who arrive during the interval [0, Wi ] for any j < i. Clearly,
the expected number of type j arrivals in time t is λj t because the arrivals are
according to a Poisson process and each of those arrivals require 1/μj service
q q
time. Hence, we have E[Tj |Wi ] = λj Wi /μj . Taking expectations we get

q
E[Tj ] = ρj E Wi .

Plugging these results into Equation 5.2, we have

q λ i
q q
i−1
E Wi = E S2 + ρj E Wj + E Wi ρj .
2
j=1 j=1

We rewrite this equation using the notation that we have used earlier for the
q
average waiting time before service, that is, Wiq = E[Wi ]. Thus, we have for
all i ∈ {1, . . . , K}

λ i i−1
Wiq = E[S2 ] + ρj Wjq + Wiq ρj .
2
j=1 j=1

Note that we have K equations with K unknowns Wiq . However, for i = 1,

there is only one unknown W1q which we can solve. Then we consider i = 2.
Since we already know W1q , we can write W2q in terms of W1q . In that manner
we can solve recursively for Wiq starting with i = 1 all the way to i = K. For
that we let αi = ρ1 + ρ2 + · · · + ρi with α0 = 0. Then it is possible to show
that for all i ∈ {1, . . . , K}

K
1
2 λj E S2j
j=1
Wiq = .
(1 − αi )(1 − αi−1 )
260 Analysis of Queues

Now, using Wiq we can derive the other performance measures as follows
for all i ∈ {1, . . . , K}:

Liq = λi Wiq ,
Wi = Wiq + E[Si ],
Li = Liq + ρi .

Now, using the performance measures for individual classes, we can

easily obtain aggregate performance measures across all classes as follows:

L = L1 + L2 + · · · + LK ,
L
W= ,
λ
1
Wq = W − ,
μ
Lq = λWq .

So far we have assumed that we are given which class should get the
highest priority, second highest, etc. This may be obvious in some settings
such as a hospital emergency ward. However, in other settings such as a
manufacturing system we may need to determine an optimal way of assign-
ing priorities. To do that, consider there are K classes of customers and it
costs the server Cj per unit time a customer of class j spends in the system
(this can be thought of as the holding cost for class j customer). It turns out
(we will show that in a problem next) if the objective is to minimize the total
expected cost per unit time in the long run, then the optimal priority assign-
ment is to give class i higher priority than class j if Ci μi > Cj μj (for all i, j
such that i = j). In other words, sort the classes in the decreasing order of the
product Ci μi and assign first priority to the largest Ci μi and the last priority
to the smallest Ci μi over all K classes. This is known as the Cμ rule. Also
note that if all the Ci values were equal, then this policy reduces to “serve the
customer with the shortest expected processing time first.” We derive the
optimality of the Cμ rule next.

Problem 46
Consider an M/G/1 queue with K classes with notations described earlier in
this section and service discipline being nonpreemptive priority. Further, it
costs the server Cj per unit time a customer of class j spends in the system
and the objective is to minimize the total expected cost per unit time in the
long run. Show that the optimal priority assignment is to give class i higher
priority than class j if Ci μi > Cj μj (provided i = j).
Multiclass Queues under Various Service Disciplines 261

Solution
Let TC be the average cost incurred per unit time if the priorities are
1, 2, . . . , K from highest to lowest for the system under consideration. Since
a cost Cn is incurred during the sojourn of a class n customer, the total cost
incurred per class n customer on average in steady state is Cn Wn (the reason
for not using i or j but n is that i and j are reserved for something else). Also,
class n customers arrive at rate λn resulting in an average cost per unit time
due to class n customers being λn Cn Wn . Thus, we have

K
TC = λn Cn Wn .
n=1

We use an exchange argument for this problem. In essence, we would like

to evaluate the total cost if instead of ordering the priorities from class-1
to K, we swap the priorities of two classes. For that we select a particular
i ∈ {1, . . . , K − 1}. Also, we pick j = i + 1 so that we only swap neighboring
classes to make our analysis simple. It is crucial to realize that we are indeed
selecting only one i and one j. For example, if K = 5, then i could be 2, and j
would therefore be 3. Now, denote TCe as the average cost incurred per unit
time by exchanging priorities for i and j. For example, say K = 5, i = 2, and
j = 3. Then, TC is the average cost per unit time when the priority order is
1-2-3-4-5 and TCe is the average cost per unit time when the priority order is
1-3-2-4-5. Now we proceed with a generic i and j = i + 1 for the remainder
of the analysis. We compute Tc − TCe with the understanding that if it is
positive, then we must switch i and j, otherwise we should stick with the
original choice. Note that for some n ∈ {1, . . . , K}
K
1
2 λk E S2k 1
k=1
Wn = Wnq + E[Sn ] = + .
(1 − αn )(1 − αn−1 ) μn

We first make a simplifying step

λn μn μn
= − ,
(1 − αn )(1 − αn−1 ) 1 − αn 1 − αn−1

using the notation of αn = ρ1 + · · · + ρn with ρk = λk /μk for all 1 ≤ k ≤ n.

This gives us

1
K
μn μn λn Cn
λn Cn Wn = Cn λk E S2k − + .
2 1 − αn 1 − αn−1 μn
k=1

that while computing Tc − TCe , all the terms except the ith and jth
Next, note
terms in n λn Cn Wn would be identical and cancel out. Using these results,
262 Analysis of Queues

the definitions of αk and ρk , as well as some algebra, we

can derive the
following (presented without displaying the canceled terms n λn Cn /μn ):

K i−1
TC − TCe Cr μr Cr μr Cr μr Cj μj
= − − −
1 K 2
n=1 λn E Sn
1 − αr 1 − αr−1 1 − αr 1 − αi−1 − ρj
2 r=1 r=1

Ci μi
K
Cr μr
i−1
Cr μr
− − +
1 − αi−1 − ρj − ρi 1 − αr 1 − αr−1
r=i+2 r=1

Cj μj Ci μi
K
Cr μr
+ + +
1 − αi−1 1 − αi−1 − ρj 1 − αr−1
r=i+2

Ci μi Cj μj Ci μi Cj μj Cj μj
= + − − −
1 − αi 1 − αi − ρj 1 − αi−1 1 − αi 1 − αi−1 − ρj
Ci μi Cj μj Ci μi
− + +
1 − αi−1 − ρj − ρi 1 − αi−1 1 − αi−1 − ρj

1 1 1
= (Ci μi − Cj μj ) − +
1 − αi 1 − αi−1 1 − αi−1 − ρj

1
−
1 − αi − ρ j
ρi ρj (αj − 2)(Ci μi − Cj μj )
= .
(1 − αi )(1 − αi−1 )(1 − αi−1 − ρj )(1 − αi − ρj )

Given that αj − 2 < 0, we can make the following conclusion:

if Ci μi − Cj μj > 0, then TC − TCe < 0.

Therefore, if Ci μi < Cj μj , then we should switch the priorities of i and j since

TC − TCe > 0. In this manner, if we compare Ci μi and Cj μj for all pairs of
neighbors, the final priority rule would converge to one that is in the decreas-
ing order of Cn μn . In other words, the optimal priority assignment is to give
class i higher priority than class j if Ci μi > Cj μj (provided i = j). Therefore,
one should sort the product Ci μi and call the highest Ci μi as class-1 and the
lowest as class K.

5.2.3 M/G/1 with Preemptive Resume Priority

Here we consider a slight modification to the M/G/1 nonpreemptive pri-
ority considered in the previous subsection. The modification is to allow
preemption during service. During the service of a customer, if another cus-
tomer of higher priority arrives, then the customer in service is preempted
Multiclass Queues under Various Service Disciplines 263

and service begins for this new higher priority customer. When the pre-
empted customer returns to service, service resumes from where it was
preempted. This is a work-conserving discipline (however, if the service
has to start from the beginning which is called preemptive repeat, then it
is not work conserving because the server wasted some time serving). As
we described earlier, if the service times are exponential, due to memo-
ryless property, preemptive resume and preemptive repeat are the same.
However, there is another case called preemptive identical which requires
that the service that was interrupted is repeated with an identical service
time (in the preemptive repeat mechanism, the service time is sampled again
from a distribution). We do not consider those here and only concentrate on
preemptive resume priority.
All the other preliminary materials for the nonpreemptive case also hold
here for the preemptive resume policy (namely, multiclass M/G/1 with class-
1 being highest priority and class K lowest). Also, customers within a class
will be served according to FCFS policy. But a server will serve a customer
of a particular class only when there is no customer of higher priority in the
system. Upon arrival, customer of class i can preempt a customer of class
j in service if j > i. Also, the total service time is unaffected by the inter-
ruptions, if any. Assume that the system is stable, that is, ρ < 1. Note that
there could be more than one customer with unfinished (but started) service.
Therefore, the results of Section 5.1.4 cannot be applied here. However, the
service discipline is still work conserving and we will take advantage of that
in our analysis. Further, note that the sojourn time of customers of class i is
unaffected by customers of class j if j > i.
With that thought we proceed with our analysis. We begin by consid-
ering class-1 customers. Clearly, as far as class-1 customers are concerned,
they can be oblivious of the lower class customers. Therefore, class-1 cus-
tomers effectively face a standard single class M/G/1 system with arrival rate
λ1 and service time distribution G1 (·). Class-1 customers get served upon
arrival if there are no other class-1 customers in the system, and they will
wait only for other class-1 customers for their service to begin. Thus, from
Pollaczek–Khintchine formula in Equation 4.6, we get the sojourn time of
class-1 customers as

1 λ1 E S21
W1 = + .
μ1 2(1 − ρ1 )

Next we turn to class-2. Since preemptive resume is a work-conserving

discipline, the workload in the system at any time due to classes 1 and 2 alone
is identical to that of a FCFS queue with only classes 1 and 2. In the preemp-
tive resume framework, the classes 1 and 2 as a single group is unaffected
by the dynamics of the other classes. Therefore, we can simply consider the
two classes as a set and compare with another queue that has the same two
classes but FCFS discipline. The steady-state expected workload under FCFS
264 Analysis of Queues

(for the two-class system) is

(λ1 E S21 + λ2 E S22 ) 2
(λ1 E S21 + λ2 E S22 )
+ ρi Wiq =
2 2
i=1

1 λ1 E S21 + λ2 E S22
+ (ρ1 + ρ2 ) .
2 1 − ρ1 − ρ2

This expression is equal to the expected workload of class-1 customers plus

that of class-2 customers under the preemptive resume discipline. Because
of PASTA, entering class-2 customers in steady state see this workload due
to the two classes. If W2 is the steady-state average sojourn time of a class-2
customer in the preemptive resume discipline, then W2 equals the preced-
ing average workload plus the expected workload due to all the customers
of class-1 that arrived during W2 (which equals ρ1 W2 ), plus the service
time of this customer 1/μ2 . Thus, we can derive (details are shown for Wi
subsequently for any i > 1)
2

λj E S2j
1 j=1
W2 = + .
μ2 (1 − ρ1 ) 2(1 − ρ1 − ρ2 )(1 − ρ1 )

In a similar manner, we can derive the expressions for class-3. However,

it is as straightforward to just show this for a generic class i in the K-class
M/G/1 queue with preemptive resume priority. Recall that if only the first i
classes of customers are considered, then the processing of these customers
as a group is unaffected by the lower priority customers. The crux of the
analysis is in realizing that the workload of this system (with only the top i
classes) at all times is the same as that of an M/G/1 queue with FCFS and top
i classes due to the work-conserving nature. Therefore, using the results for
work-conserving systems, the performance analysis of this system is done.
Consider an M/G/1 queue with only the first i classes and FCFS service. Let
W(i) be the average workload in the system when the system is observed at
an arbitrary time in steady state. Then using Equation 5.1 and plugging in
the relation for the waiting time in the queue from the FCFS analysis (one
must be cautious to use i and not K for the number of classes), we get
i

λj E S2j
j=1
W(i) =
2(1 − αi )

where αi = ρ1 + ρ2 + · · · + ρi for i > 0 and α0 = 0.

The only reason we needed FCFS discipline is to obtain W(i). Now we
resort back to the preemptive resume service discipline with i classes. Since
the preemptive resume policy is also work conserving, W(i) is the average
Multiclass Queues under Various Service Disciplines 265

workload in the system with the first i classes alone. In addition, due to
PASTA, W(i) will also be the average workload in the preemptive resume
M/G/1 queue as seen by an arriving class-i customer. This in turn is also
equal to the average workload due to the first i classes in the K-class system.
Now consider an M/G/1 queue with all K classes where a customer of class i
is about to enter in steady state. Then the sojourn time in the system for this
customer depends only on the customers of classes 1 to i in the system upon
arrival. Therefore, Wi can be computed by solving

1
Wi = W(i) + + αi−1 Wi
μi

as the mean sojourn time is equal to the expected workload upon arrival
from all customers of classes 1 to i plus the mean service time of this class
i customer plus the average service time of all the customers of classes 1 to
i − 1 that arrived during the sojourn time. Substituting the expression for
W(i) and rearranging terms, we have

i

λj E S2j
1 j=1
Wi = + .
μi (1 − αi−1 ) 2(1 − αi )(1 − αi−1 )

Now, using Wi , other mean performance measures for the preemptive

resume service discipline can be obtained as follows:

Wiq = Wi − E[Si ],
Li = λi Wi ,
Liq = Li − ρi .

The results for the individual classes can be used to obtain aggregate
performance measures as follows:

L = L1 + L2 + · · · + LK ,
L
W= ,
λ
1
Wq = W − ,
μ
Lq = λWq .

As described earlier, the reader is encouraged to consider results in

the literature for other priority queues such as preemptive repeat and pre-
emptive identical. Next we present a case study to illustrate some minor
266 Analysis of Queues

variations to the models we have seen in this section. After that, we will
move on to other policies in subsequent sections.

5.2.4 Case Study: Emergency Ward Planning

Between January 2009 and June 2011, several hospitals all over the United
States have started to report their emergency room wait times on billboards.
That information can also be accessed on smartphones through apps as well
as by going to the hospital websites. One such hospital that got onto that
bandwagon is the University Town Hospital (UTH). University Town is a
small college town and UTH serves residents of University Town as well
as other small towns around it. Nonetheless, UTH is facing some fierce
competition from other hospitals and clinics in the area that have recently
established many urgent care centers (including high acuity ones that deal
with emergency patients). To make matters worse, emergency rooms (ER)
have been getting bad press such as: ER visits reached an all time high of
over 123 million in 2008 (it was 117 million in 2007); in addition, a govern-
ment report showed that on many occasions patients waited nearly a half
hour instead of being seen immediately at ERs.
Clearly, UTH’s emergency ward needed a makeover fearing the loss of
clientele. Feeling the pressure, toward the end of year 2010, upper manage-
ment of UTH met to develop a public relations strategy for the sustainability
of the emergency ward of UTH. They did not want to waste any more
time and made two major decisions. The first decision was to install a bill-
board before January 2011 on Main Street in downtown University Town
that would electronically display the current wait time for ER patients. The
second decision was to roll out a campaign called 30-Minute ER Commitment
that would assure that patients would be guaranteed to be seen by an ER
doctor within 30 min. Focus groups revealed that this would clearly establish
UTH’s commitment to patient satisfaction. In addition, many of the focus-
group participants felt that the high-tech billboard technology would give
the impression that UTH was equipped with the latest medical devices as
well. By mid-April, the upper management was delighted to hear that the
average wait time was only 14 min while the number of patients served per
day in fact increased since November 2009.
However, in June 2011 the upper management of UTH was informed of
a few local blogs which reported that patients waited over half an hour at
the emergency ward to see a doctor. This was a cause for concern. The upper
management immediately called upon the hospital management engineer.
Her name is Jenna and she was charged to look into the issue. She had a
week to produce a report and make a presentation. Jenna had worked on her
Master’s thesis on emergency ward hospital-bed scheduling. But ever since
she has been employed at UTH, she has worked on other topics such as nurse
scheduling and pharmaceutical inventories. However, Jenna loved queue-
ing theory and was excited about a project on wait times. Without much
Multiclass Queues under Various Service Disciplines 267

adieu, Jenna went about gathering all the information she could as well as
collected the necessary data for her analysis. The first thing she found out
was that the billboard wait times were updated every 15 min (through an
automatic RSS feed) and the displayed value was the average wait time over
the past 2 h.
Jenna wanted to know how they computed wait times and she was told
that the wait time for a patient is the time from when the patient checks in
until when the patient is called by a clinical professional. Jenna immediately
realized that it did not include the time the clinical professional spends see-
ing the patient. So it represented the waiting time in the queue and not the
sojourn time. Jenna found out that the entire time spent by patients in the ER
could even be several hours if a complicated surgery needs to be performed.
However, she was glad that she did not have to focus on those issues. But
what was concerning for her was whether someone with a heart failure had
to wait on average for 14 min to see a clinical professional. She was reassured
that when patients arrive at the emergency ward, they are immediately seen
by a triage nurse. The nurse would determine the severity of the patient’s
illness or injury to determine if they would have to be seen immediately
by a clinical professional. Priority was given to patients with true emergen-
cies (this does not include life-threatening cases, pregnancies, etc., where
patients are not seen by a clinical professional but are directly admitted to
the hospital).
Upon speaking with the triage nurse, Jenna found out that there are
essentially two classes of patients. One class is the set of patients with true
emergencies and the second class is the remaining set of patients. Within a
class, patients were served according to FCFS; however, the patients with
true emergencies were given preemptive priority over those that did not
have a true emergency. The triage nurse also whispered to Jenna that she
would much rather have three classes instead of two. It was hard to talk a
lot to the triage nurse because she was always busy. But Jenna managed to
also find out that there are always two doctors (i.e., clinical professionals) at
the emergency ward, and like Jenna saw during her Master’s thesis days, the
most crowded times were early evenings. Jenna next stopped at the informa-
tion technology office to get historical data of the patients. A quick analysis
revealed that patients arrived according to a Poisson process and the time
a doctor took to see a patient was exponentially distributed. Interestingly,
the time a doctor spent to see a patient was indifferent for the two classes of
patients.
Jenna looked at her textbook for her course on waiting line models. She
distinctly remembers studying preemptive queues. However, when she saw
the book, she did not see anything about two-server systems (note that since
the ward has two doctors, that would be the case here). Further, the book
only had results for mean wait times and not distributions, which is some-
thing she thought was needed for her analysis. Nonetheless, she decided to
go ahead and read that chapter carefully so that she gets ideas to model the
268 Analysis of Queues

system and analyze it. Jenna also checked the simulation software packages
she was familiar with and none had an in-built preemptive priority option
(all of them only had nonpreemptive). At this time Jenna realized that her
only option was to model the system from scratch. She wondered if the two-
server system was even work conserving. But she did feel there was hope
since the interarrival times and service times were exponentially distributed.
Also, the service times were class independent. “How hard can that be to
analyze,” she thought to herself.
Jenna started to model the system. Based on her data she wrote down
that class-1 patients (with true emergencies) arrived to the emergency ward
according to PP(λ1 ) and class-2 patients arrived according to PP(λ2 ). The
service time for either class is exp(μ). There are two servers that use FCFS
within a class and class-1 has preemptive resume priority over class-2. In
the event that there are two class-2 patients being served when a class-1
arrives, Jenna assumed that with equal probability one of the class-2 patients
was selected to be preempted. Jenna first started to model the system as
a CTMC {(X1 (t), X2 (t)), t ≥ 0} where for i = 1, 2, Xi (t) is the number of
class-i patients in the system. Then Jenna realized that there must be an
easier way to model the system. She recalled how the M/G/1 queue with
preemptive priority was modeled in her textbook. An idea immediately
dawned on her.

5.2.4.1 Service Received by Class-1 Patients

Jenna realized that the class-1 patients are easy to analyze and they are the
most important ones to analyze too. The main reason is that the hospital is
extremely concerned that these patients who have a true emergency should
not wait for too long. Thus, to model class-1 patients, note that they are obliv-
ious to the class-2 patients. So Jenna modeled the class-1 patients using an
M/M/2 queue with PP(λ1 ) arrivals and exp(μ) service times and two servers.
From the M/M/2 results in her queueing theory text, she was able to write
down W1q , the average time spent by class-1 patients in the queue prior to
being seen by a clinical professional, as

ρ21 1
W1q =
1 − ρ21 μ

where ρ1 = λ1 /(2μ) is the traffic intensity brought by class-1 patients. Based

on Jenna’s data, the arrival rate was about two class-1 patients per hour
(λ1 = 0.0333 per minute) and the average service time was about 17.5 min
(1/μ = 17.5 min) resulting in ρ1 = 0.2917. Plugging into this formula,
Jenna got W1q = 1.6271 min which was reasonably close to what her data
revealed.
Multiclass Queues under Various Service Disciplines 269

At first it appears like a 1.63-min wait on average to see a doctor for a

patient with a true emergency does not sound too bad. In addition, it is still
fairly lower than the 30-min ER wait commitment. However, based on her
data it showed that some patients (although rare) did wait for quite a while
but there weren’t enough sample points to make a meaningful statistical
analysis for those patients. So Jenna decided to forge ahead with a queue-
ing model. She let Yq be the time spent waiting to begin service for a class-1
patient that arrives in steady state. She computed the LST as

∞
i−1
2μ
E e−sYq = p0 + p1 + pi
2μ + s
i=2

where pi is the steady-state probability that the M/M/2 queue has i in the
system. From the M/M/2 analysis, p0 = (1 − ρ1 )/(1 + ρ1 ) and for i ≥ 1,
pi = 2p0 ρi1 . Then she obtained

−sY 1 + 2ρ1 + 2ρ1 λ1
q = p
Ee 0
2μ − λ1 + s

which upon inverting yielded

2ρ21 −(1−ρ1 )2μy

P{Yq ≤ y} = 1 − e .
1 + ρ1

Plugging in the numbers, Jenna calculated the probability of a class-1 patient

waiting for less than 30 min as

P{Yq ≤ 30} = 0.9884

which says that more than 1% of the patients with a true emergency would
wait over 30 min to see a clinical professional. Jenna thought to herself that
this could mean quite a few patients a month that would wait over 30 min,
and it was not surprising to her that many would be blogging about it.

5.2.4.2 Experience for Class-2 Patients

For the patients that arrive at the emergency ward that do not have a true
emergency, Jenna felt that it is not crucial to guarantee a less than 30-min
wait. She was confident that no one was going to blog that they had a very
bad cold and had to wait for 40 min in the emergency ward to see a doctor.
However, she realized that it would also not be a good idea to dissuade
those patients. In fact, in her opinion the main purpose of the billboard is to
ensure that these patients consider the emergency ward as an alternative to
270 Analysis of Queues

urgent care facilities. In other words, Jenna felt that it is important to have a
low overall average wait time so that based on the billboard display, some
nonemergency patients would be lured to the emergency ward as opposed
to visiting an urgent care facility. So she proceeded to compute the average
time spent by the stable patients (i.e., ones without a true emergency) waiting
before they see a clinical professional.
To model that, Jenna let X(t) be the total number of patients in the system
at time t; these include those that do and do not have a true emergency.
The “system” in the previous sentence includes patients that are waiting to
see a clinical professional as well as those that are being seen by a clinical
professional. Since the service times for both times of patients are identically
distributed exp(μ), X(t) would be stochastically identical to the number of
customers in an M/M/2 queue with FCFS service, PP(λ1 + λ2 ) arrivals, and
exp(μ) service. For such an M/M/2 queue with FCFS service, the steady-
state average number in the system is (λ1 + λ2 )/μ + (ρ2 /(1 − ρ2 ))(λ1 + λ2 )/μ
where ρ = (λ1 + λ2 )/(2μ). Thus, it is also equal to the expected value of the
total number of patients in the system in steady state, L1 + L2 , where Li for
i = 1, 2 is the mean number of class-i patients in the system in steady state.
Therefore,

λ1 + λ2 ρ2 λ1 + λ2
L1 + L2 = + .
μ 1 − ρ2 μ

Jenna realized that L1 can be computed as

λ1 ρ21 λ1
L1 = + ,
μ 1 − ρ21 μ

where ρ1 = λ1 /2μ using the W1q result discussed earlier and the fact that
L1 = λ1 /μ + λ1 W1q . Thus, she calculated L2 as

2ρ 2ρ1
L2 = − .
1 − ρ2 1 − ρ21

Using that she wrote down, W2q , the average time spent by class-2 patients
waiting as

L2 − ρ2
W2q =
λ2

where ρ2 = λ2 /μ. Based on Jenna’s data, the arrival rate was about 2.5 class-
2 patients per hour (λ2 = 0.0417 per minute) and the average service time
Multiclass Queues under Various Service Disciplines 271

of 17.5 min (1/μ = 17.5 min) resulting in ρ2 = 0.3646. Plugging into this
formula for W2q , Jenna computed W2q = 31.2759 min. Although this would
mean that UTH cannot guarantee an “average” wait time of less than 30 min
for stable patients, across all patients the average wait time was a little over
18 min which sounded reasonable. Jenna felt it was important to clarify that
W2q included time spent while being preempted by a class-1 patient, and not
just the time to see a clinical professional for the first time.

5.2.4.3 Three-Class Emergency Ward Operation

Although Jenna had done a thorough analysis of the current state of the
emergency ward, she was reminded of her conversation with the UTH upper
management wanting recommendations too. So Jenna started to contemplate
about what could change. When she checked online, Jenna found out that the
statistics for true emergencies was somewhat less than 50% of the patients
that go to ER. She guessed things were similar at UTH as well. But Jenna
remembered what the triage nurse told her. It dawned on her that not all true
emergencies need to be treated right away, some could wait a little while oth-
ers cannot. So she decided to break the true emergency cases into two groups,
one is critical cases when immediate treatment is a must for survival, and the
other is serious cases where early treatment would be beneficial. About one
in six of the true emergencies was a critical case.
Thus, using the same results as in the true emergency cases, Jenna cal-
culated that for the critical cases, the mean arrival rate is 0.0056 patients
per minute, bringing in a load of 0.0486, resulting in a mean wait time of
0.0415 min, and the probability of being seen within 30 min is 99.98% which
is much more reasonable. Jenna thought she should recommend dividing
the patients into three classes: critical, serious, and stable cases. Under very
rare circumstances, if critical cases have to wait an unusually long time, then
Jenna suggested that they be admitted to the hospital and get immediate
care. Jenna also realized that the patients with serious conditions would
experience a higher probability of being seen after 30 min than when they
were clubbed together with the critical cases (i.e., the true emergencies).
However, this is a risk worth taking, she felt. Jenna also recommended that
UTH clarify on their website what the billboard average wait times truly
indicate.
Jenna finished her report and made slides for her presentation. The upper
management of UTH was impressed with Jenna’s findings and also agreed
with her recommendations. Jenna was thrilled and she was also very excited
that her tools in queueing theory came very handy. In fact, she realized that
rare events such as patients waiting for over 30 min cannot be reliably esti-
mated using data or simulations. She met the triage nurse subsequently and
told the nurse that her idea was acknowledged in the presentation, and it
appears like the hospital was going to adopt it.
272 Analysis of Queues

5.3 Evaluating Policies for Classification Based on Location:

Polling Models
In the previous section, we mainly considered classes based on types, and
priorities that fitted naturally for such systems. In this section, we consider
classification based on physical location. Although we model in a general
form allowing different “types” here too, in practice these are usually of
a single type that contend for the common server (resource). For example,
a four-way stop sign, a material handling device for storage and retrieval, a
token bus in a wide area network, etc., are all examples of the kind of sys-
tems we are going to consider, namely, polling systems. Sometimes this is
also referred to as round-robin scheduling or cyclic queues. Since these sys-
tems are classified based on location, it is natural to consider a queue for each
location. The server goes from one location to another performing service in
a cyclic manner. One of the unique aspects of polling systems is the presence
of a switch-over time to go from one queue to another. Therefore, the system
is not work conserving. It is also worthwhile to point out that we continue
to assume that the arrival and service times are nonanticipative, specifically
the realized service time is known only upon service completion. There are
many types of polling systems, and depending on the application one can
consider a suitable model. We present a few models to illustrate a flavor of
the analysis techniques.
We begin by presenting the setting, notation, and some results (mostly
based on Sarkar and Zangwill [94]) that are common to all polling systems
considered in this section. We once again consider K classes of customers;
however, each class has its own queue (note that in the previous section on
priorities also, each class could have had its own queue, but since the switch-
over time was zero, it was not mathematically different than having one
queue). See Figure 5.2 where class-i customers arrive into queue i accord-
ing to PP(λi ) (i = 1, 2, . . . , K). Each class-i customer requires service times
that are IID with mean E[Si ] = 1/μi , second moment E[S2i ], and CDF Gi (·)
(for i = 1, 2, . . . , K). Further, we use ρi = λi /μi . There is a single server, and
each queue has infinite waiting rooms. The server polls queue 1 and serves
it, then switches to queue 2 and serves it. In this manner, the server contin-
ues till queue K and then cycles back to queue 1. For this reason, the policy
is called cyclic or polling or round-robin. The server spends a random time
Di to go from queue i − 1 to i. Throughout this section, the understanding
is that if i equals 1, then i − 1 is K. We use E[Di ] and E[D2i ] to represent the
mean and second moment of Di , respectively.
One thing we have not described here is what discipline the server adopts
to switch from one queue to another. As one might expect, this could be
done differently. We consider three disciplines that have been well-studied
in the literature: (1) exhaustive discipline where the server completely serves a
Multiclass Queues under Various Service Disciplines 273

PP(λ1)

PP(λ2)

PP(λ3)
Server

PP(λK)

FIGURE 5.2
Schematic of a polling system with a single server.

queue and leaves it only when it becomes empty; (2) gated discipline where the
server serves all the customers that arrived to that queue prior to the server
arrival in that cycle; (3) limited discipline where a maximum fixed number
can be served during each poll. Naturally, if the switch-over times are large
(such as a setup time to manufacture a class of jobs), then one may favor
the exhaustive discipline. Whereas if they are small, then it makes sense to
consider a limited discipline (such as even a maximum of one customer per
visit to the queue). The gated discipline falls in between. We assume that the
server can see the contents of a queue only upon arrival. Hence, even if a
queue is empty, the server would still spend the time to switch in and out of
that queue. We also would like to point out that customers in a single queue
(i.e., of a single class) are served according to FCFS.
In addition to this notation, we also require that not all mean switch-over
times can be zero. That provides us with a result quite unique to polling
systems which we describe next. Let ρ = ρ1 + ρ2 + · · · + ρK with K ≥ 2. If
the system is stable (we will describe the conditions later), then the long-run
fraction of time the server is attending to customers is ρ. Likewise, (1 − ρ)
is the fraction of time spent switching in the long run. This is a relatively
straightforward observation since the server is never at a queue idling when
that queue is empty. However, to prove it rigorously one needs to consider
times when the system regenerates and use results from regenerative pro-
cesses to show that. Using that we can state the following result: if E[C] is the
expected time to complete a cycle (including service as well as switch-over
times) in steady state, then

K
E[Di ]
i=1
= 1 − ρ. (5.3)
E[C]
274 Analysis of Queues

Although one is tempted to say that by taking the limit, all the switch-over
times can become zero, but that one has to be careful because of the close
tie with ρ. It turns out the zero switch-over time case is more complicated
to handle than the one with at least one nonzero switch-over time. Next we
analyze each of the types of queue emptying policies, that is, exhaustive,
gated, and limited.

5.3.1 Exhaustive Polling

Consider Figure 5.2. If the server adopted an exhaustive polling scheme, then
the server would poll a queue, serve all the customers back to back, and
then proceed onto the next queue. As alluded to before, if the switch-over
times are large, then it may be a good policy to adopt. Of course the down-
side is that the waiting time for a customer that narrowly missed a server
would be too long. Such policies are indeed common in flexible manufactur-
ing systems where there is a significant setup time to switch from making one
product to another. With this motivation, we begin to analyze the exhaustive
polling policy.
Let Bni be the random time the server spends at queue i in the nth cycle
serving customers till the queue empties out in that cycle. A cycle time associ-
ated with queue i begins at the instant a server departs queue i (and proceeds
to queue i + 1). Note that at this instant, queue i is empty which is the moti-
vation to consider that time as the cycle time epoch. The cycle time ends the
next time the server departs queue i. In other words, the cycle time asso-
ciated with queue i in the nth cycle is represented as Cni which is the time
between when the server departs queue i for the (n − 1)th and nth time. Note
that if we let the system be stationary at time 0 or we let n → ∞, then E[Cni ]
would just be E[C], the average cycle time defined earlier. We can also write
down Cni as

K
i
K
Cni = Dj + Bnj + Bn−1
j
j=1 j=1 j=i+1

since it is essentially the sum of the time the server spends in each queue
plus the time switching. However, note that we have been careful to use
n − 1 for queues greater than i and n for others so that the index n − 1 or n
appropriately denotes the cycle number with respect to the server.
Using these definitions, we can immediately derive the following results.
Given Cni , the expected number of arrivals of class-i customers during that
cycle time is λi Cni . All these arrivals would be served before this cycle time is
completed, and each one of them on average requires 1/μi amount of service
Multiclass Queues under Various Service Disciplines 275

time. Therefore, the average amount of time spent by the server in queue i in
the nth cycle, given Cni , is

λi Cni
E Bni Cni = = ρi Cni .
μi

Taking expectations of this expression, we get

E Bni = ρi E Cni .

By assuming the system is stationary at time 0 or by letting n → ∞, we

see that in steady state the server spends on average ρi E[C] time serving cus-
tomers in queue i in each cycle. In fact, using that we can derive Equation 5.3.
It may be worthwhile for the reader to verify that. Further, that can also be
used to deduce the necessary condition for stability which is

ρ < 1.

Now we are in a position to derive expressions for the performance measures

of the system assuming it is stable.
Consider an arbitrary queue, say i (such that 1 ≤ i ≤ K). As far as that
queue is concerned, it is an M/G/1 queue with server vacations. The server
goes on vacation if the system is empty after a service completion. If the
system is empty upon the return of the server from vacation, the server goes
on another vacation; otherwise the server begins service. In this system, the
vacation time is essentially the time between when a cycle starts till the server
returns to queue i, that is, Cni − Bni in the nth server cycle. In steady state
(letting n → ∞) or assuming stationarity, we denote this as Vi , the vacation
time corresponding to queue i. One of the exercise problems in Chapter 4
(An M/G/1 queue with server vacations) is to derive the steady-state number in
the system L as

(2)
1 ρ2i
m
Li = ρi + 1 + σ2i μ2i + i ,
2 1 − ρi 2mi

(2)
where mi and mi are, respectively, the mean and second factorial moment
of the number of arrivals during a vacation. Since the arrivals are according
to a Poisson process with rate λi into queue i, we can write down mi = λi E[Vi ]
(2)
and mi = λ2i E[Vi2 ]. Plugging that into Li and writing in terms of Wiq as

1 ρi /μi
2 2
E Vi2
Wiq = 1 + σi μi + .
2 1 − ρi 2E[Vi ]
276 Analysis of Queues

We still need expressions for E[Vi ] and E[Vi2 ]. It is relatively straightforward

to obtain E[Vi ] using the fact that it is the limit as n → ∞ E[Cni − Bni ] which
can be computed as

E Cni − Bni = (1 − ρi )E Cni .

As n → ∞, we get
⎛ ⎞
K
1 − ρi
E[Vi ] = (1 − ρi )E[C] = ⎝ E[Dj ]⎠
1−ρ
j=1

where the last equality is from Equation 5.3. However, obtaining E[Vi2 ] is
fairly involved. In the interest of space, we merely state the results from
Takagi [101] without describing the details.
Let Tin be the station time for the server in queue i defined as

Tin = Bni + Di .

In other words, this is the time between when the server leaves queue i − 1
and queue i. Define bij as the steady-state covariance of cycle times of queues
i and j during consecutive visits, with the understanding that if i = j, it would
be the variance. In other words, for all i and j

Cov Tin , Tjn if j > i,
bij = lim
n−1 n
n→∞ Cov Ti , Tj if j ≤ i.

Using the results in Takagi [101], we can obtain bij by solving the following
sets of equations:
⎛ ⎞
j−1
ρi ⎝
K i−1
bij = bjk + bjk + bkj ⎠ , for j < i
1 − ρi
k=i+1 k=1 k=j
⎛ ⎞
j−1
ρi ⎝
K i−1
bij = bjk + bkj + bkj ⎠ , for j > i
1 − ρi
k=i+1 k=j k=1
⎛ ⎞
ρi ⎝
i−1 K
Var(Di ) λi E S2i E[Vi ]
bii = + bij + ⎠
bij + .
(1 − ρi )2 1 − ρi (1 − ρi )3
j=1 j=i+1

These sets of equations can be solved using a standard matrix solver by writ-
ing down these equations in matrix form [bij ]. Assuming that can be done,
we can write down Var(Vi ) as
Multiclass Queues under Various Service Disciplines 277

⎛ ⎞
1 − ρi ⎝
i−1 K
Var[Vi ] = Var[Di ] + bij + bij ⎠ .
ρi
j=1 j=i+1

Using that we can obtain Wiq . Thereby we can also immediately write down
Liq = λi Wiq , Wi = Wiq + 1/μi , and Li = λi Wi . Of course we can also obtain
the total number in the entire system in steady-state L as L1 + L2 + · · · + LK .
Using that we could get metrics such as W, Wq , and Lq .

5.3.2 Gated Policy

In this section, we consider the gated policy. In a lot of ways, this is extremely
similar to the exhaustive policy. However, there are some subtle differences
in notation that must not be neglected. Once again, consider Figure 5.2. If
the server adopted a gated polling scheme, then the server would poll a
queue, serve only the customers that were in the system upon its arrival to
the queue, and then proceed onto the next queue. As alluded to before, this
policy balances the time wasted in switching against the waiting time for
each class. Such policies are indeed common in multiaccess communication
protocols such as token rings in local area networks (or other similar proto-
cols in metropolitan and wide area networks). In a token ring multiaccess
system, there are K nodes that are interested in transmitting information.
The token ring polls each node (which has packets waiting in a queue). The
token ring selects and transmits all the packets that were in the queue when
it arrived, and then proceeds onto the next queue. With this motivation, we
begin to analyze the gated polling policy.
Let Bni be the random time the server spends at queue i in the nth cycle
serving customers that were in the queue when it arrived. A cycle time asso-
ciated with queue i begins at the instant a server arrives at queue i (this is
different from the cycle time definition we had for exhaustive service policy)
and ends the next time this happens. Note that all customers that arrived
in queue i during its cycle would be served one by one during the time the
server spends in queue i. This is the reason we consider the time when a
server arrives at queue i as the cycle time epoch. Thus, the cycle time asso-
ciated with queue i in the nth cycle is represented as Cni , which is the time
between when the server arrives to queue i for the (n − 1)th and nth time.
Note that if we let the system be stationary at time 0 or we let n → ∞, then
E[Cni ] would just be E[C] the average cycle time defined earlier. We can also
write down Cni as

K
i−1
K
Cni = Dj + Bnj + Bn−1
j
j=1 j=1 j=i
278 Analysis of Queues

since it is essentially the sum of the time the server spends in each queue
plus the time switching. However, note that we have been careful to use
n − 1 for queues greater than or equal to i and n for others so that the
index n − 1 or n appropriately denotes the cycle number with respect to the
server.
Using these definitions, we can immediately derive the following results.
Given Cni , the expected number of arrivals of class i customers during that
cycle time is λi Cni . All these arrivals would be served during the server’s
sojourn in queue i, and each one of them on average requires 1/μi amount
of service time. Therefore, the average amount of time spent by the server in
queue i in the nth cycle, given Cni , is

λi Cni
E Bni Cni = = ρi Cni .
μi

Taking expectations of this expression, we get

E Bni = ρi E Cni .

By assuming the system is stationary at time 0 or by letting n → ∞, we

see that in steady state the server spends on average ρi E[C] time serving
customers in queue i in each cycle. Further, the necessary condition for
stability is

ρ < 1.

All the results derived so far are identical to those of exhaustive service
policies. However, this is enabled only by a careful selection of how Cni is
defined. It may hence be worthwhile to note the subtle differences in both
policies. Now we are in a position to derive expressions for the performance
measures of the system assuming it is stable.

Problem 47
Consider an arbitrary queue, say i (such that 1 ≤ i ≤ K). Let E[Ci ] and
E[C2i ], respectively, denote the steady-state mean and second moment of
the cycle time Cni . Write down an expression for Wiq in terms of E[Ci ]
and E[C2i ].
Solution
Let a customer arrive into queue i in steady state. From results of renewal
theory, the remaining time for completion as well as the elapsed time
since the start of the cycle in progress are both according to the equilib-
rium distribution of Ci . Therefore, the expected value of both the elapsed
time since the start of the cycle in progress as well as the remaining time
Multiclass Queues under Various Service Disciplines 279

for the cycle in progress to end are equal to E[C2i ]/(2E[Ci ]). The customer
in question would have to wait for the cycle in progress to end plus the
service times of all the customers that arrived since the cycle in progress
began. Therefore, the average waiting time in the queue for this customer
is E[C2i ]/(2E[Ci ]) + λi E[C2i ]/(2E[Ci ]μi ). The second term uses the fact that
λi E[C2i ]/(2E[Ci ]) customers would have arrived on average since the cycle
in progress began, and each of them requires on average 1/μi service time.
Thus, we have

(1 + ρi )E C2i
Wiq = .
2E[Ci ]

There are certainly other ways to derive Wiq , one of which is given as an
exercise problem.

Of course, we still need expressions for E[Ci ] and E[C2i ]. It is relatively

straightforward to obtain E[Ci ] using the fact that it is indeed E[C] we had
computed earlier, the cycle time of the server in steady state. Therefore

⎛ ⎞
K
1
E[Ci ] = E[C] = ⎝ E[Dj ]⎠
1−ρ
j=1

where the last equality is from Equation 5.3. However, obtaining E[C2i ] is
fairly involved. In the interest of space, we merely state the results from
Takagi [101] without describing the details.
Let Tin be the station time for the server in queue i defined (slightly
different from the exhaustive polling case) as

Tin = Bni + Di+1 .

In other words, this is the time between when the server enters queue i and
queue i + 1. Define bij as the steady-state covariance of cycle times of queues i
and j during consecutive visits, with the understanding that if i = j, it would
be the variance. In other words, for all i and j

Cov(Tin , Tjn ) if j > i,
bij = lim
n→∞ Cov(Tin−1 , Tjn ) if j ≤ i.
280 Analysis of Queues

Using the results in Takagi [101], we can obtain bij by solving the following
sets of equations:
⎛ ⎞
j−1

K
i−1
bij = ρi ⎝ bjk + bjk + bkj ⎠ , for j < i
k=i k=1 k=j
⎛ ⎞
j−1

K
i−1
bij = ρi ⎝ bjk + bkj + bkj ⎠ , for j > i
k=i k=j k=1
⎛ ⎞

i−1
K
K
bii = Var(Di+1 ) + ρi ⎝ bij + bij ⎠ + ρ2i bji + λi E[S2i ]E[C].
j=1 j=i+1 j=1

These sets of equations can be solved using a standard matrix solver by writ-
ing down these equations in matrix form [bij ]. Assuming that can be done,
we can write down E(C2i ) as
⎛ ⎞
2 1
i−1
K
K
E Ci = {E[C]}2 + ⎝ bij + bij ⎠ + bji .
ρi
j=1 j=i+1 j=1

5.3.3 Limited Service

In some sense, the exhaustive policy and gated policy considered in the pre-
vious sections were somewhat similar. However, the limited service polling
we consider here is dramatically different in a lot of ways. Consider Figure
5.2 with an additional requirement that the server uses a limited- policy.
The server would poll a queue, serve a maximum of customers, and then
proceed onto the next queue. As explained earlier, this policy is appropri-
ate when the switching times are small because of the large fraction of time
spent in switching. Some systems with time division slotted scheduling can
be modeled this way. Each node in a network is allowed to transmit at
most packets when it receives its slot to transmit. As soon as the pack-
ets are transmitted, the scheduler moves to the next node without wasting
any time idling. Unfortunately, the most general case of such a system is
very difficult to analyze, and the waiting time expressions Wiq cannot be
written just in terms of the arrival rate as well as the first two moments of
the service times and switching times. It appears like we would need the
Multiclass Queues under Various Service Disciplines 281

whole distribution even for the small case of K = 2. In that light, we will
make several simplifying assumptions. First we let = 1, that is, at each poll
if a queue is empty, the server immediately begins its journey to the next
queue; otherwise the server serves one customer and then begins its journey
to the next queue.
Let Bni be the random time the server spends at queue i in the nth cycle
serving customers that were in the queue when it arrived. Of course Bni will
either be equal to zero or equivalent of one class-i customer’s service time.
We drop the superscript n by either considering a stationary system or letting
n → ∞. Thus, Bi is the random variable corresponding to the time spent
serving customers in queue i during a server visit in stationary or steady
state. Recall that Di is the random variable associated with the time to switch
from queue i−1 to i. Let E[D] = E[D1 ]+E[D2 ]+· · ·+E[DK ] so that E[D] is the
average time spent switching in each cycle. Since ρi is the long-run fraction
of time the server spends in queue i, we have

E[Bi ]
ρi = K .
E[D] + E[Bj ]
j=1

By summing over all i and rewriting, we get

K
ρ
E[Bj ] = E[D] .
1−ρ
j=1

For all i, the average

number of class-i arrivals in a cycle is λi

E[D] + K j=1 E[Bj ] , and that number should be less than 1 for stability
because at most one class-i
customer canbe served in a cycle. Hence, the

stability condition is λi E[D] + K j=1 E[Bj ] < 1, and we can rewrite that as

λi E[D] < 1 − ρ

for all i ∈ [1, 2, . . . , K]. Further, the mean cycle time E[C], as stated in
Equation 5.3, can be verified from previous equation as

K
E[D]
E[C] = E[D] + E[Bj ] = .
1−ρ
j=1

The next step is to obtain performance metrics such as Wiq . It turns out
that unlike the exhaustive and gated cases where we could write down Wiq
in terms of just the first two moments of the service times and switch-over
times, here in the limited case we do not have the luxury. Except for K = 2,
282 Analysis of Queues

the exact analysis is quite intractable. However, we can still develop some
relations between the various Wiq values known as pseudo-conservation law.
Note that this is equivalent to the work conservation result for multiclass
queues with at most one partially complete service. Note that although
here too we have at most one partially complete service, because of the
switch-overs the system is not work conserving. However, by suitably
adjusting for the time spent switching, we can show that the amount of
workload in the queue for the limited polling policy is the same as that of
an equivalent M/G/1 queue with FCFS. The resulting pseudo-conservation
law yields

ρ
K K K
λi R ρ
ρi 1 − Wiq = λi E S2i + Var[Di ]
1−ρ 2(1 − ρ) 2E[D]
i=1 i=1 i=1

E[D] K
2
+ ρ+ ρi .
2(1 − ρ)
i=1

Using this expression, the only case we can obtain Wiq is when the system
is symmetric, that is, the parameters associated with each queue and switch-
over time is identical to that of the others. Instead of using the subscript i, we
use sym to indicate the symmetric case. Thus, the average time spent waiting
in the queue before service in the symmetric case is

Kλsym E S2sym + E[Dsym ](K + ρ) + Var[Dsym ]Kλsym Var[Dsym ]
Wsym,q = + .
2(1 − ρ − λsym KE[Dsym ]) 2E[Dsym ]

Thereby we can also immediately write down Lsym,q = λsym Wsym,q , Wsym =
Wsym,q + 1/μsym , and Lsym = λsym Wsym . Of course we can also obtain the total
number in the entire system in steady-state L as KLsym . Using that we could
get metrics such as W, Wq , and Lq .

5.4 Evaluating Policies for Classification Based on Knowledge of

Service Times
Consider an M/G/1 queue into which customers arrive according to a Pois-
son process with mean rate λ per unit time. Let S be a random variable
denoting the service times with CDF G(t) = P{S ≤ t}, mean E[S], and sec-
ond moment E[S2 ]. However, as soon as a customer arrives into the system, it
reveals its service time. In all the situations we have considered thus far, only
when the service completes we know the realized service times (we called
Multiclass Queues under Various Service Disciplines 283

that nonanticipative). However, here the service times are declared upon
arrival (which we call anticipative). In applications such as web servers, this
is reasonable since we would know the file size of an arriving request, and
hence its service time. Also in many flexible manufacturing systems, the pro-
cessing times can be calculated as soon as the specifications are known from
the request. Therefore, based on the knowledge of service times we consider
each customer to belong to a different class indexed by the service time.
For analytical tractability, we assume that the service time is a continuous
random variable without any point masses. Thus, it results in a multiclass
system with an uncountably infinite number of classes, where each class
corresponds to service time.
Although the overall system is indeed a single class system, we treat it
as a multiclass system by differentiating the classes based on their service
time requirement. An arrival would be classified as class x if its service time
requirement is x amount of time. Analogous to the mean waiting time before
service for a discrete class-i customer defined as Wiq in the previous sections,
here we define Wxq . The quantity Wxq is the time a customer of class x would
wait in the queue on average, not including the service time (which is x).
Likewise, Wx would indicate the corresponding sojourn time for this cus-
tomer with x as the amount of service. Of course the quantities Wxq and Wx
would depend on the scheduling policy. The overall sojourn time (W) as well
as the overall time waiting in the queue (Wq ) for the various policies can be
computed as

∞
W= Wx dG(x),
0
∞
Wq = Wxq dG(x).
0

Next, we consider three scheduling policies and compute Wx or Wxq with

the understanding that using these expressions we can obtain W and Wq . It
is crucial to note that we are only considering policies that use the service
time information (such as shortest processing time first without preemption,
preemptive shortest job first, and shortest remaining processing time first).
Also, all policies that do not use the service time information and have at
most one job with partially completed service have the same Wxq for all x
which is given by the Pollaczek–Khintchine formula (Equation 4.6). These
include policies such as FCFS, LCFS (without preemption), random order of
service, etc. Note that when we derived the Pollaczek–Khintchine formula
in Chapter 4, the discrete time Markov chain (DTMC) corresponding to the
number in the system as seen by a departing customer could not change
whether we used FCFS, LCFS, or random order of service. Therefore, for
policies such as FCFS, LCFS, and random order of service that do not use the
284 Analysis of Queues

service time information and have at most one job with partially completed
service, we have (from Equation 4.6)

λ E S2
Wq = Wxq = Wx − x = ,
2 1 − λE[S]
W = E[S] + Wq .

However, policies such as processor sharing and preemptive LCFS that do

not use service time information but can have more than one partially com-
pleted service do not have these expressions for their performance measures
(unless the service times are exponentially distributed). It turns out that pre-
emptive LCFS and processor sharing do have the same Wx given by (but
other policies that belong to the same category do not)

x
Wx = .
1 − λE[S]

With that we proceed to the three scheduling policies: shortest processing

time first without preemption, preemptive shortest job first, and shortest
remaining processing time first.

5.4.1 Shortest Processing Time First

Consider the setting described earlier where customers arrive according to
a Poisson process, and upon arrival customers reveal their service time
requirement by sampling from a distribution with CDF G(·), mean E[S], and
second moment E[S2 ]. Assume that the system is stable, that is, λE[S] < 1.
In the shortest processing time first (SPTF) policy, we consider a scheduling
discipline that gives a nonpreemptive priority to jobs with shorter process-
ing time. In other words, the server always selects the customer with the
shortest processing time. However, during the service of this customer, if
there are others that arrive with shorter processing times, then (because we
do not allow preemption) we consider them only after this job is completed.
To implement this, the server stores jobs in the queue by sorting according
to the service time. Therefore, this policy is the continuous analog of the
nonpreemptive priority considered in Section 5.2.2. In fact we just use those
results to derive Wxq here which we do next.
Consider the nonpreemptive priority discipline analyzed in Section 5.2.2.
The number of classes K in that setting is uncountably infinite, and if the
service time is in the interval (x, x + dx), that customer belongs to class x.
Note that if x + dx < y, then class x is given higher priority than y which
is consistent with SPTF requirements. We would like to derive the expected
time a customer with service time x waits to being serviced, Wxq . Recall from
the expression in Section 5.2.2, we need expressions for λx , E[S2x ], αx , and
Multiclass Queues under Various Service Disciplines 285

αx−dx . For an infinitesimal dx, note that λx = λ dG(x) dx dx. This is because λx
corresponds to the arrival rate of customers with service time in the interval
(x, x + dx) which equals λ times the probability that an arrival has service
time in the interval (x, x + dx), and that is exactly the PDF of service times at x
multiplied by dx. Also, as dx → 0, we have E[S2x ] → x2 (since the service time
in the interval (x, x + dx) converges to a deterministic quantity x as dx → 0).
→ 0, we need to compute αx . By definition, if x were countable,
Finally, as dx
then αx = xz=0 λz E[Sz ] which by letting x dx → 0 and using the result for λz ,
we get in the uncountable case αx = z=0 λzdG(z) realizing that E[Sz ] → z.
x
Therefore, we have as dx → 0, αx → αx−dx → 0 λtdG(t).
Using the results for Wiq in Section 5.2.2 for class i corresponding to
service time in (x, x + dx), we get by letting dx → 0
∞
1
λy2 dG(y) 1
2 λE[S ]
2 2
y=0
Wxq = x 2 = 2
,
1− λtdG(t) (1 − ρ(x))
0

where
x
ρ(x) = λtdG(t).
0

Of course we can immediately compute Wx as x + Wxq , and thereby

∞
W= Wx dG(x),
0
∞
Wq = Wxq dG(x).
0

5.4.2 Preemptive Shortest Job First

The service-scheduling policy preemptive shortest job first (PSJF) considered
here is a lot similar to what we just saw. The only difference is that preemp-
tion is allowed here. For the sake of completeness, we go ahead and describe
it formally. Customers arrive according to a Poisson process, and upon
arrival customers reveal their service time requirement by sampling from
a distribution with CDF G(·), mean E[S], and second moment E[S2 ]. Assume
that the system is stable, that is, λE[S] < 1. In the PSJF policy, we consider a
scheduling discipline that gives preemptive priority to jobs with shorter pro-
cessing time. In other words, the server always selects the customer with the
shortest processing time. However, during the service of this customer, if a
customer with shorter processing time arrives, then that customer preempts
286 Analysis of Queues

the one in service. It is crucial to realize that the server only uses the initially
declared service times for determining priorities but resumes from where the
service was completed. To implement this, the server stores jobs in the queue
by sorting according to the total service time and always serving customers
on the top of the list. Therefore, this policy is the continuous analog of the
preemptive resume priority considered in Section 5.2.3. In fact, we merely
use those results to derive Wx here which we do next as a problem.

Problem 48
Derive an expression for the sojourn time for a request with service time x
under PSJF.
Solution
Consider the preemptive resume priority discipline analyzed in Section 5.2.3.
First, let the number of classes K in that setting go to infinite. To map from
class i in that setting to class x here, if the service time is in the interval (x, x +
dx), that customer belongs to class x. Note that if x + dx < y, then class
x is given higher preemptive priority than y which is consistent with PSJF
requirements. We need to derive the expected time a customer with service
time x spends in the system, Wx . Recall from the corresponding expression
in Section 5.2.3, we need expressions for λx , μx , E[S2x ], αx , and αx−dx . For an
infinitesimal dx, note that λx = λ dG(x) dx dx. This is because λx corresponds to
the arrival rate of customers with service time in the interval (x, x+dx) which
equals λ times the probability that an arrival has service time in the interval
(x, x+dx), and that is exactly the PDF of service times at x multiplied by dx. As
dx → 0, μx → 1/x since the service time would just be x. Also, as dx → 0, we
have E[S2x ] → x2 (since the service time in the interval (x, x + dx) converges to
a deterministic quantity x as dx → 0). Finally, as dx → 0, we need to compute
αx . By definition, αx = xz=0 x λz E[Sz ] which by letting dx → 0 and using the
result for λz , we get αx = z=0 λzdG(z) realizing that E[Sz ] → z. Therefore,
x
we have as dx → 0, αx → αx−dx → 0 λtdG(t).
Using the results for Wx in Section 5.2.3 for class i corresponding to
service time being in (x, x + dx), we get by letting dx → 0

x
λy2 dG(y) 1
x y=0 x 2 λ(x)
Wx = x +
x 2 = 1 − ρ(x) + ,
1− z=0 λzdG(z) 2 1− λzdG(z) (1 − ρ(x))2
z=0

where

x
(x) = y2 dG(y)
0
Multiclass Queues under Various Service Disciplines 287

and

x
ρ(x) = λtdG(t).
0

Of course we can immediately compute Wxq as Wx − x, and thereby

∞
W= Wx dG(x),
0
∞
Wq = Wxq dG(x).
0

5.4.3 Shortest Remaining Processing Time

The shortest remaining processing time (SRPT) scheme is very similar to the
PSJF we just considered. The key difference is that instead of sorting cus-
tomers according to the initial service times in PSJF, here they are sorted
according to the remaining processing times. Hence the name SRPT. To
clarify SRPT, customers arrive according to a Poisson process with rate λ.
Upon arrival, customers reveal their service time requirement by sampling
from a distribution with CDF G(·), mean E[S], and second moment E[S2 ].
Assume that the system is stable, that is, λE[S] < 1. The SRPT policy is a
scheduling discipline that gives a preemptive priority to jobs with shorter
remaining processing time. In other words, the server always selects the cus-
tomer with the shortest remaining processing time. However, during the
service of this customer, if there are others that arrive with shorter pro-
cessing times than what is remaining for this customer, then the customer
in service gets preempted. To implement this, the server stores jobs in the
queue by sorting according to the remaining service time. Unlike the pre-
vious two policies, the class of a customer in service keeps changing here
in SRPT and, hence, cannot be analyzed like those. We adopt a different
but related technique which has been directly adapted from Schrage and
Miller [95].
Our objective is to compute Wx , the expected time from when an arrival
with service time requirement of x occurs in steady state till that customer’s
service is complete, that is, mean sojourn time in steady state of a customer
with service time x. To compute Wx , we divide the sojourn time into two
intervals, one from the time of arrival till the customer starts getting served
for the first time, and the second interval is from that time till the end of the
sojourn. Hence, we write that as a sum

Wx = Vx + Rx ,
288 Analysis of Queues

where
Vx is the expected time for an arriving customer with service time x to
begin processing by the server (note that until that time, the remaining
processing time is equal to service time x)
Rx is the expected time from when this customer enters the server for the
first time until service is completed (during this time, the server could
get preempted by arriving customers with service time smaller than the
remaining processing time for the customer in question)

We first obtain an expression for Vx and then for Rx . To obtain Vx , we

consider a class x customer, that is, one with a service time requirement of x.
The average arrival rate of customers with service times x less than or equal to
x is λG(x). In the long run, a fraction of time-equal to 0 λudG(u) there would
be customers at the server with original service time less than or equal to x.
This is essentially our definition of ρ(x). Therefore, here too

x
ρ(x) = λudG(u).
0

Further, the probability that an arriving customer will see the server with a
customer whose remaining service time is less than x is β(x), given by

β(x) = ρ(x) + λx(1 − G(x))

where λx(1 − G(x)) is the long-run fraction of time the server serves cus-
tomers with remaining processing time less than x, although initially they
had more than x service time to begin with. A busy period of type x is defined
as the continuous stretch of time during which the server only processes
customers with remaining processing time less than x. It is crucial to point
out that if a class x customer arrives during a busy period of type x (note
that this happens with probability β(x)), that customer waits till the busy
period ends to begin service. Of course if this class x customer arrives at
a time other than during a busy period of type x, the customer immedi-
ately gets served by the server. Therefore, by conditioning on whether or
not an arriving class x customer encounters a busy period of type x and then
unconditioning, we get

Vx = β(x)E[Be (x)]

where Be (x) is the remaining time left in the busy period of type x.
Since Be (x) is the equilibrium random variable corresponding to B(x), the
length of the busy period of type x, from renewal theory we can write down
E[Be (x)] as
Multiclass Queues under Various Service Disciplines 289

E B(x)2
E[Be (x)] = .
2E[B(x)]

Thus, all we need to compute are E[B(x)2 ] and E[B(x)]. For this we need
another notation τ(x), the remaining service time of the job that initiated a
type x busy period. Note that when a busy period of type x is initiated, there
would be exactly one customer in the system with remaining processing time
not greater than x, and this customer initiates the busy period. This can hap-
pen in two ways: (1) if a customer with service time t such that t < x arrives
when a busy period of type x is not in progress, then this will start a busy
period of type x with τ(x) = t so that the probability that a given customer
initiates a busy period of type x and the initiating customer has remaining
processing time between t and t + dt is (1 − β(x))dG(t); (2) if a customer that
has an original processing time greater than x initiates a type x busy period as
soon as this customer’s remaining time reaches x so that the probability that
a given customer initiates a busy period of type x and the initiating customer
has remaining processing time x is (1 − G(x)). Therefore, the probability that
a customer initiates a busy period of type x is

x
(1 − β(x))dG(t) + (1 − G(x)) = 1 − G(x)β(x).
0

We also have
x
0 tdG(t) + x(1 − G(x))
(1 − β(x))
E[τ(x)] = ,
1 − G(x)β(x)

2
(1 − β(x)) 0x t2 dG(t) + x2 (1 − G(x))
E τ(x) = .
1 − G(x)β(x)

The remaining customers that are served in a busy period of type x arrive
after the busy period is initialized and have service times less than x. Let Sx
be the service times of one such customer, then clearly we have
x
0 tdG(t)
E[Sx ] = ,
G(x)
x
0 t2 dG(t)
E S2x = .
G(x)

It is crucial to note that the busy period distribution would be identical to

the case of an M/G/1 with any work-conserving service-scheduling disci-
pline for which customers that initiate the busy period has processing times
according to τ(x), the customers that arrive during the busy period have
processing times according to Sx , and arrivals according to PP(λG(x)). Next
290 Analysis of Queues

we solve a problem where we select an appropriate scheduling discipline to

compute the first two moments of the busy period.

Problem 49
Using the notation and description from previous text, show that

E[τ(x)] β(x)
E[B(x)] = = ,
1 − ρ(x) λ(1 − G(x)β(x))

2
E τ(x)2 E S2x
E B(x) = + λG(x)E[τ(x)] .
{1 − ρ(x)}2 (1 − ρ(x))3

Solution
The busy period B(x) can be computed by selecting an appropriate schedul-
ing discipline. First serve the customer that initializes the busy period and
this takes τ(x) time. During this time τ(x), say N(τ(x)) new customers arrived
with service times smaller than x according to a Poisson process with param-
eter λG(x). After τ(x) time, we serve the first of the N(τ(x)) customers (if
there is one) and all the customers that arrive during this service time that
have service times smaller than x. Note that this time is identical to that of
the busy period of an M/G/1 queue with PP(λG(x)) arrivals and service times
according to Sx . Once this “mini” busy period is complete we serve the sec-
ond (if any) of the N(τ(x)) for another mini busy period and then continue
until all the N(τ(x)) customers’ mini busy periods are complete. Thus, we can
write down

N(τ(x))
B(x) = τ(x) + bi (x), (5.4)
i=1

where bi (x) is the ith mini busy period of an M/G/1 queue with PP(λG(x))
arrivals and service times according to Sx . Note that bi (x) over all i are IID
random variables. Based on one of the exercise problems of Chapter 4 on
computing the moments of the busy period of an M/G/1 queue, we have

E[Sx ]
E[bi (x)] = ,
1 − λG(x)E[Sx ]

2 E S2x
E[bi (x) ] = .
(1 − λG(x)E[Sx ])3

Also, N(τ(x)) is a Poisson random variable such that E[N(τ(x))] = λG(x)τ(x)

and E[N(τ(x))2 ] = λG(x)τ(x) + {E[N(τ(x))]}2 . Using Equation 5.4 we can
write down
Multiclass Queues under Various Service Disciplines 291

τ(x)
E[B(x)|τ(x)] = τ(x) + E[N(τ(x))]E[bi (x)] = ,
1 − ρ(x)

E B(x)2 τ(x) = τ(x)2 + E[N(τ(x))]E bi (x)2 + 2τ(x)E[N(τ(x))]E[bi (x)]
+ E[N(τ(x)){N(τ(x)) − 1}]{E[bi (x)]}2

2 E S2x
= τ(x) + λG(x)τ(x)
(1 − λG(x)E[Sx ])3
E[Sx ]
+ 2τ(x)2 λG(x)
1 − λG(x)E[Sx ]
2
E[Sx ]
+ {λG(x)τ(x)}2
1 − λG(x)E[Sx ]

τ(x)2 E S2x
= + λG(x)τ(x) .
{1 − ρ(x)}2 (1 − ρ(x))3

Taking expectations (thereby unconditioning τ(x)), we get

E[τ(x)]
E[B(x)] = ,
1 − ρ(x)

E τ(x)2 E S2x
E B(x)2 = + λG(x)E[τ(x)] .
{1 − ρ(x)}2 (1 − ρ(x))3

Also, using the expression for E[τ(x)] and the relationship

β(x) = ρ(x) + λx(1 − G(x)),

we can rewrite E[B(x)] as β(x)/(λ(1 − G(x)β(x))).

Plugging the expressions for E[B(x)] and E[B(x)2 ] into Vx = β(x)E[B(x)2 ]/

(2E[B(x)]) and by appropriately substituting the relevant terms such as
E[τ(x)] and E[τ(x)2 ], we get
x
λ 0 t2 dG(t) + x2 (1 − G(x))
Vx = .
2(1 − ρ(x))2

Having computed Vx , next we compute Rx so that we can add them to get

Wx . Recall that Rx is the average time for a customer that just started to get
served (hence, remaining service time is x) to complete the service includ-
ing possibly being interrupted by customers with service times smaller than
the remaining service time for the customer in question. For this we dis-
cretize the service time x into infinitesimal intervals of length dt. Consider
an arbitrary interval t + dt to t. The expected time for the remaining service
292 Analysis of Queues

time to go from t + dt to t is equal to dt plus the expected time for interrup-

tions in service. The probability that a service will be interrupted from t + dt
to t is the probability that an arrival would occur in that time dt (which is
equal to λdt), and that arrival would be one with service time smaller than t
(which happens with probability G(t)). Thus, the probability of being inter-
rupted is λdtG(t). The expected duration of the interruption (given there was
one) can be computed as the busy period of an M/G/1 queue with only
customers that have service times smaller than t from our original system.
However, since our discipline is work conserving, the busy period would
be identical to that of an equivalent FCFS queue which we know is equal
to E[S(t)]/(1 − λ(t)E[S(t)]), where E[S(t)] is the mean service time and λ(t)
the arrival rate, with the parameter t used to distinguish from other service
t
times and arrival rates. We can show that E[S(t)] = 0 udG(u)/G(t) since
the CDF of the service time given it is smaller than t is G(·)/G(t). Likewise,
λ(t) = λG(t), which is the rate at which customers with service times smaller
than t arrive.
Hence, we can write down the expected time for the remaining service
time to go from t + dt to t as
t
E[S(t)] 0 udG(u)/G(t)
dt + λdtG(t) = dt + λdtG(t) t
1 − λ(t)E[S(t)] 1 − λG(t) 0 udG(u)/G(t)
ρ(t)dt
= dt + ,
(1 − ρ(t))
t
where ρ(t) = λ 0 udG(u) as defined earlier. Before proceeding ahead to use
this to obtain Rx , it is worthwhile to point out that in the previous expression
we have ignored higher-order terms that would vanish as dt approaches 0.
The key factor in that is that we have only considered a maximum of one
interruption in the interval of length dt, and this is reasonable based on the
definition of Poisson processes which states that the probability of two or
more events in a Poisson process in time dt is of the order of (dt)2 or higher.
Thus, we can write down Rx by integrating the previous expression from 0
to x as
x ρ(t)dt x
dt
Rx = dt + = .
(1 − ρ(t)) (1 − ρ(t))
0 0

Therefore, we can immediately write down

Wx = Vx + Rx
x x
λ 0 t2 dG(t) + x2 (1 − G(x)) dt
= + .
2(1 − ρ(x))2 (1 − ρ(t))
0
Multiclass Queues under Various Service Disciplines 293

Of course we can then compute Wx as x + Wxq , and thereby

∞
W= Wx dG(x).
0

5.5 Optimal Service-Scheduling Policies

So far, we have only considered obtaining performance measures for each
class of a multiclass system for a given service-scheduling discipline. In this
section, we turn the tables somewhat and attempt to go in the other direction,
namely, obtaining the service-scheduling discipline (or policy) that would
optimize a given metric, which could be a combination of performance mea-
sures. The objective of this section is purely for the sake of completeness and
is not to illustrate methodologies to obtain the optimal scheduling policies.
However, in the spirit of describing analysis, an outline of the methodology
would be provided in each case.

5.5.1 Setting and Classiﬁcation

In this section on determining the optimal service-scheduling policies for
queueing systems, we first describe some settings and ground rules. We con-
sider an infinite-sized M/G/1 queue with K classes. Class-i customers arrive
into the queue according to PP(λi ) (i = 1, 2, . . . , K). Each class-i customer
requires service times that are IID with mean E[Si ] = 1/μi , second moment
E[S2i ], and CDF Gi (·) (for i = 1, 2, . . . , K). We continue to use ρi = λi /μi and
ρ = ρ1 + · · · + ρK < 1 for stability. We assume that the time required to switch
serving one class to another is zero; hence, it would not matter whether or
not we stored each class (or each customer for that matter) in a separate infi-
nite capacity queue. We also only consider scheduling policies that are work
conserving with the understanding that for most of the objectives we con-
sider, a nonwork-conserving policy would perhaps fare only poorer. Also
as we described earlier, the amount of workload in the system at any time
would be unaffected by the scheduling discipline used. This is a feature that
we would take advantage of in some instances. A natural question to ask is:
What else is needed besides work conservation?
The scheduling policies that we will consider can broadly be classified
into four categories depending on (1) whether there could be more that
one customer in the system with partially completed service, and also (2)
whether the service times are revealed upon arrival. Sometimes in the litera-
ture when there can be more than one customer with partially completed
service under a certain policy, then that policy is also called preemptive
294 Analysis of Queues

for good reasons. But it is not just the strictly preemptive policies that are
included; policies such as processor sharing (which are only partially pre-
emptive) should also be considered in that group. Further, when the service
times are revealed upon arrival, it is sometimes called anticipative (although
anticipative could include a much broader class of policies, not just reveal-
ing service times upon arrival). Therefore, as examples of the four classes of
policies we have:

1. Maximum of one customer with partially completed service and service

times not revealed upon arrival: FCFS, nonpreemptive LCFS, random
order of service, nonpreemptive priority, and round-robin. Among
the classes of policies, this is minimal in terms of bookkeeping and
information gathering.
2. More than one customer with partially completed service and service
times not revealed upon arrival: Processor sharing, preemptive resume
priority, preemptive LCFS, timed round-robin, and least-attained
service (will explain this subsequently). Although we do not need
to know the service times upon arrival, there is a lot of overhead in
terms of bookkeeping for customers that have partially completed
service.
3. Maximum of one customer with partially completed service and service
times revealed upon arrival: SPTF. Although some sorting to deter-
mine the next customer to serve is needed and also some way to
gage service time requirements upon arrival, these policies are still
reasonably less intensive.
4. More than one customer with partially completed service and service times
revealed upon arrival: PSJF and SRPT. Among the classes of policies,
this is perhaps the most intensive in terms of bookkeeping and infor-
mation gathering. However, one can gain a lot of insights from
considering these policies as they significantly outperform other
policies.

Having discussed the overheads in terms of bookkeeping and informa-

tion gathering, next we briefly touch upon the notion of fairness across the
various policies. What is considered fair depends on the application, the
individual, and a score of other factors that are difficult to capture using
physics. In human lines, if a server says “may I help the next person in
line,” clearly the one that arrived earliest among the ones waiting goes to
the server. Thus, FCFS has been accepted as a fair policy especially when the
workload each customer brings is unknown. Oftentimes in grocery stores
where the workload each customer brings can be somewhat assessed, it is
not uncommon that a person with a full cart lets someone behind them with
one or two items to be served first (although this is never initiated by the
Multiclass Queues under Various Service Disciplines 295

cashier, presumably that is why there is a separate line for customers with
fewer items in a grocery store). Thus, when the service times are known
upon arrival, it appears like a fair thing to do is the sojourn times be pro-
portional to the service time. For example, when there is a single class (i.e.,
K = 1), the mean sojourn time for a customer with service time x under
processor sharing scheme is x/(1 − ρ). Thus, the mean sojourn time is pro-
portional to the service time, and it would not be terribly unreasonable to
consider processor sharing as a “fair” policy. That is why many computer
system CPUs adopt roughly a processor sharing regime where the CPU
spends a small time (called time quantum) for each job and switches con-
text to the next job. Interestingly, the preemptive LCFS policy also has mean
sojourn time for a customer with service time x as x/(1 − ρ). However, it
is unclear if preemptive LCFS would be considered “fair” by the customers
although it certainly is for the service provider. Therefore, while determining
an optimal policy for a queueing system (which we will see next), it becomes
crucial to consider the perspectives of both the customers and the service
providers.
The optimal service-scheduling policy in a multiclass queueing system
depends greatly on the choice of objective function. The issue of fair-
ness becomes extraordinarily important if the customers can observe the
queues, in which case FCFS or anything considered fair by the users must
be adopted. However, for the rest of this chapter we assume that the
customers cannot observe the queue; however, the service provider has real-
time access to the queue (usually number in the system, sometimes service
time requirements and amount of service complete). The service provider
thus uses a scheduling policy that would optimize a performance measure
that would strike a balance between the needs of the customers and the
service provider. We consider one such objective function which is to min-
imize the mean number of customers in the system. In particular, let Lπ i
represent the average number of class-i customers in the system in steady
state under policy π (for i = 1, . . . , K). Then, the objective function is to
determine the optimal service scheduling among all work-conserving poli-
cies that minimizes the total number in the system. Hence, our objective
function is

K
min Lπ
i .
π∈
i=1

Using Little’s law, the objective function is equivalent to minimizing the

mean sojourn time (W π ) over all policies π, which is essentially
the original
K
objective function divided by a constant, that is, Li /(λ1 + · · · + λK ).
π
i=1
Toward the end of this section, we would mention other objectives that one
296 Analysis of Queues

could consider. Next we are ready to describe the optimal policies. We do

that in two stages: first for the single class, that is, K = 1 and subsequently
for a general K class system.

5.5.2 Optimal Scheduling Policies in Single Class Queues

Here we consider the single class case, that is, K = 1, to develop insights and
intuition into the optimal policy. Since we only have one class, we drop the
subscript and seek to find the policy π that minimizes Lπ or W π which are
effectively equivalent. We have so far allowed π to be any work-conserving
discipline. However, depending on restrictions imposed on the system (such
as whether service times are known upon arrival, and whether more than
one job with partially completed service is allowed), an appropriate optimal
policy would need to be selected. We describe the optimal policies under
various conditions next.

1. Service times can be known upon arrival: If it is possible to know the

service time of each customer upon arrival, then the optimal policy
that minimizes W is SRPT. For a discussion on performance analysis
of SRPT, refer to Section 5.4.3. Here we give an idea of why SRPT is
indeed optimal. Let X(t) be the number of customers in the system
at time t. We can use a sample path argument to show that SRPT
yields the smallest X(t) among all work-conserving policies (). Let
W(t) be the workload at time t, and it can be written as the sum
of the remaining processing times of the X(t) customers. Consider
SRPT and any other work-conserving policy π so that the num-
ber of customers in the respective queues are X(t)SRPT and X(t)π .
We would like to show that X(t)SRPT ≤ X(t)π . Pick any j such that
j ≤ min{X(t)SRPT , X(t)π }. By the definition of SRPT, the sum of the
j largest remaining processing times for policy π would be smaller
than that of SRPT (since SRPT would have favored the remaining
jobs). However, as described before, the sum of the remaining pro-
cessing times of all jobs for either policy should be equal. This is
possible only when there are more jobs in π than SRPT, that is, if
X(t)SRPT ≤ X(t)π . Therefore, clearly L would be smallest for SRPT
(and thereby W). Note that this argument holds for any arrival
and service process; hence, SRPT is optimal for a much wider class
of systems, not just M/G/1 queues. If preemption is not allowed,
then it is possible to show using a similar argument that SPTF is
optimal.
2. Service times unknown upon arrival; many jobs with partially complete
service allowed: Consider the case where it is not possible to know
the service time of each customer upon arrival (i.e., nonanticipative);
however, we allow more than one job to have partially complete
Multiclass Queues under Various Service Disciplines 297

service. Then the optimal policy that minimizes W is what is known

as Gittins index. To explain the Gittins index policy, we need some
notation. Let a be the amount of attained service (i.e., amount of
completed service) for an arbitrary customer in a queue. Then define
k(a, x) as
a+x
dG(y)
k(a, x) = a+xa . (5.5)
a (1 − G(y))dy

This is indeed the hazard rate function when x = 0. We will use the
relation subsequently. Define κ(a) as

κ(a) = max k(a, x)

x≥0

and x∗ = arg maxx≥0 k(a, x), that is, the value of x that maximizes
k(a, x). Then, the Gittins index policy works as follows: at every
instant of time, serve the customer with the largest κ(a). To imple-
ment this, select the customer with the largest κ(a) and serve this
customer until whichever happens first among the following: (1) the
customer is served for time x∗ , (2) the customer’s service is com-
plete, or (3) a new customer arrives with a higher Gittins index. The
proof that the Gittins index policy is fairly detailed is not presented
here. However, the key idea is to target the application with the
highest probability of completing service within a time x and at the
same time have a low expected remaining service time. Although
not exactly, the k(a, x) measure roughly captures that. We saw ear-
lier that if the service times are known upon arrival, then SRPT is the
best, and when they are unknown, we do what best we can based on
the information we have (which is what Gittins index policy does).
Two special cases of Gittins index policy are discussed subsequently
when the hazard rate functions are monotonic.
3. Service times unknown upon arrival; only one job with partially complete
service allowed: Here we consider the case where it is not possible to
know the service time of each customer upon arrival (i.e., nonan-
ticipative), and it is not possible to have more than one partially
complete service. Interestingly, in this restricted framework every
policy would yield the same W. Therefore, all work-conserving
policies in this restricted framework (such as FCFS, nonpreemptive
LCFS, random order of service, etc.) are optimal. Thus, a policy such
as FCFS which is generally fair would be ideal.

Before moving onto multiclass queues, we describe two special cases of

Gittins index policies by framing it as a problem.
298 Analysis of Queues

Problem 50
When the service times are unknown upon arrival and many jobs with par-
tially complete service are allowed, then Gittins index policy is optimal.
Show that if the service times are IFR, then the Gittins index policy reduces
to FCFS-like policies that do not allow more than one job to be partially
complete (although that is not a requirement). Also describe the special case
optimal policy when the service times are DFR.
Solution
Refer to Aalto et al. [1] for a rigorous proof; we just provide an outline
based on that paper here. From the definition of k(a, x) given in Equation 5.5,
we have
x x
k(a, x) (1 − G(a + y))dy = g(a + y)dy
0 0

where g(y) is the PDF of the service times, that is, dG(y)/dy. By taking
derivative with respect to x of this equation, we get

∂k(a, x)
x
(1 − G(a + y))dy + k(a, x)[1 − G(a + x)] = g(a + x).
∂x
0

We can rewrite this expression in terms of the hazard (or failure) rate
function h(y) defined as

dG(y)/dy
h(y) =
1 − G(y)

which would yield

∂k(a, x) [h(a + x) − k(a, x)]

= x [1 − G(a + x)].
∂x 0 (1 − G(a + y))dy

From this expression we can conclude that by keeping a a constant, k(a, x)

would be increasing (or decreasing, respectively) with respect to x if h(a + x)
is greater than (or less than, respectively) k(a, x).
Recall that by service times being IFR (or DFR, respectively), we mean
that h(x) is increasing (or decreasing, respectively) with respect to x. Thus, if
the service times are IFR (or DFR, respectively), h(a+u) ≤ (or ≥, respectively)
h(a + x) for all u ≤ x. Since we can rewrite k(a, x) in Equation 5.5 as
x
0 h(a + u)[1 − G(a + u)]du
k(a, x) = x
0 (1 − G(a + u))du
Multiclass Queues under Various Service Disciplines 299

clearly the RHS is ≤ (or ≥, respectively) h(a+x) if the service times are IFR (or
DFR, respectively). Therefore, if the service times are IFR, k(a, x) ≤ h(a + x),
and if they are DFR, k(a, x) ≥ h(a + x). However, we showed earlier that
k(a, x) would increase (or decrease, respectively) with respect to x if h(a + x)
is greater than (or less than, respectively) k(a, x). Hence, we can conclude
that if the service times are IFR (or DFR, respectively), k(a, x) would increase
(or decrease, respectively) with respect to x. Thus, we can compute κ(a) =
maxx {k(a, x)} when the service times are IFR and DFR. In particular, we can
show the following:

• If service times are IFR

1 − G(a)
κ(a) = k(a, ∞) = ∞ .
0 [1 − G(a + t)]dt

This can be used to show that κ(a) increases with a (using a very
similar derivative argument, left as an exercise for the reader). Since
Gittins index policy always picks the job with the largest κ(a), the
optimal policy is to select the job with the largest attained service a.
Consider a queue that is empty, since the discipline is work con-
serving, an arriving customer is immediately served. This customer
is never interrupted because we always serve the customer that has
the largest attained service. When this customer completes service
if there are many jobs to choose from, any of them can be picked
since they all have a = 0 and served. Again this customer is never
interrupted until service is complete. Thus, any FCFS-like policy that
does not allow more than one job to be partially complete is optimal.
Note that all of these FCFS-like policies yield the same L or W.
• If service times are DFR

κ(a) = k(a, 0) = h(a).

Since h(a) is decreasing, clearly we have κ(a) decreasing with respect

to a. Since Gittins index policy always picks the job with the largest
κ(a), the optimal policy is to select the job with the smallest attained
service a. This policy is known as least-attained service (LAS) or
foreground-background (FB, also called forward-backward or even
feedback). This policy is a little tricky and deserves some expla-
nation. Consider a queue that is empty, since the discipline is
work conserving, an arriving customer is immediately served. This
customer is interrupted if a customer arrives because the arriving
customer would have LAS. This new customer is served until either
a time equal to how much the first customer was served elapses
or another customer arrives. If it is the latter, then the new cus-
tomer goes through what the previous new customer went through;
300 Analysis of Queues

however, the former would result in going back and forth between
the two customers that have equally attained service. Hence the
name FB.

5.5.3 Optimal Scheduling Policies in Multiclass Queues

Here we consider the more generic M/G/1 queue with K classes as described
earlier in this section. Recall that our objective is to find the policy π that
minimizes the total number in the system on average in steady state, that
K
is, Lπ . We return to the only restriction on π being that it can be any
i=1 i
work-conserving discipline. However, depending on restrictions imposed
on the system (such as whether or not service times are known upon arrival,
and whether or not more than one job with partially completed service is
allowed), a different policy may be optimal. We describe the optimal policies
under various conditions as follows:

1. Service times can be known upon arrival: If it is possible to know the

service time of each customer upon arrival, then the optimal policy
that minimizes L = L1 + · · · + LK is SRPT. This is identical to what we
saw in the single class case, that is, K = 1 (the reader is encouraged
to go over the SRPT explanation in the previous section for K = 1).
However, that result should not be surprising which is the proof of
why SRPT is optimal given it is so when K = 1. Consider another
queueing system where customers arrive according to PP(λ) such
that λ = λ1 + · · · + λK , and when a customer arrives the service time
is sampled from Si (service time of a class-i customer in the original
system) with probability λi /λ and revealed to the server. This new
single class system with one aggregate class is identical to the origi-
nal system with K classes. However, we know that SRPT is optimal
for the new system if our objective is to minimize the number in the
system (this is based on the single class results). Thus, SRPT would
be optimal for the original K class system. For a discussion on per-
formance analysis of SRPT, refer to Section 5.4.3. In the performance
analysis, the parameters to use is the new queueing system with a
single aggregated class. Before moving on to the next case, it is cru-
cial to point out that if we considered other objectives (such as a
weighted sum of queue lengths or sojourn times), then SRPT may
not be optimal.
2. Service times unknown upon arrival; many jobs with partially complete
service allowed: Consider the case where it is not possible to know the
service time of each customer upon arrival (i.e., nonanticipative);
however, we allow more than one job to have partially complete
service. In the single class version (K = 1) in the previous section,
Multiclass Queues under Various Service Disciplines 301

we saw that the Gittins index policy minimizes L in the case consid-
ered. Would that work here too for a general K? As it turns out, the
functions used in the Gittins index (such as k(a, x) and κ(a)) are class-
dependent. Therefore, one has to be careful in writing down the
Gittins index parameter for each customer in the system. However,
if we are able to do that, then indeed the Gittins index policy would
be optimal (see Theorems 1 and 2 in Aalto et al. [1] where what we
refer to as Gittins index policy is what they call Gittins index quan-
tum policy). The idea of the proof is similar to that when there is a
single class and the reader is encouraged to refer to that. Now we
explain the policy in the general K class case. Let a be the amount of
attained service (i.e., amount of completed service) for an arbitrary
customer in a queue. Then define ki (a, x) for a class-i customer in the
system as
a+x
dGi (y)
ki (a, x) = a+xa .
a (1 − Gi (y))dy

Define κi (a) for a class-i customer as

κi (a) = max ki (a, x)

x≥0

and x∗i = arg maxx≥0 ki (a, x), that is, the value of x that maximizes
ki (a, x). Then, the Gittins index policy works as follows: at every
instant of time, serve the customer with the largest κi (a) over all cus-
tomers of all classes i. To implement this, from all the customers
in the system (belonging to various classes) select the one with the
largest κi (a). Say this customer is of class j. Serve this class-j customer
until whichever happens first among the following: (1) the cus-
tomer is served for time x∗j , (2) the customer’s service is complete, or
(3) a new customer arrives with a higher Gittins index.
Now we briefly discuss two special cases of Gittins index policy,
that is, when the hazard rate functions are monotonic: (1) If the ser-
vice time distributions of all K classes are DFR, then the Gittins index
policy reduces to serving the job with the highest failure rate hi (a).
Since the service times are DFR, within a class we always use LAS.
However, across classes we need to compare the hazard rate (or fail-
ure rate) functions of the least attained service customer in each class
and serve the one with the highest failure rate. An interesting case
is when the failure rate functions hi (x) do not overlap, then we can
order the classes according to hazard rate and use a preemptive pri-
ority policy (and LAS within a class) that assigns highest priority
to the class with the highest hazard rate function. (2) If the service
time distributions of all K classes are IFR, then the Gittins index
302 Analysis of Queues

policy reduces to serving the customer with the shortest expected

remaining service time. Since the service times are IFR, within a
class we always use most attained service (an example of which
is FCFS, as we discussed earlier any policy with at most one cus-
tomer with partially complete service would be optimal). However,
across classes we need to compare the expected remaining service
times of the most attained service customer in each class and serve
the one with the smallest expected remaining service times. Note
that we could have more than one customer with partially com-
plete service across the classes (although within a class that is not
possible).
3. Service times unknown upon arrival; only one job with partially complete
service allowed: Here we consider the case where it is not possi-
ble to know the service time of each customer upon arrival (i.e.,
nonanticipative), and it is not possible to have more than one par-
tially complete service.
Since our objective is to find a policy π that
minimizes Lπ = i Lπ i a natural choice is to give highest nonpre-
,
emptive priority to the class with shortest expected service times.
Also within a class, the jobs can be served according to FCFS or
any policy that is nonanticipative and allows at most one job with
partially completed service. Before explaining why that policy is
optimal, we can consider a slightly extended version. Say our objec-
tive is to minimize some weighted sum of the number in the system
K
ci Lπ
i across all allowable policies π. In the previous case, we
i=1
had all ci values as one. The optimal policy is the nonpreemptive
priority policy (see Section 5.2.2). Within a class, the server serves
according to FCFS and across classes the server gives nonpreemp-
tive priority to the class with the highest ci /E[Si ]. This is precisely
the c − μ rule we saw in Problem 46. An elegant proof that the
nonpreemptive policy is optimal uses the concept achievable region
approach nicely detailed in Green and Stidham [48]. Consider Prob-
lem 44 where we showed one of the strong conservation results
that the sum

K
ρi Wiq
i=1

is conserved over all work-conserving policies with only one job

with partially completed service allowed. Our objective is to find
a policy π that satisfies
the previous strong conservation result and
also minimizes i ci λi Wiπ . We can easily write down the objective
function in terms of Wiq ; then the problem can be solved as a linear
program with the optimal solution at one of the corner points of the
Multiclass Queues under Various Service Disciplines 303

polyhedron formed by K i=1 ρi Wiq = Kc , where Kc is a constant that
can be computed for say FCFS. This polyhedron feasible region is
the achievable region described earlier. The proof is to mainly show
that the nonpreemptive policy is indeed one of the corner points and
the one that minimizes the objective function.

There are innumerable other optimal scheduling problems considered

in the literature. A closely related problem is the armed bandits problem
which has received a lot of attention where researchers have developed
index policies. An immediate extension to the previous problem is to con-
sider more general cost functions, not just the weighted sum with linear
weights. Another popular extension is to consider more complex costs struc-
ture (such as a combination of holding costs, penalty costs, rewards, etc.)
which can typically be formulated as a stochastic dynamic program which
yields switching-curve-like policy. Therefore, the server would look at (for
example) the queue lengths of various classes and select the appropriate one.
A natural extension that is much harder for performance analysis is multi-
server systems (because they are usually not work conserving). However,
the structure of the optimal policy can be derived for multiserver queues.
Also, as a last word it is important to mention that the field of stochastic
scheduling is mature with a lot of fascinating results and analysis techniques
for which this section does not do justice in terms of coverage.

Reference Notes
Analysis of queues with multiple classes of customers can be approached
from many angles as evident from the literature, namely, based on applica-
tions, based on objectives such as performance analysis versus optimization,
and also theory versus practice. However, a common thread across the
angles is the objective where the server needs to decide which class of cus-
tomer to serve. For that reason, this is also frequently referred to as stochastic
scheduling. The topic of stochastic scheduling has recently received a lot
of attention after a surge of possibilities in computer network applications.
Although the intention of this chapter was to provide a quick review of
results from the last 50 years of work in single-station, single-server, mul-
ticlass queues, a large number of excellent pieces of work had to be left out.
The main focus of this chapter has been to present analytical expressions
for various performance measures under different service-scheduling poli-
cies. These are categorized based on how the customers are classified, that
is, depending on type, location, or service times.
This chapter brings together some unique aspects of queues and the the-
oretical underpinnings for those can be found in several texts. For example,
the fundamental notion of work conservation has been greatly influenced
304 Analysis of Queues

by some excellent texts on queueing such as Kleinrock [63], Heyman and

Sobel [53], and Wolff [108]. Those books also address the first set of perfor-
mance analysis in this chapter, namely, priority queues. For more details on
work conservation for application to computer systems, refer to Gelenbe and
Mitrani [45], and for a more theoretical treatment see Baccelli and Bremaud
[7]. The next set of models for performance analysis namely for polling sys-
tems in following the pioneering work of Takagi [101]. Then, the third set
has been motivated by some recent research on anticipative policies, espe-
cially when the service times are known upon arrival. An excellent resource
for that is Harchol-Balter [51]. The last section of this chapter looks at finding
the best scheduling policy that would minimize the number in the system.
That section is based on a recent paper by Aalto et al. [1]. It nicely presents all
the results on the optimality of various policies in single as well as multiclass
M/G/1 queues.

Exercises
5.1 Consider a repair shop that undertakes repairs for K different
types of parts. Parts of type i arrive to the repair shop accord-
ing to a Poisson process with mean arrival rate λi parts per hour
(i = 1, . . . , K). At a time only one part can be repaired in the
shop with a given mean repair time τi hours and a given stan-
dard deviation of σi hours (i = 1, . . . , K). There is a single waiting
room for all parts. This system can be modeled as a standard
M/G/1 multiclass queue. Use K = 5 and the following numerical
values:

i λi τi σi
1 0.2 1.2 1.0
2 0.3 0.3 0.6
3 0.1 1.5 0.9
4 0.4 1.0 1.0
5 0.2 0.3 0.8

Compute for each type i (i = 1, . . . , 5) the mean number of parts

(Li ) and the average time in the system (Wi ) under the following
rules:
(a) FCFS
(b) Nonpreemptive priority (1 is highest and 5 is lowest priority)
(c) Preemptive resume priority (1 is highest and 5 is lowest
priority)
Multiclass Queues under Various Service Disciplines 305

5.2 Consider a single-server queue with two types of customers. Cus-

tomers of type i arrive according to PP(λi ) for i = 1, 2. The two
arrival processes are independent. The service times are exp(μ)
for both types. If the total number of customers (of both types) in
the system is greater than or equal to K, customers of type 2 do
not join the system, whereas customers of type 1 always join the
system. State the condition of stability for this queueing system.
Compute the expected waiting time for each type of customer in
steady state, assuming they exist. Note that for type 2 customers,
you are only required to obtain the mean waiting time for those
customers that join the system.
5.3 A discrete time polling system consists of a single communication
channel serving N buffers in cyclic order starting with buffer 1. At
time t = 0, the channel polls buffer 1. If the buffer has any packets
to transmit, the channel transmits one and then moves to buffer
2 at time t = 1. The same process repeats at each buffer until at
time t = N − 1, the channel polls buffer N. Then at time t = N, the
channel polls buffer 1 and the cycle repeats.
Now consider buffer 1. Let Yt be the number of packets it
receives during the interval (t, t + 1]. Assume that {Yt , t ≥ 0} is
a sequence of IID random variables with mean m and generat-
ing function ψ(z). Let Xn be the number of packets available for
transmission at buffer 1 when it is polled for the nth time. Model
{Xn , n ≥ 1} as a DTMC. Show that Nm < 1 is the stability condi-
tion for buffer 1. Let πj = lim P{Xn = j}. Compute the generating
n→∞
function (z) corresponding to πj , that is,

∞

(z) = πi zi .
i=0

Please note that you are only asked for (z) and NOT the
individual πj values.
5.4 Consider an M/M/1 queue with nonpreemptive LCFS service dis-
cipline. Nonpreemptive means that a customer in service does not
get replaced by a newly arriving customer. Show that the LST of
the sojourn time in the system in terms of the LST of the busy
period distribution B̃(s) is

μ μ
E[e−sY ] = (1 − ρ) + ρB̃(s)
s+μ s+μ

[Note: At time t = 0, let a customer arrive into an empty M/M/1

system with nonpreemptive LCFS service discipline. Assume that
306 Analysis of Queues

at time t = T, the system becomes empty for the first time. Then,
T is a random variable known as the busy period. Then, B(·) is
the CDF of the busy period. It is crucial to realize that the busy
period does not depend on the service discipline as long as it is
work conserving. So if we know B̃(s) for FCFS discipline, then we
can compute E[e−sY ] for the nonpreemptive LCFS.]
5.5 Consider an M/G/1 queue with K classes. Using the expressions
for Wiq for both FCFS and nonpreemptive priority service disci-
K
plines, show that ρi Wiq results in the same expression. In
i=1
other words, this verifies that the amount of work in the wait-
ing area is conserved and is equal to the expression previous.
Further, for the special case of exponential service times for all
K classes, show that the preemptive resume policy also yields
K
the same expression for ρi Wiq by rewriting the expression
i=1
for FCFS and nonpreemptive priority using exponential service
times. Although the preemptive resume policy does not satisfy
the condition that there can be at most only one customer with
partially complete service, why does the result hold?
5.6 Answer the following multiple choice questions:
(i) Consider a stable G/G/1 queue with four classes (average
arrival and service rates are λi and μi , respectively, for class
i) using preemptive resume priority. What fraction of time in
the long run is the server busy?
λ1 +λ2 +λ3 +λ4
(a) μ1 +μ2 +μ3 +μ4
λ1 λ2 λ3 λ4
(b) μ1 + μ2 + μ3 + μ4

λ1 +λ2 +λ3 +λ4
(c) 1 − μ1 +μ2 +μ3 +μ4
1
(d) λ1 λ2 λ3 λ4
μ1 + μ2 + μ3 + μ4

(ii) Consider an M/G/1 queue with four classes (call them classes
A, B, C, D) such that the average service time for them are 2, 3,
1, and 5 min, respectively. Also, the holding cost for retaining
a customer of classes A, B, C, and D are, respectively, 3, 5, 2,
and 5 dollars per item per minute. If we use a nonpreemptive
priority, what should be the order of priority from highest to
lowest?
(a) D, A, B, C
(b) C, A, B, D
(c) D, B, A, C
(d) C, B, A, D
Multiclass Queues under Various Service Disciplines 307

(iii) Consider a stable G/G/1 queue with K classes. Which of

the following quantities would be different for the FCFS
discipline as compared to nonpreemptive priority?
(a) p0 (probability of empty system)
W
(b) Wq
(c) Average time server is continuously busy (mean busy
period)
(d) L − Lq
(iv) You are told that for a stable work-conserving single class
G/G/1 queue in steady state, W = 18 min, server was busy
90% of the time, and average arrival rate is 30 per hour.
Which of the following statements can you conclude from
that?
(a) If the arrival process is Poisson, then service times are
exponentially distributed
(b) L = 540
(c) Service discipline is FCFS
(d) Wq = 81/5 min
(v) Consider a stable M/G/1 queue with K classes. Which of
the following quantities would be different for the FCFS
discipline as compared to nonpreemptive priority?
K
i=1 λi E(Si )
(a) 2
K
(b) i=1 Li
K
(c) i=1 ρi Wiq
K
(d) i=1 ρi
5.7 Consider a stable three-class M/G/1 queue with nonpreemptive
priority (class-1 has highest priority and class-3 the lowest). The
mean and standard deviation of the service times (in minutes) as
well as the arrival rates (number per minute) for the three classes
are given in the following table:

Class Arrival Mean Std. Deviation

i Rate Service Time Service Time
1 2 0.1 0.2
2 3 0.2 0.1
3 1 0.1 0.1

In steady state, what would be the expected remaining service

time for the entity in service (if any) as seen by an arriving class-3
308 Analysis of Queues

customer? Also compute the average time spent by a class-2

customer in the queue before beginning service (in steady state).
5.8 For the following TRUE or FALSE statements, give a brief reason
why you picked the statements to be true or false.
(a) Consider a stable single class M/G/1 queue with nonpre-
emptive LCFS discipline. Is the following statement TRUE or
FALSE? The average time in the system (W) would not change
if the discipline is changed to FCFS.
(b) Let V be the variance of the workload at an arbitrary time
in steady state for a multiclass G/G/1 queue. Is the follow-
ing statement TRUE or FALSE? The quantity V would remain
the same for any work-conserving service discipline (FCFS,
preemptive resume, etc.).
(c) Consider a stable multiclass M/M/1 queue with different
arrival and service rates for each class. Let L be the average
number of entities in the system in steady state under non-
preemptive priority scheme. Is the following statement TRUE
or FALSE? If we change the service discipline to preemptive
resume priority but retain the priority order, then L would
remain unchanged.
(d) Consider a single station queue with two classes. For i = 1, 2,
arrivals for class i are according to PP(λi ) and service times for
class i take exp(μi ) time. There is a single server that uses non-
preemptive priority across classes and FCFS within a class.
Let Xi (t) be the number of customers of class i in the system
at time t. Is the following statement TRUE or FALSE? The
stochastic process {(X1 (t), X2 (t)), t ≥ 0} is a CTMC.
5.9 Consider a stable M/M/1 queue with arrival rate λ and service rate
μ. A modification is made to this system such that each customer
upon entry must state if their service time is going to be greater
than or smaller than a given constant quantity b (but the customer
does not disclose the actual service time). If the service time is
smaller than b, it is called class-1 or else class-2. For this system,
if we give nonpreemptive priority to class-1, what would be the
sojourn time for the two classes, W1 and W2 ? Also, is the aggregate
sojourn time 1/(μ − λ)?
5.10 Let W̃i (s) be the LST of the waiting time before service in steady
state for a class-i customer in a polling system using gated policy.
Ferguson and Aminetzah [37] show that

C̃i (λi − λi G̃i (s)) − C̃i (s)

W̃i (s) =
E[C][s − λi + λi G̃i (s)]
Multiclass Queues under Various Service Disciplines 309

where C̃i (·) is the LST of the cycle time associated with queue i
in steady state and all other variables are described in Section 5.3.
Using the previous expression, show that

(1 + ρi )E C2i
Wiq = .
2E[C]

5.11 For an M/G/1 queue with a single class, show that if service times
are IFR, then the Gittins index policy parameter κ(a) increases
with a using the expression

1 − G(a)
κ(a) = ∞
0 [1 − G(a + t)]dt

where G(·) is the service time CDF.

5.12 Consider an M/G/1 queue with a single class operating LAS
scheduling policy. Derive an expression for the sojourn time of
a customer that arrives with a job requiring x amount of service.
This page intentionally left blank
6
Exact Results in Network of Queues:
Product Form

So far we have only considered single-stage queues. However, in practice,

there are several systems where customers go from one station (or stage) to
other stations. For example, in a theme park, the various rides are the dif-
ferent stations and customers wait in lines at each station, get served (i.e.,
go on a ride), and randomly move to other stations. Several engineering
systems such as production, computer-communication, and transportation
systems can also be modeled as networks of multistage queues. Such net-
works, which have a queue at each node, are called queueing networks. Just
like the single-station case, the analysis of queueing networks also has sev-
eral nuances to consider, such as multiple classes, scheduling disciplines,
and capacity constraints, to name a few. However, as one would expect,
the most general case that captures all the nuances is intractable in terms of
analytical modeling. To that end, we divide the analysis into two groups,
one where exact results are derivable, which is the focus of this chapter,
and the other where we rely on approximations (that may or may not be
asymptotically exact), which we consider in the next chapter. In addition,
for each group, we start with the most basic model and develop some of the
generalized classifications.
We now describe the focus of this chapter. In this chapter, we mostly only
consider single-class queueing networks except in the last section. In all the
cases we discuss, the common thread is that we will develop a product-form
solution for the joint distribution for the number of customers in each node
of the network. Details of the product-form solution are in Section 6.2. Simi-
lar to the single-station case seen in the previous chapters, here too we start
with Markovian networks. We first describe some results for acyclic open-
queueing networks that leverage upon Poisson processes. The scenario in the
first model is that there is at least one node into which customers arrive and
at least one node from which customers depart the network. Such a queueing
network is called an open-queueing network. By acyclic, we mean networks
that do not have cycles. They are also called feed-forward networks. Sub-
sequently, we delve into analyzing general networks with cycles that have
Poisson arrivals and exponential service times. We will then consider closed-
queueing networks that do not allow new customers to enter or the existing
customers to leave (and thereby are necessarily cyclic). The simplest to ana-
lyze (possibly cyclic) open- and closed-queueing networks are called Jackson

311
312 Analysis of Queues

networks (due to Jackson [56]). After analyzing Jackson networks, we will

consider other queueing networks that also exhibit the product form. As
mentioned earlier, we begin with acyclic networks.

6.1 Acyclic Queueing Networks with Poisson Flows

In this section, we consider queueing networks where the underlying net-
work structure is acyclic. By acyclic, we mean that there are no cycles in
the directed network. Figure 6.1 depicts an example of an acyclic network.
Sometimes acyclic networks are called feed-forward networks. In contrast,
a cyclic network is one where there are one or more cycles in the directed
network. Figure 6.2 depicts an example of a cyclic network where there are
two cycles: 4-5-4 and 4-3-5-4. Cyclic networks are also called networks with
feedback. In addition to requiring that there be no cycles in the directed net-
work in this section, we also require that there be a single queue at each
node. Therefore, what we essentially have is an acyclic queueing network.
Since closed-queueing networks are cyclic in nature, all our acyclic networks
belong to open-queueing networks. Thus the word “open” would be redun-
dant if we call our system an acyclic open-queueing network. Consider an
acyclic queueing network with N nodes. External arrivals into node i is
according to a Poisson process with parameter λi for 1 ≤ i ≤ N. Some λi values
can be zero indicating no external arrival into that node.
If the queue in node i cannot accommodate a potential arrival, that cus-
tomer is rejected from the network. However, if the customer does enter

1 3 5

0 2 4

FIGURE 6.1
Example of an acyclic network.

1 3 5

0 2 4

FIGURE 6.2
Example of a cyclic network.
Exact Results in Network of Queues: Product Form 313

TABLE 6.1
Conditions to be Satisfied at Node i
Service Time
No. of Servers Capacity Distribution Stability
si ∞ Exponential Required
si si General Not applicable
∞ ∞ General Not applicable

node i and gets served, then upon service completion, the customer joins
node j with probability pij and leaves the network with probability ri . The
queue service in node i must satisfy one of the categories given in Table 6.1
(although we do not consider in this section, the results also hold if node
i is a single-server queue with processor sharing discipline or LCFS with
preemptive resume policy). If N = 1, the single node case, these corre-
spond to M/M/s, M/G/s/s, and M/G/∞ cases in the order provided in the
table. With that in mind, our first step to analyze the acyclic queueing net-
work is to characterize the output (or departure) process from the M/M/s,
M/G/s/s, and M/G/∞ queues, which would potentially act as input for a
downstream node.

6.1.1 Departure Processes

The objective of this section is to characterize the output (or departure) pro-
cess from the M/M/s, M/G/s/s, and M/G/∞ queues in steady state. We
begin with an M/M/1 queue and derive the interdeparture time distribution
in steady state as a problem.

Problem 51
Consider an M/M/1 queueing system with PP(λ) arrivals and exp(μ) ser-
vice time distribution. Assume that λ < μ. Let U be a random variable that
denotes the time between two arbitrarily selected successive departures from
the system in steady state. Show by conditioning, whether or not the first
of those departures has left the system empty, that U is an exponentially
distributed random variable with mean 1/λ.
Solution
Say a departure just occurred from the M/M/1 queue in steady state. Let
X denote the time of the next arrival and Y denote the service time of the
next customer. We would like to obtain the CDF of U, the time of the next
departure. Define F(x) = P(U ≤ x). Also, let Z be a random variable such that
Z = 0 if there are no customers in the system currently (notice that a depar-
ture just occurred), and Z = 1 otherwise. If Z = 0 then U = X + Y, otherwise
314 Analysis of Queues

U = Y. Recall that πj = π∗j = pj for all j, that is, the probability there are j in
the system as observed by a departing customer in steady state would be the
same as that of an arriving customer as well as the steady-state probability
that there are j in the system. We also know that p0 = 1 − ρ where ρ = λ/μ.
Therefore, we have the LST of F(x) by conditioning on Z as

F̃(s) = E[e−sU ] = E[e−sU |Z = 0]P(Z = 0) + E[e−sU |Z = 1]P(Z = 1)

= E[e−s(X+Y) ](1 − ρ) + E[e−sY ]ρ

λ μ λ μ λ λ
= 1− + = .
λ+s μ+s μ μ+s μ λ+s

Inverting the LST, F(x) = P(U ≤ x) = 1 − e−λt or U ∼ exp(λ). Thus, U is an

exponentially distributed random variable with mean 1/λ.

Considering the above result, a natural question to ask is whether the

departure process is a Poisson process with parameter λ. Notice in the above
problem we have only shown that the marginal distribution for an arbitrary
interdeparture time in steady state is exp(λ). It is more involved to show
that the departure process from a stable M/M/1 queue in steady state is
indeed a Poisson process. It is very nicely done in Burke [14] and the result
is called Burke’s theorem. We only present an outline of the proof by consider-
ing a reversible process that can easily be extended to M/M/s queues. Recall
the reversibility definition and properties from Chapter 2. Clearly, the queue
length process {X(t), t ≥ 0} of an M/M/s queue is a reversible process because
it satisfies the condition for all i ∈ S and j ∈ S,

pi qij = pj qji

which is a direct artifact of the balance equations resulting from the arc cuts
for consecutive nodes i and j (otherwise qij = 0).
One of the implications of reversible processes is that if the system is
observed backward in time, then one cannot tell the difference in the queue
length process. Thus the departure epochs would correspond to the arrival
epochs in the reversed process and vice versa. Therefore, the departure
process would be stochastically identical to the arrival process, which is
a Poisson process. In fact, for the M/G/∞ and M/G/s/s queues as well,
the departures are according to a Poisson process. We had indicated in
Chapter 4 that if we define an appropriate Markov process for the M/G/s/s
queue (note that when s = ∞ we get the M/G/∞ queue, so the same result
holds), then that process is reversible. Due to reversibility, the departures
from the original system correspond to arrivals in the reversed system.
Exact Results in Network of Queues: Product Form 315

Therefore, the departure process from the M/G/s/s queue is a Poisson pro-
cess with rate (1 − ps )λ departures per unit time on average, where ps is
the probability that there are s in the system in steady state (i.e., zero when
s = ∞), that is, the probability of a potential arrival is rejected.
It is indeed strange that for the stable M/M/s queue and the M/G/∞
queue, the output process is not affected by the service process. Of course,
this is incredibly convenient in terms of analysis. Poisson processes have
other extremely useful properties, such as superpositioning and splitting,
that are conducive for analysis, which we will see next. Before that it is
worthwhile to point out that in the M/G/s/s case, through ps the departure
processes does depend on the mean service rate. However, the distribution
of service times has no effect on the departure process in steady state. Hav-
ing said that, except for a small example we will consider in the next section,
until we reach Section 6.4.4 on loss networks, we will only consider infinite
capacity stable queues and not consider any rejections.

6.1.2 Superpositioning and Splitting

We first describe two results from Poisson processes that we would use to
analyze the performance of feed-forward networks. The first result is super-
position of Poisson processes. When n Poisson processes PP(λ1 ), PP(λ2 ), . . .,
PP(λn ) are superimposed, then the aggregate counting process that keeps
track of the number of events is a PP(λ), where λ = λ1 + λ2 + · · · + λn . In
other words, when n Poisson processes are merged, the resultant process is
Poisson with the rate equal to the sum of the rates of the merging processes.
This result can be proved using the memoryless property of exponential
random variables and minimum of exponentials (properties of exponential
distribution are summarized in Section A.4). Denote the n merging Poisson
processes as n streams such that the interevent times of stream i is according
to exp(λi ) for 1 ≤ i ≤ n. Let Xi be the remaining time until the next event from
stream i. Due to the memoryless property, Xi ∼ exp(λi ). The next event in
the merged process occurs after min(X1 , X2 , . . . , Xn ) time and that is accord-
ing to exp(λ). Hence, the interevent times in the merged process is according
to exp(λ).
Next, we describe splitting of Poisson processes. Consider events occur-
ring according to a Poisson process with parameter λ for some λ > 0.
Independent of everything else, for j = 1, · · ·, k, an event is of type j with prob-
ability qj so that q1 + · · · + qk = 1. Such a splitting of each event into k types
is called a Bernoulli splitting. In this case, type j events occur according to
PP(λqj ). This can be proved using the sum of geometric number of IID expo-
nentials property. From this description, events occur with interevent times
according to exp(λ) and each event is of type j with probability qj and not
type j with probability 1 − qj . Thus the number of events till an even of type
j occurs is according to a geometric distribution with mean 1/qj events. Thus
if Yi is the ith interevent time of the original process and Z is the number
316 Analysis of Queues

PP(λ1) q1 PP(λq1)
PP(λ2) q2 PP(λq2)
PP(λ) PP(λ)

qk
PP(λn) PP(λqk)

FIGURE 6.3
Merge, flow through a stable queue with s exponential servers and split.

of events until one of type j occurs (which is geometric with parameter qj ),

then the time for that event is Y1 + · · · + YZ , which from the property “sum
of geometric number of IID exponentials” is according to exp(λqj ). Hence,
type j events occur according to PP(λqj ). Note that Bernoulli splitting is
critical and this would not work otherwise.
Now let us consider in a single framework, superposition, flow through
a node (or queue) as well as splitting. Let n flows merge before entering
a queue (i.e., n streams of customers arrive into the queue) so that each
flow i is according to PP(λi ) for 1 ≤ i ≤ n. The net customer-arrival process
to the queue is PP(λ), where λ = λ1 + · · · + λn . Now these merged customers
encounter the queue with s identical servers and service times exp(μ) such
that λ < sμ. The output from this queue is PP(λ). If the exiting customers
are split into k flows according to Bernoulli splitting with probabilities
q1 , . . ., qk , then the resulting flow j (for 1 ≤ j ≤ k) is PP(λqj ). This is illustrated
in Figure 6.3. Of course, this would hold for all the cases given in Table 6.1
except in the last case one has to be careful to write down the rate of the
departing Poisson process. Using these results, we can now analyze any
acyclic queueing network with nodes satisfying conditions in Table 6.1 by
going through merging, flow through a queue and splitting in a systematic
manner. We illustrate this through an example problem next.

Problem 52
Consider the acyclic network in Figure 6.1. Say customers arrive externally at
nodes 0, 2, and 4 according to Poisson processes with mean rates 13, 12, and
15 customers per hour, respectively. After service at nodes 3, 4, and 5, a cer-
tain fraction of customers exit the network. After service at all nodes i (such
that 0 ≤ i ≤ 5), customers choose with equal probabilities among the options
available. For example, after service at node 4, with an equal probability of
1/3, customers choose nodes 3 or 5 or exit the network (while customers that
complete service at nodes 0, 1, and 2 do not immediately exit the network).
Further, node 0 has two servers but a capacity of 2 and generally distributed
service times. Nodes 1, 3, and 5 are single-server nodes with exponentially
distributed service times and infinite capacity. Node 2 is a two-server node
with exponentially distributed service times and infinite capacity. Node 4
is an infinite-server node with generally distributed service times. Assume
that the mean service times (in hours) at nodes 0, 1, 2, 3, 4, and 5 are 1/26,
Exact Results in Network of Queues: Product Form 317

1/15, 1/10, 1/30, 1/7, and 1/20, respectively. Compute the average number
of customers in each node in steady state as well as the overall number in the
network.
Solution
We consider node by node and derive Lj , the average number of customers
in node j (for 0 ≤ j ≤ 5).
Node 0: Arrivals to node 0 are according to PP(13). Since there are two
servers and capacity of 2, this node can be modeled as an M/G/2/2 queue.
The probability that this node is full is p2 and is given by

2
1 13
2 26
p2 = 2 = 1/13.
13 1 13
1+ 26 + 2 26

The output from node 0 is according to PP(13(1 − p2 )), which is PP(12).

Since exiting customers pick nodes 1 and 2 each with probability 0.5, depar-
tures from node 0 enter node 1 according to PP(6) and node 2 according to
PP(6). Using the M/G/s/s results, we get L0 = 12/26.
Node 2: From node 0 customers enter node 2 according to PP(6) and cus-
tomers arrive externally according to PP(12). Thus, total arrival into node
2 is according to PP(18). Thus node 2 can be modeled as an M/M/2 queue.
Using M/M/s results, we have L2 = 180/19. Since the service rate of each
server is 10, the queue is stable and customers depart from this node accord-
ing to PP(18). Since a third of the departures go to node 1, a third to node
3, and a third to node 4, the process from node 2 to 1, 2 to 3, and 2 to 4
are all PP(6).
Node 1: Arrivals to node 1 are a result of merging of two Poisson pro-
cesses, PP(6) from node 0 and PP(6) from node 2. Thus, customers enter
node 1 according to PP(12). Since this is a single exponential server node
with service rate 15, it can be modeled as an M/M/1 queue and L1 = 4. Since
this queue is stable, customers depart according to PP(12), all of which go to
node 3.
Node 4: From node 2 customers enter node 4 according to PP(6) and cus-
tomers arrive externally according to PP(15). Thus total arrival into node 4
is according to PP(21). Thus node 4 can be modeled as an M/G/∞ queue.
Using M/G/∞ results, we have L4 = 3. Customers depart from this node
according to PP(21). Since a third of the departures go to node 3, a third
to node 5, and a third exit, the processes from node 4 to 3 and 4 to 5 are
both PP(7).
Node 3: Arrivals to node 3 are a result of merging of three Poisson pro-
cesses, namely, PP(12) from node 1, PP(6) from node 2, and PP(7) from
node 4. Thus customers enter node 3 according to PP(25). Since this is a
single exponential server node with service rate 30, it can be modeled as
318 Analysis of Queues

an M/M/1 queue and L3 = 5. Since this queue is stable, customers depart

according to PP(25), and with probability half of each departing customer
goes to node 5 (or exit the network).
Node 5: Arrivals to node 5 are a result of merging of two Poisson pro-
cesses, PP(12.5) from node 3 and PP(7) from node 4. Thus customers enter
node 5 according to PP(19.5). Since this is a single exponential server node
with service rate 20, it can be modeled as an M/M/1 queue and L5 = 39. Since
this queue is stable, customers depart according to PP(19.5), all of which exit
the system.
Further, the overall average number in the network in steady state is
L = L0 + L1 + L2 + L3 + L4 + L5 = 60.935.

Interestingly, the above example does not fully illustrate the properties
of Poisson-based acyclic queueing networks in all its glory. In fact, one of
the most unique properties that these acyclic networks satisfy and that is not
satisfied by the cyclic networks (that we will discuss in the next section) is
given in the next remark.

Remark 11

Consider an acyclic network that has N nodes with external arrivals accord-
ing to a Poisson process and conditions in Table 6.1 satisfied. Assume that
the network is in steady state at time 0. Let Xi (t) be the number of customers
in node i at time t for 1 ≤ i ≤ N and t ≥ 0. Then for any j and u such that
1 ≤ j ≤ N, j = i, and u ≥ 0, the two random quantities Xi (t) and Xj (u) are
independent.

This remark enables us to decompose the queueing network and study one
node at a time (of course, the right order should be picked) without wor-
rying about dependence between them. Further, if one were to derive the
sojourn time distribution for a particular customer, then it would just be
the sum of sojourn times across each node in its path, which are all inde-
pendent. For example, a customer that enters node 2 in Problem 52, goes
to node 3, then to 5, and exits the network would have a total sojourn time
T equal to the sum of the times in nodes 2, 3, and 5, say T2 , T3 , and T5 ,
respectively. Then since T2 , T3 , and T5 are independent, we can compute the
LST of T as

E e−sT = E e−s(T2 +T3 +T5 ) = E e−sT2 E e−sT3 E e−sT5 .

Clearly, the RHS can be computed from single-station analysis in the

previous chapters.
Exact Results in Network of Queues: Product Form 319

It is worthwhile to point out that as described earlier in this section,

besides the cases that can be captured in Table 6.1, all the results presented
here are also satisfied for nodes with a single server, general IID service
times with either processor sharing or LCFS preemptive resume discipline.
Before concluding this section, we would like to add another remark. Since
it is extremely convenient to analyze Poisson-based cyclic networks due to
the decomposability, a natural question to ask is whether there are other
queues where the departure processes can be characterized as Poisson, at
least approximately. The following remark is from Whitt [104].

Remark 12

Consider a stable G/G/m queue where many servers are busy. The arrivals
are according to a renewal process and the servers are identical with ser-
vice times according to a general distribution. Whitt [104] shows that the
departures from such a queue is approximately a Poisson process.

This remark is extremely useful especially while approximately analyz-

ing some queueing networks. We consider approximate analysis in the next
chapter. For the remainder of this chapter, we will consider networks that
are not necessarily acyclic and derive the so-called product-form results. We
start with Jackson networks in the next section. However, we wrap up this
section with a case study.

6.1.3 Case Study: Automobile Service Station

Carfix is an automobile service station strategically located near three major
interstate highways. It provides routine maintenance, services, and repairs
such as tune-ups, brake work, exhaust issues, electrical system malfunction-
ing, oil and fluid changes, and tire work. Pretty much anything that needs to
be fixed in a car, SUV, light truck, or van can be brought to Carfix and the
technicians there are capable of taking care of it. The business at Carfix has
been steady, and due to the proximity to major interstate highways, the clien-
tele was not limited to local customers. With the economy going south (note
that this case study was written in 2011), several dealerships in the area have
started to offer lucrative service packages since new vehicle sales have plum-
meted. Thus, the demographics of Carfix’s clients have shifted from being
largely local a few years ago to now a vast majority of out-of-town customers.
One morning, Cleve, the owner of Carfix was playing a round of golf with
his buddy Vineet who owned a couple of motels in town. As soon as Vineet
asked Cleve how things were at Carfix, Cleve seized the moment and asked,
“so Vineet, since all your clients are from out of town, do you really worry
about customer service?” To that Vineet replied that several years ago, all
320 Analysis of Queues

he worried about was to make sure that the expectations of the parent com-
pany from which he franchised the motels were being met. However, in this
day and age where customers can easily check reviews on their smart phones
while they are traveling, it has become extremely important to provide excel-
lent service individually (not just overall). Cleve thought to himself that a
good number of his prospective clients are going to search on their smart-
phones and it would be of paramount importance to have good reviews.
Subsequently, Cleve brainstormed with Vineet opportunities for enhanc-
ing customer satisfaction. Based on that, three main ideas emerged: (1) per-
form a complementary multipoint inspection at the end of a service/repair
for all customers (a vehicle inspector would need to be hired for that);
(2) offer guarantees such as if you do not get your vehicle back within some τ
minutes, the service is free; and (3) install an additional bay and hire an addi-
tional automotive technician. It was not clear to Cleve, what the benefits of
these improvements would be. He knew for sure that idea (1) needs to be
done because all the dealership service stations are providing that inspec-
tion, and to stay competitive, Carfix must also offer it. Cleve told Vineet that
once he figures out the best option, he would be ready for another round
of golf with Vineet to discuss what kind of discounts he could provide his
customers to stay at one of Vineet’s motels.
One of Cleve’s nieces, Lauren, was a senior in Industrial Engineering who
Cleve thought could help him analyze the various options. When Lauren
heard about the problem, she was excited. She talked to her professor to find
out if he would allow her and a couple of her friends to work on Cleve’s
problem for their capstone design course. The professor agreed, in fact
he was elated because it would solve the mismatch he was encountering
between the number of students and projects. When Lauren and her friends
arrived at Carfix they found out that there was no historical data, so they
spent a few hours collecting data. Interestingly, when the students were at
Carfix, Cleve also had two inspector-candidates who were going to inter-
view for the multipoint inspection position. For the interview, Cleve asked
the two candidates to perform inspections on a few vehicles.

6.1.3.1 System Description and Model

On the basis of the data they collected as well as the discussions with the
staff and technicians working at Carfix, Lauren and her friends abstracted
the system in the following manner. Essentially, there are three stations in a
tandem fashion. Vehicles arrived to the first station according to a Poisson
process at an average rate of about 4.8 per hour. As soon as a vehicle arrives,
a greeter would meet the driver of the vehicle, write down the problems or
symptoms, and politely request the driver (and any passengers) to park the
vehicle and wait in the lounge area. This whole process took 1–2 min and it
appeared to Lauren’s team that it was uniformly distributed. Also, Lauren
found that although they only had one greeter, the drivers would never wait
Exact Results in Network of Queues: Product Form 321

to be greeted. That is because if the greeter was busy when a new vehicle
arrived, the cashier or Cleve himself would go and greet.
At the second stage, technicians would pick up the parked vehicle in the
order they arrived and take them to their bay. There were four technicians,
each with his or her own bay. The bays were equipped to handle all repairs
and service operations. After completing the repair, the technicians would
return the vehicles back to the parking area. The time between when a tech-
nician would pick up a vehicle from where it is parked till he or she would
drop that vehicle off is exponentially distributed with mean about 36 min.
Given the different types of services and repairs as well as the types of vehi-
cles, it did not surprise Lauren and friends that a high-variability distribution
such as exponential fitted well. Then in the third stage, the inspector would
perform a multipoint inspection at the parking lot and send a report.
Among the two inspectors interviewed, inspector A had a mean of 6 min
and a standard deviation of 4.24 min, whereas inspector B had a mean of
7 min and a standard deviation of 3.5 min. Although the number of sample
points were small, Lauren and her friends felt comfortable using a gamma
distribution for the inspection times. They also realized that it was not nec-
essary to include the time at the cashier because what the customers really
cared about was the time between when they dropped off the car and when
they hear from Cleve or another staff member (on days Cleve is out golfing)
that the repair or service is complete. So Lauren and friends decided on using
a three-stage tandem system to represent Carfix’s shop.
Since there was never a queue buildup, Lauren and friends modeled
the first stage as an M/G/∞ queue with PP(λ) arrivals and Unif (a, b) ser-
vice times where λ = 4.8 per hour, a = 1 min, and b = 2 min. Clearly, the
departures from the queue would be PP(λ) and will act as arrivals to the sec-
ond stage. They modeled the second stage as an M/M/4 queue with PP(λ)
arrivals and exp(μ) service times, where μ = 5/3 per hour. Since the depar-
ture from a stable M/M/s queue is a Poisson process, they modeled the third
stage as an M/G/1 queue with PP(λ) arrivals, and mean and variance of
service times depending on whether inspector A or B is used.

6.1.3.2 Analysis and Recommendation

With the preceding model, Lauren and friends decided to analyze the sys-
tem stage by stage. At stage 1, since the system is an M/G/∞ queue with
PP(λ) arrivals and Unif (a, b) service times, clearly the sojourn time in stage
1 (they called it Y1 ) would be uniformly distributed between 1 and 2 min.
Also, the mean time spent by a vehicle in stage 1 is W1 = 1.5 min and the
LST of Y1 is

e−s (1 − e−s )
E e−sY1 = .
s
322 Analysis of Queues

At this point, Lauren and friends were not sure if the LST was necessary
but felt that it was good to keep it in case they were to compute the LST
of the total sojourn time in the system. Then, using the fact that λ = 4.8 per
hour, the average number of vehicles in stage 1 is λ ∗ 1.5/60 = 0.12. Thus the
steady-state probability that there are no more than two vehicles in stage 1
is (1 + 0.12 + (0.12)2 /2!)e−0.12 = 0.9997, which clearly justifies the use of the
M/G/∞ model. In fact, it shows how rarely Cleve would have to greet a
customer (although it appears that the cashier would have to greet only one
in 100 vehicles on average as the probability of zero or one vehicles at stage
one is 0.9934, in reality it was a lot more often because the greeter was called
upon to run odd jobs from time to time).
Lauren and friends next considered stage 2, which they modeled as an
M/M/4 queue with PP(λ) arrivals and exp(μ) service. Plugging in λ = 4.8
per hour and μ = 5/3 per hour, they got a traffic intensity ρ = λ/(4μ) = 0.72 at
stage 2. Clearly, this is stable albeit not a low-traffic intensity. It does appear
like adding a new bay (thus, M/M/5 queue) would significantly reduce the
traffic intensity to 0.576 resulting in better customer service. Starting with
the present M/M/4 system, Lauren and friends calculated the mean sojourn
time in stage 2 as W2 = 50.79 min (then for the M/M/5 system they calculated
that it would be 39.52 min). Also, with s servers, the sojourn time Y2 had a
CDF F2 (y) = P{Y2 ≤ y} given by

s−1
(λ/μ)s sμ
−μy
F2 (y) = pj (1 − e ) + p0 (1 − e−μy )
s! (s − 1)μ − λ
j=0

sμ2
− (1 − e−(sμ−λ)y )
(sμ − λ)[(s − 1)μ − λ]

for any y ≥ 0, where

s−1 −1
1 λ n (λ/μ)s 1
p0 = +
n! μ s! 1 − λ/(sμ)
n=0

and for j = 1, . . . , s, pj = 1/j!(λ/μ)j p0 . By plugging in the numbers for λ and

μ, and using the current system with s = 4, Lauren and friends computed
the probability P{Y2 ≤ y} to be 0.6798 for y = 1 h and 0.8363 for y = 1.5 h. They
also quickly computed that if Carfix were to consider s = 5 bays and workers,
the probability that the sojourn time in stage 2 would be less than 1 and 1.5 h
is 0.7815 and 0.9036, respectively.
Lauren and friends decided to forge ahead with stage 3, which they mod-
eled as an M/G/1 queue. They realized that irrespective of whether Cleve
decides to adopt 4 or 5 bays in stage 2, the output will still be PP(λ), which
Exact Results in Network of Queues: Product Form 323

would act as input to stage 3. Using the Pollaczek–Khintchine formula for

M/G/1 queue, the mean sojourn in stage 3, W3 is 10.1538 min for inspector A
(mean inspection time 6 min and standard deviation 4.24 min), whereas it is
12.5682 min for inspector B (mean inspection time 7 min and standard devi-
ation 3.5 min). Given that inspector A yielded a much better mean sojourn
time, Lauren and friends recommended that inspector. Also, they fitted an
Erlang distribution for the inspection time so that the LST of the CDF of the
inspection time in minutes was G̃(s) = 1/(1 + 3s)2 . Thereby, the LST of the
sojourn times Y3 (in min) was calculated as

(1 − ρ3 )sG̃(s)
E e−sY3 =
s − λ(1 − G̃(s))/60

where the traffic intensity ρ3 = 0.1λ since the mean inspection time is 6 min,
that is, 0.1 h. Inverting the LST, Lauren and friends computed the stage-3
sojourn time (in min) CDF as

P{Y3 ≤ y} = 1.3724 1 − e−0.1252y − 0.3724 1 − e−0.4615y .

Plugging in for y = 20 min, we get the probability of completing the inspec-

tion within 20 min (for inspector A) as 0.8879, which Lauren and friends felt
was quite poor considering that the traffic intensity was only 0.48 and the
mean inspection time was only 6 min.
Having completed the analysis for the three stages, Lauren and friends
felt it was time to make recommendations. Recall the three main ideas that
emerged out of the discussion between Cleve and Vineet. Lauren and friends
responded to those suggestions as follows: (1) Since performing a multipoint
inspection was almost necessary to stay competitive, it would be best to
hire inspector A. However, it is important to note that it would result in
an increase of the total time by over 10 min on average and about 11% of
the customers could face over 20 min of wait. It was up to Cleve to deter-
mine if that was worth it. (2) Given that the mean sojourn time was a little
over an hour (62.44 min) and the standard deviation was quite high, it did
not make much sense to offer guarantees such as if you do not get your vehi-
cle back within 2 h, the service is free. The rationale for that is if on average
it would only take 1 h, the guarantee of 2 h might be misleading. Instead,
the recommendation would be to provide an estimate based on the queues
at various stations (Lauren offered to write a program to compute that esti-
mate or use the CDFs to calculate the probabilities of completing the entire
service within some time τ). (3) Installing the additional bay (i.e., the fifth
one) and hiring an additional automotive technician would cut the mean
sojourn times by over 11 min. However, that might become absolutely nec-
essary if Carfix’s demand increases. Ultimately, Lauren was able to perform
324 Analysis of Queues

the required what-if analysis so that Cleve could use a cost-benefit analysis
to determine the best alternatives. Eventually, Cleve ended up implementing
all the recommendations Lauren and friends made. As expected, it improved
customer satisfaction as well as increased the demand. However, Cleve was
well positioned for that higher demand.

6.2 Open Jackson Networks

An open Jackson network is a special type of open-queueing network where
arrivals are according to Poisson processes and service times are exponen-
tially distributed. Cycles are allowed in Jackson networks. In addition, open
Jackson networks can be categorized by the following:

1. The open-queueing network consists of N service stations (or nodes).

2. There are si identical servers at node i (such that 1 ≤ si ≤ ∞), for all i
satisfying 1 ≤ i ≤ N.
3. Service times of customers at node i are IID exp(μi ) random vari-
ables. They are independent of service times at other nodes.
4. There is infinite waiting room at each node, and stability condition
(we will describe that later) is satisfied at every node.
5. Externally, customers arrive at node i according to a Poisson process
with rate λi . All arrival processes are independent of each other and
the service times. At least one λi must be nonzero.
6. When a customer completes service at node i, the customer departs
the network with probability ri or joins the queue at node j with
probability pij . Note that pii > 0 is allowed and corresponds to
rejoining queue i for another service. It is required that for all i,
N
ri + pij = 1 as all customers after completing service at node
j=1
i either depart the network or join another node (in particular, join-
ing node j with probability pij ). The routing of a customer does not
depend on the state of the network.
7. Define P = [pij ] as the routing matrix of pij values. Assume that I − P
is invertible, where I is an N × N identity matrix. It turns out that the
I − P matrix is invertible if every customer would eventually exit the
network after a finite sojourn.

To analyze the open Jackson network, one of the preliminary results is flow
conservation and stability, which we describe next.
Exact Results in Network of Queues: Product Form 325

6.2.1 Flow Conservation and Stability

The results in this section can be applied to a much wider class of networks
than the Jackson network. In particular, the requirement of exponential inter-
arrival and service times is not necessary. The concepts of flow conservation
and stability are closely linked to each other. In fact, for flow conservation we
need stability, but condition of stability can be derived only after performing
flow conservation. One of the most fundamental results of flow conservation
is that if a queue is stable, then the long-run average arrival rate of customers
is equal to the long-run departure rate of customers. In other words, every
customer that arrives must depart, no customers can be created or lost. This
is the essence of flow conservation at a queue (see Section 1.2). In a network
where flows merge and split, we can again apply flow conservation at merge
points and split points. Average flow rate before and after the merge or split
points is equal.
With that in mind, the next objective is to obtain the average arrival (or for
that matter departure) rates of customers into each node. Recall the notation
and conditions for open Jackson networks. In steady state, the total average
arrival rate into node j (external and internal put together) is denoted by aj
and is given by

N
aj = λj + ai pij ∀j = 1, 2, . . . , N. (6.1)
i=1

This result is due to the fact that the total arrival rate into node j equals the
external arrival rate λj plus the sum of the departure rates from each node i
(for i = 1, . . . , N) times the fraction that are routed to j (i.e., pij ). Therefore,
let a = (a1 , a2 , . . . , aN ) be the resulting row vector that we need to obtain.
We can rewrite the aj Equation 6.1 in matrix form as a = λ + aP, where
λ = (λ1 , λ2 , . . . , λN ) is a row vector. Then a can be solved using

a = λ(I − P)−1 . (6.2)

Note that for this result we require (I − P) to be invertible. Unlike the acyclic
networks where aj values are easy to compute, when the networks are cyclic,
one may have to rely on Equation 6.2.
Now that aj values can be obtained for all j, we are in a position to state
the stability condition. For any j (such that 1 ≤ j ≤ N), node j is stable if

aj < sj μj .

Further, the queueing network is stable if the stability condition is satisfied

for every node of the network. Note that the fact we needed the queues to
be stable to compute aj does not cause any computational difficulties. Essen-
tially we compute aj , which does not require the knowledge of μi values.
326 Analysis of Queues

Then we check if each queue is stable. If all queues are stable then we can
conclude that the flow rates through node j are indeed aj . For the open Jack-
son network described earlier, the objective is to derive the joint distribution
as well as marginal distribution (and moments) of the number of customers
in each queue. We do that next and describe a product form for the joint
distribution and thereby derive the marginals.

6.2.2 Product-Form Solution

Consider an open Jackson network. Note that if the network is acyclic, then
we have the acyclic queueing network results from Section 6.1. Thus the
most interesting case is if the network has cycles. However, we present a
unified method that covers both acyclic and cyclic open Jackson networks.
To analyze Jackson networks, a natural question to ask is why not just use
the Poisson property like we did for acyclic networks. As it turns out, the
Poisson result does not hold when there is feedback. In particular for cyclic
networks, if there is a node that is part of a cycle, then the output from
that node is not according to a Poisson process. Disney and Kiessler [26]
show the steady-state correlation between the queue lengths at two differ-
ent nodes (that are part of a cycle) i and j at two different times u and t,
that is, Xi (t) and Xj (u). In this reference, there is a fitting illustration on
their correlation for different t and u values as well as an explanation for
why the output process from such nodes that are parts of a cycle are not
Poisson. Thus we need a method that does not use the Poisson property
explicitly.
For that we begin by modeling the entire system as a stochastic process.
For i = 1, . . . , N, let Xi (t) be the number of customers in node i of an open
Jackson network at time t. Let X(t) be a vector that captures a snapshot of the
state of the network at time t and is given by X(t) = [X1 (t), X2 (t), . . . , XN (t)].
It is not hard to see that the N-dimensional stochastic process {X(t), t ≥ 0}
is a CTMC since the interevent times are exponentially distributed. Let p(x)
be the steady-state probability that the CTMC is in state x = (x1 , x2 , . . . , xN ),
that is,

p(x) = lim P{X(t) = (x1 , x2 , . . . , xN )}.

t→∞

To obtain p(x) we consider the balance equations (flow out equals flow
in) for x = (x1 , x2 , . . . , xN ) just like we would do for any CTMC. To make
our notation crisp, we write the balance equations in terms of ei , which is
a unit vector with one as the ith element and zeros everywhere else. For
example, if N = 4, then e2 = (0, 1, 0, 0). Also for notational convenience we
denote p(x) as zero if any xj < 0. Thus, the generic balance equation takes
the form
Exact Results in Network of Queues: Product Form 327

N
N
p(x) λi + min(xi , si )μi
i=1 i=1

N
N
= p(x − ei )λi + p(x + ei )ri min(xi + 1, si )μi
i=1 i=1

N
N
+ p(x + ei − ej )pij min(xi + 1, si )μi .
j=1 i=1

To explain this briefly, note that the LHS includes all the transitions out of
state x, which include any external arrivals or service completions. Likewise,
the RHS includes all the transitions into state x, that is, external arrivals
as well as service completions that lead to exiting the network or joining
other nodes. If this is not clear, it may be worthwhile for the reader to try an
example with a small number of nodes before proceeding further.
It is mathematically intractable to directly solve the balance equations
to get p(x) for all x except for special cases. However, since we know that
there is a unique solution to the balance equations, if we find a solution then
that is the solution. In that spirit, consider an acyclic open Jackson network
for which from an earlier section we can compute p(x). For the acyclic open
Jackson network, node j (such that 1 ≤ j ≤ N) would be an M/M/sj queue
with PP(aj ) arrivals, exp(μj ) service and sj servers (if the stability condition
at each node j is satisfied, that is, aj < sj μj ). Hence, it is possible to obtain the
steady-state probability of having n customers in node j, which we denote as
φj (n). Using the M/M/s queue results in Chapter 2, we have
⎧ a n
⎪
⎪ 1 j
φj (0) if 0 ≤ n ≤ sj − 1
⎨ n! μj
φj (n) = a n (6.3)
⎪
⎪ 1 j
φj (0) if n ≥ sj
⎩ n−sj μj
sj ! sj

where
⎡ ⎤−1
sj −1
1 (aj /μj )sj 1
φj (0) = ⎣ (aj /μj )n + ⎦ .
n! sj ! 1 − aj /(sj μj )
n=0

Since the number of customers in each node in steady state is independent

of those in other nodes, we have for the acyclic open Jackson network

p(x) = φ1 (x1 )φ2 (x2 ) . . . φN (xN ).

This structure is what is called product form.

328 Analysis of Queues

Since we have a solution for a special open Jackson network, as a first

option we could try to see if such a p(x) would be a solution to the generic
open Jackson network (not just acyclic). If the p(x) given here does not sat-
isfy the balance equations, then we will try some other p(x). However, if it
does satisfy the balance equations, then we are done since there is a unique
solution to the balance equations. Thus for a generic open Jackson network,
we consider the following as a possible solution to the balance equations:

p(x) = φ1 (x1 )φ2 (x2 ) . . . φN (xN )

where φj (n) is given by Equation 6.3. We need to verify that this p(x) satisfies
the balance equation

N
N
p(x) λi + min(xi , si )μi
i=1 i=1

N
N
= p(x − ei )λi + p(x + ei )ri min(xi + 1, si )μi
i=1 i=1

N
N
+ p(x + ei − ej )pij min(xi + 1, si )μi .
j=1 i=1

For that, notice first of all if p(x) = φ1 (x1 )φ2 (x2 ) . . . φN (xN ), then

p(x) φi (xi )
= ,
p(x ± ei ) φi (xi ± 1)

p(x) φi (xi )φj (xj )

= ,
p(x + ei − ej ) φi (xi + 1)φj (xj − 1)

for all i and j. In addition, from the definition of φi (n) in Equation 6.3, we can
obtain the following:

ai φi (xi − 1) = min(xi , si )μi φi (xi )

ai φi (xi ) = min(xi + 1, si )μi φi (xi + 1)

with the additional condition that φi (n) = 0, if n < 0.

Exact Results in Network of Queues: Product Form 329

Using these equations in the balance equation (by dividing it by p(x)),

we get

N
N
λi + min(xi , si )μi
i=1 i=1

N
min(xi , si )μi λi
N
N
N
ai
= + ai ri + min(xj , sj )μj pij
ai aj
i=1 i=1 j=1 i=1

N
min(xi , si )μi λi
N
N
min(xj , sj )μj
N
= + ai ri + ai pij
ai aj
i=1 i=1 j=1 i=1

N
min(xi , si )μi λi
N
N
min(xj , sj )μj
= + ai ri + (aj − λj )
ai aj
i=1 i=1 j=1

N
N
= ai ri + min(xj , sj )μj
i=1 j=1

N
where the third equation can be derived using ai pij = aj − λj ,
i=1
which is directly from Equation 6.1. Since the other two terms cancel,
p(x) = φ1 (x1 )φ2 (x2 ) . . . φN (xN ) is the solution to the balance equation if

N
N
λi = ai ri .
i=1 i=1

This equation is true because (using the notation e as a column vector of ones)
from Equation 6.2 we have

λe = a(I − P)e,
⎛ ⎞

N N
N
⇒ λi = ai ⎝1 − pij ⎠ ,
i=1 i=1 j=1

N
N
⇒ λi = ai ri .
i=1 i=1

Thus, p(x) = φ1 (x1 )φ2 (x2 ) . . . φN (xN ) satisfies the balance equations for a
generic open Jackson network. In other words, the steady-state joint proba-
bility distribution of having x1 in node 1, x2 in node 2, . . ., xN in node N is
330 Analysis of Queues

equal to the p(x), which is the product of the φj (xj ) values for all j. Hence,
this result is known as product form. Now to get the marginal distribution
that queue j has xj customers in steady state for some j such that 1 ≤ j ≤ N,
all we have to do is sum over all x keeping xj a constant. Thus the marginal
probability that node j has xj in steady state is given by φj (xj ). Notice that the
joint probability is the product of the marginals. From a practical standpoint,
this is extremely convenient because we can model each node j as though it
is an M/M/sj queue (although in reality it may not be) with PP(aj ) arrivals
and exp(μj ) service times. Then we can obtain φj (xj ) and then get p(x). Also,
the steady-state expected number in node j, Lj , can be computed using the
M/M/sj results as well. Then L can be computed by adding over all j from
1 to N. Similarly, it is possible to obtain performance measures at node j
such average waiting time (Wj ), time in queue not including service (Wjq ),
and number in queue not including service (Ljq ), using the single-station
M/M/s queue analysis in Chapter 2. However, while computing something
like sojourn time distribution (across a single node or the network), one has
to be more careful. In the next remark we illustrate some of the issues that
have been nicely described in Disney and Kiessler [26].

Remark 13

Consider an open Jackson network (with cycles) that are stationary (i.e., in
steady state) at time 0. For this network the following results hold:

1. Although the marginal distribution of the steady-state number in

node j behaves similar to that of an M/M/sj queue, the arrival
process to node j may not be Poisson. However, the arrival point
probabilities can still be computed using the moving units see time
average (MUSTA) property in Serfozo [96]. Thus, it may be possible
to derive the sojourn time distribution across the network.
2. At a given instant of time, the number of customers in each
queue is independent of other queues. However, the number of
customers in a particular queue at a given instant may not be
independent of another queue at a different instant. Thus, while
computing sojourn times across a network, since the time of entry
into different nodes occur at different instances, the independence
assumption may not be valid; however, the MUSTA property still
holds.

6.2.3 Examples
In this section, we present four examples to illustrate the approach to obtain
performance measures in open Jackson networks, discuss design issues as
Exact Results in Network of Queues: Product Form 331

well as point out some paradoxical situations. We present these examples as

problems.

Problem 53
Bay-Gull Bagels is a bagel store in downtown College Taste-on. See
Figure 6.4 for a schematic representation of the store as well as numerical
values used for arrivals, service, as well as routing probabilities. Assume
all arrival processes are Poisson and service times are exponential. Aver-
age arrival rates and average service times for each server are given in the
figure. Assume there are infinite servers whenever a station says self-service.
Also assume all queues have infinite capacity. Model the system as an open
Jackson network and obtain the average number of customers in each of
the five stations. Then state how many customers are in the system on an
average.
Solution
The system can be modeled as an open Jackson network with N = 5 nodes or
stations. Let the set of nodes be {B, S, D, C, E} (as opposed to numbering them
from 1 to 5) denoting the five stations: bagels, smoothies, drinks, cashier,
and eat-in. Note that external arrival processes are Poisson and they occur
at stations B, S, and D. The service times are exponentially distributed. Note
the cycle D–C–E–D. Thus we have an open Jackson network with a cycle.
There are three servers at node B, two servers at node S, ∞ servers at node

Special smoothies
Two servers
(4.5 min)
0.2 min

Bagels 20% Two servers

(1 min)
1 min 50%

30%
Three servers Cashier
(2.5 min)
Drinks, namely, coffee 50%
0.4 min

Self-service
(2 min) Eat-in
5%

95% Self-service
(20 min)

FIGURE 6.4
Schematic of Bay-Gull Bagels.
332 Analysis of Queues

D, two servers at node C, and ∞ servers at node E. The external arrival rate
vector λ = [λB λS λD λC λE ] is

λ = [1 0.2 0.4 0 0].

We assume that after getting served at the bagel queue, each customer
chooses node S, C, and D with probabilities pBS = 0.2, pBC = 0.5, and
pBD = 0.3, respectively. Likewise, after getting served at C, customers go to
node E or exit the system with equal probability, that is, pCE = rC = 0.5. Sim-
ilarly, after service at node E, with probability pED = 0.05 enter the drinks
node or exit the network with probability rE = 0.95. Also, pSC = 1 and pDC = 1.
Thus the routing probabilities in the order {B, S, D, C, E} are
⎡ ⎤
0 0.2 0.3 0.5 0
⎢ 0 0 0 1 0 ⎥
⎢ ⎥
⎢ ⎥
P=⎢ 0 0 0 1 0 ⎥.
⎢ ⎥
⎣ 0 0 0 0 0.5 ⎦
0 0 0.05 0 0

The effective arrival rate vector a = (aB aS aD aC aE ) is given by

a = λ(I − P)−1 = [1 0.4 0.741 1.641 0.8205],

where I is the 5 × 5 identity matrix. Notice that stability condition is satisfied

in each of the five queues.
Having modeled the system as an open Jackson network, next we obtain
the average number of customers in each of the five stations. Recall the
number of customers in any node in steady state is identical to that of an
M/M/s queue with the same arrival rate, service rate, and number of servers.
However, the actual arrivals may not be Poisson. With that we obtain Li , the
average number of customers in steady state for all i by analyzing each node
as an M/M/s queue (using Chapter 2 results), as follows:

1. Bagel queue: Arrival rate is aB = 1, service times are exp(1/2.5),

and there are three servers. Using the results for M/M/3, we get
LB = 6.01124, the average number of customers in the Bagel station.
2. Smoothies queue: Arrival rate is aS = 0.4, service times are exp(1/4.5),
and there are two servers. Using the results for M/M/2 we get
LS = 9.47368, the average number of customers in the Smoothies
station.
3. Drinks queue: Arrival rate is aD = 0.741, service times are exp(1/2),
and there are ∞ servers. Using the results for M/M/∞ we get
LD = 1.482, the average number of customers in the Drinks station.
Exact Results in Network of Queues: Product Form 333

4. Cashier queue: Arrival rate is aC = 1.641, service times are exp(1),

and there are two servers. Using the results for M/M/2 we get
LC = 5.02173, the average number of customers in the Cashier
station.
5. Eat-in queue: Arrival rate is aE = 0.8205, service times are exp(1/20),
and there are ∞ servers. Using the results for M/M/∞ we
get LE = 16.41, the average number of customers in the Eat-in
station.

Thus the average number of customers inside Bay-Gull Bagels is LB +

LS + LD + LC + LE = 38.4 customers.

Say in the previous problem, we were to find out the probability there
are 3 customers in node B, 2 in S, 1 in D, 4 in C, and 20 in E. Then that
is equal to the product φB (3)φS (2)φD (1)φC (4)φE (20) and can be computed
using Equation 6.3. Note that φj (n) computation is indeed equal to the prob-
ability there are n in a steady-state M/M/sj queue with PP(aj ) arrivals and
exp(μj ) service. Having described that, we next present an example that dis-
cusses some design issues by comparing various ways of setting up systems
with multiple stages and servers.

Problem 54
Consider a system into which customers arrive according to a Poisson pro-
cess with parameter λ. Each customer needs N stages of service and each
stage takes exp(μ) amount of time. There are N servers in the system and N
buffers for customers to wait. Assume that the buffers have infinite capacity
and λ < μ. There are two design alternatives to consider:

1. Serial system: Each set of buffer and server is placed serially in a tan-
dem fashion as described in Figure 6.5. Each node corresponds to a
different stage of service. Customers arrive at the first node accord-
ing to PP(λ). There is a single server that takes exp(μ) time to serve
after which the customer goes to the second node. At the second
node there is a single server that takes exp(μ) time. This continues
until the Nth node and then the customer exits.

PP(λ)

exp(μ) exp(μ) exp(μ)

FIGURE 6.5
System of N buffers and servers in series.
334 Analysis of Queues

1/N PP(λ/N)
1/N
PP(λ) PP(λ/N)

1/N
PP(λ/N)

FIGURE 6.6
System of N buffers and servers in parallel.

2. Parallel system: Each set of buffer and server is placed in a paral-

lel fashion as described in Figure 6.6. Each node corresponds to a
single server that performs all N stages of service. Customers arrive
according to PP(λ) and they are split with equal probability of 1/N
to one of the N nodes. There is a single server at each of the parallel
nodes that takes a sum of N IID times each equal to exp(μ) to serve
each customer by performing all N stages. After completing service
at a node, the customer exits the system.

By comparing the mean sojourn time for an arbitrary customer in steady state
in both systems, determine whether the serial or parallel system is better.
Solution
Note that although the figures appear to be somewhat different, the
resources of the system and service needs of customers are identical. In other
words, in both systems we have N single-server queues each with an infinite
buffer. Also, in both systems the customers experience N stages of service,
each taking an exp(μ) time. We now analyze the system in the same order
they were presented.

1. The serial system is an open Jackson network (in fact an acyclic net-
work) with N single-server nodes back to back. Service time at each
node is exp(μ). Such a serial system is called a pipeline system in the
computer science literature and tandem network in the manufactur-
ing literature. Clearly, each queue is an M/M/1 queue with PP(λ)
arrivals and exp(μ) service. The average time in each node is thus
1/(μ − λ). Thus, the mean sojourn time in the system is

N
Wseries = .
μ−λ

2. The parallel system is a set of N single-server M/G/1 queues in

parallel. The arrival rate into each of the M/G/1 queues is λ/N.
Service time at each node is the sum of N exp(μ) random vari-
ables. Therefore, the service times are according to an Erlang (or
gamma) distribution with mean N/μ and variance N/μ2 . Since all
Exact Results in Network of Queues: Product Form 335

queues are stochastically identical, the mean sojourn time on any of

them would be equal to the others’. Using the Pollaczek–Khintchine
formula (Equation 4.6), we get the mean sojourn time in the
system as

N λ(N + 1) 2Nμ − Nλ + λ
Wparallel = + = .
μ 2μ(μ − λ) 2μ(μ − λ)

Comparing Wseries and Wparallel , if N > 1 then Wseries > Wparallel ; however, if
N = 1 then Wseries = Wparallel . Thus the parallel system is better.

Before proceeding onto the next example, it is worthwhile to comment

on some practical considerations regarding the conclusion of Problem 54. In
many situations one is faced with this type of decision. For example, consider
fast restaurants such as those making submarine sandwiches or burritos. The
process of making a customized sandwich or burrito can be broken down
into N approximately equally time-consuming tasks, if there are N servers.
In many of these restaurants, the servers are placed in a serial fashion and
the customer walks through explaining what they want. However, there are
some restaurants that adopt the parallel service mechanism where a single
server walks the customer through all stages of the sandwich- or burrito-
making process. We illustrate these cases in exercise problems at the end
of this chapter. A question to ask is why are some restaurants going for
the serial option when the parallel one appears better (at least in the expo-
nential service time case). In most of these restaurants, space and resources
are constraints rendering it impractical to do a parallel operation without
servers being blocked. However, when space and resources are not constrain-
ing, some restaurants adopt the parallel system, but typically instead of N
queues in parallel, they use a single queue. For the Problem 54 it can be
shown that instead of using N parallel M/G/1 queues, if one were to use
a single M/G/N queue with the service times being according to an Erlang
(or gamma) distribution with mean N/μ and variance N/μ2 , then the mean
sojourn time can be reduced further.
Therefore, if there are no constraints in terms of space or resources, then it
appears like we should pool all our servers and make them perform all tasks
of an operation instead of pipelining our servers and ask them to perform a
single task. Then why do we see assembly lines? This brings us to the second
practical issue in terms of designing a multiserver system. Although from a
customer’s standpoint there are N tasks each taking a random exp(μ) amount
of time, the server might actually take a lot longer than the sum of those N
random times. That is due to several reasons. First of all, the servers have to
be trained in all the tasks, which is unrealistic. So typically a server would
take a lot longer to perform all the tasks especially in a complex manufac-
turing environment. Second, it would either be too time consuming from a
336 Analysis of Queues

material handling standpoint as well (or too expensive if the time consump-
tion is reduced). Therefore, in practice, sometimes a combination of serial
and parallel tasking is used. Usually servers are trained in two to four tasks
that they perform together. This not only improves the system performance
but also reduces monotonous conditions. Having described that, the next
two examples are paradoxes that further help understand issues in queueing
networks.

Problem 55
Braess’ paradox: In a network, does adding extra capacity always improve
the system in terms of performance? Although it appears intuitive, adding
extra capacity to a network when the moving entities selfishly choose their
routes can in some cases worsen the overall performance! Illustrate this using
an example.
Solution
Consider a network with nodes A, B, C, and D. There are directed arcs from
A to B, B to D, A to C, and C to D. Customers arrive into node A according
to a Poisson process with mean rate 2λ. The customers need to reach node D
and they have two paths, one through B and the other through C, as shown
in Figure 6.7. Along the arc from A to B there is a single-server queue with
exponentially distributed service times (and mean 1/μ). Likewise, there is an
identical queue along the arc from C to D. In addition, it takes a deterministic
time of 2 units to traverse arcs AC and BD. Assume that

μ > λ + 1.

B
Travel time = 1
μ–λ Travel time = 2
μ

2λ A D

Travel time = 2
Travel time = 1
μ–λ

FIGURE 6.7
Travel times along arcs in equilibrium.
Exact Results in Network of Queues: Product Form 337

In equilibrium, each arriving customer to node A would select either of

the two paths with equal probability. Thus the average travel time from A to
D in equilibrium is

1
2+ .
μ−λ

Note that the constant time of 2 units can be modeled as an M/G/∞ queue
with deterministic service time of 2 units, then the output from that queue is
still PP(λ). Another way of seeing that would be that the departure process
after spending 2 time units would be identical to the entering process, just
shifted by 2 time units. Hence, it would have to be Poisson. Thus the time
across either path is 2 + 1/(μ − λ), where the second term is the sojourn time
of an M/M/1 queue with PP(λ) arrivals and exp(μ) service.
Now a new path from B to C is constructed along which it would take a
deterministic time of 1 unit to traverse. For the first customer that arrives into
this new system described in Figure 6.8, this would be a shortcut because the
new expected travel time would be 1 + 2/(μ − λ), which is smaller than the
old expected travel time given earlier under the assumption μ > λ + 1. Soon,
the customers would selfishly choose their routes so that in equilibrium, all
three paths A − B − D, A − C − D, and A − B − C − D have identical mean
travel times. Actually the equilibrium splits would not be necessary to cal-
culate, instead notice that each of the three routes would take 3 time units to
traverse on average (this is the only way the three paths would have identi-
cal travel times). But the old travel time before the new capacity was added,
2 + 1/(μ − λ), is actually less than 3 units under the assumption μ > λ + 1.
Thus adding extra capacity has actually worsened the average travel
times!

B
Travel time = 1 Travel time = 2
μ

Travel
2λ A time = 1 D

Travel time = 2
Travel time = 1

FIGURE 6.8
New travel times along arcs in equilibrium.
338 Analysis of Queues

Problem 56
Can the computation of waiting times in a queueing system depend on
the method? Consider a stable queue that gets customer arrivals externally
according to a Poisson process with mean rate λ. There is a single server
and infinite waiting room. The service times are exponentially distributed
with mean 1/μ. At the end of service each customer exits the system with
probability p and reenters the queue with probability (1 − p). The system
is depicted in Figure 6.9, for now ignore A, B, C, and D. We consider two
models:

1. If the system is modeled as a birth and death process with birth rates
λ and death rates pμ, then L = λ/(pμ − λ) and W = L/λ = 1/(pμ − λ).
2. If the system is modeled as a Jackson network with 1 node and effec-
tive arrival rate λ/p and service rate μ, then L = (λ/p)/(μ − λ/p) and
W = L/(λ/p) = p/(pμ − λ).

Clearly, the two methods give the same L but the W values are different!
Explain.
Solution
Although this appears to be a paradox, that is really not the case. Let us
revisit Figure 6.9 but now let us consider A, B, C, and D. The W from the
first method (birth and death model) is measured between A and D, which
is the total time spent by a customer in the system (going through one or
more rounds of service). The W from the second method (Jackson network
model) is measured between B and C, which is the time spent by a customer
from the time he or she entered the queue until one round of service is com-
pleted. Note that the customer does a geometric number of such services
(with mean 1/p). Therefore, the total time spent on average would indeed be
the same in either methods if we used the same points of reference, that is,
A and D.

1–p

λ μ p
A B C D

FIGURE 6.9
Points of reference.
Exact Results in Network of Queues: Product Form 339

The exercises at the end of the chapter describe several more examples of
open Jackson networks. Next, we consider closed Jackson networks.

6.3 Closed Jackson Networks

Closed-queueing networks are networks where the number of customers at
all times stays constant (say C). These C customers move from one node to
another. There are no external arrivals to the system and no departures from
the system. The network structure for closed-queueing networks is necessar-
ily cyclic because customers cycle through the network visiting node by node
and never exit the system. At every node there is a single queue with large
enough capacity that no customers are rejected or dropped. There could be
one or more servers at each node. Before describing further modeling details
that are conducive for analysis, it may be worthwhile motivating the need
to study such systems. Closed-queueing networks are popular in popula-
tion studies, multiprogrammed computer systems, window flow control,
Kanban, amusement parks, etc. The key feature is that the number of cus-
tomers is kept a constant at least approximately. For example, in amusement
parks, most customers usually arrive when the park opens and stay almost
till the end of the day going from one ride to another. If one were to design,
control, or schedule events in an amusement park, closed-queueing network
analysis might be appropriate. Another example of such systems is if a new
customer enters the network as soon as an existing customer leaves (a pop-
ular scheme in just-in-time manufacturing). Thus, although we do not have
the same C customers in the system, the total is still a constant and can be
analyzed suitably.
To analyze closed-queueing networks, we make some assumptions simi-
lar to those in the open-queueing networks. These assumptions would result
in what is popularly known as closed Jackson network (mainly because the
requirements are similar to those of the open Jackson network). However,
some researchers also call these as Gordon–Newell networks because of a
paper about 10 years after Jackson [56] by Gordon and Newell [47] that
explicitly considers all the conditions mentioned here. The attributes of a
closed Jackson network are as follows:

1. The network has N service stations and a total of C customers.

2. When a customer completes service at node i, the customer joins
node j with probability pij . Note that customers do not leave the
system from any node. We assume that the closed-queueing net-
work is irreducible (i.e., there is a path from every node to every
other node in the network).
340 Analysis of Queues

3. The service rate at node i when there are n customers in that node is
μi (n) with μi (0) = 0 and μi (n) > 0 for 1 ≤ n ≤ C. The service times are
exponentially distributed.

It may be worthwhile to elaborate on the last attribute. Note that we do not

mention about the number of servers in node i. For example, if there are si
servers in node i and service rate for each server is μ, then μi (n) = min(n, si )μ
since the RHS is the aggregate rate at which service is completed. Now
the representation is versatile enough to also allow state-dependent service.
In other words, we could have one or more servers and the service rate
depends on the number in the system. We will see subsequently that such
a consideration can be made for the open Jackson network as well.

6.3.1 Product-Form Solution

Similar to the open Jackson network case, for the closed Jackson networks
too we will develop a joint distribution for the steady-state number in each
node. In fact, we will aim to show that here too the joint distribution has a
product form. For that we begin by modeling the entire system as a stochastic
process. For i = 1, . . . , N, let Xi (t) be the number of customers in node i of the
closed Jackson network at time t. Let X(t) be a vector that captures the state of
the network at time t and is given by X(t) = [X1 (t), X2 (t), . . . , XN (t)] with the
condition that X1 (t) + X2 (t) + · · · + XN (t) = C for all t. Using the definition
of X(t), the N-dimensional stochastic process {X(t), t ≥ 0} can be modeled as
a CTMC since the interevent times are exponentially distributed. Let p(x)
be the steady-state probability that the CTMC is in state x = (x1 , x2 , . . . , xN ),
that is,

p(x) = lim P{X(t) = (x1 , x2 , . . . , xN )}.

t→∞

Of course, we require that x1 + x2 + · · · + xN = C, otherwise p(x) would be

zero. To obtain p(x) we consider the balance equations for x = (x1 , x2 , . . . , xN )
just like we did for the open Jackson network. We write down the balance
equations in terms of ei , which is a unit vector with one as the ith element
and zeros everywhere else. In addition, we denote p(x) as zero if any xj < 0.
Thus the generic balance equation takes the form

N
N
N
p(x) μi (xi ) = p(x + ei − ej )pij μi (xi + 1).
i=1 j=1 i=1

In this balance equation, the LHS describes the total rate of all the transitions
out of state x that includes all possible service completions. Likewise, the
RHS includes all the transitions into state x, that is, service completions from
various states that result in x.
Exact Results in Network of Queues: Product Form 341

Except for small C and N, solving the balance equations directly is diffi-
cult. However, like we saw in the open Jackson network case, here too we
will guess a p(x) solution and check if it satisfies the balance equations. If it
does, then we are done since there is only one solution to the balance equa-
tions. As an initial guess for p(x), we try the open-queueing network result
itself. For that, recall from Equation 6.2 that a(I − P) = λ; however, λ is a vec-
tor of zeros since there are no external arrivals. Hence, we define a as the
solution to a(I − P) = 0, in other words a solution to

a = aP.

This equation has a solution if the closed-queueing network is irreducible

(which is an assumption we made earlier). Interestingly, the P matrix is
similar to that of a DTMC with N states and a is similar to its stationary prob-
abilities. However, we do not necessarily need a to be a probability vector
that sums to one. Technically it is the left eigen vector of P that corresponds
to eigen value of 1. Let aj be the jth element of a. Understandably, being an
eigen vector, a is not unique. However for all a, we have ai /aj a constant for
every pair of i and j, and for that reason the a values are also called visit
ratios. In that light, following p(x) in the open Jackson network case, we
define φj (n) very similar to that in Equation 6.3 as follows: φj (0) = 1 and

n

aj
φj (n) = for n ≥ 1. (6.4)
μj (k)
k=1

Therefore, as an initial guess we try

N
p(x) = G(C)φ1 (x1 )φ2 (x2 ) . . . φN (xN ) = G(C) φi (xi ),
i=1

where the normalizing constant G(C) is chosen such that

p(x) = 1.
x:x1 +x2 +···+xN =C

Next, we need to verify if the p(x) here satisfies the balance equation

N
N
N
p(x) μi (xi ) = p(x + ei − ej )pij μi (xi + 1).
i=1 j=1 i=1
342 Analysis of Queues

For that, first of all if p(x) = G(C)φ1 (x1 )φ2 (x2 ) . . . φN (xN ), then

p(x) φi (xi )φj (xj )

= ,
p(x + ei − ej ) φi (xi + 1)φj (xj − 1)

for all i and j. In addition, from the definition of φi (n) in Equation 6.4, we can
obtain the following:

ai φi (xi − 1) = μi (xi )φi (xi )

ai φi (xi ) = μi (xi + 1)φi (xi + 1)

with the additional condition that φi (n) = 0 if n < 0.

Using these equations in the balance equation (by dividing it by p(x)),
we get

N
N
N
p(x + ei − ej )
μi (xi ) = μi (xi + 1)pij
p(x)
i=1 j=1 i=1

N
N
φi (xi + 1)φj (xj − 1)
= μi (xi + 1)pij
φi (xi )φj (xj )
j=1 i=1

N
N
ai μj (xj )
= μi (xi + 1)pij
μi (xi + 1) aj
j=1 i=1

N
N
μj (xj )
= ai pij
aj
j=1 i=1

N
μj (xj )
N
= ai pij
aj
j=1 i=1

N
μj (xj )
N
= aj = μj (xj )
aj
j=1 j=1

N
where the penultimate equation can be derived using ai pij = aj , which
i=1
is directly from a = aP. Thus, p(x) = G(C)φ1 (x1 )φ2 (x2 ) . . . φN (xN ) satisfies the
balance equations for a closed Jackson network.
In other words, the steady-state joint probability distribution of having
x1 in node 1, x2 in node 2, . . ., xN in node N is equal to p(x), which is
the product of the φj (xj ) values for all j times a normalizing constant G(C).
Exact Results in Network of Queues: Product Form 343

Hence, this result is also a product form. Note that for this result, similar to the
other product-form cases that we will consider subsequently, the difficulty
arises in computing the normalizing constant G(C). In general, it is not com-
putationally trivial. However, once G(C) is obtained one can compute the
marginal distribution that queue j has xj customers in steady state for some j
such that 1 ≤ j ≤ N, all we have to do is sum over all x keeping xj a constant.
We proceed by first explaining a simple example.

Problem 57
Consider a closed Jackson network with three nodes and five customers,
that is, N = 3 and C = 5. The network structure is depicted in Figure 6.10.
Essentially all five customers behave in the following fashion: upon com-
pleting service at node 1, a customer rejoins node 1 with probability 0.5, or
joins node 2 with probability 0.1, or joins node 3 with probability 0.4; upon
completing service in node 2 or 3, a customer always joins node 1. Node 1
has a single server that serves at rate i if there are i customers at the node.
Node 2 has two servers each with service rate 1. Node 3 has one server with
service rate 2. Determine the joint as well as marginal probability distribu-
tion of the number of customers at each node in steady state. Also compute
the average number in each node as well as the network in steady state.
Solution
Although it is not explicitly stated, the service times are exponentially dis-
tributed (since it is a closed Jackson network). For such a system, to compute
the joint distribution of the steady-state number in each node we first solve
for vector a in
a = aP

where P is the routing probability matrix that can be obtained from the
problem statement as
⎡ ⎤
0.5 0.1 0.4
P=⎣ 1 0 0 ⎦.
1 0 0

2
0.1
1

0.5 1
0.4

1 3

FIGURE 6.10
Closed Jackson network with C = 5 customers.
344 Analysis of Queues

Note that the network is irreducible. Therefore, solving for a we get

!
10 1 4
a = [a1 a2 a3 ] = .
15 15 15

To obtain φj (n) described in Equation 6.4, we need expressions for μj (n).

From the preceding problem description, we have

[μ1 (1) μ1 (2) μ1 (3) μ1 (4) μ1 (5)] = [1 2 3 4 5]

since the single server at node 1 serves at rate n when there are n customers.
Thus, μ1 (n) = n for all n, which is the service rate when there are n customers
in node 1. Likewise, since node 2 has two servers each serving at rate 1, if
there is only 1 customer in node 2, the service rate is 1; however, if there are
2 or more customers in that node, the net service rate at the node (which is
the rate at which customers exit that node) is 2. Hence we have

[μ2 (1) μ2 (2) μ2 (3) μ2 (4) μ2 (5)] = [1 2 2 2 2].

Since there is a single server serving at rate 2 in node 3, the service rate vector
when there are n customers in node 3 is given by

[μ3 (1) μ3 (2) μ3 (3) μ3 (4) μ3 (5)] = [2 2 2 2 2].

Using the preceding values of aj and μj (n) for j = 1, 2, 3 and n = 1,

2, 3, 4, 5, we can compute φj (n) from Equation 6.4 as

n

aj
φj (n) = .
μj (k)
k=1

Also, for j = 1, 2, 3 we have φj (0) = 1. Thereby for all combinations of x1 , x2 ,

and x3 such that x1 + x2 + x3 = 5 we can compute

q(x1 , x2 , x3 ) = φ1 (x1 )φ2 (x2 )φ3 (x3 ).

Thus we can obtain the normalizing constant G(C) using

1
= q(x1 , x2 , x3 ).
G(C)
x1 ,x2 ,x3 :x1 +x2 +x3 =5

Using that we have for all combinations of x1 , x2 , and x3 such that x1 + x2 +

x3 = 5, the joint probability distribution

p(x1 , x2 , x3 ) = G(C)q(x1 , x2 , x3 ) = G(C)φ1 (x1 )φ2 (x2 )φ3 (x3 ).

Exact Results in Network of Queues: Product Form 345

TABLE 6.2
Example Joint Probability Distribution
x1 x2 x3 p(x1 , x2 , x3 ) x1 x2 x3 p(x1 , x2 , x3 ) x1 x2 x3 p(x1 , x2 , x3 )
0 0 5 0.0077 0 1 4 0.0039 0 2 3 0.0010
0 3 2 0.0002 0 4 1 0.0001 0 5 0 0.00002
1 0 4 0.0386 1 1 3 0.0193 1 2 2 0.0048
1 3 1 0.0012 1 4 0 0.0003 2 0 3 0.0964
2 1 2 0.0482 2 2 1 0.0121 2 3 0 0.0030
3 0 2 0.1607 3 1 1 0.0803 3 2 0 0.0201
4 0 1 0.2009 4 1 0 0.1004 5 0 0 0.2009

It may be worthwhile to write a computer program to verify the following

sample numerical results: q(2, 1, 2) = 0.00026337 and q(5, 0, 0) = 0.0011. Due
to space restrictions other q values are not reported. However, note that
G(C) = 183.038 and the joint probability p(x1 , x2 , x3 ) is described in Table 6.2.

Having obtained the joint probability distribution, next we obtain the

marginal probabilities. Let pi (n) be marginal probability that node i has n
customers (for i = 1, 2, 3 and n = 0, 1, 2, 3, 4, 5). Clearly,

5
5
p1 (x1 ) = p(x1 , x2 , x3 ),
x2 =0 x3 =0

5
5
p2 (x2 ) = p(x1 , x2 , x3 ),
x1 =0 x3 =0

5
5
p3 (x3 ) = p(x1 , x2 , x3 ).
x1 =0 x2 =0

For the preceding numerical values, the marginal probability vectors pi (for
i = 1, 2, 3) are

p1 = [0.0129 0.0642 0.1597 0.2611 0.3013 0.2009]

p2 = [0.7051 0.2521 0.0379 0.0045 0.0004 0.000015]

p3 = [0.3247 0.2945 0.2140 0.1167 0.0424 0.0077]

where pi = [pi (0) pi (1) pi (2) pi (3) pi (4) pi (5)]. Let Li be the average number
of customers in node i in steady state for i = 1, 2, 3. We can compute Li as
"
n npi (n). Hence, we have L1 = 3.3764, L2 = 0.3429, and L3 = 1.2807. The total
346 Analysis of Queues

number in the system in steady state is C, which is 5. It can also be obtained

as L1 + L2 + L3 .

Next, we describe the probability that an arriving customer to node

i would see j others in that queue. Subsequently, we will consider the single-
server case where we can avoid explicitly computing G(C). Then we will
wrap up closed Jackson networks with an example in analysis of computer
systems.

6.3.2 Arrivals See Time Averages (ASTA)

In Chapter 4 we discussed the relationship between the arrival point proba-
bilities (i.e., the distribution of the system state for an arrival in steady state)
and the corresponding long-run time-averaged fractions. In particular, we
claimed for the M/G/1 queue using PASTA that πj = pj for all j, where πj is
the probability that an arrival in steady state would see j others in the system
versus pj is the long-run fraction of time there are j customers in the system.
However, for the G/M/1 where the arrivals are not necessarily Poisson, we
showed that πj is not equal to pj but they are related (pj ∝ πj−1 for all j ≥ 1). In
this section, we seek to obtain a relationship between the arrival point proba-
bilities and the steady-state probabilities for the closed Jackson network. Similar
to the G/M/1 queue, in the closed Jackson network as well, the arrival pro-
cesses to nodes are not Poisson. So it is natural to not expect the arrival point
probabilities and the steady-state probabilities to be equal (although the title
of this section—ASTA—might mislead one to believe them to be equal). But
would they be related? Let us find out.
Consider a closed Jackson network (with all the usual notation) in steady
state. Let πj (x) be the probability that an arriving customer into node j in
steady state will see xk customers in node k for all k so that x is a vector of
those xk values. Note that x1 + x2 + · · · + xk = C − 1 since we are not count-
ing the arriving customer into node j in the state description. Consider an
infinitesimal time h units long and ei an N-dimensional unit vector, that is, a
one on the ith element and zero everywhere else. We can show that
"N
i=1 pij [μi (xi + 1)h + o(h)]p (x + ei )
C
πj (x) = lim " " N
i=1 pij [μi (yi + 1)h + o(h)]p (y + ei )
h→0 C
y

where
y = [y1 , y2 , . . . , yN ] such that y1 + · · · + yN = C − 1
o(h) is a set of terms of the order h such that o(h)/h → 0 as h → 0
pC (x + ei ) is the usual p(x + ei ) with C used for clarity to denote the total
number of customers

We first explain the πj (x) equation.

Exact Results in Network of Queues: Product Form 347

The denominator of the πj (x) equation essentially is the probability that

in an infinitesimal time interval of length h in steady state, an arrival occurs
into queue j, although initially it may appear that the summation in the
denominator does not capture all the states of the system because there are
C customers and y is summed over C − 1 customers being in various queues.
However, note that by summing over all possible states i where the extra
customer could lie we have indeed taken care of all possibilities. In addition,
we not only need a service completion in node i but that should also result
in the customer moving to node j (with probability pij ). Further, if there are n
customers in node i, then with probability μi (n)h + o(h) a departure occurs in
a time interval of length h. In other words, the terms inside the summations
of the denominator account for the probability that the system state is y + ei ,
and a service is completed for a customer in node i, and that customer joins
node j. This is computed by conditioning first on the state, then on the com-
pletion. The numerator in a very similar fashion is the probability that an
arrival occurs into queue j during an infinitesimal duration of length h when
the state is x + ei . Thus the ratio of the numerator to denominator in πj (x)
would result in the probability that the system state is x (not counting the
arrival) given an arrival occurrence. Of course, this arrival could in theory
have come from any one of the N nodes, and hence we would have to con-
sider all states x + ei over all i from which a departure would have occurred
and then that customer joins node j.
Taking the limit and simplifying the πj (x) equation we get
"N
i=1 pij μi (xi + 1)p (x + ei )
C
πj (x) = " " N
i=1 pij μi (yi + 1)p (y + ei )
C
y
"N
i=1 pij G(C)φ1 (x1 )φ2 (x2 ) . . . φi (xi + 1) . . . φN (xN )μi (xi + 1)
=" " N
y i=1 pij μi (yi + 1)G(C)φ1 (y1 )φ2 (y2 ) . . . φi (yi + 1) . . . φN (yN )
"N
i=1 pij φ1 (x1 )φ2 (x2 ) . . . φi (xi ) . . . φN (xN )ai
=" " N
y i=1 pij ai φ1 (y1 )φ2 (y2 ) . . . φi (yi ) . . . φN (yN )

since φi (n + 1) = φi (n)ai /μi (n + 1), and canceling G(C)

"
φ1 (x1 )φ2 (x2 ) . . . φN (xN ) N i=1 pij ai
=" "N
y φ1 (y1 )φ2 (y2 ) . . . φN (yN ) i=1 pij ai

by collecting the product form outside the summation

φ1 (x1 )φ2 (x2 ) . . . φN (xN )aj

="
y φ1 (y1 )φ2 (y2 ) . . . φN (yN )aj
348 Analysis of Queues

N
pij ai = aj from aP = a
i=1

φ1 (x1 )φ2 (x2 ) . . . φN (xN )

="
y φ1 (y1 )φ2 (y2 ) . . . φN (yN )

canceling aj from numerator and denominator

= G(C − 1)φ1 (x1 )φ2 (x2 ) . . . φN (xN )

"
since y φ1 (y1 )φ2 (y2 ) . . . φN (yN ) = 1/G(C−1). Since G(C−1)φ1 (x1 )φ2 (x2 )
. . . φN (xN ) = pC−1 (x), that is, if the closed Jackson network had C − 1 instead
of C customers, then this is the probability that the state of this system in
steady state is x. Therefore,

πj (x) = pC−1 (x).

The preceding result is called ASTA because the RHS of the equation is
the time-averaged probabilities and the LHS is as seen by arriving customers.
In fact, to be more precise, if one were to obtain the distribution of the system
state by averaging across those seen by arriving customers (note that arriving
customers do not include themselves in the system state), then this is iden-
tical to a time-averaged distribution of the system state when there is one
less customer. Furthermore, if one were to insert a “dummy” customer in
the system to obtain the system state every time this customer enters a node,
then it is possible to get the system state distribution without this dummy
customer. Sometimes, one is not necessarily interested in the entire vector
of states but just that of the entering node. This is the essence of the next
remark, sometimes also known as arrival theorem.

Remark 14

In a closed Jackson network with C customers, for any n, the probability that
there are n customers in node i, as seen at the time of arrival of an arbitrary
customer to that node, is equal to the probability that there are n customers
at this node with one less job in the network (i.e., C − 1).

This remark can be immediately derived using πj (x) = pC−1 (x) by sum-
ming over all xj such that j = i. For example, if we were to modify Problem
57 so that there are C = 6 customers, then the probability that an arriving
customer will see two customers in node 3 is 0.214, which can be obtained
by considering a network of C = 5 customers (done in that Problem 57);
computing p3 , which is the probability distribution of the number in node 3;
and then using the term corresponding to two customers in the system.
Exact Results in Network of Queues: Product Form 349

Next we present another example of a closed Jackson network, which in the

queueing literature is popularly known as finite population queue.

Problem 58
Consider a single-server queue where it takes exp(μ) amount of time to
serve a customer. Unlike most of the systems in the previous chapters, here
we assume that there is a finite population of C customers. Each customer
after completion of service returns to the queue after spending exp(λ) time
outside. First model the system as a birth and death process, and obtain
the steady-state probabilities. Then compute the arrival point probabilities.
Subsequently, model the system as a closed Jackson network and compare
the corresponding results.
Solution
The finite population single-server queue model is depicted in Figure 6.11.
The top of the figure is the queue under consideration and the box in the
bottom denotes the idle time before customers return to the queue. There
are several applications of such a system. For example, in the client–server
model with C clients that submit requests to a server. Once the server sends
a response, after a think time the clients send another request and so on. The
request service time for the server is exp(μ) and different requests contend
for the server. The think times for the clients are according to exp(λ). Another
example is a bank that has C customers in total. Each customer visits the bank
(with a single teller, although the model and analysis can easily be extended
to multiple tellers), waits for service, gets served for exp(μ) time, and revisits
the bank after spending exp(λ) time outside.
To model this system as a birth and death process, let X(t) denote the
number of customers in the queue (including any at the server) at time t.
When X(t) = n, there are C − n customers outside the queue and hence the
arrival time would be when the first of those C − n customers enters the
queue. Since each customer spends exp(λ) time outside, the arrival rate
when X(t) = n is (C − n)λ. Likewise, if there is a customer at the queue, the
service rate is μ. Therefore, we can show the {X(t), t ≥ 0} process is a birth
and death process with birth parameters λn = (C − n)λ for 0 ≤ n ≤ C − 1 and

exp(μ)

exp(λ)

FIGURE 6.11
Finite population queue with C customers.
350 Analysis of Queues

death parameters μn = μ for 1 ≤ n ≤ C. All other birth and death parameters

are zero.
Next, we obtain the steady-state probabilities pj , which is the probabil-
ity that there are j in the queue (including any at the server) in steady state.
For that we consider the birth and death process and write down the fol-
lowing balance equations using arc cuts: Cλp0 = μp1 , (C − 1)λp1 = μp2 , and
(C − 2)λp2 = μp3 , . . . , λpC−1 = μpC . Rewriting the equations in terms of p0 we
get for all i = 1, . . . , C
i i
λ C λ
pi = C(C − 1) . . . (C − i + 1) p0 = i! p0 .
μ i μ

Solving for p0 and substituting, we get for all i such that 0 ≤ i ≤ C,

C λ i
i i! μ
pi =
"C C λ j
. (6.5)
j=0 j j! μ

To compute the arrival point probabilities, we write the pi as pC i to denote

that the population size is C. This is in the same line as what we did for
the closed-queueing network. Let π∗j denote the probability that an arriving
customer into the queue find j others in there. We can write down π∗j in terms
of Ei and Ah , which, respectively, denote the event that there are i customers
in queue i (for all i ∈ {0, 1, . . . , C}) at an arbitrary time in steady state and the
event that an arrival occurs into the queue during an infinitesimal time of
length h. By definition we have

π∗j = lim P{Ej |Ah }.

h→0

To obtain this conditional probability, we use Bayes’ rule and then compute
the conditional probabilities in the following manner:

P{Ah |Ej }P(Ej )

π∗j = lim C
h→0
P{Ah |Ei }P(Ei )
i=0

{(C − j)λh + o(h)}pC

j
= lim C
h→0
{(C − i)λh + o(h)}pC
i
i=0

(C − j)λpC
j
= C .
(C − i)λpC
i
i=0
Exact Results in Network of Queues: Product Form 351

To further simplify this expression, note from Equation 6.5 that

pC
j C pC
0
= .
pC−1 C−j pC−1
j 0

Therefore, we can write down the last expression for π∗j as

(C − j) C pC
π∗j = C pC−1 0

(C − i)pC
j C−j pC−1
i 0
i=0

CpC
= C
0
pC−1
j
C−1 C
p0 (C − i)pi
i=0

= kpC−1
j

where k is a constant that does not depend on j. But k must be 1 because

" ∗ " C−1
j πj = j kpj = 1. Thus, we have

π∗j = pC−1
j , (6.6)

that is, the probability that there are j in the queue as seen by an arriving
customer is the same as the probability that there are j in the queue for a
similar system with one less customer.
Next, we model the system in the problem as a closed Jackson network
with N = 2 nodes and C customers. We denote the single-server queue as
node 1 and outside of the queue as node 2. The service rate at node 1 when
there are n (such that n > 0) customers in it is μ, hence μ1 (n) = μ. The service
rate at node 2 when there are n customers in it is nλ (which is essentially
the rate at which a departure occurs when there are n in node 2). Thus,
μ2 (n) = nλ. The routing probability matrix P can be obtained from the
problem statement as
!
0 1
P= .
1 0

The network is irreducible and a solution to a = aP is a = [1 1]. Note that

the elements of a do not need to sum to one, and in this case it is especially
convenient to keep them both equal to 1.
Using the values of aj and μj (n), for j = 1, 2 and n = 1, . . . , C we can
compute φj (n) from Equation 6.4 as
352 Analysis of Queues

n

aj 1
φ1 (n) = = n
μj (k) μ
k=1

and

n

aj 1
φ2 (n) = = .
μj (k) n!λn
k=1

Also for j = 1, 2 we have φj (0) = 1. Further, since we only have two nodes,
if one node has x1 , then necessarily the other node must have x2 = C − x1 .
Therefore, the joint probability distribution

1
p(x1 , C − x1 ) = G(C)φ1 (x1 )φ2 (C − x1 ) = G(C)
(C − x1 )!λC−x1 μx1
x1
1 C λ
= G(C) C
x1 ! .
C!λ x1 μ

Since the sum over all x1 (from 0 to C) is one, we have

C!λC
G(C) =
"C C λ j
.
j=0 j j! μ

Hence, we have

C λ x1
x1 x1 ! μ
p(x1 , C − x1 ) =
"C C λ j
j=0 j j! μ

which is identical to Equation 6.5. Thus we have arrived at the steady-state

distribution of the number in the queue by modeling and analyzing in two
different ways. Also note that the probability that an arriving customer to
node 1 sees x1 in node 1 and the remaining in node due to ASTA 2 is

π1 (x1 , C − x1 − 1) = pC−1 (x1 , C − x1 − 1)

where pC−1 (x1 , C − x1 − 1) is the steady-state probability that there are x1 in

queue 1 when there are a total of C−1 customers in the system. Note that this
result is identical to Equation 6.6 that was derived by analyzing the system
using a birth and death process.
Exact Results in Network of Queues: Product Form 353

6.3.3 Single-Server Closed Jackson Networks

Although for the two closed Jackson network examples thus far the comput-
ing G(C) was not particularly difficult, as the number of nodes and customers
increase, this would become a very tedious task. There are a few algorithms
to facilitate that computation, one of which is provided in the exercises at
the end of this chapter. Nonetheless, there is one case where it is possi-
ble to circumvent computing G(C). That case is a closed Jackson network
where there is a single server at all the nodes. Assume that for all i, there
is a single server at node i with service rate μi operating FCFS discipline at
every node. Then the mean performance measures can be computed with-
out going through the computation of the normalizing constant G(C). For
that we first need some notation (for anything not defined here the reader
is encouraged to refer to the closed Jackson network notation). Define the
following steady-state measures for all i = 1, . . . , N and k = 0, . . . , C:

• Wi (k): Average sojourn time in node i when there are k customers

(as opposed to C) in the closed Jackson network
• Li (k): Average number in node i when there are k customers (as
opposed to C) in the closed Jackson network
• λ(k): Measure of average flow (sometimes also referred to as
throughput) in the closed Jackson network when there are k cus-
tomers (as opposed to C) in the network

We do not have an expression for any of the preceding measures and the
objective is to obtain them iteratively. However, before describing the itera-
tive algorithm, we first explain the relationship between those parameters.
On the basis of the arrival theorem described in Remark 14, in a network
with k customers (such that 1 ≤ k ≤ C), the expected number of customers that
an arrival to node i (for any i ∈ {1, . . . , N}) would see is Li (k − 1). Note that
Li (k − 1) is the steady-state expected number of customers in node i when
there are k − 1 customers in the system. Thereby, the net mean sojourn time
experienced by that arriving customer in steady state is the average time to
serve all those in the system upon arrival plus that of the customer. Since the
average service time is 1/μi , we have

1
Wi (k) = [1 + Li (k − 1)].
μi

Let a be the solution to a = aP as usual with the only exception that the aj
values sum to one here. Thus, the aj values describe the fraction of visits
that are made into node j. The aggregate sojourn time weighted across the
N
network using the fraction of visits is given by ai Wi (k) when there are
i=1
k customers in the network. One can think of an aggregate sojourn time as
354 Analysis of Queues

the sojourn time for a customer about to enter a node. Hence, by condition-
ing on the node of entry as i (which happens with probability ai ) where the
N
mean sojourn time is Wi (k), we can get the result ai Wi (k). Thereby we
i=1
derive the average flow in the network using Little’s law across the entire
network as
k
λ(k) = N
ai Wi (k)
i=1

when there are k customers in the network. Essentially, λ(k) is the average
rate at which service completion occurs in the entire network, taken as a
whole. Thereby applying Little’s law across each node i we get

Li (k) = λ(k)Wi (k)ai

when there are k customers in the network.

Using these results we can develop an algorithm to determine Li (C),
Wi (C), and λ(C) defined earlier. The inputs to the algorithm are N, C, P,
and μi for all i ∈ {1, . . . , N}. For the algorithm, initialize Li (0) = 0 for 1 ≤ i ≤ N
and obtain the ai values for all i ∈ {1, . . . , N}. Then for k = 1 to C, iteratively
compute for each i (such that 1 ≤ i ≤ N):

1
Wi (k) = [1 + Li (k − 1)],
μi

k
λ(k) = N ,
ai Wi (k)
i=1

Li (k) = λ(k)Wi (k)ai .

We present an example in website server architecture to illustrate the pre-

ceding algorithm via a problem that is suitably adapted from Menasce and
Almeida [81].
Websites, especially dealing with e-business, typically consist of a three-
tier architecture described in Figure 6.12. Users access such websites through

Tier 1 Tier 2 Tier 3

User Internet
Web Application Database
Browser server server server
website

FIGURE 6.12
Three-tier architecture for e-business websites.
Exact Results in Network of Queues: Product Form 355

their browsers by connecting to the first tier, namely, the web server. The
web server provides web pages and forms for users to enter requests or
information. When the users enter the information and send back to the
web server, it passes the information onto the application server (which is
the second tier). The application server processes the information and com-
municates with the database server (in the third tier). The database server
then searches its database and responds to the application server, which
in turn passes it onto to the web server, which transmits to the user. For
example, consider running a website for a used car dealership (with URL
www.usedcar.com). When a user types www.usedcar.com on their browser,
the request goes to the first tier for which the web server responds with the
relevant web page. Say the user fills out a set of makes and models as well as
desirable years of manufacture. When the user submits this form expecting
to see the set of used cars available that meets his or her criteria, the web
server passes the set of criteria to the application server (second tier). The
application server processes this set of criteria to check if the form is filled
with all the required fields and then submits to the database server (third
tier). The database server queries the database of all cars available in the deal-
ership that meet the criteria and then responds with the appropriate set. Hav-
ing described some background for websites, we now describe a problem.

Problem 59
The bottleneck in many three-tier architecture websites is the database server
that does not scale up to handle a large number of users simultaneously. Let
us say that the database server can handle at most C connections simultane-
ously. During peak periods, one typically sees the database server handling
its full capacity of C connections at every instant of time. In practice, every
time one of the C connections is complete, instantaneously a new connection
is added to the database server thus maintaining C connections through-
out the peak period. Hence, we can model the database server system as
a closed queueing network with C customers. The database server system
consists of a processor and four disks as shown in Figure 6.13. All five nodes
are single-server queues with exponential service times. Each customer after
being processed at the processor goes to any of the four disks with equal
probability. The average service time (in milliseconds) at the processor is
6 and at the four disks are 17, 28, 13, and 23, respectively. For C = 16 use
the preceding algorithm to determine the expected number at each of the
five nodes as well as the throughput of the database–server system. What
happens to those metrics when C is 25, 50, 75, and 100?
Solution
Note that we have a Jackson network with N = 5 nodes and C = 16 customers
(we will later consider other C values). The P matrix corresponding to the
processor node and the four disks is
356 Analysis of Queues

3
1
Processor 4

Disks

FIGURE 6.13
Closed queueing network inside database server.

⎡ ⎤
0 0.25 0.25 0.25 0.25
⎢ 1 0 0 0 0 ⎥
⎢ ⎥
P=⎢
⎢ 1 0 0 0 0 ⎥.
⎥
⎣ 1 0 0 0 0 ⎦
1 0 0 0 0

We can obtain a = [a1 a2 a3 a4 a5 ] by solving for a = aP and a1 + a2 +

a3 + a4 + a5 = 1 to get

a = [1/2 1/8 1/8 1/8 1/8].

We are also given in the problem statement that

μ = [μ1 μ2 μ3 μ4 μ5 ] = [1/6 1/17 1/28 1/13 1/23].

By initializing Li (0) = 0 for i = 1, 2, 3, 4, 5, we can go through the algorithm

of iteratively computing for k = 1 to C (and all i such that 1 ≤ i ≤ 5) using
N
the following steps: Wi (k) = 1/μi [1 + Li (k − 1)], λ(k) = k/ ai Wi (k), and
i=1
Li (k) = λ(k)Wi (k)ai . These computations are tabulated in Table 6.3. From the
last row of that table note that with C = 16, the expected number of connec-
tions at nodes 1, 2, 3, 4, and 5 are 3.6023, 1.327, 7.208, 0.7815, and 3.0812,
respectively. Also, the throughput of the database–server system is 0.2719
transactions per millisecond. If we were to increase C to 25, 50, 75, and
100, the corresponding values would be as described in Table 6.4. Note in
the table that the throughput λ has practically leveled off and increasing C
is mostly contributing only to longer queues in node 3 (i.e., disk 2), which
is the bottleneck in this system. Also notice how all the other nodes have
not only scaled very well but the contribution to the overall number in the
system is becoming negligible. This will be the basis of bottleneck-based
approximations that we will consider in the next chapter.
Exact Results in Network of Queues: Product Form 357

TABLE 6.3
The Single-Server Closed Jackson Network Iterations
k L1 (k) L2 (k) L3 (k) L4 (k) L5 (k) λ(k)
1 0.2286 0.1619 0.2667 0.1238 0.2190 0.0762
2 0.4631 0.3102 0.5570 0.2294 0.4403 0.1256
3 0.7018 0.4452 0.8714 0.3195 0.6621 0.1599
4 0.9433 0.5674 1.2102 0.3962 0.8829 0.1848
5 1.1860 0.6776 1.5737 0.4615 1.1013 0.2034
6 1.4284 0.7765 1.9620 0.5173 1.3158 0.2178
7 1.6692 0.8650 2.3754 0.5649 1.5255 0.2291
8 1.9073 0.9439 2.8138 0.6057 1.7294 0.2382
9 2.1414 1.0142 3.2772 0.6406 1.9266 0.2455
10 2.3706 1.0766 3.7657 0.6706 2.1165 0.2515
11 2.5940 1.1321 4.2790 0.6964 2.2985 0.2565
12 2.8109 1.1812 4.8169 0.7187 2.4723 0.2607
13 3.0207 1.2246 5.3792 0.7379 2.6376 0.2642
14 3.2228 1.2631 5.9654 0.7546 2.7942 0.2672
15 3.4167 1.2970 6.5752 0.7690 2.9421 0.2697
16 3.6023 1.3270 7.2080 0.7815 3.0812 0.2719

TABLE 6.4
When C Is Increased to 25, 50, 75, and 100
C L1 L2 L3 L4 L5 λ
25 4.8794 1.4786 13.8299 0.8418 3.9704 0.2818
50 5.9284 1.5435 37.0910 0.8660 4.5711 0.2856
75 5.9972 1.5454 61.9915 0.8667 4.5992 0.2857
100 5.9999 1.5455 86.9880 0.8667 4.6000 0.2857

6.4 Other Product-Form Networks

Besides the Jackson network, there are other product-form queueing net-
works, most of which are generalizations of the Jackson network. In some
sense these networks relax few of the restrictive assumptions and still give
rise to product-form steady-state distributions. Some of the generalizations
include state-dependent arrivals and service, multiclass customers, deter-
ministic routing, general service times, state-dependent routing, networks
with losses, and networks with negative customers. In this section, we
consider only a few of those generalizations, one at a time, to get an appre-
ciation of each of their effects. However, in the literature one typically
358 Analysis of Queues

finds a combination of the preceding generalizations such as those in

Baskett– Chandy–Muntz–Palacios (BCMP) networks, Kelly networks, Whit-
tle networks, loss networks, and networks with signals, to name a few. The
reader is encouraged to consider recent texts such by Serfozo [96], Chao
et al. [18], or Chen and Yao [19] for a more rigorous and thorough treat-
ment of this topic through a concept called quasi-reversibility. In addition,
Bolch et al. [12] provides algorithms for performance analysis of product-
form networks. With that introduction, we now describe four product-form
networks.

6.4.1 Open Jackson Networks with State-Dependent Arrivals and Service

In the literature, state-dependent arrivals and/or services are also included
under open Jackson networks, much like what we did for closed Jackson
networks. There are some technicalities that we felt are better addressed if
considered separately and hence we are including them here. Other than the
arrivals and the service times, all conditions described in Section 6.2 hold
(especially important is that there are N nodes in the network and the rout-
ing matrix is P). We first describe the notion of state-dependent service times,
which is identical to what we saw for closed Jackson networks. Let the ser-
vice rate at node i when there are n customers at that node be μi (n) with
μi (0) = 0. Also assume that the service rate does not depend on the states of
the remaining nodes (toward the end of this section we will talk about Whit-
tle networks where service rates could depend on the state at all nodes). Next
we describe state-dependent arrivals.
Let λ(n) be the total external arrival rate to the network as a whole when
there are n customers in the entire network. A special case of λ(n) is when it
is λ for n ≤ C and 0 for n > C. This special case amounts to an open Jackson
network with at most C customers (some texts such as Serfozo [96] make a
distinction between this special case and the usual open Jackson network).
Now back to the general λ(n). When a customer arrives, then with probabil-
ity ui this incoming customer joins node i, independently of other customers
for i = 1, . . . , N. Therefore, external arrivals to node i are at rate ui λ(n). Recall
that the service rate at node i when there are ni customers at that node is
given by μi (ni ) with μi (0) = 0.
It is not hard to see that we can model the entire system as a CTMC.
For i = 1, . . . , N, let Xi (t) be the number of customers in node i at time t.
Let X(t) be a vector that captures a snapshot of the state of the network at
time t and is given by X(t) = [X1 (t), X2 (t), . . . , XN (t)]. Let p(x) be the steady-
state probability that the CTMC {X(t), t ≥ 0} is in state x = (x1 , x2 , . . . , xN ),
that is,

p(x) = lim P{X(t) = (x1 , x2 , . . . , xN )}.

t→∞
Exact Results in Network of Queues: Product Form 359

Although we would not derive the results (since it is extremely similar

to those in the previous sections), we will next just describe p(x) as a
product form.
For that we first obtain the visit ratios that we call bi for node i (for i =
1, . . . , N). Clearly, bj is the unique solution to

N
bj = uj + bi pij .
i=1

It can be crisply computed in vector form as

b = u[I − P]−1

where
u is a row vector of uj values
b a row vector of bj values
I the identity matrix
P the routing matrix

Next, define the following for all i ∈ {1, . . . , N}: φi (0) = 1 and

n

bi
φi (n) = for n ≥ 1.
μi ( j)
j=1

N
Define x̂ = xi . Using this notation it is possible to show that the
i=1
steady-state probability p(x) is given by

N
x̂
p(x) = c φi (xi ) λ( j),
i=1 j=0

where the normalizing constant c is

⎧ ⎫−1
⎨
N
x̂ ⎬
c= φi (xi ) λ( j) .
⎩ x i=1
⎭
j=0

Using the preceding joint distribution, it is possible to obtain certain perfor-

mance measures. However, one of the difficulties is obtaining the normal-
izing constant c. This was a concern for the closed Jackson network as well.
Once we obtain c, the marginal distribution at each node can be obtained.
360 Analysis of Queues

That can be used to get the distribution of the number of customers in the
system as well as the mean (and higher moments). Then using Little’s law,
the mean sojourn times (across a node and the network itself) can also be
obtained.
An immediate extension to this model is to allow service rates to depend
on the number of customers in each node of the network. Therefore, the
service rate at node i when there are x1 , . . . , xN customers, respectively, at
nodes 1, . . . , N instead of being μi (xi ), is now μi (x). A network with that
extension is known as a Whittle network for which Serfozo [96] describes con-
ditions for a product-form solution. In the next few sections, we will describe
other networks where the steady-state probabilities can be represented as
product form.

6.4.2 Open Jackson–Like Networks with Deterministic Routing

In the description for open Jackson networks in Section 6.2, at the end of ser-
vice at a node (say i) each customer joins node j with probability pij . Further,
the past route a customer has followed up to node i is of no use in predict-
ing the next node. In this section, we modify that requirement. In particular,
each customer entering the network has a fixed route to follow, which is
deterministic and revealed upon arrival to the network. Note that a route is
a collection of nodes representing the path to be followed by a customer on
the route. The paths followed by various routes could be overlapping. Say
there are R routes in a network, then each arriving customer belongs to one of
those R routes. In fact, in many situations we say that the customers belong
to R different classes depending on their prespecified routes. There are sev-
eral examples: repair shops where there are R types of repairs that require
items to visit a given set of workstations in a particular order, multiple types
of requests in a computer system each requiring a set of resources in a partic-
ular order, and tasks to be performed in a multiagent software system with
precedence constraints.
As a generic description, we restate the conditions described in Section
6.2 to reflect this situation as follows:

1. The system is an open-queueing network consisting of N service

stations (or nodes).
2. There are R deterministic routes in the network and each customer
arrives, follows one of the routes, and exits the network.
3. External arrivals are according to a Poisson process. In particu-
lar, customers following route r (for r = 1, . . . , R) arrive externally
according to PP(λr ) into the first node of that route. These customers
are sometimes said to belong to class r.
4. There are si identical servers at node i (such that 1 ≤ si ≤ ∞), for all i
satisfying 1 ≤ i ≤ N.
Exact Results in Network of Queues: Product Form 361

5. Service time requirements of customers at node i are IID exp(μi ) ran-

dom variables. They are independent of service times at other nodes
and independent of routes.
6. The service discipline is one of the following (irrespective of the
class or route of the customer): FCFS, random order of service with
new customer selected at every arrival and service completion, or
processor sharing (with si = 1 at node i).
7. There is infinite waiting room at each node and stability condition is
satisfied at every node i, that is, the sum of the average arrival rate
from all routes sharing node i must be less than si μi .
8. When a customer of class r (i.e., in route r) completes service at node
i, the customer joins node j if that is the next node in route r or exits
the network if i is the last node in route r. The routes can have cycles
(i.e., multiple visits to a node).

This is a special case of what is known in the literature as Kelly networks.

To analyze such a system, define M as an R × N route–node incidence matrix
such that an element, say Mrj , denotes the number of times node j would
be visited in route r (in acyclic routes this would just be zero or one). For
i = 1, . . . , N, let Xri (t) be the number of customers belonging to route r in
node i at time t. Let X(t) be a vector that captures a snapshot of the state of the
network at time t and is given by X(t) = [X11 (t), X21 (t), . . . , Xri (t), . . . , XRN (t)].
Let p(x) be the steady-state probability of being in state x = (x11 , x21 , . . . , xRN ),
that is,
p(x) = lim P{X(t) = (x11 , x21 , . . . , xRN )}.
t→∞

Of course, in practice we would not have to consider such a large dimension-

ality for X(t) considering that xrj should be zero if Mrj = 0, that is, if route r
does not include node j. Hence, let us define set E as the set of all possible x
satisfying the criterion that xrj = 0 if Mrj = 0 for all r and j. Therefore, p(x) > 0
if x ∈ E and p(x) = 0 otherwise.
Note that at every node, if the service discipline is either processor shar-
ing (with si = 1) or random order of service with new customer selected at
every arrival and service completion, the stochastic process {X(t), t ≥ 0} is a
CTMC. However, if the discipline is FCFS, {X(t), t ≥ 0} is not a CTMC since
the transition rates would be different if the history is known as opposed to
when it is not known as the state information does not include the customer
class under service. For the FCFS, case we would have to keep track of the
type of customer in each position of every queue in the network that would
form a CTMC. But any permutation within a queue would result in the same
probability. Thus, adding across all permutations we can obtain p(x) identi-
cal to that of the random order of service. Having said that, next we describe
a product-form solution for p(x).
362 Analysis of Queues

To obtain a product-form expression for p(x), we require that for all r

and i (such that 1 ≤ r ≤ R and 1 ≤ i ≤ N), the rate at which a class r cus-
tomer completes service at node i when the state of the network is x such
that x ∈ E is
xri
μi min(yi , si )
yi

where yi is the total number of customers in node i, that is,

R
yi = xri
r=1

(when yi is zero, the service completion rate is zero as well). All three service
disciplines mentioned earlier satisfy that condition, and others that do can
also be included in the list of service disciplined allowed. With that under-
standing, we will next just describe p(x) as a product form without going
into details of the derivation.
Define the following for all i ∈ {1, . . . , N}: φi (0, 0, . . . , 0) = βi and
⎛ ⎞& '
n
1 R
λnr r
φi (n1 , n2 , . . . , nR ) = βi n! ⎝ ⎠ for n ≥ 1
μi min(j, si ) nr !
j=1 r=1

where

R
∞
k λr Mrj
r=1
n= nr and β−1
j =1+ .
r
μj min(n, si )
k=1 n=1

Using this notation it is possible to show that the steady-state probability p(x)
is given by
(N
p(x) = i=1 φi (x1i , x2i , . . . , xRi ) if x ∈ E
0 otherwise.

One can consider several extensions to this model. In fact, Kelly net-
works, on which the preceding analysis is based, are a lot more general. The
reader is referred to the end of the next section on multiclass networks for a
description of some of the possible extensions as they are common to both
sections. After all what we saw here is just a special type of multiclass net-
work. Before forging ahead, we present a small example to illustrate these
results.
Exact Results in Network of Queues: Product Form 363

Problem 60
Consider the four-node network in Figure 6.14. There are three routes
described using three types of arrows. Route-1 uses path 1-3-4-2, route-2 uses
4-2-1, and route-3 uses 2-3. Route-1 customers arrive according to a Poisson
process with mean rate 4 per hour. Likewise, route-2 and route-3 customers
arrive according to Poisson processes with mean rates of 2 and 3 per hour,
respectively. Nodes 1, 2, and 3 have a single server that serves according
to an exponential distribution with rates 8, 10, and 9 per hour, respectively.
Node 4 has two servers each serving at rate 4 per hour. The service disci-
pline is FCFS in nodes 1, 2, and 4 but processor sharing in node 3. What is
the probability that there is one route-1 customer in each of the four nodes,
two route-2 customers in node 4, two in node 2 and one in node 1, and three
route-3 customers in node 2 and four in node 3?
Solution
The problem illustrates an example of a queueing network with fixed
routes. Using the notation of this section, N = 4 nodes and R = 3 routes.
Also, (λ1 , λ2 , λ3 ) = (4, 2, 3), (μ1 , μ2 , μ3 , μ4 ) = (8, 10, 9, 4), and (s1 , s2 , s3 , s4 ) =
(1, 1, 1, 2). In the problem, the vector of xrj values for route r and
node j describes x given by x = (x11 , x21 , x31 , x12 , x22 , x32 , x13 , x23 , x33 , x14 ,
x24 , x34 ). Using the numerical values in the problem, we have x = (1, 1, 0, 1, 2,
3, 1, 0, 4, 1, 2, 0) for which we need to compute p(x). Using the results in this
section we have

p(x) = φ1 (1, 1, 0)φ2 (1, 2, 3)φ3 (1, 0, 4)φ4 (1, 2, 0).

For i = 1, 2, 3, 4, we can obtain φi (·, ·, ·) as

⎛ ⎞

2
1
φ1 (1, 1, 0) = β1 (2!) ⎝ ⎠ (λ1 λ2 ) = β1 1
μ1 4
j=1

1 3
Route 1

2 4

Route 2
Route 3

FIGURE 6.14
Queueing network with fixed routes.
364 Analysis of Queues

⎛ ⎞& '

6
1 λ1 λ22 λ33 81
φ2 (1, 2, 3) = β2 (6!) ⎝ ⎠ = β2
μ2 12 3125
j=1
⎛ ⎞& '

5
1 λ1 λ43 20
φ3 (1, 0, 4) = β3 (5!) ⎝ ⎠ = β3
μ3 24 729
j=1
⎛ ⎞& '

3
1 λ1 λ22 1
φ4 (1, 2, 0) = β4 (3!) ⎝ ⎠ = β4 .
min(j, 2)μ4 6 32
j=1

Thus, the only thing left is to obtain the βj values for j = 1, 2, 3, 4. Although
it is possible to directly use the formula, it is easier if we realize that βj is the
probability that node j is empty in steady state. Since nodes 1, 2, and 3 are
effectively M/M/1 queues with arrival rates 6, 9, and 7 as well as service
rates 8, 10, and 9, respectively, we have β1 = 1/4, β2 = 1/10, and β3 = 2/9.
Likewise, node 4 is effectively an M/M/2 queue with arrival rate 6 and
service rate 4 for each server. Hence, we have β4 = 1/7. Thus, the prob-
ability that there is one route-1 customer in each of the four nodes, two
route-2 customers in node 4, two in node 2 and one in node 1, and three
route-3 customers in node 2 and four in node 3 is β1 β2 β3 β4 (1/180000) =
1/22680000.

6.4.3 Multiclass Networks

Around the same time that Kelly [59] introduced networks with determin-
istic routing, a closely related multiclass network called BCMP (Baskett,
Chandy, Muntz, and Palacios) network [8] was developed, named after the
four authors of that manuscript. We present a special case of BCMP networks
here, and toward the end of this section give an idea of what extensions are
possible. Similar to the previous section, here too we state the conditions
akin to Section 6.2 to reflect the situation under consideration:
1. The system is an open-queueing network consisting of N service
stations (or nodes).
2. There are K classes of customers in the network and each class of
customer moves randomly in the network.
3. External arrivals are according to a Poisson process. Class k cus-
tomers arrive into the system from the outside into node i at
rate λki .
4. At each node i there is a single server (for all i such that 1 ≤ i ≤ N).
5. Service time requirements of class k customers at node i are IID
exp(μki ) random variables.
Exact Results in Network of Queues: Product Form 365

6. The service discipline is one of the following: FCFS (in which case
we require μki to be independent of k (i.e., all K class have the
same service rate that we call μi ), processor sharing, or LCFS with
preemptive resume.
7. When a class k customer completes service at node i, the customer
departs the network with probability rki or joins the queue at node j
as a class customer with probability pki,j . The routing of a customer
does not depend on the state of the network.
8. There is infinite waiting room at each node and stability condition is
satisfied at every node i.

As earlier, here we obtain the visit ratios aj for class customers into
node j (i.e., the effective arrival rate of class customers into node j).
We solve the following set of simultaneous equations:

N
K
aj = λj + aki pki,j
i=1 k=1

for all (such that 1 ≤ ≤ K) and j (such that 1 ≤ j ≤ N). For i = 1, . . . , N and
k = 1, . . . , K, let Xki (t) be the number of customers belonging to class k in node
i at time t. Let X(t) be a vector that captures a snapshot of the state of the
network at time t and is given by X(t) = [X11 (t), X21 (t), . . . , Xki (t), . . . , XKN (t)].
Let p(x) be the steady-state probability of being in state x = (x11 , x21 , . . . , xKN ),
that is,

p(x) = lim P{X(t) = (x11 , x21 , . . . , xKN )}.

t→∞

Note that the stochastic process {X(t), t ≥ 0} is not a CTMC if the discipline
is FCFS (although for the other disciplines mentioned earlier, it would be a
CTMC) since the transition rates would be different if the history is known
as opposed to when it is not known as the state information does not include
the customer class under service. For the FCFS case we would have to keep
track of the class of customer in each position of every queue in the network
that would form a CTMC. But any permutation within a queue would result
in the same probability. Thus, adding across all permutations we can obtain
p(x). With that understanding in place, next we describe p(x), which would
be a product form, without going into details of the derivation.
Let i be the total arrival rate into node i aggregated over all classes,
that is,

K
i = aki .
k=1
366 Analysis of Queues

Likewise, let i be the aggregate service rate at node i, that is,

1 aki
K
= .
i μki i
k=1

Note that if the discipline is FCFS, i = μki for all k since μki does not change
with k. Next, define the following for all i ∈ {1, . . . , N}:

&& ''

K
& n '
1 − i akik
φi (n1 , n2 , . . . , nK ) = nk ! n .
i nk !μkik
k k=1

Using this notation it is possible to show that the steady-state probability p(x)
is given by
N
p(x) = φi (x1i , x2i , . . . , xKi ).
i=1

To illustrate this result, we consider a numerical example adapted from

Bolch et al. [12].

Problem 61
Consider an open-queueing network with K = 2 classes and N = 3 nodes.
Node 1 uses processor sharing, while nodes 2 and 3 use LCFS preemptive
resume policy. The service rates μki for class k customers in node i are μ11 = 8,
μ21 = 24, μ12 = 12, μ22 = 32, μ13 = 16, μ23 = 36. Arrivals for both classes occur
externally into node 1 at rate 1 per unit time. The routing probabilities are
p12,11 = 0.6, p22,21 = 0.7, p11,12 = 0.4, p12,13 = 0.4, p21,22 = 0.3, p22,23 = 0.3,
p11,13 = 0.3, p13,11 = 0.5, p21,23 = 0.6, p23,21 = 0.4, p13,12 = 0.5, p23,22 = 0.6,
r11 = 0.3, and r21 = 0.1. Note that class switching is not allowed in the net-
work. Compute the probability that there are one class-1 and two class-2
customers in node 1, one class-1 and one class-2 customers in node 2, and
zero class-1 and one class-2 customer in node 3.
Solution
Note that since the external arrival rate is 1 into node 1 for both classes,
we have λ11 = 1, λ21 = 1, λ12 = 0, λ22 = 0, λ13 = 0, and λ23 = 0. Using
those and solving for the simultaneous equations for the visit ratios,
we get a11 = 3.3333, a21 = 10, a12 = 2.2917, a22 = 8.0488, a13 = 1.9167, and
a23 = 8.4146. In fact, since no class switching is allowed, we can solve
the simultaneous equations one class at a time. Therefore, we can com-
pute the net arrival rate into node i aggregating over all the classes i
as 1 = 13.3333, 2 = 10.3404, and 3 = 10.3313. Likewise, we can obtain
1 = 16, 2 = 23.3684, and 3 = 29.2231. Using the formulae for φi (n1 , n2 ),
Exact Results in Network of Queues: Product Form 367

we get φ1 (1, 2) = 0.0362, φ2 (1, 1) = 0.0536, and φ3 (0, 1) = 0.1511. Note that
φi (n1 , n2 ) actually gives us the probability that in node-1 there are n1 class-1
customers and n2 class-2 customers. Of course, the answer to the question
given in the problem is the joint probability, which is the product form
p(1, 2, 1, 1, 0, 1) = φ1 (1, 2)φ2 (1, 1)φ3 (0, 1) = 0.00029271.

One can consider several extensions to the preceding model that would
still give us product-form solutions. As a matter of fact, BCMP networks,
when they were first introduced, considered a few more generalizations.
For example, if a node uses FCFS discipline, any number of servers were
allowed; infinite server queues with general service times is an option (not
just FCFS, processor sharing and LCFS with preemptive resume); general
service times were allowed for processor sharing and LCFS with preemp-
tive resume nodes; also closed-queueing networks are analyzable. Subse-
quently, several research studies further generalized the BCMP networks
to include state-dependent (local and network-wide) arrivals and service,
state-dependent routing, networks with negative customers, networks with
blocking, open networks with a limited total capacity, etc. Refer to some
recent books on queueing networks such as by Serfozo [96], Chao et al.
[18], and Chen and Yao [19] for results under those general cases. In fact,
both multiclass queueing networks with fixed routing (Kelly networks) and
BCMP networks are combined into a single framework (by using routing
probabilities of 0 or 1 for Kelly networks). It is also worthwhile to con-
sider algorithms for product-form networks (especially in the extensions, we
would require the use of normalizing constants that are harder to obtain).
That said, we conclude the product-form network analysis by describing loss
networks next.

6.4.4 Loss Networks

Telephone networks and circuit-switched networks are typically modeled as
loss networks where there is virtually no buffering (or waiting) at nodes.
Calls or customers or requests are either accepted or rejected. When they
are accepted, they traverse the network with zero delay but use up network
resources while a call is in progress. This is akin to generalizing the M/G/s/s
queue (recall that the M/G/s/s queue results in a reversible process) to a
network of resources. In fact, queueing theory arguably started when A.K.
Erlang studied this problem. But, we have so far mostly considered only
delay networks and not loss networks. There are good reasons for including
and for excluding loss networks from a chapter on queueing networks. By
stating upfront that loss networks have very little similarity with everything
seen thus far, we hope that the reader would pay careful attention to the
description to follow.
Consider a network with N nodes and J arcs (or links). Let arc j (for
j = 1, . . . , J) have a capacity of Cj . To explain further, we use the telephone
368 Analysis of Queues

network example. Say each accepted telephone call takes 1 unit of arc capac-
ity (this is about 60 kbps). Then on arc j you can have at most Cj calls
simultaneously. Let R be the set of routes in the telephone network such
that a route r is described by a set of arcs that are traversed. In this manner,
we only focus on the arcs and not on the nodes. For all r ∈ R, telephone calls
requesting route r arrive according to a Poisson process with mean rate λr .
A call requesting route r is blocked and lost if there is no capacity available
on any of the links in the route. If the call is accepted, it uses up 1 unit of
capacity in each of the arcs in the route. The holding time for accepted calls
of class r is generally distributed with mean 1/μr .
Define Xr (t) as the number of calls in progress on route r at time t,
for all r ∈ R. Let R = |R|, the number of routes in the network. Let the
R-dimensional vector X(t) be X(t) = (X1 (t), . . . , Xr (t), . . . , XR (t)). Then the
steady-state distribution of the stochastic process {X(t), t ≥ 0} can be com-
puted as a product form (note that the process is reversible). Let p(x) be the
steady-state probability of being in state x = (x1 , x2 , . . . , xR ), that is,

p(x) = lim P{X(t) = (x1 , x2 , . . . , xR )}.

t→∞

Let us define set E as the set of all possible x satisfying the criterion that
the total number of calls in each link is less than or equal to the capacity.
Therefore, p(x) > 0 if x ∈ E and p(x) = 0, otherwise. We can write down p(x)
as a product form given by

R
1 λr xr
p(x) = G(C1 , . . . , CJ ) , ∀x ∈ E.
xr ! μr
r=1

The normalizing constant G(C1 , . . . , CJ ) can be obtained by

p(x) = 1.
x∈E

Next, we present an example to illustrate loss networks.

Problem 62
Consider the six-node network in Figure 6.15. There are four routes in the
network. Route-1 is through nodes A–C–D–E, route-2 is through nodes
A–C–D–F, route-3 is through nodes B–C–D–E, and route-4 is through nodes
B–C–D–F. The capacities of the five arcs are described below the arcs. Each
call on a route uses one unit of the capacity of all the arcs on the route. Calls
on routes 1, 2, 3, and 4 arrive according to a Poisson process with respective
rates 10, 16, 15, 20 per hour. The average holding time for all calls is 3 min.
Exact Results in Network of Queues: Product Form 369

A Arc 1 E
Arc 4
2 2
Arc 3
C D
4 Arc 5
Arc 2
3 3
B F

FIGURE 6.15
Loss network.

What is the probability that in steady state there is one call on each of the
four routes?
Solution
For r = 1, 2, 3, 4, let Xr (t) be the number of calls on route r. Let
x = (x1 , x2 , x3 , x4 ), and we would like to compute p(x). First we describe E,
the set of all feasible x values. Then

E = {(x1 , x2 , x3 , x4 ) : x1 + x2 ≤ 2, x1 + x2 + x3 + x4 ≤ 4, x3 + x4 ≤ 3,

x1 + x3 ≤ 2, x2 + x4 ≤ 3}.

There are a total of 37 elements in set E. For any x ∈ E, we have

x 1 x 2 x 3 x4
1 10 1 16 1 15 1 20
p(x) = G(C1 , . . . , C5 ) .
x1 ! 20 x2 ! 20 x3 ! 20 x4 ! 20

We can compute G(C1 , . . . , C5 ) = 1/14.4475 and hence p(1, 1, 1, 1) = 0.0208,

which is what we require.

Reference Notes
We began this chapter with acyclic networks as well as open and closed
Jackson networks. Most of the results here were adapted from Kulkarni [67].
In fact, many standard texts on queues would also typically contain a few
chapters on these topics. The main emphasis of this chapter is product-form
solutions. There is a strong connection between product-form queueing net-
works, the notion of reversibility, as well as insensitivity to the service time
distribution. Note that all our product-form results use only the mean arrival
time and mean service time at every node but not the entire distribution. Of
course, the link in itself is quasi-reversibility that results in partial balance
equations. We have not gone into any of those details in this chapter but
370 Analysis of Queues

the reader is encouraged to refer to other resources, in particular this is well

explained in Kelly [59].
Further, except for a few portions of this chapter, we only considered
models with exponential distributions (unless the result is insensitive to the
distribution). However, it is critical to point out that everything is not perfect
when we have exponential distributions. In fact, if we are in a finite capac-
ity queue (with or without blocking), we do not have a nice structure. For
finite capacity networks, the joint stationary distribution is not product form.
Hence, an exact analysis of these networks is limited to very small networks.
Refer to some recent books on queueing networks such as Serfozo [96], Chao
et al. [18], and Chen and Yao [19] for the most recent advances on the topic
of product-form solutions.

Exercises
6.1 Consider a queueing network of single-server queues shown in the
Figure 6.16. Note that, external arrival is Poisson and service times
are exponential. Derive the stability condition and compute (1) the
expected number of customers in the network in steady state and (2)
the fraction of time the network is completely empty in steady state.
6.2 Consider a seven-node single-server Jackson network where nodes
2 and 4 get input from the outside (at rate 5 per minute each on an
average). Nodes 1 and 2 have service rates of 85, nodes 3 and 4 have
service rates of 120, node 5 has a rate of 70, and nodes 6 and 7 have
rates of 20 (all in units per minute). The routing matrix is given by
⎛ 1 1 1 1
⎞
3 4 0 4 0 6 0
⎜ 1 1 1 ⎟
⎜ 0 0 0 0 ⎟
⎜ 3 4 3 ⎟
⎜ 0 0 1 1 1
0 0 ⎟
⎜ 3 3 3 ⎟
⎜ ⎟
P = [pij ] = ⎜ 13 0 1
0 1
0 0 ⎟.
⎜ 3 3 ⎟
⎜ 0 0 0 4
0 0 1 ⎟
⎜ 5 6 ⎟
⎜ 1 1 1 1 1 ⎟
⎝ 6 0 6 6 6 6 0 ⎠
1 1 1 1 1
0 6 6 6 6 0 6

λ 1–p
μ1 μ2 μN–1 μN
p

FIGURE 6.16
Single-server queueing network.
Exact Results in Network of Queues: Product Form 371

Find the average number in the network and the mean delay at
each node.
6.3 Consider a closed-queueing network of single-server stations. Let
ρi = ai /μi . Show that the limiting joint distribution is given by

1 xi
N
"
p(x) = ρi when xi = C
γ(C)
i=1

where C is the number of customers in the system (note that the

relationship between our usual normalizing constant G(·) and γ(·)
used here is γ(C) = 1/G(C)). Also, show that the generating function
G̃(z) of γ(C) is given by

∞

N
C 1
G̃(z) = γ(C)z = .
1 − ρi z
C=0 i=1

Now define Bj (z) and bj (n) as follows:

j
∞
1
Bj (z) = = bj (n)zn j = 1, 2, . . . , N.
1 − ρi z
i=1 n=0

Thus, BN (z) = G̃(z). Show that

Bj (z) = ρj zBj (z) + Bj−1 (z) j = 2, . . . , N

with B1 (z) = 1/(1 − ρ1 z). Also, show that

b1 (n) = ρn1 with bj (0) = 1

and

bj (n) = bj−1 (n) + ρj bj (n − 1).

Then one can use this recursion to compute γ(C) = bN (C). Thus, γ(C)
can be computed in O(NC) time.
372 Analysis of Queues

6.4 For a closed network of single-server nodes with C customers, show

that
j G(C)
(a) lim P{Xi (t) ≥ j} = ρi .
t→∞ G(C − j)

C
j G(C)
(b) Li = lim E[Xi (t)] = ρi .
t→∞ G(C − j)
j=1
Hint: You may want to first derive the result that for a dis-
crete random variable Z taking values 0, 1, 2, 3, . . ., that is,
∞
E[Z] = P(Z ≥ j).
j=1
6.5 A simple communication network consists of two nodes labeled A
and B connected by two one-way communication links: line AB from
A to B and line BA from B to A. There are 150 users at each node. The
ith user (1 ≤ i ≤ 150) at node A (or B respectively) is denoted by Ai (or
Bi respectively). User Ai has an interactive session set up with user
Bi and it operates as follows. User Ai sends a message to user Bi .
All the messages generated at node A wait in a buffer at node A for
transmission on line AB in an FCFS fashion to appropriate users at
node B. When user Bi receives the message, he or she spends a ran-
dom amount of time, called think time, to generate a response to it.
All the messages generated at node B wait in a buffer for transmis-
sion (in an FCFS fashion) to appropriate users at node A on line BA.
When user Ai receives a response from user Bi , he or she goes into
think mode and generates a response to it after a random amount
of think time. This process of messages going back and forth con-
tinues forever. Suppose all think times are exponentially distributed
with mean 2 min and that all the message transmission times are
exponentially distributed with mean 0.5 s. Model this as a queueing
network. What is the expected number of messages in the buffers
at nodes A and B in steady state. This will require some careful
computing, so if you like you can leave the result using summations.
6.6 The system in a restaurant called Takes-Makes can be modeled as a
tandem queueing network of three stations (order placing, fixings,
and cashier). Assume that customers arrive to the system according
to a Poisson process with mean arrival of 1 per minute. The entering
customers go straight to join a line to place their order. After plac-
ing their order, the customers join a line at the fixings station at the
end of which they go to pay at the cashier. There is a single server at
the order placing station, two servers at the fixings station and one
server at the cashier station. The average service times at the order
placing, fixings, and cashier stations are, respectively, 54, 90, and
48 s. Assume that service times at all stations are exponential. Com-
pute the average number of customers in each station and the total
in the system of three stations.
Exact Results in Network of Queues: Product Form 373

6.7 Consider a modification to the previous problem where the ser-

vice times are now 48, 96, and 48 s, at the order placing, fixings,
and cashier stations, respectively. How does the average time in
the entire system of three stations in this modified scenario com-
pare against that of the original scenario? Note that in the modified
scenario, there would be more people waiting at the fixing station,
which could be undesirable.
6.8 Consider another modification to Takes-Makes. What if each server
took the order, did the fixings, and acted as the cashier? Assume
that the system can be modeled as a single M/M/4 queue with aver-
age service time of 192 s (54 + 90 + 48 = 192). What is the average
time in the system for each customer? Comment on the follow-
ing: (a) compare against time in the system for scenarios in the
previous two cases; (b) if the assumptions in the previous cases
are true, would the service time distribution be exponential here;
and (c) what are the physical implications and limitations (space,
resources, etc.)?
6.9 A queueing network with six single-server stations is depicted in
Figure 6.17. Externally arrivals occur according to a Poisson process
with average rate 24 per hour. The service time at each station is
exponentially distributed with mean (in minutes) 2, 3, 2.5, 2.5, 2,
and 2, at stations A, B, C, D, E, and F, respectively. A percentage on
any arc (i, j) denotes the probability that a customer after completing
service in node i joins the queue in node j. Compute the average
number of customers in the network. What would be the waiting
time experienced by a customer that enters node A, goes to node C,
and exits the system through node E?
6.10 A repair shop (Figure 6.18) can be modeled as a queueing network
with four stations. Externally, products arrive at stations A and C
according to Poisson processes with mean 10 products per day and
20 products per day, respectively. Products are repaired at one or

25% 35% E 80%

10%
50%
A B 15%

F
65%

FIGURE 6.17
Schematic of a six-station network.
374 Analysis of Queues

A B

FIGURE 6.18
Schematic of a repair shop.

more stations among A, B, C, and D. The number of servers at sta-

tions A, B, C, and D are, respectively, 2, 3, 3, and 2. Average service
times at stations A, B, C, and D are 0.08, 0.15, 0.075, and 0.08 days,
respectively. Upon service completion, both at nodes A and C, with
probability 0.2, the products immediately reenter those nodes. Also,
75% of the products after service in stations B and D exit the system.
Compute the in-process inventory of the number of products in the
system.
6.11 Answer the following TRUE or FALSE questions:
(a) Consider a closed Jackson network with N nodes and C cus-
tomers. Is the following statement TRUE or FALSE? The
interarrival times at any node can be modeled as IID exponential
random variables.
(b) Consider a closed-queueing network with two identical stations
and C customers in total. The service times both stations are
according to exp(μ) distribution. Each customer at the end of
service joins either stations with equal probability. Let pi be
the steady-state probability there are i customers in one of the
queues (and C − i in the other). Is the following TRUE or FALSE?

C 1
pi = for i = 1, 2, . . . , C
i 2C

(c) Consider a closed Jackson network with two nodes and C cus-
tomers. Let X(t) be the number of customers in one of the nodes
(the other node would have C − X(t) customers). Is the following
statement TRUE or FALSE? The CTMC {X(t), t ≥ 0} is reversible.
6.12 Answer the following multiple-choice questions:
(a) In a stable open Jackson network, which of the following is the
average time spent by an entity in the network?
"N "N
(i) i = 1 Li / i = 1 λi
"N
(ii) i = 1 Li /λi
Exact Results in Network of Queues: Product Form 375

"N "N
(iii) i = 1 Li / i = 1 ai
"N
(iv) i = 1 Li /ai
(b) Consider an open Jackson network with two nodes. Arrivals
occur externally according to PP(λ) into each node. There is a
single server that takes exp(μ) service time and there is infinite
waiting room at each node. At the end of service at a node, each
customer chooses the other node or exits the system, both with
probability 1/2. What is the average number of customers in the
entire network in steady state, assuming stability?
(i) 2λ/(μ − λ)
(ii) 2λ/(μ − 3λ)
(iii) 4λ/(μ − 2λ)
(iv) 4λ/(μ − 4λ)
6.13 Consider a series system of two single-server stations. Customers
arrive at the first station according to a PP(λ) and require exp(μ1 )
service time. Once service is completed, customers go to the second
station where the service time is exp(μ2 ), and exit the system after
service. Assume the following: both queues are of infinite capacity;
λ < μ1 < μ2 ; no external arrivals into the second station; and both
queues use FCFS and serve one customer at a time. Compute the
LST of the CDF of the time spent in the entire series system.
6.14 Consider an open Jackson network with N nodes and a single-server
queue in each node. We need to determine the optimal service
rate μi at each node i for i ∈ {1, . . . , N} subject to the constraint
μ1 + · · · + μN ≤ C. The total available capacity is C, which is a given
constant. Essentially we need to determine how to allocate the avail-
able capacity among the nodes. Use the objective of minimizing the
total expected number in the system in steady state. Formulate a
nonlinear program and solve it to obtain optimal μi values in terms
of the net arrival rates aj (for all j), which are also constants.
6.15 Consider a feed-forward tandem Jackson network of N nodes and
the arrivals to the first node is PP(λ). The service rate at node i is
exp(μi ). We have a pool of S workers that need to be assigned one
time to the N nodes. Formulate an optimization problem to deter-
mine si , the number of servers in node i (for i ∈ {1, . . . , N}) so that
s1 + . . . sN ≤ S, if the objective is to minimize the total expected num-
ber in the system in steady state. Describe an algorithm to derive the
optimal allocation.
This page intentionally left blank
7
Approximations for General Queueing
Networks

Jackson networks and their extensions that we saw in the previous chap-
ter lent themselves very nicely for performance analysis. In particular, they
resulted in a product-form solution that enabled us to decompose the queue-
ing network so that individual nodes can be analyzed in isolation. However,
a natural question to ask is: What if the conditions for Jackson networks are
not satisfied? In this chapter, we are especially interested in analyzing gen-
eral queueing networks where the interarrival times or service times or both
can be according to general distributions. In particular, how do you ana-
lyze open queueing networks if each node cannot be modeled as an M/M/s,
M/G/c/c, or M/G/∞ queue? For example, the departure process from an
M/G/1 FCFS queue is not even a renewal process, let alone a Poisson pro-
cess. So if this set of customers departing from an M/G/1 queue join another
queue, we would not know how to analyze that queue because we do not
have results for queues where the arrivals are not according to a renewal
process.
So how do we analyze general queueing networks? In the most general
case, our only resort is to develop approximations. It is worthwhile to men-
tion that one way is to use discrete-event simulations. There are several
computer-simulation software packages that can be used to obtain queue-
ing network performance measures numerically. At the time of writing this
book, the commonly used packages are Arena and ProModel especially
for manufacturing systems applications. However, although simulation
methodology is arguably the most popular technique in the industry, it is not
ideal for developing insights and intuition, performing quick what-if anal-
ysis, obtaining symbolic expressions that can be used for optimization and
control, etc. For those reasons we will mainly consider analytical models that
can suitably approximate general queueing networks. However, one of the
objectives of the analytical approximations is that they must possess under-
lying theory, must be exact under special cases or asymptotic conditions and
reasonably accurate under other conditions, and must be relatively easy to
implement.
In that spirit we will consider, for example, approximations based on
reflected Brownian motion (we will define and characterize this subse-
quently). One of the major benefits of reflected Brownian motion is that it
can be modeled using just the mean and variance of the interarrival time as

377
378 Analysis of Queues

well as service time at each queue, hence it is easy to implement. Further,

if the queueing network dynamics is indeed based on fluid and diffusion
processes, then the marginal queue lengths are reflected Brownian motions.
Further, the joint distribution of the queue lengths would be a product form.
However, we will first describe the underlying theory that maps the general
queueing network formulation into the reflected Brownian motion descrip-
tion. Then we will show the results are asymptotically exact and investigate
the accuracy under other situations. We will begin with a simple FCFS net-
work where there is only one class of customers and only one server per
node of the network. Then we will extend to multiserver, multiclass, and
other scheduling policies.

7.1 Single-Server and Single-Class General Queueing Networks

In this section, we consider a queueing network where all nodes have only
a single server and there is only one class of customers. Customers arrive
externally according to a renewal process and the service times at each node
are according to general distributions. Notice that the description appears
to be that of an open queueing network; however, if we let external arrival
rates and external departure probabilities to be zero, then we indeed obtain
a closed queueing network. Also, if the arrival processes are Poisson and
service times exponential, we essentially have a Jackson network. For that
reason, this network is sometimes referred to as the generalized Jackson net-
work. Next we explicitly characterize these networks and set the notation as
follows:

1. The queueing network consists of N service stations (or nodes).

2. There is one server at node i for all i satisfying 1 ≤ i ≤ N.
3. Service times of customers at node i are IID random variables
with mean 1/μi and squared coefficient of variation C2Si . They are
independent of service times at other nodes.
4. There is infinite waiting room at each node and stability condition
(we will describe that later) is satisfied at every node. Customers are
served according to FCFS.
5. Externally, customers arrive at node i according to a renewal process
with mean interarrival time 1/λi and squared coefficient of varia-
tion C2Ai . All arrival processes are independent of each other and the
service times.
6. When a customer completes service at node i, the customer departs
the network with probability ri or joins the queue at node j with
probability pij . Notice that pii > 0 is allowed and corresponds to
Approximations for General Queueing Networks 379

rejoining queue i for another service. It is required that for all i,

N
ri + pij = 1 as all customers after completing service at node
j=1
i either depart the network or join node j with probability pij for all
j ∈ [1, . . . , N]. The routing of a customer does not depend on the state
of the network.
7. Define P = [pij ] as the routing matrix of pij values. Notice that if all
λi and ri values are zero, then we essentially have a closed queueing
network in which case P is a stochastic matrix with all rows sum-
ming to one. Otherwise, we assume that I − P is invertible (i.e., when
we have an open queueing network), where I is the N × N identity
matrix.

Our objective is to develop steady-state performance measures for such a

queueing network.

7.1.1 G/G/1 Queue: Reﬂected Brownian Motion–Based Approximation

We begin by considering a special case of the queueing network described
earlier. In particular, we let N = 1. Thus we essentially have a G/G/1 queue.
Although we have developed bounds and approximations for the G/G/1
queue in Chapter 4, the key motivation here is to develop an approxima-
tion based on reflected Brownian motion so that it can be used to obtain a
product-form solution in the more general case of N. We drop the subscript
i from the previous notation since we have only one node. Hence arrivals
occur according to a renewal process with interarrival times having a general
distribution with mean 1/λ and squared coefficient of variation C2a . Like-
wise, the service times are IID random variables with mean 1/μ and squared
coefficient of variation C2s . Although at this time we assume that the inter-
arrival times and service time distributions are known, our objective is to
build an approximation for performance measures based only on λ, μ, C2a ,
and C2s and not the entire distribution. We assume that the system is stable,
that is, λ < μ.
For the analysis, we first consider the arrival and service processes. Say
A(t) is the number of customers that arrived from time 0 to t, and S(t) is
the number of customers the server would process from time 0 to t if the
server was busy during that whole time. Since the interarrival times and
service times are IID, the processes {A(t), t ≥ 0} and {S(t), t ≥ 0} are renewal
processes. We know from renewal theory that for large t, A(t) is approx-
imately normally distributed with mean λt and variance λC2a t. Likewise,
for large t, S(t) is normally distributed with mean μt and variance μC2s t.
Therefore, it is quite natural (as we will subsequently show) to analyze
the system by approximating (for some large constant T) {A(t), t ≥ T} and
{S(t), t ≥ T} as Gaussian processes, in particular, Brownian motions. This is
a reasonable approximation, but a more rigorous approach to show that is
380 Analysis of Queues

to make a scaling argument appropriately as done in Chen and Yao [19] (see
Chapter 8). However, before we proceed with the Brownian approximation,
we first need to write down the relevant performance measures for the queue
in terms of A(t) and S(t). We do that next.
We first describe some notation. Let X(t) denote the number of customers
in the G/G/1 queue at time t with X(0) = x0 , a given finite constant number
of customers initially. To write down X(t) in terms of A(t) and S(t), it is
important to know how long the server was busy and idle during the time
period 0 to t. For that, let B(t) and I(t), respectively, denote the total time the
server has been busy and idle from time 0 to t. We emphasize that the server
is work conserving, that means the server would be idle if and only if there
are no customers in the system. Note that

B(t) + I(t) = t.

Thus we can write down an expression for X(t) as

X(t) = x0 + A(t) − S(B(t)), (7.1)

since the total number in the system at time t equals all the customers that
were present at time 0, plus all those that arrived in time 0 to t, minus those
that departed in time 0 to t. Note that while writing the number of departures
we need to be careful to use only the time the server was busy. Hence we get
the preceding result.
Equation 7.1 is not conducive to obtain an expression for X(t). Hence we
rewrite as follows:

X(t) = U(t) + V(t)

where

U(t) = x0 + (λ − μ)t + (A(t) − λt) − (S(B(t)) − μB(t)),

V(t) = μI(t).

Verify that the preceding result yields Equation 7.1 realizing that
B(t) + I(t) = t. Note that we are ultimately interested in the steady-state dis-
tribution of X(t); however, to do that we start by computing the expected
value and variance of U(t), for large t. Thus we have for large t

E[U(t)] = x0 + (λ − μ)t,
Var[U(t)] ≈ λC2a t + λC2s t

since we have from renewal theory E[A(t)] = λt, E[S(B(t))|B(t)] = μB(t) for
any B(t). However, the variance result is a lot more subtle. Note that for
Approximations for General Queueing Networks 381

a large t, the total busy period can be approximated as B(t) ≈ (λ/μ)t since
λ/μ is the fraction of time the server would be busy. Hence we write down
B(t) = (λ/μ)t in the expression for U(t) and then take
the
variance. However,
since we know that Var[A(t)] = λC2a t and Var S μ λ
t = λC2s t, we get the
preceding approximate result for Var[U(t)]. Note that E[U(t)] is exact for any
t, however, Var[U(t)] is reasonable only for large t, that is, in the asymptotic
case.
It is straightforward to see that for large t, if A(t) and S(t) are normally
distributed random variables, then U(t) is normally distributed with mean
x0 + (λ − μ)t and variance λ(C2a + C2s )t. Therefore, if {A(t), t ≥ T} and
{S(t), t ≥ T} can be approximated as Brownian motions for some large T,
then from the description of U(t), {U(t), t ≥ T} is also a Brownian motion
with initial state x0 , drift λ − μ, and variance λ(C2a + C2s ).
Next we seek to answer the question: If {U(t), t ≥ 0} is a Brownian,
then what about {X(t), t ≥ 0}? To answer this we observe that the following
relations ought to hold for all t ≥ 0:

X(t) ≥ 0, (7.2)
dV(t)
≥ 0 with V(0) = 0 (7.3)
dt

and

dV(t)
X(t) = 0. (7.4)
dt

Before moving ahead, it may be worthwhile to explain the preceding rela-

tions. Condition (7.2) essentially holds because the number in the system,
X(t), cannot be negative. Likewise, since the amount of idle time I(t) is a
nondecreasing function of t, V(t), which is μI(t), is also a nondecreasing func-
tion of t, hence dV(t)/dt ≥ 0. Of course I(0) is zero based on the definition of
I(t), hence V(0) is zero and we get condition (7.3). Equation 7.4 holds because
whenever X(t) > 0, the idle time does not increase as the server is busy, hence
dV(t)/dt = 0. However, if X(t) = 0, that is, the system is empty, the idle time
increases, in other words dV(t)/dt > 0. Thus we have X(t)dV(t)/dt = 0.
We still have not addressed the issue of whether we know something
about U(t), what can we say about X(t). In fact, if U(t) is known there exists
a unique X(t) that satisfies conditions (7.2 through 7.4). We determine that
by illustrating it as a problem.

Problem 63
Given U(t), show that there exists a unique pair X(t) and V(t) such that
X(t) = U(t) + V(t), which satisfy conditions (7.2 through 7.4). Also show
382 Analysis of Queues

that the unique pair X(t) and V(t) can be written in terms of U(t) as
follows:

V(t) = sup max{−U(s), 0}, (7.5)

0≤s≤t

X(t) = U(t) + sup max{−U(s), 0}. (7.6)

0≤s≤t

Solution
We first show that X(t) and V(t) are a unique pair. For that, consider
another pair X̂(t) and V̂(t) such that given a U(t) for all t, X̂(t) = U(t) + V̂(t)
and the pair X̂(t) and V̂(t) satisfy conditions (7.2 through 7.4). Hence
X̂(t) ≥ 0, dV̂(t)/dt ≥ 0 with V̂(0) = 0, and X̂(t)dV̂(t)/dt = 0. If we show that
the only way that can happen is if X(t) = X̂(t) (hence V(t) = V̂(t) because
U(t) = X(t) − V(t) = X̂(t) − V̂(t)). For that, consider 1/2{X(t) − X̂(t)}2 and
write it as follows (the first equation is an artifact of integration and uses
the fact that X(0) = X̂(0) = x0 ; the second equation is due to substituting
X(u) − X̂(u) by V(u) − V̂(u) since U(u) = X(u) − V(u) = X̂(u) − V̂(u) for all
u; the last equation can be derived using condition (7.4), i.e., X(u)dV(u) = 0
and X̂(u)dV̂(u) = 0):

1 t
{X(t) − X̂(t)}2 = {X(u) − X̂(u)}d{X(u) − X̂(u)},
2
0

t
= {X(u) − X̂(u)}d{V(u) − V̂(u)},
0

t t
=− X(u)dV̂(u) − X̂(u)dV(u).
0 0

However, based on conditions (7.2) and (7.3), X(u), dV̂(u), X̂(u), and dV(u)
are all ≥0. Thus we have

1
{X(t) − X̂(t)}2 ≤ 0.
2

But the LHS is nonnegative. So the only way these result holds is if
X(t) = X̂(t). Hence the pair X(t) and V(t) such that X(t) = U(t) + V(t), which
satisfy conditions (7.2 through 7.4), is unique.
Having shown that X(t) and V(t) is unique, we now proceed to show that
V(t) and X(t) defined in Equations 7.5 and 7.6 satisfy X(t) = U(t) + V(t) and
the conditions (7.2 through 7.4). Subtracting Equation 7.5 from Equation 7.6,
Approximations for General Queueing Networks 383

we get X(t) = U(t) + V(t). Since max{−U(s), 0} ≥ 0 for all s, from the defi-
nition of V(t), we have V(t) ≥ 0. Thus if U(t) ≥ 0, X(t) = U(t) + V(t) ≥ 0.
Now, if U(t) < 0, from the definition of the supremum we have V(t) ≥ −U(t)
since V(t) ≥ −U(s) for all s such that 0 ≤ s ≤ t based on Equation 7.5. Since
V(t) ≥ −U(t), U(t)+V(t) ≥ 0, hence X(t) ≥ 0. Thus condition (7.2) is verified.
Next to show condition (7.3) is satisfied, we first show that since U(0) = x0 ,
which is nonnegative, we have V(0) = max{−U(0), 0} = 0. Also, for any
dt ≥ 0 we have V(t + dt) ≥ V(t) since the supremum over time 0 to t + dt
must be greater than or equal to the supremum over any interval within 0 to
t + dt, in particular, 0 to t. Thus we have

dV(t) V(t + dt) − V(t)

= lim ≥0
dt dt→0 dt

thereby satisfying condition (7.3). The only thing we are left to do is

to show that V(t) and X(t) defined in Equations 7.5 and 7.6 satisfy
Equation 7.4. Note that in Equation 7.5, if the supremum in the RHS occurs
at s = t, then dV(t)/dt > 0, otherwise dV(t)/dt = 0. However, if the supre-
mum indeed occurs at s = t, then V(t) = −U(t), which means V(t) + U(t) = 0,
implying X(t) = U(t) + V(t) = 0. Thus Equation 7.4 is satisfied since either
X(t) = 0 or dV(t)/dt = 0.

Based on the characteristics of this result, X(t) is called the reflected pro-
cess of U(t) and V(t) the regulator of U(t). From the expression for X(t) in
Equation 7.6, we can conclude that if {U(t), t ≥ 0} is a Brownian motion
with initial state x0 , drift θ, and variance σ2 , then {X(t), t ≥ 0} is a reflected
Brownian motion (sometimes also called Brownian motion with reflecting
barrier on the x-axis). To illustrate the Brownian motion and the reflected
Brownian motion, we simulated a single sample path of U(t) for 1000 time
units sampled at discrete time points 1 time unit apart. Using numerical val-
ues for initial state x0 = 6, drift −0.01, and variance 0.09, a sample path of
U(t) is depicted in Figure 7.1. Using the relation between U(t) and X(t) in
Equation 7.6, we generated X(t) values corresponding to U(t). Although this
is only a sample path, note from Figure 7.2, the reflected Brownian motion
starts at x0 and then keeps getting reflected at the origin and behaves like
a Brownian motion at other points. Since the drift is negative, the reflected
Brownian motion hits the origin (i.e., X(t) = 0) infinitely often. In the spe-
cific case of the G/G/1 queue, we showed that the {U(t), t ≥ 0} process
especially for large t is a Brownian motion with drift (λ − μ) and vari-
ance λ(C2a + C2s ). Then the {X(t), t ≥ 0} process is a corresponding reflected
Brownian motion.
Having described an approximation for the number in the system process
{X(t), t ≥ 0} as a reflected Brownian motion, we remark that it is rather awk-
ward to approximate a discrete quantity X(t) by a continuous process such
384 Analysis of Queues

0
U(t)

–5

–10

–15

–20
0 100 200 300 400 500 600 700 800 900 1000
t

FIGURE 7.1
Simulation of Brownian motion {U(t), t ≥ 0}.

8
X(t)

0
0 100 200 300 400 500 600 700 800 900 1000
t

FIGURE 7.2
Generated reflected Brownian motion {X(t), t ≥ 0}.

as a reflected Brownian motion. Bolch et al. [12] get around this by map-
ping the probability density function of the reflected Brownian motion to a
probability mass function of the number in the system in steady state. That
is certainly an excellent option. However, here we follow the literature on
diffusion approximation or heavy-traffic approximations. In particular, we
Approximations for General Queueing Networks 385

model the workload process (which is inherently continuous) as a reflected

Brownian motion. For that, let W(t) be the workload at time t, that is, the
time it would take to complete service of every customer in the system at
time t including any remaining service at the server. Chen and Yao [19] use
the approximation

X(t)
W(t) ≈
μ

to relate the workload in the system to the number in the system. It is

easy to see that if {X(t), t ≥ 0} is a reflected Brownian motion with initial
state x0 , drift λ − μ, and variance λ(C2a + C2s ), then {W(t), t ≥ 0} is also
a reflected Brownian motion with initial state x0 /μ, drift (λ − μ)/μ, and
variance λ(C2a + C2s )/μ2 , when W(t) = X(t)/μ. Next we state a problem
to derive the steady-state distribution of W(t) using the reflected Brown-
ian motion. Subsequently, we will use the steady-state distribution to derive
performance metrics for the G/G/1 queue.

Problem 64
Show that for any reflected Brownian motion {W(t), t ≥ 0} with drift θ (such
that θ < 0) and variance σ2 , the steady-state distribution is exponential with
parameter −2θ/σ2 .
Solution
Consider a reflected Brownian motion {W(t), t ≥ 0} with initial state w0 , drift
θ, and variance σ2 . Let the cumulative distribution function F(t, x; w0 ) be
defined as

F(t, x; w0 ) = P{W(t) ≤ x|W(0) = w0 }.

It is well known that F(t, x; w0 ) satisfies the following partial differential

equation (which is also called forward Kolmogorov equation or Fokker–
Planck equation):

∂ ∂ σ2 ∂ 2
F(t, x; w0 ) = −θ F(t, x; w0 ) + F(t, x; w0 ) (7.7)
∂t ∂x 2 ∂x2

with initial condition

0 if x < w0
F(0, x; w0 ) =
1 if x ≥ w0
386 Analysis of Queues

and boundary conditions (based on the reflecting barrier on x-axis)

F(t, 0; w0 ) = 0, whenever w0 > 0 and t > 0.

Since we are only interested in the steady-state distribution of the reflected

Brownian motion, we let t → ∞. We denote F(x) as the limiting distribution,
that is,

F(x) = lim F(t, x; w0 ).

t→∞

Since there is no dependence on t, the partial derivative with respect to t is

zero. Hence the differential Equation 7.7 reduces to

dF(x) σ2 d2 F(x)
−θ + F(x) = 0
dx 2 dx2

which can be solved by integrating once with respect to x and then using
standard differential equation techniques to yield

2 c
F(x) = ae 2θx/σ −
θ
for some constants a and c that are to be determined. Using the boundary
condition F(0) = 0 and the CDF property F(∞) = 1 we get c = − θ and a = −1.
Thus we have
2
F(x) = 1 − e 2θx/σ .

Therefore, any reflected Brownian motion with drift θ (such that θ < 0) and
variance σ2 has a steady-state distribution that is exponential with parameter
−2θ/σ2 .

For the G/G/1 queue described earlier, since we approximated the work-
load process {W(t), t ≥ 0} as a reflected Brownian motion with initial state
w0 = x0 /μ, drift θ = (λ − μ)/μ and variance σ2 = λ(C2a + C2s )/μ2 , we have the
expected workload in steady state (using the preceding problem where we
showed it is σ2 /(2θ)) as:

λ C2a + C2s
.
2(1 − ρ)μ2

Therefore, if the G/G/1 queue is observed at an arbitrary time in steady

state, the expected workload in the system would be approximately equal
to the preceding quantity. Hence we can approximately equate that to the
workload seen by an arriving customer in steady state. But the expected
Approximations for General Queueing Networks 387

workload seen by an arriving customer in steady state is indeed Wq , the

time in the system waiting for service to begin. In summary, we have an
approximation for Wq of a G/G/1 queue as

λ C2a + C2s
Wq ≈ . (7.8)
2(1 − ρ)μ2

We can immediately obtain Lq , W, and L using Lq = λWq , W = Wq + 1/μ,

and L = λW. A good thing about the approximation (like the other G/G/1
approximations in Chapter 4) is that it only uses the mean and variance of
the interarrival time and service time, but not the entire distribution. This
makes it very convenient from a practical standpoint because one only needs
to gather enough data for reasonably estimating the mean and variance. For
that reason, this is also called a second-order approximation (a first-order
approximation would be to use only the mean interarrival and service time,
and use an M/M/1 expression) that uses the first two moments. However,
a natural question to ask is: How good is the approximation (Equation 7.8)?
For that, we first make the following remark and then subsequently address
the issue further.

Remark 15

The expression for Wq in (Equation 7.8) is exact when the arrivals are Poisson.
In other words, if we had an M/G/1 queue, then based on the preceding
result

λ 1 + C2s
Wq =
2(1 − ρ)μ2

since C2a = 1. That gives us

λ λ2 1 + C2s
L= + .
μ 2(1 − ρ)μ2

But that is exactly L as described in the Pollaczek–Khintchine formula (4.6).

It has been reported extensively in the literature that the approximation

for Wq given in Equation 7.8 is an excellent one. It is known to work remark-
ably well especially when the traffic intensity (ρ = λ/μ) is neither too high nor
too low (typically in the range 0.1 ≤ ρ ≤ 0.95) and also the coefficient of vari-
ations are not too high (typically when C2a ≤ 2 and C2s ≤ 2). However, most
of those results are based on using distributions such as Erlang, gamma,
388 Analysis of Queues

truncated normal, Weibull, hyperexponential, hypoexponential, and uni-

form for interarrival and service times. What if the interarrival times and
service times were according to a Pareto distribution? Does the approxima-
tion (Equation 7.8) work well under the condition 0.1 ≤ ρ ≤ 0.95, C2a ≤ 2, and
C2s ≤ 2 where other distributions have shown promise? This motivates us to
consider the next problem.

Problem 65
Simulate a G/G/1 queue with mean arrival rate λ = 1, with both interarrival
times and service times both according to Pareto distributions. Generate 100
replications and in each run use 1 million customer departures to obtain the
time in the system for various values of ρ, C2a , and C2s . Compare the sim-
ulation results against the approximation for W that can be derived from
Equation 7.8.
Solution
For this problem, we are given λ, ρ, C2a , and C2s . Using Equation 7.8, we
can derive an analytical expression (in terms of those four quantities) for the
expected time in the system in steady state as

ρ ρ2 C2a + C2s
W= + .
λ 2(1 − ρ)λ

Next, we can simulate a G/G/1 queue using the algorithm in Chapter 4

(Problem 37) when the interarrival times and the service times are accord-
ing to Pareto distribution. For that, we first need the parameters of Pareto
distributions. Let ka and βa be the parameters of the Pareto interarrival dis-
tribution, so that the CDF of interarrival times for x ≥ ka is 1−(ka /x)βa . Using
the mean and variance of Pareto distributions, we can derive ka and βa as

1
βa = 1 + 1+ ,
C2a
βa − 1
ka = .
λβa

Next, we can easily obtain the inverse distribution for the CDF as F−1 (u) =
ka (1 − u)−1/βa . In a similar manner, one can obtain ks , βs , and the inverse of
the service time CDF by changing all the arrival subscripts from a to s and λ
to μ = λ/ρ.
We perform 100 replications of the simulation algorithm for each set of
λ, ρ, C2a , and C2s values. In each replication, we run the simulation till 1 mil-
lion customers are served to obtain the average time in the system over the
1 million customers. Using the 100 sample averages we obtain a confidence
Approximations for General Queueing Networks 389

TABLE 7.1
Comparing Simulation Confidence Interval against Analytical
Approximation
Confidence Interval for W Analytical
ρ C2a C2s (via Simulations) Approx. for W
0.9 1.00 2.00 (8.0194, 12.2170) 13.0500
0.6 1.00 2.00 (1.2347, 1.5313) 1.9500
0.9 0.49 2.00 (1.1135, 22.0137) 10.9845
0.9 2.00 2.00 (9.1947, 11.0142) 17.1000
0.6 2.00 2.00 (0, 7.7464) 2.4000
0.6 0.49 2.00 (1.0702, 1.7480) 1.7205
0.6 0.49 0.49 (0.7926, 0.8577) 1.0410
0.9 0.49 0.49 (3.8001, 3.9382) 4.8690
0.9 4.00 0.49 (5.8341, 5.9503) 19.0845
0.6 4.00 0.49 (0.8837, 0.9007) 2.6205

interval (three standard deviations on each side of the grand average across
the 100 sample averages). We tabulate the results in Table 7.1. Notice that
λ = 1 in all cases.

From the results of the previous problem it appears that in most cases
the analytically predicted W does not even fall within the confidence inter-
val, leave alone being close to the grand average. However, it is not clear
from the preceding text whether that is because of the simulation with Pareto
distribution or the accuracy of the approximation. Looking at how wide
the confidence intervals are, it gives an indication that it might be an inac-
curacy in simulation. It is worthwhile further investigating this issue by
considering an M/G/1 queue where the service times are according to Pareto
distribution where we know the exact steady-state mean sojourn time. For a
similar situation where we run 100 replications of 1 million customers in
each replication for ρ = 0.9, λ = 1, C2s = 2 with Pareto distribution for ser-
vice times, we get a confidence interval of (10.2088, 14.2815). Although the
exact result using the Pollaczek–Khintchine formula in Equation 4.6 yields
W = 13.05, which is within the confidence interval, the grand average (of
12.2452) using the simulation runs is still significantly away considering it
is averaged over as many as 100 million customers. Another thing to notice
is that the grand average is smaller than the expected analytical value. One
reason for this is that the extremely rare event of seeing a humongous ser-
vice time has not been realized in the simulations but has been accounted
for in the analytical models. There are some research papers that have
been addressing similar problems of determining how many simulation
runs would be needed to predict performance under Pareto distributions
with reasonable accuracy. Having said that, we will move along with the
390 Analysis of Queues

understanding that the approximation (Equation 7.8) does work reason-

ably well for most problems especially when the load, variability, and
distributions are reasonable.

7.1.2 Superpositioning, Splitting, and Flow through a Queue

Having developed an approximation based on reflected Brownian motion
for a G/G/1 queue that uses only the first two moments of the interarrival
times as well as service times, we are in a position to extend it to a network.
For that, we go just a little beyond a single queue and consider the flows
coming into a node as well as those that go out of a node. First consider
a node as described in Figure 7.3. Note that the net arrival process to this
queue is the superposition of flows from other nodes as well as from outside
the network. The arrivals get processed according to FCFS in a single-server
queue and then depart the queue. Upon departure, each customer joins dif-
ferent nodes (or exits the network) with certain probabilities. We assume
that this is according to Bernoulli splitting, that is, each customer’s choice
of flows is independent of other customers as well as anything else in the
system. The key question we aim to address is if we are given the first two
moments of interarrival times for each of the superpositioning flows as well
as those of the service times, can we obtain the first two moments of the
interevent times of each of the split flows. Since every node in the network
has such a structure (as described in Figure 7.3), one can immediately obtain
the first two moments of the interarrival times at all the nodes of the net-
work. Then by modeling each node as a G/G/1 queue we can obtain relevant
performance measures.

7.1.2.1 Superposition
With that we begin with the first step, namely superposition of flows. Con-
sider m flows with known characteristics that are superimposed into a single
flow that act as arrivals to a queue. For i = 1, . . . , m, let θi and C2i be the
average arrival rate of customers as well as the squared coefficient of vari-
ation of interarrival times on flow i. Likewise, let θ and C2 be the effective
arrival rate as well as the effective squared coefficient of variation of inter-
arrival times obtained as a result of superposition of the m flows. Given

Splitting
Superposition
Flow through queue

FIGURE 7.3
Modeling a node of the network.
Approximations for General Queueing Networks 391

(θ1, C 12)
(θ2, C 22)
(θ, C 2)

(θm, Cm2 )

FIGURE 7.4
Superposition of flows.

θ1 , . . . , θm , C21 , . . . , C2m , the objective is to find θ and C2 . This is described in

Figure 7.4. At this time, we assume that each of the m flows is an indepen-
dent renewal process. In particular, for i = 1, . . . , m, {Ni (t), t ≥ 0} is a renewal
process with mean interrenewal time 1/θi and squared coefficient of varia-
tion of interrenewal time C2i . Thus the counting process {N(t), t ≥ 0} that is a
superposition of the m flows is such that

N(t) = N1 (t) + · · · + Nm (t)

for all t. For large t we know from renewal theory that Ni (t) for all
i ∈ {1, . . . , m} is normally distributed with mean θi t and variance θC2i t. Also,
since N(t) is the sum of m independent normally distributed quantities
(for large t), N(t) is also normally distributed. Taking the expectation and
variance of N(t) we get
m

E[N(t)] = E[N1 (t)] + · · · + E[Nm (t)] = θi t,
i=1

m
Var[N(t)] = Var[N1 (t)] + · · · + Var[Nm (t)] = θi C2i t.
i=1

Working backward, if there is a renewal process {N(t), t ≥ 0} such that

for large t, N(t) is normally distributed with mean E[N(t)] and variance
Var[N(t)], then the interrenewal times would have a mean 1/θ where

θ = θ1 + · · · + θm .

Further, the squared coefficient of variation of the interrenewal times is C2

given by

m
θi
C2 = C2i .
θ
i=1

Notice how this is derived by going backward, that is, originally we started
with a renewal process with a mean and squared coefficient of variation of
392 Analysis of Queues

interrenewal times, and then derived the distribution of the counting process
N(t) for large t; but here we reverse that. It is crucial to notice that the actual
interevent times of the aggregated superpositioned process is not IID, and
hence the process is not truly a renewal process. However, we use the results
as an approximation pretending the superimposed process to be a renewal
process.

7.1.2.2 Flow through a Queue

Consider a G/G/1 queue where the interarrival times are according to a
renewal process with mean 1/θa and squared coefficient of variation C2a .
Also, the service times are according to a general distribution with mean
1/θs and squared coefficient of variation C2s . Thus given θa , C2a , θs , and C2s ,
our objective is to obtain the mean (1/θd ) and squared coefficient of varia-
tion (C2d ) of the interdeparture time in steady state assuming stability. These
parameters are described pictorially in Figure 7.5. Of course, since the flow
is conserved, average arrival rate must be equal to average departure rate.
Hence we have

θd = θa .

Note that we have derived an expression for C2d in Chapter 4 using mean
value analysis (MVA) in Equation 4.16. We now rewrite that expression
in terms of θa , θs , ρ = θa /θs , C2a , and C2s as

C2d = C2a + 2ρ2 C2s + 2ρ(1 − ρ) − 2θa W(1 − ρ).

Using the approximation in Equation 7.8 for W, we can rewrite it in terms of

θa , θs , ρ = θa /θs , C2a , and C2s as

1 ρ2 C2a + C2s
W≈ + .
θs 2θa (1 − ρ)

Hence we can write down an approximate expression for C2d as

C2d ≈ 1 − ρ2 C2a + ρ2 C2s .

(θa,Ca2) (θd,Cd2)
(θs,Cs2)

FIGURE 7.5
Flow through a queue.
Approximations for General Queueing Networks 393

Before proceeding, it is worthwhile to take a moment to realize that based

on the aforementioned expression, as the traffic intensity (or utilization) ρ
goes close to one, the variability of the departures is similar to that of the
service times. However, if ρ is low, the variability of the departures resemble
those of the arrivals. Now, this is a fairly intuitive result since high utilization
implies there is a line of customers waiting more often than not (hence the
departures are close to the service variability), likewise, if the utilization is
low, then most often than not customers are not waiting and the departures
look like the arrivals.

7.1.2.3 Bernoulli Splitting

Consider the renewal process {N(t), t ≥ 0} with interrenewal times with
mean 1/θ and squared coefficient of variation C2 as illustrated in the LHS
of Figure 7.6. This renewal process is split according to a Bernoulli split-
ting process such that with probability pi each event in the renewal process
is classified as belonging to class i for i = 1, . . . , n. Our objective is to
derive for all i, the mean θi and coefficient of variation C2i of interevent
times of the split process of class i. For this, consider a particular split i.
From the main renewal process {N(t), t ≥ 0} with probability pi each event
is marked as belonging to class i and with probability 1 − pi it is marked
as belonging to some other class (or split flow). Let X1 , X2 , . . . XN , denote
the interevent times of the original renewal process {N(t), t ≥ 0}. Like-
wise, let IID random variables Y1 , Y2 , . . . YN , denote the interevent times
of the split process along flow i. Note that Yj is the sum of a geomet-
rically distributed number of Xk values (since it is the number of events
until a success is obtained) for all j and appropriately chosen k. In par-
ticular, Y1 = X1 + X2 + · · · + XN where N is a geometric random vari-
able with probability of success of pi . We are interested in the mean θi
as well as squared coefficient of variation C2i of the interevent times of
the split process. Of course E[Y1 ] = 1/θi and Var[Y1 ] = C2i /θ2i . There-
fore, once we get E[Y1 ] and Var[Y1 ], we can immediately get θi and C2i
that we need.

P1 (θ1, C 12)
(θ2, C 22)
(θ, C 2) P2

Pn (θn, Cn2)

FIGURE 7.6
Bernoulli splitting of a flow.
394 Analysis of Queues

Since Y1 = X1 + X2 + · · · + XN , we can immediately compute E[Y1 ] and

Var[Y1 ] by conditioning on N as follows:

N 1
E[Y1 ] = E[E[Y1 |N]] = E =
θ pi θ
Var[Y1 ] = E[Var[Y1 |N]] + Var[E[Y1 |N]]

NC2 N C2 1 − pi
=E + Var = 2 + 2 .
θ2 θ pi θ pi θ2

Thus we have for all i such that 1 ≤ i ≤ n

θi = pi θ,
C2i = C2 pi + 1 − pi .

7.1.3 Decomposition Algorithm for Open Queueing Networks

Using what we have seen thus far, we are now in a position to put everything
together in the context of a network of single-server nodes that we consid-
ered earlier in this section. Recall that we have a single-class network of N
queues and in each queue or node there is a single server. Customers arrive
externally according to a renewal process and the service times at each node
are according to general distributions. It is worthwhile to recollect the nota-
tions that were described earlier. In particular, for i = 1, . . . , N, service times
of customers at node i are IID random variables with mean 1/μi and squared
coefficient of variation C2Si . Likewise, externally customers arrive at node i
according to a renewal process with mean interarrival time 1/λi and squared
coefficient of variation C2Ai . When a customer completes service at node i,
the customer departs the network with probability ri or joins the queue at
node j with probability pij . The crux of the analysis lies in the fact that if
the arrival processes were truly Brownian motions and so were the service
processes, then the number in each queue would be a reflected Brownian
motion if the network is of feed-forward type. We do not prove this but refer
the reader to Chen and Yao [19]. In fact, that would result in a product-form
solution for the joint number in each node of the system. Therefore, we could
use the approximate results for G/G/1 queues based on reflected Brownian
motion to analyze this system. For that, we consider each node separately
and approximate the net arrivals into each node as a renewal process using
its mean and squared coefficient of variation. For that reason, such an analy-
sis is called decomposition technique where we decompose the network into
individual nodes and analyze them.
The main idea is that we substitute the random process {Wi (t), t ≥ 0} that
denotes the workload in node i, by a reflected Brownian motion. Then the
Approximations for General Queueing Networks 395

joint process Wi (t) over all i yields a multidimensional reflected Brownian

motion with a product-form steady-state distribution. However, we decom-
pose the system and study the marginal distributions of the number of
customers in each node by suitably approximating each node i as a G/G/1
queue. Of course we already know the service characteristics of each G/G/1
node since the service time i has mean 1/μi and SCOV C2Si . Also, the average
aggregate arrival rate into node i, ai can be computed relatively easily (using
an identical result as that in Jackson networks). In particular, we obtain aj for
all 1 ≤ j ≤ N by solving

N
aj = pij ai + λj . (7.9)
i=1

Then for 1 ≤ i, j ≤ N, if P = [pij ] an N × N matrix, then [a1 a2 . . . aN ] =

[λ1 λ2 . . . λN ][I − P]−1 . We assume that at every node stability is satisfied,
that is, for all i such that 1 ≤ i ≤ N,

ai < μi .

We also define the traffic intensity at node i as

ai
ρi = .
μi

Once the ai ’s are obtained, the only parameter left to compute to use in the
G/G/1 result is the squared coefficient of variation of the interarrival times
into node i that we denote as C2a,i . We use an approximation that the net
arrivals into node i (for all i) is according to a renewal process and obtain
an approximate expression for C2a,i using the results for superposition, flow
through a queue, and splitting that we saw in the previous section. But we
require a feed-forward network for that so that we can perform superposi-
tion, flow, and splitting as we go forward in the network. However, as an
approximation, we consider any generic network and show (see following
problem) that for all j such that 1 ≤ j ≤ N,

λj 2
N
ai pij
C2a,j = CAj + 1 − pij + pij 1 − ρ2i C2a,i + ρ2i C2Si . (7.10)
aj aj
i=1

Therefore, there are N such equations for the N unknowns C2a,1 , . . . , C2a,N ,
which can be solved either by writing as a matrix form or by iterating start-
ing with an arbitrary initial C2a,j for each j. Once they are solved, we can
396 Analysis of Queues

use Equation 7.8 to obtain an approximate expression for the steady-state

average number of customers in node j as

ρ2j C2a,j + C2Sj
Lj ≈ ρj + . (7.11)
2(1 − ρj )

Note that this result is exact for the single-server Jackson network. Before pro-
gressing further, we take a moment to derive the expression for C2a,j described
in Equation 7.10 as a problem.

Problem 66
Using the terminology, notation, and expressions derived until Equa-
tion 7.10, show that for a feed-forward network

λj 2
N
ai pij
C2a,j = CAj + 1 − pij + pij 1 − ρ2i C2a,i + ρ2i C2Si .
aj aj
i=1

Solution
Consider the results for superposition of flows, flow through a queue, as well
as Bernoulli splitting, all described in Section 7.1.2. Based on that we know
that if node i has renewal interarrival times with mean 1/ai and squared
coefficient of variation C2a,i (and service times with mean 1/μi and squared
coefficient of variation C2Si ), then the interdeparture times have mean 1/ai
and squared coefficient of variation (1 − ρ2i )C2a,i + ρ2i C2Si . Since the probability
that a departing customer from node i will join node j is pij , the interarrival
times of customers from node i to node j has mean ai pij and squared coeffi-
cient of variation 1 − pij + pij [(1 − ρ2i )C2a,i + ρ2i C2Si ]. Since the aggregate arrivals
to node j is from all such nodes i as well as external arrivals, the effective
interarrival times into node j has a squared coefficient of variation (defined
as C2a,j ) given by

λj 2
N
ai pij
C2a,j = CAj + 1 − pij + pij 1 − ρ2i C2a,i + ρ2i C2Si
aj aj
i=1

using the superposition results.

Next we describe an algorithm to obtain the performance measures of

the general single-server open queueing network. Say we are given a stable
single-server open queueing network (in the sense that every customer that
arrives into the network exits it in a finite time) with N nodes and routing
Approximations for General Queueing Networks 397

probabilities pij . The algorithm to obtain Lj for all j ∈ {1, 2, . . . , N} given mean
and squared coefficient of variation of IID service times (i.e., 1/μi and C2Si ) as
well as mean and squared coefficient of variation of IID external interarrival
times into node i (i.e., 1/λi and C2Ai ) for all i = 1, 2, . . . , N is as follows:

1. Obtain aggregate customer arrival rate aj into node j by solving

Equation 7.9 for all j.
2. Using aj , obtain the traffic intensity ρj at node j as ρj = aj /μj . Verify
that all queues are stable, that is, ρi < 1 for all i.
3. Solve Equation 7.10 for all j and derive C2a,j as a solution to the N
simultaneous equations.
4. Using the derived values of ρj and C2a,j , plug them into Equation 7.11
and obtain Lj approximately for all j.

Before concluding this section, we present a numerical example to illustrate

the algorithm for a specific single-server queueing network.

Problem 67
For a single-server open queueing network described in Figure 7.7, cus-
tomers arrive externally only into node 0 and exit the system only from
node 5. Assume that the interarrival times for customers coming externally
has a mean 1 min and standard deviation 2 min. The mean and stan-
dard deviation of service times (in minutes) at each node is described in
Table 7.2. Likewise, the routing probabilities pij from node i to j is provided

1 3 5

0 2 4

FIGURE 7.7
Single-server open queueing network.

TABLE 7.2
Mean and Standard Deviation of Service Times (min)
Node i 0 1 2 3 4 5
Mean 1/μi 0.8 1.25 1.875 1 1 0.5

Standard deviation C2S /μ2i 1 1 1 1 1 1
i
398 Analysis of Queues

TABLE 7.3
Routing Probabilities from Node on Left to Node on Top
0 1 2 3 4 5
0 0 0.5 0.5 0 0 0
1 0 0 0 1 0 0
2 0 0.2 0 0.3 0.5 0
3 0 0 0 0 0 1
4 0 0 0 0.2 0 0.8
5 0 0 0 0 0.4 0

in Table 7.3. Using this information compute the steady-state average num-
ber of customers at each node of the network as well as the mean sojourn
time spent by customers in the network.
Solution
Since this is an open queueing network of single-server queues, to solve the
problem we use the algorithm described earlier (with the understanding that
the network is not a feed-forward network). Note the slight change from the
original description where nodes moved from 1 to N; however, here it is
from 0 to N − 1, where N = 6. From the problem description we can directly
obtain P from Table 7.3 for nodes ordered {0, 1, 2, 3, 4, 5} as
⎡ ⎤
0 0.5 0.5 0 0 0
⎢ 0 0 0 1 0 0 ⎥
⎢ ⎥
⎢ 0 0.2 0 0.3 0.5 0 ⎥
P=⎢
⎢
⎥.
⎥
⎢ 0 0 0 0 0 1 ⎥
⎣ 0 0 0 0.2 0 0.8 ⎦
0 0 0 0 0.4 0

Also, the external arrival rate is 1 per minute for node 0 and zero for all
other nodes. Based on Equation 7.9 we have a = [a0 a1 a2 a3 a4 a5 ] =
[λ0 λ1 λ2 λ3 λ4 λ5 ][I − P]−1 with λ0 = 1, and with λi = 0 for i > 0, we
have a = [1.0000 0.6000 0.5000 0.93330.9167 1.6667] effective arrivals per
minute into the various nodes. Using the service rates μi for every node i
described in Table 7.2 we can obtain the traffic intensities for various nodes
as [ρ0 ρ1 ρ2 ρ3 ρ4 ρ5 ] = [0.8000 0.7500 0.9375 0.9333 0.9167 0.8333]. Clearly,
all the nodes are stable; however, note how some nodes have fairly high
traffic intensities.
Next we obtain the squared coefficient of variations for the effective inter-
arrival times. For that, from the problem description we have the coefficient
of variation for external arrivals as

C2A0 C2A1 C2A2 C2A3 C2A4 C2A5 = [ 4 0 0 0 0 0 ]
Approximations for General Queueing Networks 399

and those of the service times we can easily compute from Table 7.2 as

C2S0 C2S1 C2S2 C2S3 C2S4 C2S5 = [ 1.5625 0.6400 0.2844 1.0000 1.0000 4.0000 ].

Thereby we use Equation 7.10 for all j to derive C2a,j using the following steps:
First obtain the row vector ψ = [ψj ] for all i, j, k ∈ {0, 1, 2, 3, 4, 5} as

ψj = λj C2Aj + [aP]j − a p2ik j + ai ρ2i C2Si p2ik j

and then calculate

−1
ψ I − diag 1 − ρ2i p2ik j
C2a,j =
aj

where diag(bk ) is a diagonal matrix using elements of some vector (bk ). Using
the preceding computation we can derive

ψj
= [ 4.0000 0.9750 1.0000 0.5461 1.4149 0.8716 ]
aj

and

C2a,j = [ 4.0000 1.5819 1.7200 1.0107 1.5349 1.0308 ].

Now, using the values of ρj , C2a,j , and C2Sj for j = 0, 1, 2, 3, 4, 5 in Equa-

tion 7.11 we can obtain the row vector of the mean number in each node
in steady state as L0 = 9.7, L1 = 3.2497, L2 = 15.0312, L3 = 14.0701, L4 =
13.6969, and L5 = 11.3143. Also, the mean sojourn time spent by customers
in the network in steady state is j Lj / j λj = 67.0623 min.

7.1.4 Approximate Algorithms for Closed Queueing Networks

Having seen a decomposition algorithm to obtain performance measures for
single-server open queueing networks, a natural thing to wonder is what
about closed queueing networks? As it turns out we can develop approx-
imations for closed queueing networks as well. In fact, we will show two
algorithms where we will decompose the network into individual nodes
and solve each node using the G/G/1 results described earlier. Further, the
analysis would rely heavily on the notation used for the open queueing net-
works case. Essentially, all the notation remains the same; however, for
the sake of completion we describe them once again here. We consider a
single-class closed network of N queues and in each queue or node there is a
400 Analysis of Queues

single server. There are C customers in total in the network, with no external
arrivals or departures. For i = 1, . . . , N, service times of customers at node
i are IID random variables with mean 1/μi and squared coefficient of varia-
tion C2Si . When a customer completes service at node i, the customer joins the
queue at node j with probability pij so that the routing matrix P = [pij ] has all
rows summing to one. We present two algorithms based on Bolch et al. [12],
one for large C called bottleneck approximation and the other for small C that
uses MVA. There are other algorithms such as maximum entropy method
(see Bolch et al. [12]) that are not described here.

7.1.4.1 Bottleneck Approximation for Large C

The main difficulty in extending the open queueing network decomposition
algorithm to closed queueing network is that it is not easy to obtain the mean
(1/aj ) and squared coefficient of variation (C2a,j ) of the effective interarrival
times for the queue in each node j. Of course once they are obtained, we
just use those and the service time mean (1/μj ) and squared coefficient of
variation (C2Sj ) to determine Lj for every node j. In fact, the only quantity
we need is aj for every j, since once we have that we just need to make a
minor adjustment to Equation 7.10. To obtain aj the key idea in the bottle-
neck approximation is to first identify the bottleneck node, and once that
happens we determine the traffic intensities of all other nodes relative to the
bottleneck node.
To identify the bottleneck node, we obtain visit ratios vj for all 1 ≤ j ≤ N
by solving

N
vj = pij vi (7.12)
i=1

and normalizing using v1 + v2 + · · · + vN = 1. This is in essence solving for

[v1 v2 . . . vN ] in [v1 v2 . . . vN ] = [v1 v2 . . . vN ]P and v1 + v2 + · · · + vN = 1,
where for 1 ≤ i, j ≤ N, P = [pij ] is the N × N routing matrix. Once vj is
obtained for all j, the bottleneck node is the one with the largest vj /μj , that is,
node b is the bottleneck if

vj
b = arg max .
j∈{1,2,...,N} μj

Define λ as λ = μb /vb . We obtain the traffic intensities ρj (for all j) using the
following approximation:

λvj
ρj = .
μj
Approximations for General Queueing Networks 401

Of course that would result in ρb = 1 for the bottleneck node and hence we
have to be careful as we will see subsequently.
We can also immediately obtain ai = λvi for all i. Then, the only param-
eter left to compute to use in the G/G/1 result is the squared coefficient of
variation of the interarrival times into node i that we denote as C2a,i for all i.
Since the external arrival rate is zero, this would just be a straightforward
adjustment of Equation 7.10 for all j such that 1 ≤ j ≤ N as:

N
ai pij
C2a,j = 1 − pij + pij 1 − ρ2i C2a,i + ρ2i C2Si . (7.13)
aj
i=1

Here too there are N such equations for the N unknowns C2a,1 , . . . , C2a,N , which
can be solved either by writing as a matrix form or by iterating starting with
an arbitrary initial C2a,j for each j. Once they are solved, we can use Equa-
tion 7.8 to obtain an approximate expression for the steady-state average
number of customers in node j for all j = b as

ρ2j C2a,j + C2Sj
Lj ≈ ρj + . (7.14)
2(1 − ρj )

Note that if we used this equation for node b, the denominator would go to
infinity. However, since the total number in the entire network is C, we can
easily obtain Lb using

Lb = C − Lj .
j=b

Next, we describe an algorithm to obtain the performance measures of

the general single-server closed queueing network with a large number of
customers C. Say we are given a single-server closed queueing network with
N nodes and routing probabilities pij . The algorithm to obtain Lj for all j ∈
{1, 2, . . . , N} given mean and squared coefficient of variation of IID service
times (i.e., 1/μi and C2Si ) for all i = 1, 2, . . . , N is as follows:

1. Obtain visit ratios vj into node j by solving Equation 7.12 for all j.
2. Identify the bottleneck node b as the node with the largest vi /μi value
among all i ∈ [1, N].
3. Let λ = μb /vb and obtain aggregate customer arrival rate aj into node
j as aj = λvj for all j.
4. Using aj , obtain the traffic intensity ρj at node j as ρj = aj /μj .
402 Analysis of Queues

5. Solve Equation 7.13 for all j and derive C2a,j using the N simultaneous
equations.
6. Using the derived values of ρj and C2a,j for all j = b, plug them into
Equation 7.14 and obtain Lj approximately.

7. Finally, Lb = C − Li .
i=b

7.1.4.2 MVA Approximation for Small C

Recall the MVA for product-form single-server closed queueing networks
that we derived in Section 6.3.3. One can essentially go through that exact
same algorithm with a modification in just one expression. However, for the
sake of completeness we provide the entire analysis here. We first describe
some notation (anything not defined here is provided prior to the bottle-
neck approximation). Define the following steady-state measures for all
i = 1, . . . , N and k = 0, . . . , C:

• Wi (k): Average sojourn time in node i when there are k customers (as
opposed to C) in the closed queueing network
• Li (k): Average number in node i when there are k customers (as
opposed to C) in the closed queueing network
• λ(k): Measure of average flow (sometimes also referred to as
throughput) in the closed queueing network when there are k
customers (as opposed to C) in the network

We do not have an expression for any of the preceding metrics, and the objec-
tive is to obtain them iteratively. However, before describing the iterative
algorithm, we first explain the relationship between those parameters.
As a first approximation, we assume that the arrival theorem described
in Remark 14 holds here too. Thus in a network with k customers (such that
1 ≤ k ≤ C) the expected number of customers that an arrival to node i (for
any i ∈ {1, . . . , N}) would see is Li (k − 1). Note that Li (k − 1) is the steady state
expected number of customers in node i when there are k − 1 customers in
the system. Further, the net mean sojourn time experienced by that arriving
customer in steady state is the average time to serve all those in the system
upon arrival plus that of the customer. Note that the average service time is
1/μi for all customers waiting and (1 + C2Si )/(2μi ) for the customer in service
(using the remaining time for an event in steady state for a renewal process).
Thus we have

1 1 + C2Si
Wi (k) = + Li (k − 1) .
μi 2
Approximations for General Queueing Networks 403

The preceding expression is a gross approximation because it assumes there

is always a customer at the server at an arrival epoch, which is not true. How-
ever, since the server utilization is not known, we are unable to characterize
the remaining service times more accurately and live with the preceding
approximation hoping it would be conservative.
Let v = [vj ] be a row vector, which is the solution to v = vP and the vj
values sum to one. It is identical to the visit ratios in the bottleneck approxima-
tion given earlier. The aggregate sojourn time weighted across the network
N
using the visit ratios is given by vi Wi (k) when there are k customers
i=1
in the network. Thereby we derive the average flow in the network using
Little’s law across the entire network as

k
λ(k) = N
vi Wi (k)
i=1

when there are k customers in the network. Thereby applying Little’s law
across each node i we get

Li (k) = λ(k)Wi (k)vi

when there are k customers in the network.

Using the preceding results, we can develop an algorithm to determine
Li (C), Wi (C), and λ(C) defined earlier. The inputs to the algorithm are N, C,
P, μi , and C2Si for all i ∈ {1, . . . , N}. For the algorithm, initialize Li (0) = 0 for
1 ≤ i ≤ N and obtain the vi values for all i ∈ {1, . . . , N}. Then for k = 1 to C,
iteratively compute for each i (such that 1 ≤ i ≤ N):

1 1 + C2Si
Wi (k) = + Li (k − 1) ,
μi 2
k
λ(k) = N ,
vi Wi (k)
i=1
Li (k) = λ(k)Wi (k)vi .

Next we present an example to illustrate the algorithms numerically for both

small C and large C as well as compare them.

Problem 68
Consider a manufacturing system with five machines numbered 1, 2, 3, 4,
and 5. The machines are in three stages as depicted in Figure 7.8. Four types
of products are produced in the system, 32% are processed on machines 1,
2, and 4; 8% on machines 1, 2, and 5; 30% on machines 1, 3, and 4; and
404 Analysis of Queues

0.4 2 0.8 4
1 0.2
0.5
3 5
0.6 0.5

FIGURE 7.8
Single-server closed queueing network.

the remaining 30% on machines 1, 3, and 5. The manufacturing system is

operated using a Kanban-like policy with a total of 30 products constantly
in the system at all times. Whenever a product completes its three stages
of production (i.e., on machines 1, 2 or 3, and 4 or 5), it leaves the system
and immediately a new part enters the queue in machine 1. Also, because
of the nature of the products and processes, the system uses Bernoulli split-
ting of routes. Thus when a product is processed on machine 1, it goes to
either machine 2 or 3 with probabilities 0.4 or 0.6, respectively. Likewise,
when a product is processed on machine 2, with probability 0.8 it is routed
to machine 4 and with probability 0.2 to machine 5. Also, when a product
is processed on machine 3, with probability 0.5 it is routed to machine 4
and with probability 0.5 to machine 5. Assume that all machines process
in an FCFS fashion and there is enough waiting room at every machine to
accommodate as many waiting products as necessary. Therefore, the sys-
tem can be modeled as a single-server closed queueing network as described
in Figure 7.8 with N = 5 and C = 30. Using the mean and standard devia-
tion of the service times at each of the five machines in Table 7.4, compute
the steady-state average number of products at each machine in steady
state using the bottleneck approximation (assuming C is large) and MVA
(assuming C is small).
Solution
We first set up the problem as a closed queueing network of single-server
queues and then use both the bottleneck approximation (assuming C is large)

TABLE 7.4
Mean and Standard Deviation of Service Times (Min)
Machine i 1 2 3 4 5
Mean 1/μi 3 4 5 6 2

Standard deviation C2S /μ2i 6 2 1 3 2
i
Approximations for General Queueing Networks 405

and the MVA (assuming C is small). From the problem description we can
directly obtain P ordered {1, 2, 3, 4, 5} as
⎡ ⎤
0 0.4 0.6 0 0
⎢ 0 0 0 0.8 0.2 ⎥
⎢ ⎥
P=⎢
⎢ 0 0 0 0.5 0.5 ⎥.
⎥
⎣ 1 0 0 0 0 ⎦
1 0 0 0 0

Based on Equation 7.12, we have v = [v1 v2 v3 v4 v5 ] =

[0.3333 0.1333 0.2000 0.2067 0.1267] as the visit ratios into the various nodes.

7.1.4.2.1 Bottleneck Approximation

Using the service rates μi for every node i described in Table 7.4, we have
[vi /μi ] = [1.0000 0.5333 1.0000 1.2400 0.2533]. The highest vi /μi is for
machine 4, hence that is the bottleneck, that is, b = 4. Using λ = μ4 /v4 =
0.8065, we can get the approximate entering rate into each node j, aj as λvj ,
hence a = [0.2688 0.1075 0.1613 0.1667 0.1022]. Thus we have the relative
traffic intensities for various nodes as

[ ρ1 ρ2 ρ3 ρ4 ρ5 ] = [ 0.8065 0.4301 0.8065 1.0000 0.2043 ].

Next we obtain the squared coefficient of variations for the effective inter-
arrival times. For that, we use the squared coefficient of variation of the
service times, which we can easily compute from Table 7.4 as

C2S1 C2S2 C2S3 C2S4 C2S5 = [ 4.0000 0.2500 0.0400 0.2500 1.0000 ].

Thereby we use Equation 7.13 for all j to derive C2a,j using the following steps:
First obtain the row vector ψ = [ψj ] for all i, j, k ∈ {1, 2, 3, 4, 5} as

ψj = aP j − a p2ik j + ai ρ2i C2Si p2ik j

and then calculate

−1
ψ I − diag 1 − ρ2i p2ik j
C2a,j =
aj

where diag(bk ) is a diagonal matrix using elements of some vector (bk ). Using
the preceding computation, we can derive

ψj
= [ 0.1709 1.6406 1.9609 0.3706 0.5754 ]
aj
406 Analysis of Queues

and

C2a,j = [ 0.5056 1.7113 2.0669 1.1213 0.9194 ].

Now, using the values of ρj , C2a,j , and C2Sj for j = 1, 2, 3, 5 in Equation 7.14,
we can obtain the row vector of the mean number in each node in steady
state as L1 = 8.3764, L2 = 0.7484, L3 = 4.3464, and L5 = 0.2546. We can
obtain L4 = C − L1 − L2 − L3 − L5 = 16.2742.

7.1.4.2.2 MVA Approximation

For the algorithm we initialized Li (0) = 0 for 1 ≤ i ≤ 5. We used the vi values
for all i ∈ {1, . . . , 5}. Then for k = 1 to 30, iteratively compute for each i (such
that 1 ≤ i ≤ 5):

1 1 + C2Si
Wi (k) = + Li (k − 1) ,
μi 2
k
λ(k) = 5 ,
vi Wi (k)
i=1
Li (k) = λ(k)Wi (k)vi .

The approximation can be used to get Li as Li (30). Writing a computer

program we get L1 = 11.0655, L2 = 0.4863, L3 = 2.3016, L4 = 15.8842, and
L5 = 0.2625 with λ(30) = 0.8207, which is fairly close to the λ obtained in
the bottleneck approximation. Note that the Li values are relatively close.
Upon running simulations with service times according to an appropriate
gamma distribution, we get L1 = 5.5803 (±0.0250), L2 = 0.7811 (±0.0010),
L3 = 3.6819 (±0.0119), L4 = 19.7360 (±0.0333), and L5 = 0.2205 (±0.0002) with
the numbers in the brackets denoting width of 95% confidence intervals
based on 100 replications.

7.2 Multiclass and Multiserver Open Queueing Networks

with FCFS
In Section 7.1, we considered only queueing networks with a single class
of customers and every node in those networks had only a single server.
In this section, we generalize the results in Section 7.1 to queueing networks
with multiple classes of customers and at every node of these networks there
could be more than one server. However, similar to Section 7.1, we con-
sider only FCFS service discipline. In this section, the FCFS requirement
Approximations for General Queueing Networks 407

is for two reasons. The first reason is to be able to derive expressions for
the second moment of the interdeparture time from a queue (which is why
Section 7.1 also used FCFS). The second reason is it would enable us to ana-
lyze each queue as an aggregated single-class queue. As described in Section
5.2.1 for multiclass M/G/1 queues, here too we can aggregate customers of
all classes in a node in a similar fashion. Further, as we saw in Section 7.1.4,
the approximations for general closed queueing networks were either rather
naive or available only for networks with single-server queues. Therefore,
we restrict our attention to only open queueing networks. With these intro-
ductory remarks, we proceed to analyze multiclass and multiserver open
queueing networks with FCFS discipline.

7.2.1 Preliminaries: Network Description

As described earlier, we consider an open queueing network where each
node can have one or more servers and there are many classes of customers
in the networks. The classes of customers are differentiated due to the vary-
ing service time, arrival process, as well as routing requirement. Each class of
customers arrive externally according to a renewal process and their service
times at each node are according to general distributions. Further, although
we assume routing is according to Bernoulli splitting, the routing probabili-
ties could vary depending on the class of the customer. It is important to note
that the notation is slightly different from before due to the large number of
indices. Next we explicitly characterize these networks and set the notation
for the “input” to the model as follows:

1. There are N service stations (or nodes) in the open queueing net-
work. The outside world is denoted by node 0 and the other nodes
are 1, 2, . . . , N. It is critical to point out that node 0 is used purely for
notational convenience and we are not going to model it as a “node”
for the purposes of analysis.
2. There are mi servers at node i (such that 1 ≤ mi ≤ ∞), for all i
satisfying 1 ≤ i ≤ N.
3. The network has multiple classes of traffic and class switching is not
allowed. Let R be the total number of classes in the entire network
and each class has its unique external arrival process, service times
at each node, as well as routing probabilities. They are explained
next.
4. Externally, customers of class r (such that r ∈ {1, . . . , R}) arrive at
node i according to a renewal process such that the interarrival time
has a mean 1/λ0i,r and a squared coefficient of variation (SCOV) of
C20i,r . All arrival processes are independent of each other, the service
times and the class.
408 Analysis of Queues

5. Service times of class r customers at node i are IID with mean

1/μi,r and SCOV C2Si,r . They are independent of service times at
other nodes.
6. The service discipline at all nodes is FCFS and independent of
classes.
7. There is infinite waiting room at each node and stability condition
(we will describe that later) is satisfied at every node.
8. When a customer of class r completes service at node i, the customer
departs the network with probability pi0,r or joins the queue at node
j (such that j ∈ {1, . . . , N}) with probability pij,r . Note that pii,r > 0 is
allowed and corresponds to rejoining queue i for another service. It
is required that for all i and every r

N
pij,r = 1
j=0

as all customers after completing service at node i either depart the

network or join another node (say, j with probability pij,r ). The rout-
ing of a customer does not depend on the state of the network. Also,
every customer in the network, irrespective of the customer’s class
will eventually exit the network.

The preceding notations are summarized in Table 7.5 for easy reference.
Our objective is to develop steady-state performance measures for such a

TABLE 7.5
Parameters Needed as Input for Multiserver and Multiclass Open Queueing
Network Analysis
N Total number of nodes
R Total number of classes
i Node index with i = 0 corresponding to external world, otherwise i ∈ {1, . . . , N}
j Node index with j = 0 corresponding to external world, otherwise j ∈ {1, . . . , N}
r Class index with r ∈ {1, . . . , R}
pij,r Fraction of traffic of class r that exits node i and join node j
mi Number of servers at node i for i ≥ 1
μi,r Mean service rate of class r customers at node i
C2S SCOV of service time of class r customers at node i
i,r
λ0i,r Mean external arrival rate of class r customers at node i
C20i,r SCOV of external interarrival time of class r customers at node i
Approximations for General Queueing Networks 409

queueing network. For example, we seek to obtain approximations for,

say, the average number of customers of each class at every node in
steady state.

7.2.2 Extending G/G/1 Results to Multiserver, Multiclass Networks

As we described earlier, our approach for the analysis is to aggregate the
multiple classes into a single class and then analyze each queue in a decom-
posed manner as a single-class queue. Thus wherever appropriate, we
borrow results from previous sections and chapters. We subsequently pro-
vide an algorithm in Section 7.2.3 that takes all the input metrics defined
in Table 7.5 to obtain the mean number of customers of each class in each
node of the network in steady state. If the reader would just like the
algorithm, the remainder can be skipped and they could jump directly to
Section 7.2.3.
To describe the analysis, we work our way backward from the perfor-
mance measures all the way to the inputs given in Table 7.5. Note that the
workload in queue i as seen by an arriving customer would not depend on
the class of the customer. Also, the expected value of this workload in steady
state would be the expected time this customer waits before beginning ser-
vice. Hence we call this expected workload as Wiq in node i denoting the
total time on average an arriving customer to node i would have to wait
for service to begin considering FCFS service discipline (assuming system is
in steady state and is stable). To compute Wiq we can model the queue in
node i as a single aggregated class G/G/mi queue. In particular, if for the
aggregated class, the interarrival time has mean 1/λi and SCOV C2Ai , and the
service times are IID with mean 1/μi and SCOV C2Si , then using the G/G/mi
approximation in Section 4.3 we have

α mi 1 C2Ai + C2Si
Wiq ≈ , (7.15)
μi 1 − ρi 2mi

where ρi = λi /(mi μi ) and

⎧ mi
⎪ ρ +ρ
⎨ i 2 i if ρi > 0.7,
α mi =
⎪
⎩ mi +1
ρi 2
if ρi < 0.7.

Before proceeding it may be worthwhile to test the approximation for Wiq

against simulations. We do that through a problem next.
410 Analysis of Queues

Problem 69
Consider a single G/G/m queue with interarrival and service times according
to gamma distributions. Compare against simulations the expression for Wq
given by

αm 1 C2A + C2S
Wq ≈ ,
μ 1−ρ 2m

where ρ = λ/(mμ) and

⎧ m
⎪ ρ +ρ
⎨ 2 if ρ > 0.7,
αm =
⎪
⎩ρ 2
m+1

i if ρ < 0.7.

Experiment with various cases m = 2 and m = 9, ρ = 0.6 and ρ = 0.9, SCOV

of 0.25 and 4. For convenience let μ = 1 for all experiments.
Solution
We consider all 16 cases of varying m, ρ, C2A , and C2S . For the simulations
we perform 1 million service completions after an initial warm up period.
The results are summarized in Table 7.6 where the simulations are a 95%

TABLE 7.6
Comparison of Simulation’s 95% Confidence Interval against Analytical
Approximation for Wq
Experiment m ρ C2A C2S Approximation Simulation

1 2 0.9 4 0.25 9.0844 9.4628 ± 0.0772

2 2 0.9 4 4 17.1000 16.5310 ± 0.1689
3 2 0.9 0.25 0.25 1.0688 0.9737 ± 0.0036
4 2 0.9 0.25 4 9.0844 7.9961 ± 0.0721
5 2 0.6 4 0.25 1.2345 1.4520 ± 0.0030
6 2 0.6 4 4 2.3238 2.4691 ± 0.0088
7 2 0.6 0.25 0.25 0.1452 0.0792 ± 0.0001
8 2 0.6 0.25 4 1.2345 0.9102 ± 0.0037
9 9 0.9 4 0.25 1.5199 1.9227 ± 0.0170
10 9 0.9 4 4 2.8609 3.0780 ± 0.0417
11 9 0.9 0.25 0.25 0.1788 0.1604 ± 0.0007
12 9 0.9 0.25 4 1.5199 1.2983 ± 0.0155
13 9 0.6 4 0.25 0.0459 0.1798 ± 0.0006
14 9 0.6 4 4 0.0864 0.19878 ± 0.0012
15 9 0.6 0.25 0.25 0.0054 0.0019 ± 0.00001
16 9 0.6 0.25 4 0.0459 0.0324 ± 0.0003
Approximations for General Queueing Networks 411

confidence interval on Wq over 50 replications and the approximation is the

formula for Wq in the problem description.

The approximation appears to be reasonable at least for the purposes of

initial design and planning. In fact, in the literature it has been reported
that the approximation for Wiq in Equation 7.15 is extremely effective in
practice. Thus we will go ahead and use Equation 7.15 to derive our perfor-
mance metrics in multiserver and multiclass queueing networks. However,
note that none of the variables in the expression for Wiq described in Equa-
tion 7.15 are in terms of the input metrics defined in Table 7.5. Therefore,
what is needed is a procedure to obtain those variables. For that, we con-
sider not only superposition, splitting, and flow through a queue, but
also the effect of multiple servers as well as multiple classes, which we
describe first.

7.2.2.1 Flow through Multiple Servers

Besides the expression for Wiq given earlier, the effect of multiple servers
is mainly in obtaining the SCOV of the departures from node i. Assuming
that node i can be modeled as an aggregated single-class G/G/mi queue with
interarrival time having mean 1/λi and SCOV C2Ai , and the service times
IID with mean 1/μi and SCOV C2Si , then from Whitt [103] we can obtain
the SCOV of the interdeparture times C2Di . In particular, that expression is
approximated as

ρ2i C2Si − 1
C2Di =1+ √ + 1 − ρ2i C2Ai − 1
mi

where ρi = λi /(mi μi ). Note that the result is identical to that of a G/G/1 queue
given in Section 7.1 if we let mi = 1. Also, since the departures from M/G/∞
and M/M/mi queues are Poisson, we can verify from above by using C2Ai = 1
as well as letting mi → ∞ and C2Si = 1, respectively, for the two queues, we
can show that C2Di is one in both cases.

7.2.2.2 Flow across Multiple Classes

Besides the parameters in Table 7.5, say the following are given (actually
these are not known and would be solved recursively in the algorithm): λi,r ,
the mean arrival rate of class r customers to node i, and C2Ai,r , the SCOV of
interarrival times of class r customers into node i. Using these we can obtain
412 Analysis of Queues

several metrics of interest. In particular, to obtain the traffic intensity ρi,r of

node i due to customers of class r can be computed as

λi,r
ρi,r =
mi μi,r

since it is nothing but the ratio of class i arrival rate to the net service
rate. Thus we can aggregate over all classes and obtain the effective traffic
intensity into node i, ρi , as

R
ρi = ρi,r .
r=1

It is worthwhile to point out that the condition for stability for node i is given
by ρi < 1. We can also immediately obtain the aggregate mean arrival rate
into note i, λi , as the sum of the arrival rate over all classes. Hence

R
λi = λi,r .
r=1

Also, we can obtain the SCOV of the aggregate arrivals into node i (by
aggregating over all classes)

1 2
R
C2Ai = CAi,r λi,r .
λi
r=1

This result can be derived directly from the SCOV of a flow as a result of
superpositioning described in Section 7.1.2.
Having obtained all the expressions for the input to queue i, next we
obtain the aggregate service parameters and the split output from node i. In
particular, μi , the aggregate mean service rate of node i, can be obtained from
its definition using

1 λi
μi = = .
R λi,r 1 ρi
r=1 λi mi μi,r

This result and the next one on the aggregate SCOV of service times across all
classes at a node can be derived using that in Section 5.2.1 for M/G/1 queue
Approximations for General Queueing Networks 413

with FCFS service discipline. Thus the aggregate SCOV of service time of
node i, C2Si , is given by

R 2
λi,r μi
C2Si = −1 + C2Si,r + 1 .
λi mi μi,r
r=1

Finally, the average flow rate for departures from node i into node j using
splitting of flows can be computed. As defined earlier, λij,r is the mean
departure rate of class r customers from node i that end up in node j. Since
pij,r is the fraction of traffic of class r that depart from node i join node j,
we have

λij,r = λi,r pij,r .

Next we are in a position to describe the SCOV due to superposition and

splitting.

7.2.2.3 Superposition and Splitting of Flows

Besides the parameters in Table 7.5, say the following are given (actually
these are not known and would be solved recursively in the algorithm): λji,r ,
which is the mean arrival rate of class r customers from node j to node i, and
C2ji,r , the SCOV of time between two consecutive class r customers that go
from node j to node i. Using these we can obtain several metrics of interest.
In particular, to obtain the effective class r arrival rate into node i, λi,r can be
computed by summing over all the flows from all nodes as

N
λi,r = λ0i,r + λj,r pji,r .
j=1

In fact, we would have to solve N such equations to obtain λi,r for all i. In the
single-class case recall that we solved that by inverting I − P and multiplying
that by external arrivals. Although something similar can be done here, care
must be taken to ensure that the set of nodes that class r traffic traverses only
must be considered (otherwise the generic I − P is not invertible). Further,
using the superposition of flows result in Section 7.1.2, we can derive C2Ai,r ,
the SCOV of class r interarrival times into node i as

1 2
N
C2Ai,r = Cji,r λj,r pji,r .
λi,r
j=0
414 Analysis of Queues

To obtain the SCOV of aggregate interarrival times into node i, C2Ai , we once
again use the superposition result in Section 7.1.2 to get

1 2
R
C2Ai = CAi,r λi,r .
λi
r=1

We had seen earlier that we can obtain the effective SCOV of the departures
from node i using C2Si , the aggregate SCOV of service time of node i (also
derived earlier), as

ρ2i C2Si − 1
C2Di =1+ √ + 1 − ρ2i C2Ai − 1 .
mi

Thus we can get C2ij,r , which is the SCOV of time between two consecutive
class r customers going from node i to node j in steady state as

C2ij,r = 1 + pij,r C2Di − 1 .

This result is directly from the splitting of flows departing a queue described
in Section 7.1.2. With that we have defined all the parameters necessary for
the algorithm to obtain steady-state performance measures for all classes of
customers at all nodes.

7.2.3 QNA Algorithm

Here, we describe the queueing network analyzer (QNA) algorithm (based
on Whitt [103]) to obtain the performance measures of general multiserver
and multiclass open queueing networks. Say we are given all the parameters
in Table 7.5. The algorithm to obtain Lj,r , the expected number of class r
customers in node j in steady state for all j ∈ {1, 2, . . . , N} and r ∈ {1, 2, . . . , R},
is described next. The parameters used in the algorithm for convenience are
all shown in Table 7.7.
The decomposition algorithm essentially breaks down the network into
individual nodes and analyzes each node as an independent G/G/s queue
with multiple classes (note that this is only FCFS and hence the multiple class
aggregation is straightforward, similar to that for M/G/1 queues in Section
5.2.1). For the G/G/s analysis, we require the mean arrival and service rates
as well as the SCOV of the interarrival times and service times. The bulk of
the algorithm in fact is to obtain them. There are three situations where this
becomes tricky: when multiple streams are merged (superposition), when
traffic flows through a node (flow), and when a single stream is divided
into multiple streams (splitting). For convenience, we assume that just before
entering a queue, the superposition takes place, which results in one stream.
Approximations for General Queueing Networks 415

TABLE 7.7
Parameters Obtained as Part of the QNA Algorithm
μi Aggregate mean service rate of node i
C2S Aggregate SCOV of service time of node i
i
λij,r Mean arrival rate of class r customers from node i to node j
λi,r Mean class r arrival rate to node i (or mean departure rate from node i)
λi Mean aggregate arrival rate to node i
ρi,r Traffic intensity of node i due to customers of class r
ρi Traffic intensity of node i across all classes
C2ij,r SCOV of time between two consecutive class r customers going from node i to node j
C2A SCOV of class r interarrival times into node i
i,r
C2A Aggregate SCOV of interarrival times into node i
i
C2D Aggregate SCOV of inter-departure times from node i
i
Li,r Expected number of class r customers in node i in steady state
Wiq Expected time waiting before service in node i in steady state

Likewise, we assume that upon service completion, there is only one stream
that gets split into multiple streams. The following are the three basic steps
in the algorithm.
Step 1: Calculate the mean arrival rates, utilizations, and aggregate service
rate parameters using the following:

N
λi,r = λ0i,r + λj,r pji,r
j=1

λij,r = λi,r pij,r

R
λi = λi,r
r=1
λi,r
ρi,r =
mi μi,r

R
ρi = ρi,r (condition for stability ρi < 1∀i)
r=1
1 λi
μi = =
R λi,r 1 ρi
r=1 λi mi μi,r

R 2
λi,r μi
C2Si = −1 + C2Si,r + 1 .
λi mi μi,r
r=1
416 Analysis of Queues

Step 2: Iteratively calculate the coefficient of variation of interarrival times

at each node. Initialize all C2ij,r = 1 for the iteration. Then until convergence
performs (1), (2), and (3) cyclically.

(1) Superposition:

1 2
N
C2Ai,r = Cji,r λj,r pji,r
λi,r
j=0

1 2
R
C2Ai = CAi,r λi,r
λi
r=1

(2) Flow:

ρ2i C2Si − 1
C2Di =1+ √ + 1 − ρ2i C2Ai − 1
mi

(3) Splitting:

λi,r
C2ij,r = 1 + pij,r C2Di − 1 .
λi

Note that the splitting formula is exact if the departure process is a renewal
process. However, the superposition and flow formulae are approximations.
Several researchers have provided other expressions for the flow and super-
position. As mentioned earlier, the preceding is QNA, described in Whitt
[103].
Step 3: Obtain performance measures such as mean queue length and
mean waiting times using standard G/G/m queues. Treat each queue as an
independent approximation. Choose αmi such that
⎧ mi
⎪ ρ +ρ
⎨ i 2 i if ρi > 0.7
α mi =
⎪
⎩ mi +1
ρi 2
if ρi < 0.7.

Then the mean waiting time for class r customers in the queue (not including
service) of node i is approximately

α mi 1 C2Ai + C2Si
Wiq ≈ .
μi 1 − ρi 2
Approximations for General Queueing Networks 417

Thus we can obtain the steady-state average number of class r customers in

node i as

λi,r
Li,r = + λi,r Wiq .
μi,r

Problem 70
Consider an e-commerce system where there are three stages of servers. In
the first stage there is a single queue with four web servers; in the second
stage there are four application servers, two of which are on the same node
and share a queue; an in the third stage there are three database servers (two
on one node sharing a queue and one on another node). The e-commerce
system caters to two classes of customers but serve them in an FCFS man-
ner (both across classes and within a class). This e-commerce system at the
server end can be modeled as an N = 6 node and R = 2 class open queueing
network with multiple servers. This multiserver and multiclass open queue-
ing network is described in Figure 7.9. There are two classes of customers
and both classes arrive externally only into node 1. Class-1 customers exit the
system only from node 5 and class-2 customers only from node 6. Assume
that the interarrival times for customers coming externally have a mean 1/3
units of time and standard deviation 2/3 time units for class-1 and a mean 1/6
units of time and standard deviation 1/4 units of time for class-2. The mean
and standard deviation of service times (in the same time units as arrivals)
at each node for each class are described in Table 7.10. Likewise, the rout-
ing probabilities pij,r from node i to j is provided in Table 7.8 for r = 1 (i.e.,
class-1) and Table 7.9 for r = 2 (i.e., class-2). Using this information compute
the steady-state average number of each class of customers at each node.
Solution
Based on the problem description we first cross-check to see that all input
metrics described in Table 7.5 are given. Clearly, we have N = 6 and R = 2.

1 3

FIGURE 7.9
Multiserver and multiclass open queueing network.
418 Analysis of Queues

TABLE 7.8
Class-1 Routing Probabilities [pij,1 ] from Node
on Left to Node on Top
1 2 3 4 5 6
1 0 0.8 0.2 0 0 0
2 0 0 0 0 1 0
3 0 0 0 0 1 0
4 0 0 0 0 0 0
5 0 0 0.6 0 0 0
6 0 0 0 0 0 0

TABLE 7.9
Class-2 Routing Probabilities [pij,2 ] from Node
on Left to Node on Top
1 2 3 4 5 6
1 0 0 0.1 0.9 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 1
4 0 0 0 0 0 1
5 0 0 0 0 0 0
6 0 0 0.75 0 0 0

TABLE 7.10
Number of Servers and Mean and Standard Deviation of Service Times for Each
Class
Node i 1 2 3 4 5 6
No. of servers mi 4 1 2 1 1 2
Class-1 service rate μi,1 2 2.5 6 N/A 8 N/A
SCOV class-1 service C2S 2 0.64 1.44 N/A 0.81 N/A
i,1
Class-2 service rate μi,2 4 N/A 20 6 N/A 15
SCOV class-2 service C2S 4 N/A 2 0.49 N/A 0.64
i,2

The routing probabilities (for all i and j) pij,1 and pij,2 are provided in
Tables 7.8 and 7.9, respectively. Also, Table 7.10 lists mi , μi,1 , μi,2 , C2Si,1 , and
C2Si,2 for i = 1, 2, 3, 4, 5, 6. Finally, λ01,1 = 3, C201,1 = 4, λ01,2 = 6, and C201,2 = 2.25,
with all other λ0i,r = 0 and C20i,r = 0. Now, we go through the three steps of
the algorithm.
Approximations for General Queueing Networks 419

For step 1 note that class-1 customers go through nodes 1, 2, 3, and 5,

whereas class-2 customers use nodes 1, 3, 4, and 5. Thus to solve

N
λi,r = λ0i,r + λj,r pji,r
j=1

for all i ∈ [1, . . . , 6] and r = 1, 2, we can simply consider the subset of nodes
that class r customers traverse. Then using the approach followed in Jackson
networks we can get the λi,r values. However, in this example since there is
only one loop in the network, one can obtain λi,r in a rather straightforward
fashion. In particular, we get

[ λ1,1 λ2,1 λ3,1 λ4,1 λ5,1 λ6,1 ] = [ 3 2.4 5.1 0 7.5 0 ]

and

[ λ1,2 λ2,2 λ3,2 λ4,2 λ5,2 λ6,2 ] = [ 6 0 18.6 5.4 0 24 ].

Using these results we can immediately obtain for all i ∈ {1, . . . , 6},
j ∈ {1, . . . , 6}, and r = 1, 2

λij,r = λi,r pij,r

using the preceding λi,r values. Thus the aggregate arrival rate into node i
across all classes, λi , can be obtained by summing over λi,r for r = 1, 2. Hence
we have

[ λ1 λ2 λ3 λ4 λ5 λ6 ] = [ 9 2.4 23.7 5.4 7.5 24 ].

For all i ∈ [1, . . . , 6] and r = 1, 2, we can write down ρi,r = λi,r /mi μi,r and
2
thereby obtain ρi = ρi,r as
r=1

[ ρ1 ρ2 ρ3 ρ4 ρ5 ρ6 ] = [ 0.75 0.96 0.89 0.9 0.9375 0.8 ].

Clearly, since ρi < 1 for all i ∈ {1, 2, 3, 4, 5, 6}, all queues are stable. Further,
the last computation in step 1 of the algorithm is to obtain the aggregate
service rate and SCOV of service times at node i, which can be computed
using

λi
μi =
ρi

R 2
λi,r μi
C2Si = −1 + C2Si,r + 1 .
λi mi μi,r
r=1
420 Analysis of Queues

For that, we get the following numerical values

[ μ1 μ2 μ3 μ4 μ5 μ6 ] = [ 12 2.5 26.6292 6 8 30 ]

and

C2S1 C2S2 C2S3 C2S4 C2S5 C2S6 = [ 3.125 0.64 2.6291 0.49 0.81 0.64 ].

For step 2 of the algorithm we initialize all C2ij,r = 1. Then, for all
i ∈ [1, . . . , 6] and r = 1, 2, we obtain

1 2
N
C2Ai,r = Cji,r λj,r pji,r
λi,r
j=0

1 2
R
C2Ai = CAi,r λi,r
λi
r=1

ρ2i (C2Si
− 1)
C2Di = 1 + √ + 1 − ρ2i C2Ai − 1 .
mi

Then we iteratively perform the preceding set of computations coupled with

the following

C2ij,r = 1 + pij,r C2Di − 1

till C2ij,r converges for all i, j ∈ [1, . . . , 6] and r = 1, 2. Upon convergence, we

can obtain the aggregate SCOV of arrivals into node i, C2Ai as

C2A1 C2A2 C2A3 C2A4 C2A5 C2A6 = [ 2.8333 2.1198 1.0449 2.2598 1.5487 1.6753 ].

m
Finally, in step 3 of the algorithm, we obtain αmi = (ρi i + ρi )/2 since
ρi > 0.7 for all i. Then using the approximation for Wiq , namely

α mi 1 C2Ai + C2Si
Wiq ≈
μi 1 − ρi 2

for all i ∈ {1, . . . , 6}, we obtain

W1q W2q W3q W4q W5q W6q = [0.5295 9.6637 0.5203 1.7474 1.7312 0.1282].
Approximations for General Queueing Networks 421

Upon running 50 simulation replications and using appropriate gamma

distributions, we obtain a 95% confidence interval of corresponding Wiq
values as

[0.6092 ± 0.0037 21.8970 ± 1.0294 0.4167 ± 0.0033 2.2510 ± 0.0182

2.1448 ± 0.0615 0.1753 ± 0.0006].
We can thereby obtain the steady-state average number of class r cus-
tomers in node i using
λi,r
Li,r = + λi,r Wiq
μi,r
as
[ L1,1 L2,1 L3,1 L4,1 L5,1 L6,1 ] = [ 3.0885 24.1528 3.5037 0 13.9212 0 ]
and
[ L1,2 L2,2 L3,2 L4,2 L5,2 L6,2 ] = [ 4.6770 0 10.6083 10.3359 0 4.6778 ].

Before moving ahead with other policies for serving multiclass traffic in
queueing networks, we present a case study. This case study is based on
the article by Chrukuri et al. [20] and illustrates an application of FCFS mul-
ticlass network approximation where many of the conditions required for
the analysis presented earlier are violated. In particular, the system is (a) a
multiclass queueing network with class switching, (b) a polling system with
limited service discipline, and (c) there are finite-capacity queues with block-
ing. However, since these are not at the bottleneck node, the results are not
terribly affected if we were to continue to use QNA. Further, the presen-
tation of this case study is fairly different from the previous case studies
in some ways.

7.2.4 Case Study: Network Interface Card in Cluster Computing

A network interface card (NIC) is a computer circuit board or card that is
installed in a computer so that the computer can be connected to a network.
Although personal computers and workstations on an Ethernet typically
contain a NIC specifically designed for it, NICs are also used to intercon-
nect clusters of computers or workstations so that the cluster can be used
for high performance or massively parallel computations. This case study
is regarding such an NIC used in cluster computing that adopts the vir-
tual interface architecture (VIA). It contains a CPU called LANai (which is
primarily responsible for coordinating data transfer between the computer
where the NIC resides and the network, i.e., other nodes of the cluster); a host
direct memory access (HDMA), which is used to transfer the data between
the host memory and card buffer (SRAM); a Net Send DMA (NSDMA)
422 Analysis of Queues

NIC
LANai
Net
send
HDMA DMA
Computer or
workstation Network
SRAM
(memory) Net
receive
DMA

FIGURE 7.10
Components of a Myrinet VIA NIC.

engine to transfer data from SRAM onto the network; and a Net Receive
DMA (NRDMA) engine to transfer data on to the SRAM from the network.
An example of such a VIA NIC is depicted in Figure 7.10 and its functioning
is described next.
The LANai goes through the following operations cyclically: polling the
doorbell queue to know if there is data that needs to be transferred, polling
the descriptor queue on SRAM that associates a doorbell with its data, and
polling the data queue. In addition, it programs NSDMA and NRDMA to
send and receive the data to and from the network, respectively. LANai polls
the doorbell queue and makes them available for HDMA to obtain the cor-
responding descriptors. Polled doorbells wait in a queue at HDMA to get
serviced on an FCFS basis. They are processed by the HDMA and the corre-
sponding descriptors are stored in the descriptor queue on the SRAM. The
descriptors in this queue are polled by LANai and it makes them available
for HDMA to obtain the corresponding data. In the case of a send descrip-
tor, LANai initiates the transfer of data from the host memory on to the data
queue on SRAM using HDMA. In the case of a receive descriptor, LANai
initiates the transfer of data from the network queue at NRDMA to the data
queue on SRAM using the NRDMA. LANai polls the data queue and if the
polled data is of type “send,” it checks whether NSDMA is busy. If not, it
initiates the transfer of send data from SRAM data queue to NSDMA. If the
polled data is of type “receive,” it initiates the transfer of data from SRAM
data queue to host memory using HDMA.
In summary, the operation of a VIA NIC can be modeled as a multi-
class queueing network. An experiment was performed where only the send
messages were considered (without any data received from the network) to
measure interarrival times and service times. Based on this, the send process
is depicted in Figure 7.11. There are three stations in the queueing network
corresponding to the LANai, HDMA, and NSDMA. At the LANai there are
three queues. Entities arrive externally into one of the queues according to
PP(λ), where λ is in units of per microsecond. It takes 22 μs to serve those
Approximations for General Queueing Networks 423

10 52.7
0.12
22 68.3
NSDMA
PP(λ) 21 (Blocking)
LANai
(polling) HDMA
(FCFS)

FIGURE 7.11
Multiclass queueing network model of an NIC with send.

entities. Then the entities go into the HDMA. Although there are two queues
presented in the figure for illustration, there is really only one queue and
entities are served according to FCFS. When an entity arrives for the first
time it takes 21 μs to serve it and the entities go back to the LANai station
where they are served in just 0.12 μs and they return to the HDMA for a sec-
ond time, this time to be served in 68.3 μs. The entities return to the LANai.
The LANai would spend 10 μs serving the entity if the NSDMA is idle, oth-
erwise the LANai would continue polling its other queues. Note that the
LANai knows if the NSDMA is idle or not because it also polls it to check if
it is idle but the time is negligible. Once the entity reaches the idle NSDMA,
it takes 52.7 μs to process and it exits the system. Note that all service times
are deterministic.
In summary, the model of the system is that of a reentrant line. The first
station LANai uses a limited polling policy where it polls each queue, serves
at most one entity, and moves to the next queue. Also, entities in one of the
queues (with 10 μs service time) can begin service only if the NSDMA sta-
tion is idle. Thus the LANai would serve zero entities in that queue if the
NSDMA is busy. Then the second station is HDMA, which uses a pure FCFS
strategy. And the third station, NSDMA has no buffer, so it would get an
entity only if it is idle, in other words, it blocks an entity in the correspond-
ing LANai queue. Therefore, the system is a multiclass queueing network
with reentrant lines (or class switching and deterministic routing). It uses
a polling system with limited service discipline as opposed to FCFS at one
of the nodes. There is a finite-capacity node that blocks one of the queues.
Thus several of the conditions we saw in this section are violated. How-
ever, a quick glance would reveal that the bottleneck station is the HDMA.
In particular, the utilizations of the NSDMA and HDMA are 52.7λ and 89.3λ,
respectively. The utilization of the LANai is trickier to compute because the
LANai could be idling due to being blocked by the NSDMA. Nonetheless,
the fraction of time the LANai would have one or more entities would be
only a little over 32.12λ.
Thus undoubtedly, the HDMA station would be the bottleneck. To ana-
lyze the system considering the significant difference in utilizations at the
424 Analysis of Queues

stations, we approximate using a three-station multiclass network where the

policy is FCFS at all nodes, class-based routing, and all buffers (including the
NSDMA) being of infinite capacity. Note that the bottleneck station, that is,
HDMA, is relatively unaffected. Then we use the multiclass queueing net-
work analysis presented in this section and obtain performance measures
such as the number of entities in each station in steady state. We compared
our results against simulations, which were done for the model depicted
in Figure 7.11. We let ρHDMA = 89.3λ be the traffic intensity at the bottle-
neck station, that is station 2 with HDMA. Then for ρHDMA = 0.8032 and
ρHDMA = 0.9825, the probability that there were one or more entities in the
LANai queues was 0.2935 and 0.364, respectively, which are both fairly
close to 32.12λ as we had conjectured. In particular, they were 0.2891 and
0.3533, respectively. Despite the approximations, the analytical model was
fairly close to the simulation results, especially for lower traffic intensities.
For example, for ρHDMA = 0.8032 and ρHDMA = 0.9825, the total numbers in
the HDMA queue were 1.5285 and 24.1981, respectively and in the LANai
queues they were 0.0642 and 0.0980, respectively using the analytical models.
The corresponding HDMA queue values in the simulations were 1.8653 and
30.499, respectively, and those of the LANai were 0.0854 and 0.1378, respec-
tively. Although these are not terribly accurate, they are reasonably close. It
is crucial to realize that the analytical model enables a user to quickly per-
form what-if analysis. In fact, for ρHDMA = 0.8032 and ρHDMA = 0.9825, the
simulations took almost 21 and 38 min, whereas they were less than a second
for the analytical model.

7.3 Multiclass and Single-Server Open Queueing Networks

with Priorities
In Section 7.2, we considered multiclass open queueing networks with FCFS
service discipline. A major change in this section is that we describe approx-
imations when the service discipline is based on priorities across the various
classes. In particular, we consider preemptive resume priority discipline
described in Section 5.2.3 for the single node case. In the preemptive resume
priority discipline the server would never process a low-priority job when
there are high-priority jobs waiting. Thus if a high-priority job arrives when
a low-priority job is being processed, the low-priority job gets preempted by
the high-priority job. When there are no more high-priority jobs left, then
the low-priority job’s processing resumes. Further, within a class the jobs
are processed according to FCFS rule. The first two sections that follow con-
sider a global priority rule where the priority order of classes are maintained
throughout the network. The last section allows local priorities where each
node decides on a priority order among the different classes it processes.
Approximations for General Queueing Networks 425

It is worthwhile to point out that this entire section is devoted to single-

server open queueing networks only. Wherever appropriate we will describe
generalizations.

7.3.1 Global Priorities: Exponential Case

As described earlier, we consider an open queueing network where each
node can have only one server. There are many classes of customers in
the network. The classes of customers are differentiated due to the relative
importance of the customers to the system as a whole. Hence we assume that
there is a network-wide (i.e., global) priority order. We also allow classes to
have varying service time, arrival locations, as well as routing requirements.
Each class of customers arrive externally according to a Poisson process and
their service times at each node are according to exponential distributions.
Further, although we assume routing is according to Bernoulli splitting, the
routing probabilities could vary depending on the class of the customer.
Although the notation is very similar to the previous section, for the sake
of completeness, we explicitly state them as follows:

1. There are N service stations (or nodes) in the open queueing network
indexed 1, 2, . . . , N.
2. There is one server at node i, for all i satisfying 1 ≤ i ≤ N.
3. The network has multiple classes of traffic and class switching is not
allowed. Let R be the total number of classes in the entire network.
There is a global priority order across the entire network with class-
1 having highest priority and class R having lowest priority. Each
class has its unique external arrival process, service times at each
node, as well as routing probabilities. They are explained next.
4. Externally, customers of class r (such that r ∈ {1, . . . , R}) arrive at
node i according to a Poisson process such that the interarrival time
has a mean 1/λi,r . All arrival processes are independent of each
other, the service times, and the class.
5. Service times of class r customers at node i are IID exponential ran-
dom variables with mean 1/μi,r . They are independent of service
times at other nodes.
6. The service discipline at all nodes is a static and global preemptive
resume priority (with class a having higher priority than class b if
a < b). Within a class the service discipline is FCFS.
7. There is infinite waiting room at each node and stability condition is
satisfied at every node.
8. When a customer of class r completes service at node i, the customer
joins the queue at node j (such that j ∈ {1, . . . , N}) with probability
pij,r . We require that pii,r = 0, although we eventually remark that it
426 Analysis of Queues

is possible to easily extend to allowing customers to rejoin a queue

immediately for another service. We assume that every customer in
the network, irrespective of the customer’s class will eventually exit
the network. It is required that for all i and every r

N
pij,r ≤ 1
j=1

as all customers after completing service at node i either depart the

network or join another node (j). The routing of a customer does not
depend on the state of the network.

Our objective is to develop steady-state performance measures for such a

queueing network. For example, we seek to obtain approximations for, say,
the average number of customers of each class at every node in the net-
work in steady state. To analyze this queueing network, we use the MVA
approach and make an approximation for what arriving customers see (akin
to PASTA) to derive Li,r , which as before is the expected number of class r
customers in node i in steady state for all i ∈ {1, 2, . . . , N} and r ∈ {1, 2, . . . , R}.
In particular, the approximation we make is that an arriving customer to
node i (irrespective of class) will see on average Li,1 customers of class-1, Li,2
customers of class-2, . . ., Li,R customers of class R at that node.
Using that approximation, next we describe the algorithm (which is
somewhat similar to PRIOMVA in Bolch et al. [12]) to obtain the performance
measures of general single-server and multiclass open queueing networks
with preemptive resume priorities. Say we are given N, R, λi,r , pij,r , and
μi,r for all i, j ∈ [1, . . . , N] and r ∈ [1, . . . , R]. The algorithm to obtain Li,r ,
the expected number of class r customers in node i in steady state for all
i ∈ {1, 2, . . . , N} and r ∈ {1, 2, . . . , R}, is described next in terms of the metrics
given earlier. For that, we define ai,r as the effective mean entering rate of
class r customers into node i. However, at this moment we do not know ai,r
or Li,r , which are the two steps of the algorithm.
Step 1: Calculate the mean effective entering rates and utilizations using the
following for all i ∈ [1, . . . , N] and r ∈ [1, . . . , R]:

N
ai,r = λi,r + aj,r pji,r
j=1
ai,r
ρi,r = .
μi,r

To solve the first set of equations, if we select the subset of nodes that class
r traffic visits, then we can create a traffic matrix P̂r for that subset of nodes.
Approximations for General Queueing Networks 427

Then we can obtain the entering rate vector for that subset of nodes (âr ) in
terms of the external arrival rate vector at those nodes (λ̂r ) as âr = λ̂r (I− P̂r )−1 ,
where I is the corresponding identity matrix. Thereby we can obtain ai,r for
all i ∈ [1, . . . , N] and r ∈ [1, . . . , R]. Also, the condition for stability is that

R
ρi,r < 1
r=1

for every i.
Step 2: Sequentially compute for each node i (from i = 1 to i = N) and each
r (from r = 1 to r = R)

r
Li,k
r−1
Li,r = ρi,r + ai,r + Li,r ρi,k , (7.16)
μi,k
k=1 k=0

where ρi,0 = 0 for all i. Notice that it is important to solve for the r val-
ues sequentially because for the case r = 1 it is possible to derive Li,1 using
Equation 7.16, then for r = 2 to derive Li,2 one needs Li,1 in Equation 7.16,
and so on.
Before illustrating the preceding algorithm using an example, we first
explain the derivation of Equation 7.16 and also make a few remarks. Define
Wi,r as the sojourn time for a class r customer during a single visit to node i.
Of course, due to Little’s law we have for all i ∈ [1, . . . , N] and r ∈ [1, . . . , R]

Li,r = ai,r Wi,r .

To write down Wi,r , consider a class r customer entering node i at an arbitrary

time in steady state. The expected sojourn time for this customer equals the
sum of the following three components: (1) the average time the customer
waits for all customers of classes 1 to r that were in the system upon arrival
to complete service (based on our approximation, the expected number of
class k customers this arrival would see is Li,k ; hence the average time to
serve all those class k customers is Li,k /μi,k ), (2) during the sojourn time Wi,r
all arrivals of class k such that k < r would have to be served (i.e., the ai,k Wi,r
class k customers that arrive and each takes an average of 1/μi, k of service
time), and (3) the average time the customer spends being served (which is
1/μi,r ). Hence we have

r−1
Li,k
r
ai,k 1
Wi,r = + Wi,r + .
μi,k μi,k μi,r
k=1 k=0

Using Little’s law Li,r = ai,r Wi,r we can rewrite this expression in terms of Li,k
and obtain Equation 7.16. Having explained the algorithm, next we present
a couple of remarks and then illustrate it using an example.
428 Analysis of Queues

Remark 16

The algorithm is exact for the special cases of N = 1 with any R and R = 1
with any N. That is because for N = 1 and any R it reduces to a single station
M/G/1 multiclass queues considered in Section 5.2.3. Also, for R = 1 and
any N we get a Jackson network. It may be a worthwhile exercise to check
the results for the preceding two special cases. Further, notice that under
those special cases due to PASTA, arriving customers do see time-averaged
number in the system.

Remark 17

Although we required that pii,r be zero for every i and r, the algorithm can
certainly be used as an approximation even when pii,r > 0. Also, the algo-
rithm can be seamlessly extended to non-preemptive priorities and closed
queueing networks by suitably approximating what arrivals see. The results
can be found in Bolch et al. [12].

Problem 71
Consider a small Internet service provider that can be modeled as a network
with N = 4 nodes. There are R = 3 classes of traffic. Class-1 traffic essentially
is control traffic that monitors the network states, and it is given highest pri-
ority with an external arrival rate of λi,1 = 0.5 at every node i. Class-2 traffic
arrives at node 1 at rate 3, then goes to server at node 2, and exits the net-
work through node 4. Likewise, class-3 traffic arrives into node 1 at rate 2,
then gets served at node 3, and exits the network after being served in node
4. Assume that all nodes have a single server, infinite waiting room, and the
priority order is class-1 (highest) to 2 (medium) to 3 (lowest) at all nodes.
The policy for priority is preemptive resume. Assume that external arrivals
are according to Poisson processes and service times are exponentially dis-
tributed. The service rates (in the same time units as arrivals) at each node
for each class are described in Table 7.11. Likewise, the routing probabilities
pij,r from node i to j are provided in Table 7.12 for r = 1 (i.e., class-1), Table

TABLE 7.11
Mean Service Rates for Each Class at Every Node
Node i 1 2 3 4
Class-1 service rate μi,1 10 8 8 10
Class-2 service rate μi,2 8 5 N/A 6
Class-3 service rate μi,3 7 N/A 4 8
Approximations for General Queueing Networks 429

TABLE 7.12
Class-1 Routing Probabilities [pij,1 ] from Node
on Left to Node on Top
1 2 3 4
1 0 0.2 0.2 0.2
2 0.25 0 0.25 0.25
3 0.25 0.25 0 0.25
4 0.2 0.2 0.2 0

TABLE 7.13
Class-2 Routing Probabilities [pij,2 ] from Node
on Left to Node on Top
1 2 3 4
1 0 1 0 0
2 0 0 0 1
3 0 0 0 0
4 0 0 0 0

TABLE 7.14
Class-3 Routing Probabilities [pij,3 ] from Node
on Left to Node on Top
1 2 3 4
1 0 0 1 0
2 0 0 0 0
3 0 0 0 1
4 0 0 0 0

7.13 for r = 2 (i.e., class-2), and Table 7.14 for r = 3 (i.e., class-3). Using
this information compute the steady-state average number of each class of
customers at each node.
Solution
To solve the problem we go through the two steps of the algorithm. For
step 1, note that class-1 customers go through nodes 1, 2, 3, and 4; class-2
customers use nodes 1, 2, and 4; whereas class-3 customers use nodes 1, 3,
430 Analysis of Queues

and 4. Thus we solve

N
ai,r = λi,r + aj,r pji,r
j=1

for all i ∈ [1, . . . , 4] and r = 1, 2, 3, to get

[ a1,1 a2,1 a3,1 a4,1 ] = [ 1.5625 1.5 1.5 1.5625 ],

[ a1,2 a2,2 a3,2 a4,2 ] = [ 3 3 0 3 ]

and

[ a1,3 a2,3 a3,3 a4,3 ] = [ 2 2 0 2 ].

For all i ∈ [1, . . . , 4] and r = 1, 2, 3, we can write down ρi,r = ai,r /μi,r .
Hence we get

[ ρ1,1 ρ2,1 ρ3,1 ρ4,1 ] = [ 0.1563 0.1875 0.1875 0.1563 ],

[ ρ1,2 ρ2,2 ρ3,2 ρ4,2 ] = [ 0.375 0.6 0 0.5 ]

and

[ ρ1,3 ρ2,3 ρ3,3 ρ4,3 ] = [ 0.2857 0 0.5 0.25 ].

Notice that the condition for stability

R
ρi,r < 1
r=1

is satisfied for every i.

Finally in step 2 of the algorithm, we obtain sequentially Li,r as

[ L1,1 L2,1 L3,1 L4,1 ] = [ 0.1852 0.2308 0.2308 0.1852 ],

[ L1,2 L2,2 L3,2 L4,2 ] = [ 0.9185 3.2308 0 1.6162 ]

and

[ L1,3 L2,3 L3,3 L4,3 ] = [ 3.0179 0 1.7846 8.8081 ].

Upon running simulations with 50 replications, it was found that the results
matched exactly for class-1, that is, [L1,1 L2,1 L3,1 L4,1 ] since for class-1 the
system is a standard open Jackson network. For classes 2 and 3, since the
Approximations for General Queueing Networks 431

results are approximations, the 95% confidence interval for the simulations
yielded

[ L1,2 L2,2 L3,2 L4,2 ] = [ 0.9423 ± 0.0022 3.4073 ± 0.1613 0 1.7050 ± 0.0041 ]

and

[ L1,3 L2,3 L3,3 L4,3 ] = [ 3.1390 ± 0.0178 0 2.3621 ± 0.0108 9.4006 ± 0.0894 ].

7.3.2 Global Priorities: General Case

Here we briefly consider the case where interarrival and service times are
according to general distributions. We essentially extend the approxima-
tions made in Section 7.3.1 in terms of what an arriving customer sees
to the general case. Please refer to Section 7.3.1 as all the notation and
assumptions remain the same. The only additional notation is that we define
σ2i,r as the variance of service times for class r customers in node i. All
the parameters that are given as inputs to the model in Section 7.3.1 are
inputs here as well, besides σ2i,r . We do not use the variance of interar-
rival times in our analysis and hence do not have a notation for that. For
such a multi-priority and single-server open queueing network, our objec-
tive is to develop steady-state performance measures, especially the average
number of customers of each class at every node in the network in steady
state. The algorithm to obtain Li,r , the expected number of class r cus-
tomers in node i in steady state for all i ∈ {1, 2, . . . , N} and r ∈ {1, 2, . . . , R}, is
described next.

Step 1: Calculate the mean effective entering rates ai,r and utilizations ρi,r
using the following for all i ∈ [1, . . . , N] and r ∈ [1, . . . , R]:

N
ai,r = λi,r + aj,r pji,r
j=1
ai,r
ρi,r = .
μi,r

To solve the first set of equations, if we select the subset of nodes that class
r traffic visits, then we can create a traffic matrix P̂r for that subset of nodes.
Then we can obtain the entering rate vector for that subset of nodes (âr ) in
terms of the external arrival rate vector at those nodes (λ̂r ) as âr = λ̂r (I −
P̂r )−1 , where I is the corresponding identity matrix. Thereby we can obtain
432 Analysis of Queues

ai,r for all i ∈ [1, . . . , N] and r ∈ [1, . . . , R]. Also, the condition for stability
is that

R
ρi,r < 1
r=1

for every i. Note that this step is identical to step 1 in Section 7.3.1.
Step 2: Sequentially compute for each node i (from i = 1 to i = N) and each
r (from r = 1 to r = R)

r
ai,k 1
r
Li,k − ρi,k
r−1
Li,r = ρi,r + ai,r σ2i,k + + ai,r + Li,r ρi,k (7.17)
k=1
2 μ2i,k k=1
μi,k
k=0

where ρi,0 = 0 for all i. Note that it is important to solve for the r val-
ues sequentially because for the case r = 1 it is possible to derive Li,1 using
Equation 7.17, then for r = 2 to derive Li,2 one needs Li,1 in Equation 7.17,
and so on.
Before illustrating the preceding algorithm using an example, we first
explain the derivation of Equation 7.17. Recall from Section 7.3.1 that Wi,r is
the sojourn time for a class r customer during a single visit to node i. Due to
Little’s law we have for all i ∈ [1, . . . , N] and r ∈ [1, . . . , R]

Li,r = ai,r Wi,r .

To write down Wi,r , consider a class r customer entering node i at an arbi-

trary time in steady state. The expected sojourn time for this customer equals
the sum of the following four components: (1) the average remaining ser-
vice time (if there is one of class r or lesser), which can be computed using
the fact that there is a probability ai,k /μi,k that a class k customer is in ser-
vice and its expected remaining service time is μi,k (σ2i,k + 1/μ2i,k )/2, (2) the
average time the customer waits for all customers of classes 1 to r that were
in the queue (without any service started) upon arrival to complete service
(based on our approximation, the expected number of class k customers this
arrival would see is Li,k − ρi,k ; hence the average time to serve all those
class k customers is Li,k /μi,k ), (3) during the sojourn time Wi,r all arrivals
of class k such that k < r would have to be served (i.e., the ai,k Wi,r class k
customers that arrive and each takes an average of 1/μi,k of service time),
and (4) the average time the customer spends being served (which is 1/μi,r ).
Hence we have
r−1

r
ai,k 1
r
Li,k − ρi,k ai,k 1
Wi,r = σ2i,k + + + Wi,r + .
k=1
2 μ2i,k k=1
μi,k
k=0
μi,k μi,r
Approximations for General Queueing Networks 433

Using Little’s law Li,r = ai,r Wi,r we can rewrite this expression in terms of Li,k
and obtain Equation 7.17.

Problem 72
Consider an emergency ward of a hospital with four stations: reception
where all arriving patients check in; triage where a nurse takes health-
related measurements; lab area where blood work, x-rays, etc. are done;
and an operating room. At each of the four stations, only one patient
can be served at any time. Thus the system can be modeled as a single-
server queueing network with N = 4 nodes. There are R = 2 classes of
patients: class-1 corresponds to critical cases (and hence given preemp-
tive priority) and class-2 corresponds to stable cases (hence lower priority).
Both classes of patients arrive according to a Poisson process straight to
node 1, that is, the reception. Note that node 2 is triage, node 3 is lab,
and node 4 is the operating room. External arrival rate for class-1 and 2
patients are 0.001 and 0.08, respectively. The service rates (in the same time
units as arrivals) and standard deviation of service time at each node for
each class are described in Table 7.15. Likewise, the routing probabilities
pij,r from node i to j are provided in Table 7.16 for r = 1 (i.e., class-1)
and Table 7.17 for r = 2 (i.e., class-2). Using this information compute
the steady-state average number of each class of customers in the system
as a whole.

TABLE 7.15
Mean Service Rates and Standard Deviation of Service Times for
Each Class at Every Node
Node i 1 2 3 4
Class-1 service rate μi,1 1 0.2 0.05 0.01
Class-1 service time std. dev. σi,1 1 2.5 10 20
Class-2 service rate μi,2 0.5 0.1 0.025 0.02
Class-2 service time std. dev. σi,2 1 8 10 40

TABLE 7.16
Class-1 Routing Probabilities [pij,1 ] from Node
on Left to Node on Top
1 2 3 4
1 0 1 0 0
2 0 0 0.4 0.3
3 0 0 0 0.5
4 0 0 0 0
434 Analysis of Queues

TABLE 7.17
Class-2 Routing Probabilities [pij,2 ] from Node
on Left to Node on Top
1 2 3 4
1 0 1 0 0
2 0 0 0.2 0.1
3 0 0 0 0.1
4 0 0 0 0

Solution
To solve the problem we go through the two steps of the algorithm. For
step 1, note that both classes of customers have the possibility of going
through all four nodes. Thus we solve

N
ai,r = λi,r + aj,r pji,r
j=1

for all i ∈ [1, . . . , 4] and r = 1, 2, to get

[ a1,1 a2,1 a3,1 a4,1 ] = [ 0.001 0.001 0.0004 0.0005 ]

and

[ a1,2 a2,2 a3,2 a4,2 ] = [ 0.08 0.08 0.016 0.0096 ].

For all i ∈ [1, . . . , 4] and r = 1, 2, 3, we can write down ρi,r = λi,r /μi,r .
Using that we can write down

[ ρ1,1 + ρ1,2 ρ2,1 + ρ2,2 ρ3,1 + ρ3,2 ρ4,1 + ρ4,2 ] = [ 0.161 0.805 0.648 0.53 ].

Notice that the condition for stability is satisfied for every i.

Finally, in step 2 of the algorithm, we obtain sequentially Li,r as

[ L1,1 L2,1 L3,1 L4,1 ] = [ 0.001 0.005 0.008 0.0514 ]

and

[ L1,2 L2,2 L3,2 L4,2 ] = [ 0.1794 3.5182 1.2773 0.9889 ].

Approximations for General Queueing Networks 435

Upon running simulations over 50 replications the results matched remark-

ably well that it is not worth reporting the numerical values of the 95%
confidence intervals. Thus by summing over all i of Li,r for every r, we can
state that in steady state there would be on average 0.0654 class-1 (critical
condition) patients and 5.9638 class-2 patients in total in the system.

7.3.3 Local Priorities

In this section, we consider the possibility of each node in a network to
choose the priority order for serving customers. In other words, instead of
a global priority, we have local priorities in each node. This topic has been
studied extensively in the recent literature (albeit not under the title local
priorities). Chen and Yao [19] provide an excellent treatment that includes
a far more generalized version than what we present here. Further, Chen
and Yao [19] also provide a very different method for approximately analyz-
ing the performance, which is based on semi-martingale reflected Brownian
motion (SRBM). Although we do not describe the SRBM approximation here,
we leverage upon the state-space collapse argument they use as one of their
approximations to study such systems.
We consider an open queueing network where each node can have only
one server. There are many classes of customers in the network. The classes
of customers are differentiated due to the routes taken. However, the nodes
choose the priority order among the various classes. To keep the exposition
simple, we assume that the nodes use a shortest-expected-processing-time-
first policy. Within a class, the nodes serve according to FCFS. We also allow
classes to have different service time at each node, arrival locations, as well
as routing requirements. Except for the local priorities, almost everything
else is similar to the global priority case. Hence refer to Section 7.3.1 for a
complete list of notation (such as N, R, pij,r , λi , μi,r , preemptive resume pri-
ority, and exponential interarrival, as well as service times). The only extra
generalization we make here is that customers in a class can visit a particular
node a deterministic number of times and each visit could have a different
processing rate. Thus we denote μi,r,k as the service rate for all those class
r customers in node i when they traverse it during the kth time (of course
this is only if these customers go through that node more than once). We
subsequently, illustrate this via a problem.
An interesting aspect that one encounters with the preceding description
is that nodes may be unstable even if the traffic intensity is less than one at
those nodes. Thus the traffic intensity condition is only necessary but not
sufficient. We will address that issue of stability in Chapter 8 and here we
just assume that the network is stable and all we are interested is in deriving
performance measures. To analyze such a queueing network we use two
approaches: one is the MVA approach we saw in Section 7.3.1 and the other
436 Analysis of Queues

is to use state-space collapse. Thus it is worthwhile to read through Section

7.3.1 before proceeding further.

7.3.3.1 MVA-Based Algorithm

Essentially, this is exactly the same as the algorithm in Section 7.3.1 with the
only exception being the priority order in each node. Also because each class
can visit a node more than once with a different service time during each
visit, we encounter a few extra notations. Say we are given N, R, λi,r , pij,r ,
and μi,r,k for all i, j ∈ [1, . . . , N], k ∈ [1, . . .], and r ∈ [1, . . . , R]. The algorithm
to obtain Li,r,k , the expected number of class r customers in node i during
their kth visit to node i in steady state for all i ∈ {1, 2, . . . , N}, k ≥ 1 and
r ∈ {1, 2, . . . , R}, is described next in terms of the metrics given earlier. For
that, we define ai,r,k as the effective mean entering rate of class r customers
into node i for the kth time. However, at this moment we do not know ai,r,k
or Li,r,k , which are the two steps of the algorithm.
Step 1: Calculate the mean effective entering rates ai,r,k . The technique is
more or less similar to that in Section 7.3.1 except one has to be more careful
with the possibility of multiple visits to a node. Then the utilization ρi,r,k for
all i ∈ [1, . . . , N], r ∈ [1, . . . , R] and appropriate k values is
ai,r,k
ρi,r,k = .
μi,r,k

Also, the necessary condition for stability

R
ρi,r,k < 1
r=1 k

must be met for every i. It is critical to note that this condition may not be
sufficient for stability.
Step 2: At node i let qi (r, k) be the priority given to class r traffic at node i
when it enters it for the kth time. Thus if qi (r, k) < qi (s, n), then class r traffic
entering node i for the kth time is given higher priority than class s traffic that
enters it for the nth time. Sequentially compute for each node i (from i = 1 to
i = N) each r and appropriate k (in the exact priority order qi (r, k))

qi (r,k) qi (r,k)−1
Li,q−1 (j)
Li,r,k = ρi,r,k + ai,r,k i
+ Li,r,k ρi,q−1 (j) (7.18)
μi,q−1 (j) i
j=1 i j=0

where ρi,0 = 0 for all i and q−1 i (j) is the inverse function of qi (·, ·) such that
−1
qi (j) = (s, n) if qi (s, n) = j. Note that it is important to solve for the (r, k)
values in a sequence corresponding to the priority order qi (r, k) at each node i.
Approximations for General Queueing Networks 437

The preceding might be clearer once we present an example. Before that we

describe an alternate technique.

7.3.3.2 State-Space-Collapse-Based Algorithm

In multiclass queueing networks with priorities, by performing a heavy-
traffic scaling (see Chen and Yao [19]), one can see that only the lowest
priority class has a nonempty queue. In fact, while running simulations with
nodes with reasonably high traffic intensities, one sees that the queue lengths
(not including any at server) of the lowest priority classes are significantly
higher than other classes. Refer to the examples for multiclass priority
queues/networks throughout this book for evidence of this phenomenon.
Of course this would not be accurate if the traffic intensity due to the lowest
priority class is negligible. Assuming that is not the case, the key approxima-
tion in this algorithm is to assume that at any time instant in steady state, the
workload in the system equals to the workload of the lowest priority.
However, notice that the workload in the system at any time at a given
node for a given arrival process to that node would not depend on the
discipline. Of course in our case, the arrival processes do depend on the
discipline, so this workload conservation across the network would only be
a heuristic approximation. By making that approximation we can study the
network as though it is a multiclass FCFS network and obtain the steady-
state workload. Then we can say that this workload is equal to that of the
lowest priority class and all other classes at the node have zero workload.
Of course we can always use QNA in Section 7.2.3 to derive the workload
in each node in steady state for the equivalent FCFS queueing network. The
only concern is that QNA does not allow the general routing considered here
(especially the multiple visits to a node and each visit having a different ser-
vice time). We can get around that quite easily as we will show in an example
to follow. Next we describe the algorithm.
Step 1: Modify the multiclass queueing network so that the assumptions of
QNA in Section 7.2.3 are satisfied. Then assume that the service is FCFS at
all nodes. Go through QNA till almost the last step where we compute

α mi 1 C2Ai + C2Si
Wiq ≈ .
μi 1 − ρi 2

This indeed is the steady-state expected workload in the system at node i.

The terms in the RHS are presented just to recognize where to stop in the
QNA. For the state-space collapse, we assume that any arriving customer
would see a workload of Wiq , all of which is due to the lowest priority
customers at node i.
Step 2: At node i let ai,r and ρi,r be the entering rate and traffic intensity due
to a class r customer, respectively, at that node. Since we do not allow more
438 Analysis of Queues

than one entry (with different service times) of each class into a node, we do
not need the k subscript we used for the MVA-based algorithm. However,
the total number of classes may have increased. We can obtain ai,r and ρi,r
using the QNA in step 1. Now, we switch back to the preemptive resume
priority policy. Let ρi,r̂ be the sum of traffic intensities of all classes strictly
higher priority than r at node i. We consider the following two cases:

• If r is not the lowest priority in node i, then Li,r ≈ ρi,r /(1 − ρi,r̂ ) due
to the state-space collapse assumption.
• If r is the lowest priority in node i, then Li,r ≈ ρi,r + ai,r Wiq /(1 − ρi,r̂ )
since due to state-space collapse, we assume that all the workload
belongs to this lowest class.

We next describe an example problem to illustrate these algorithms.

Problem 73
Consider a manufacturing system with three single-server workstations A,
B, and C, as described in Figure 7.12. There are three classes of traffic. Class-
1 jobs arrive externally into node A according to PP(λA,1 ) with λA,1 = 5 jobs
per day. They get served in node A, then at node B, and then they come back
to node A for another round of service before exiting the network. Class-2
jobs arrive externally into node B according to PP(λB,2 ) with λB,2 = 4 jobs per
day. After service in node B, with probability 0.75 a class-2 job joins node C
for service and then exits the network, whereas with probability 0.25 some

20
10 15
5
15 8
Node A 24
4
Node B

10
3

Node C
0.75
0.25

FIGURE 7.12
Single-server queueing network with local priorities.
Approximations for General Queueing Networks 439

class-2 jobs exit the network after service in node B. Class-3 jobs arrive exter-
nally into node C according to PP(λC,3 ) with λC,3 = 3 jobs per day. They
get served in node C, then at node A, and then node B before exiting the
network. The service times to process a job at every node is exponentially
distributed. The service rates are described in Figure 7.12 in units of number
of jobs per day. In particular, the server in node A serves class-1 jobs dur-
ing their first visit at rate μA,1,1 = 10 and second visit at rate μA,1,2 = 20,
whereas it serves class-3 jobs at rate μA,3,1 = 15. Likewise, from the figure, at
node B, we have μB,1,1 = 15, μB,2,1 = 24, and μB,3,1 = 8, and at node C, we
have μC,2,1 = 5 and μC,3,1 = 10. The server at each node uses a preemptive
resume priority scheme with priority order determined using the shortest-
expected-processing-time-first rule. Thus each server gives highest priority
to the highest μ·,·,· . For such a system compute the steady-state expected
number of each class of customer at every node.
Solution
We will stick to the notation used throughout this section, although it is
worthwhile to point out that it may be easier to map the eight three-tuples
of (node, class, visit number) to eight single-dimension quantities as done
in most texts and articles. Note that the priority order (highest to lowest)
is (A, 1, 2), (A, 3, 1), and (A, 1, 1) in node A; (B, 2, 1), (B, 1, 1), and (B, 3, 1) in
node B; and (C, 3, 1) and (C, 2, 1) in node C. Since the flows are relatively sim-
ple in this example we can quickly compute the effective entering rates as
aA,1,2 = aA,1,1 = aB,1,1 = 5, aA,3,1 = aB,3,1 = aC,3,1 = 3, aB,2,1 = 4, and aC,2,1 = 3
(due to the Bernoulli splitting only 75% of class-2 reach node C). Since the
utilization (or relative traffic intensities) ρi,r,k can be computed as ai,r,k /μi,r,k ,
we have ρA,1,2 = 0.25, ρA,3,1 = 0.2, ρA,1,1 = 0.5, ρB,2,1 = 1/6, ρB,1,1 = 1/3,
ρB,3,1 = 0.375, ρC,3,1 = 0.3, and ρC,2,1 = 0.6 (they are presented so that the
traffic intensities at the same node are together and within each node they
are presented from the highest to lowest priority). Note that the necessary
condition for stability is satisfied since the effective traffic intensity at nodes
A, B, and C are ρA = 0.95, ρB = 0.875, and ρC = 0.9, respectively. Now, we
proceed using the two different algorithms.
MVA-based algorithm: Note that we have already completed step 1 of the
algorithm. In step 2, we just explain the qi (r, k) and q−1 i (j) but do not use
them explicitly. For example, for node A, qA (1, 2) = 1 being the highest pri-
ority at node A. Likewise, qA (3, 1) = 2 and qA (1, 1) = 3. Also, for node B,
q−1 −1 −1
B (1) = (2, 1), qB (2) = (1, 1), and qB (3) = (3, 1). Thus using Equation 7.18,
we get

ρA,1,2
LA,1,2 = = 0.3333,
1 − ρA,1,2
ρA,3,1 + aA,3,1 LA,1,2 /μA,1,2
LA,3,1 = = 0.4545,
1 − ρA,1,2 − ρA,3,1
440 Analysis of Queues

ρA,1,1 + aA,1,1 (LA,1,2 /μA,1,2 + LA,3,1 /μA,3,1 )

LA,1,1 = = 14.6970,
1 − ρA,1,2 − ρA,3,1 − ρA,1,1
ρB,2,1
LB,2,1 = = 0.2,
1 − ρB,2,1
ρB,1,1 + aB,1,1 LB,2,1 /μB,2,1
LB,1,1 = = 0.75,
1 − ρB,2,1 − ρB,1,1
ρB,3,1 + aB,3,1 (LB,2,1 /μB,2,1 + LB,1,1 /μB,1,1 )
LB,3,1 = = 4.4,
1 − ρB,2,1 − ρB,1,1 − ρB,3,1
ρC,3,1
LC,3,1 = = 0.4286,
1 − ρC,3,1
ρC,2,1 + aC,2,1 LC,3,1 /μC,3,1
LC,2,1 = = 7.2857.
1 − ρC,3,1 − ρC,2,1

State-space-collapse-based algorithm: We begin by considering an equivalent

single-server queueing network described in Figure 7.12, however, with each
node operating using an FCFS discipline. Using the ai,r,k values, we know the
arrival rates at every node. To compute the SCOV as for the arrivals as well as
for the departures we systematically use QNA. Of course we know that the
arriving SCOV C2A,1,1 = C2B,2,1 = C2C,3,1 = 1 since the arrivals of class-1, 2, and
3 into nodes A, B, and C are according to a Poisson process. For the other five
arriving SCOVs, namely C2A,1,2 , C2B,1,1 , C2C,2,1 , C2A,3,1 , and C2B,3,1 , we iteratively
compute them by computing the departure SCOV carefully as described in
QNA. Once we know the effective SCOV of arrivals and service times (which
can also be computed using QNA), we can compute the steady-state average
workload at the three nodes using the QNA FCFS result as WAq = 1.3111,
WBq = 0.5015, and WCq = 1.2382.
Reverting to the preemptive resume priority case, we can derive the
expected number of each type of job in the system in steady state using the
state-space-collapse algorithm as

LA,1,2 = ρA,1,2 = 0.25,

ρA,3,1
LA,3,1 = = 0.2667,
1 − ρA,1,2
ρA,1,1 + aA,1,1 WAq
LA,1,1 = = 12.828,
1 − ρA,1,2 − ρA,3,1
LB,2,1 = ρB,2,1 = 0.1667,
ρB,1,1
LB,1,1 = = 0.4,
1 − ρB,2,1
ρB,3,1 + aB,3,1 WBq
LB,3,1 = = 3.7589,
1 − ρB,2,1 − ρB,1,1
Approximations for General Queueing Networks 441

LC,3,1 = ρC,3,1 = 0.3,

ρC,2,1 + aC,2,1 WCq
LC,2,1 = = 6.1635.
1 − ρC,3,1

Upon running 50 replications of simulations, the 95% confidence interval

for the corresponding values are as follows:

LA,1,2 = 0.3308 ± 0.0002,

LA,3,1 = 0.3801 ± 0.0005,
LA,1,1 = 13.328 ± 0.2076,
LB,2,1 = 0.2 ± 0.0002,
LB,1,1 = 0.8916 ± 0.0012,
LB,3,1 = 3.8873 ± 0.0180,
LC,3,1 = 0.4284 ± 0.0006,
LC,2,1 = 7.2532 ± 0.0606.

Reference Notes
One of the key foundations of this chapter is approximations based on mod-
eling nodes in a queueing network using a reflected Brownian motion. We
began this chapter by giving a flavor for how the Brownian motion argu-
ment is made in a single node G/G/1 setting and derived an approximation
for the number in the system. The analysis relies heavily on the excellent
exposition in Chen and Yao [19] as well as Whitt [105]. A terrific resource
for key elements of Brownian motion (that have been left out in this chap-
ter) is Harrison [52]. Subsequently, we extend the single node to an open
network of single-server nodes assuming each queue behaves as if it were
a reflected Brownian motion. Then we provide approximations based on
Bolch et al. [12] for closed queueing networks. The second portion of this
chapter on multiclass and multiserver general open queueing networks is
entirely from Whitt [103]. Finally, the topic of queueing network with pri-
orities is mainly based out of Chen and Yao [19]. Although it is critical to
point out that there are several research studies mainly focused on aspects
of stability, they will be dealt with in the next chapter. The semi-martingale
reflected Brownian motion offers the ability to derive approximations for the
performance measures including the state-space collapse.
442 Analysis of Queues

Exercises
7.1 Consider a stable G/G/1 queue with arrival rate 1 per hour, traf-
fic intensity 0.8, C2a = 1.21, and C2s = 1.69. Obtain Wq using the
reflected Brownian motion approximation in Equation 7.8 and
also using the approximations in Chapter 4. Compare the approx-
imations against simulations. For the simulations use either
gamma or hyperexponential distribution.
7.2 Consider the queueing network of single-server queues shown in
Figure 7.13 under a special case of N = 3 nodes, p = 0.6, and λ = 1
per minute. Note that external arrival is Poisson. Service times at
node i are according to gamma distribution with mean 1/(i + 2)
minutes and SCOV i − 0.5. Compute an approximate expression
for the expected number of customers in each node of the network
in steady state.
7.3 Consider the queueing network of single-server queues shown in
Figure 7.13. Consider a special case of N = 3 nodes, p = 1, λ = 0
(hence a closed queueing network), and C = 50 customers. Service
times at node i are according to gamma distribution with mean
1/(i + 2) minutes and SCOV i − 0.5. Compute an approximate
expression for the expected number of customers in each node of
the network in steady state using both the bottleneck approxima-
tion (for large C) and the MVA approximation (for small C).
7.4 A queueing network with six single-server stations is depicted in
Figure 7.14. Externally, arrivals occur into node A according to a
renewal process with average rate 24 per hour and SCOV 2. The
service time at each station is generally distributed with mean (in
minutes) 2, 3, 2.5, 2.5, 2, and 2, and SCOV of 1.5, 2, 0.25, 0.36, and
1, 1.44, respectively, at stations A, B, C, D, E, and F. A percent-
age on any arc (i, j) denotes the probability that a customer after
completing service in node i joins the queue in node j. Compute
the average number of customers in each node of the network in
steady state.
7.5 Consider a seven-node single-server queueing network where
nodes 2 and 4 get arrivals from the outside (at rate 5 per minute
each on an average). Nodes 1 and 2 have service rates of 85,

λ 1–p
μ1 μ2 μN–1 μN
p

FIGURE 7.13
Single-server queueing network.
Approximations for General Queueing Networks 443

25% 35% E
10% 80%
50%
A B
15%

F
65%

FIGURE 7.14
Schematic of six-station network.

nodes 3 and 4 have service rates of 120, node 5 has a rate of

70, and nodes 6 and 7 have rates of 20 (all in units per minute).
Assume that all external interarrival times have SCOV of 0.64 and
all service times have SCOV of 1.44. The routing matrix is given by
⎛ 1 1 1 1
⎞
3 4 0 4 0 6 0
⎜ ⎟
⎜ 1 1
0 1
0 0 0 ⎟
⎜ 3 4 3 ⎟
⎜ ⎟
⎜ 0 0 1 1 1
0 0 ⎟
⎜ 3 3 3 ⎟
⎜ 1 ⎟
P = [pij ] = ⎜
⎜ 3 0 1
3 0 1
3 0 0 ⎟
⎟.
⎜ ⎟
⎜ 0 0 0 4
0 0 1 ⎟
⎜ 5 6 ⎟
⎜ 1 ⎟
⎜ 0 1 1 1 1
0 ⎟
⎝ 6 6 6 6 6 ⎠
1 1 1 1 1
0 6 6 6 6 0 6

Using an approximation, find the average number of customers in

each node of the network in steady state.
7.6 Recall the Bay-Gull Bagels problem depicted in Figure 6.4. Ignore
the routing probability values shown in the figure. Use abbrevi-
ations B, S, D, C, and E to denote the five stations. There are
three classes of customers. Class-1 customers arrive at rate 0.2
per minute into node S; they go through nodes S and C, then
immediately exit the network (without eating in). Class-2 cus-
tomers arrive at node B at rate 1 and then undergo routing exactly
according to all the probabilities described in the figure. Class-
3 customers arrive at node D externally at rate 0.4 per minute;
they go through nodes D and C, then immediately exit the net-
work (without eating in). Use QNA to obtain the average number
of customers of each class at each of the five nodes of the net-
work in steady state. Assume that the service times are not class
444 Analysis of Queues

dependent with SCOV of 0.25 at all nodes, and external arrivals

are according to Poisson process for all nodes.
7.7 A production system consists of six work centers and two classes
of parts that get produced. Denote the six work centers as nodes A,
B, C, D, E, and F each of which contains a single server. Machine
A receives requests for class-1 parts at an average rate of 4 per
hour. Likewise, machine B receives requests for class-2 parts at an
average rate of 6 per hour. Once machines A and B process their
parts one by one, they drop them in a buffer for painting station
C to pick them. The painting is done according to FCFS and then
a robot in node D picks up the parts one by one and then delivers
them to inspection stations E and F. Class-1 parts are inspected
at E and class-2 are inspected at F. All parts after inspection exit
the system. Compute the long-run expected number of parts of
each class in each node in steady state. For this, assume that the
SCOV of the request arrivals of class-1 and class-2 are 1.5 and 2.4,
respectively. Also, the mean service time (in hours) and SCOV are
described in Table 7.18.
7.8 Consider a sensor network with six single-server nodes. Node 1
is the gateway node to which requests arrive externally and
responses are sent out to the external world. There are two classes
of requests. Assume that each externally arriving request is class-
1 with probability 0.2 and class-2 with probability 0.8. The mean
and standard deviation of service times (in seconds) at each node
for each class are described in Table 7.19. Likewise, the routing
probabilities pij,r from node i to j is provided in Table 7.20 for r = 1
(i.e., class-1) and Table 7.21 for r = 2 (i.e., class-2). Assume that
class-1 has preemptive resume priority over class-2 in the entire
network. First of all, determine the maximum external arrival
rate that can be sustained in the network so that no node has
a total traffic intensity larger than 0.95 (both classes combined).
Assume that the interarrival times for requests coming externally
has an SCOV of 1.44 for both classes. Using this maximum arrival

TABLE 7.18
Mean and Standard Deviation of Service Times for Each Class
Node A B C D E F
Class-1 mean service time 0.2 N/A 0.1 0.05 0.22 N/A
SCOV class-1 service time 0.25 N/A 0.64 2 0.75 N/A
Class-2 mean service time N/A 0.15 0.05 0.1 N/A 0.14
SCOV class-2 service time N/A 0.36 0.49 2.25 N/A 0.81
Approximations for General Queueing Networks 445

TABLE 7.19
Mean and Standard Deviation of Service Times for Each Class
Node i 1 2 3 4 5 6
Class-1 service time 2 2 3 1 2 4
SCOV class-1 service 1 1.44 1.69 1 0.81 0.49
Class-2 service time 3 1 2 3 4 2
SCOV class-2 service 0.25 1 0.81 0.49 2 0.64

TABLE 7.20
Class-1 Routing Probabilities [pij,1 ] from Node on Left to
Node on Top
1 2 3 4 5 6
1 0 0.1 0.1 0.1 0.1 0.1
2 0.2 0 0.2 0.2 0.2 0.2
3 0.3 0.4 0 0.1 0.1 0.1
4 0.1 0.1 0.1 0 0.2 0.5
5 0.2 0.1 0.5 0.1 0 0.1
6 0.3 0.3 0.2 0.1 0.1 0

TABLE 7.21
Class-2 Routing Probabilities [pij,2 ] from Node on Left to
Node on Top
1 2 3 4 5 6
1 0 0.5 0 0 0 0
2 0 0 1 0 0 0
3 0 0 0 1 0 0
4 0 0 0 0 1 0
5 0 0 0 0 0 1
6 1 0 0 0 0 0

rate, compute the steady-state average number of each class of

customers at each node.
7.9 Consider the previous problem. If the service discipline is FCFS
(instead of preemptive priority for class-1), solve that problem
using QNA.
7.10 Consider the Rybko–Stolyar–Kumar–Seidman network in
Figure 7.15. There are two classes of traffic. Class-1 jobs arrive
externally into node A according to PP(λA ) with λA = 7 jobs per
hour. They get served in node A, then at node B, and then they
446 Analysis of Queues

14 21
7
16 9 4
Node A Node B

FIGURE 7.15
Rybko–Stolyar–Kumar–Seidman-type network.

exit the network. Class-2 jobs arrive externally into node B accord-
ing to PP(λB ) with λB = 4 jobs per hour. After service in node
B, class-2 jobs get served at node A and then exit the network.
The service rates are described in Figure 7.15 in units of num-
ber of jobs per hour. In particular, the server in node A serves
class-1 jobs at rate μA,1 = 14, whereas it serves class-2 jobs at
rate μA,2 = 16. Likewise, from the figure, at node B, we have
μB,1 = 21 and μB,2 = 9. There is a single server at each node.
The servers use a preemptive resume priority scheme with prior-
ity order determined using shortest expected processing time first
rule. Thus each server gives highest priority to the highest μ·,· in
that node. For such a system compute the steady-state expected
number of each class of customer at every node. Use both MVA
and state-space collapse technique. Note: The necessary condi-
tions for stability of this network are that λA /μA,1 + λB /μA,2 < 1
and λB /μB,1 + λB /μB,2 < 1. However, the sufficient condition
(assuming class-2 has high priority in node A and class-1 has high
priority in node B) is that λA /μB,1 + λB /μA,2 < 1.
8
Fluid Models for Stability, Approximations,
and Analysis of Time-Varying Queues

In this chapter and in the next two chapters, we will consider the notion
of fluid models or fluid queues. However, there is very little commonality
between what is called fluid queues here and what we will call fluid queues
in the next two chapters. In fact they have evolved in the literature rather
independently, although one could fathom putting them together in a uni-
fied framework. We will leave them in separate chapters in this book with
the understanding that in this chapter we are interested in the fluid limit of
a discrete queueing network whereas in the next two chapters we will directly
consider queueing networks with fluid entities (as opposed to discrete enti-
ties) flowing through them. Another key distinction is that the resulting fluid
network in this chapter is deterministic lending itself straightforward ways
to determine stability of queueing networks, develop performance measures
approximately, as well as study transient and time-varying queues. In the
next section, we will study stochastic fluid networks.
Deterministic fluid models have been applied to many other systems
besides queueing networks. In pure mathematics the deterministic fluid
models are called hydrodynamic limits and in physics they fall under mean
field theory. The key idea is to study systems using only mean values by scal-
ing metrics appropriately. We begin by considering a single queue with a
single server to explain the deterministic fluid model concept and flush out
details such as functional strong law of large numbers which is a concept central
to the theory developed. Subsequently, we will use fluid limits in a network
setting to analyze stability, obtain performance metrics approximately, and
finally to study nonstationary queues under transient conditions.

8.1 Deterministic Fluid Queues: An Introduction

The objective of this section is to provide an introduction to deterministic
fluid queues which is also called fluid limits of discrete queues. For that we
first consider the simplest system, namely a single queue with one server.
That would enable us to quickly compare against results we have already
seen in this book and thereby develop an intuition for the material. Sub-
sequently, in future sections, we will consider refinements to this simple

447
448 Analysis of Queues

system which cannot typically be analyzed using anything we have seen

thus far in this book. For example, stability of queueing networks with local
priorities, queues with abandonments and retrials, and queues with time-
varying parameters are a few such examples that we will consider in the
latter sections of this chapter. With that motivation in mind for what is to
come in the future, we describe the simple system of a single server queue.

8.1.1 Single Queue with a Single Server

Consider a queue with an infinite waiting room where entities arrive one
by one. There is a single server that serves customers one by one using an
FCFS discipline. This is not necessarily a G/G/1 queue as we do not make
assumptions such as IID interarrival times and IID service times. The most
ideal example of such a system is a single-server node in the middle of a
queueing network where arrivals are not necessarily according to a renewal
process and in fact service times also do not have to be IID. Having said that,
it is worthwhile to point out that for the sake of simplicity of explaining the
results, we will frequently consider the example of a G/G/1 queue. In fact,
some of the proofs for the most general results are much harder to show
although they only require some mild assumptions such as the interarrival
times and service times need to have finite mean and variance. In summary,
the G/G/1 queue setting is provably sufficient to derive the results in this
section but it is not necessary.
It is crucial to point out that the system is a discrete and stochastic queue
with a single server. We are yet to take the fluid limit of this system. To do
that, we first restate the notation used in Section 7.1.1 with the caveat that we
do not require a renewal arrival process and IID service times as done in that
section. Let A(t) be the number of entities that arrive from time 0 to t. Also
let S(t) be the number of entities the server would process from time 0 to t
if the server was busy during that whole time. Further, denote the average
arrival rate as λ and the average service rate as μ. In other words,

A(t)
λ = lim and
t→∞ t

S(t)
μ = lim .
t→∞ t

With that description we are now ready to take the fluid limits of the
discrete system described here. Define An (t) as

A(nt)
An (t) =
n

for any n > 0 and t ≥ 0. We now study what happens to An (t) as n → ∞. In

other words, we scale the stochastic process {A(t), t ≥ 0} by a factor n and
Stability, Fluid Approximations, and Non-stationary Queues 449

take the limit as n → ∞. We first illustrate the fluid scaling by means of

an example followed by the theoretical underpinnings in the next section.
To demonstrate the power of the fluid limits, for our illustrations we con-
sider an arrival process with a high coefficient of variation under the belief
that for smaller coefficient of variations the result would only be more
powerful. Further we only consider a fairly small t because for larger t
values the results may not be all that surprising. In addition, we con-
sider the second node of a tandem network where the arrivals are not
truly IID.

Problem 74
Consider a G/G/1 queue with interarrival times as well as service times
according to Pareto distributions. The coefficient of variation for interarrival
times is 5 and for the service time it is equal to 2. The average arrival rate is
1 per unit time and the average service rate is 1.25 per unit time. The depar-
tures from this queue act as arrivals to a downstream queue. Let A(t) be the
number of entities that arrive at the downstream node during (0, t]. For t = 0
to 10 time units, graph three sample paths of An (t) = A(nt)/n versus t for
n = 1, 10, 100, and 1000.
Solution
It is crucial to note that the A(t) process is the arrivals to the downstream
node which is the same as the departures from the G/G/1 node described in
the question. Also the average arrival rate is λ = 1. By writing a simulation
using the algorithm in Problem 37 in Chapter 4, we can obtain sample paths
of the output process from the G/G/1 queue, in particular the number of
departures during any interval of time. Using this for various values of n = 1,
10, 100, and 1000, we can plot three sample paths of An (t) = A(nt)/n versus t
as shown in Figure 8.1(a)–(d).
From the figure, note that in (a) where n = 1, the sample paths are piece-
wise constant graphs for the number of arrivals until that time. We expect
to get about 10 arrivals during t = 10 time units. Also notice in (a) that the
sample paths are quite varying. Now, when n = 10 as seen in (b), the sample
paths are still piecewise constant but they are closer than in case (a). We see
this trend more prominent in case (c) where the sample paths have closed
in and the piecewise constant graph has started to look more like a constant
graph. Finally in (d) when n = 1000 which for this example is sufficiently
large, the sample paths merge with one another thereby the entire stochastic
process in the limit goes to a deterministic limit.
Notice that the numerical example is for small t, however, the conver-
gence would only be faster if we used larger t values. Thus we can conclude
that An (t) converges to λt as n grows to infinite. In addition, if we were to
choose a smaller coefficient of variation we would see that the convergence
is much faster as a matter of fact.
450 Analysis of Queues

14 10
9
12
8
10 7
8 6
An(t)

An(t)
5
6
4
4 3
2
2
1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(a) t (b) t

12 12

10 10

8 8
An(t)
An(t)

6 6

4 4

2 2

0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(c) t (d) t

FIGURE 8.1
Sample paths of scaled arrival process An (t). (a) n = 1, (b) n = 10, (c) n = 100, and (d) n = 1000.

The preceding result would not be particularly surprising if {A(t), t ≥ 0}

was a renewal process since for a renewal process with average interrenewal
time 1/λ for a large t, A(t) would be approximately normally distributed
with mean λt and variance λC2a t, where C2a is the squared coefficient of
variation of interarrival times. For that reason in the previous problem the
convergence was tested for an arrival process that is not a renewal process.
Further, it is worthwhile noting that if one were to scale the service process
{S(t), t ≥ 0}, similar graphs can be obtained and the process Sn (t) = S(nt)/n
would converge to μt as n grows to infinity. In fact the fluid limit can
be taken for any stochastic process, not just counting processes. The the-
ory underlying this is presented in the next section, which extends the
well-known strong law of large numbers (see Section A.5 and Resnick [90])
to functionals.

8.1.2 Functional Strong Law of Large Numbers and the Fluid Limit
One version of the well-known strong law of large numbers (SLLN) is that
if X1 , X2 , . . . are IID random variables with mean m, then Sn = X1 +X2n+···+Xn
Stability, Fluid Approximations, and Non-stationary Queues 451

converges almost surely to m. In other words, Sn → m as n → ∞ almost

surely. For the sake of completeness, the other version of SLLN (although
we will not use in this section) is that the fraction I(X1 ≤ x)+I(X2 ≤nx)+···+I(Xn ≤ x)
converges to P(Xi ≤ x) almost surely as n → ∞ for some arbitrary i, where
I(A) is an indicator function that is one when A is true and zero otherwise.
Back to the version we are interested, that is, Sn → m as n → ∞. That is what
is known as point-wise convergence of the random variable Sn .
However, in the previous section we saw that if A(t) and S(t) are the num-
ber of arrivals and number of service completions (if the server was busy
during that whole time) in time (0, t], respectively, then An (t) = A(nt)/n → λt
and Sn (t) = A(nt)/n → μt. Basically the entire sequence of stochastic pro-
cesses {An (t), t ≥ 0} converges to the deterministic process λt. Likewise, the
entire sequence of stochastic processes {Sn (t), t ≥ 0} converges to the deter-
ministic process μt. Such convergence of sequences of stochastic processes
are called functional strong law of large numbers (FSLLN). Notice that in
SLLN just a sequence of random variables converged to a deterministic
quantity, but in FSLLN entire stochastic processes converge. However, Whitt
[105] states that SLLN and FSLLN are actually equivalent (especially in the
renewal context where it is not hard to see).
With that said, we have derived a deterministic system where arrivals
occur at rate λ and service occurs at rate μ. In other words, the deterministic
fluid model or fluid scaling process of the arrival process corresponds to a
fluid that flows at a constant rate λ, thus in time t exactly λt amount of fluid
flows into the system. Likewise the service capacity is μ per unit time. But
that is a little more subtle since there may not be enough fluid to drain out at
rate μ per unit time. In summary, we started with a discrete stochastic queue
and created a sequence of stochastic processes that converge to a determin-
istic fluid queue. The deterministic fluid model or fluid scaled queue is a
queue into which fluid arrives at rate λ per unit time continuously and the
service rate is a deterministic μ per unit time. A picture converting the dis-
crete stochastic queue to a fluid queue is depicted in Figure 8.2. The picture
is depicted as though λ < μ and there was some fluid at t = 0 (we will show
later when we scale that this fluid level must be zero) and the snapshot was
taken at some small t. This leads us to the next question about how the con-
tents of the fluid queue looks like for all t and for cases of both λ < μ and
λ ≥ μ.
Actually the contents of the fluid queue at any time is rather easy to guess.
What needs to be shown is that the stochastic process tracking the number
of entities in the discrete system over time converges to that established by
the fluid queue. For that, we let X(t) be the number of discrete entities in the
original stochastic queue. Let Xn (t) = X(nt)/n be the scaled process that we
are interested in taking the limit as n → ∞. For that we let X(0) = x0 such that
x0 is a given finite constant number of entities initially (refer to Chen and Yao
[19] for the more general case of x0 being infinite and scaled appropriately).
452 Analysis of Queues

Average
arrival
rate Constant
λ entities fluid arrival
per hour rate
λ per hour
Scaling

Mean
Constant
service
valve
capacity
capacity
μ entities
μ per hour
per hour

FIGURE 8.2
Snapshot of a discrete stochastic queue and a scaled deterministic fluid queue.

Thus we have Xn (0) = x0 /n which in the limit goes to zero. All the results in
this section use 0 ≤ x0 < ∞.
To derive the scaled process we use a reflection mapping argument
where the steps are identical to that in Section 7.1.1 with similar notation
as well. Let B(t) and I(t) denote the total time the server has been busy and
idle, respectively, from time 0 to t. The corresponding fluid limits by def-
inition are Bn (t) = B(nt)/n and In (t) = I(nt)/n. Of course B(t) + I(t) = t and
Bn (t) + In (t) = t. We can apply scaling to Equation 7.1 and obtain

x0
Xn (t) = + An (t) − Sn Bn (t) .
n

This equation can be rewritten as

Xn (t) = Un (t) + V n (t)

where
x0
Un (t) = + (λ − μ)t + An (t) − λt − Sn Bn (t) − μBn (t) ,
n (8.1)
V n (t) = μIn (t).

Refer to Section 7.1.1 for a detailed explanation.

Stability, Fluid Approximations, and Non-stationary Queues 453

Then, by scaling the expressions for U(t) and V(t) defined in Section 7.1.1,
we know that given Un (t), there exists a unique pair Xn (t) and V n (t) such that
Xn (t) = Un (t)+V n (t), which satisfies the following three conditions (obtained
by rewriting conditions (7.2), (7.3), and (7.4) by scaling for any t ≥ 0):

Xn (t) ≥ 0,

dV n (t)
≥0 with V n (0) = 0 and
dt

dV n (t)
Xn (t) = 0.
dt

We also showed in Section 7.1.1 that the unique pair Xn (t) and V n (t) can be
written in terms of Un (t) as

V n (t) = sup max −Un (s), 0 , (8.2)
0≤s≤t

Xn (t) = Un (t) + sup max −Un (s), 0 . (8.3)
0≤s≤t

Since Bn (t) + In (t) = t, as n → ∞, Bn (t) is bounded. Thus as n → ∞,

(Sn (Bn (t)) − μBn (t)) → 0. By taking the limit n → ∞, Equation 8.1 results
in Un (t) → (λ − μ)t. Substituting that in Equation 8.2, we get V n (t) con-
verges to max{−(λ − μ)t, 0} as n → ∞. Using that in Equation 8.3, we get
Xn (t) → (λ − μ)t + max{−(λ − μ)t, 0} as n → ∞. Since for any real scalar a, we
know that a + max{−a, 0} = max{a, 0}, we get as n → ∞,

Xn (t) → max{(λ − μ)t, 0}. (8.4)

Notice that we do not make any assumptions about whether λ is less

than or greater than μ. But we can see that the scaled fluid queue length
converges to max{(λ − μ)t, 0}, which is zero if μ > λ and (λ − μ)t if λ ≥ μ. Now
consider the fluid queue with deterministic arrivals at rate λ and capacity
μ. We assume that the initial fluid level zero as in the scaled process the
discrete stochastic queue with x0 entities initially converges to zero initial
level due to scaling x0 /n. The amount of fluid in this queue at time t would
be zero if μ > λ and (λ − μ)t is λ ≥ μ. Thus we have shown that not only
the arrival process and service process but also the number in the system in
the discrete stochastic case converges to that of the deterministic fluid case.
Chen and Yao [19] show the workload process also converges accordingly.
However, it is crucial to realize that all we have shown is that by scaling
the discrete stochastic queue we get its limit as a deterministic fluid queue.
454 Analysis of Queues

But how do we use this fluid limit? We will see that in the remainder of this
chapter especially in the context of stability of networks, approximations,
and analyzing time-varying systems.

8.2 Fluid Models for Stability Analysis of Queueing Networks

The main objective of this section is to use fluid models to determine the
stability of queueing networks. For example, consider the single-server
single-class infinite-waiting-area queue described in the previous section.
We know that the system is stable if λ < μ where λ and μ are the aver-
age arrival rate and average service rate, respectively. Also as t → ∞, the
contents in the deterministic fluid model (by scaling the original discrete
stochastic system, as shown in the previous section) shoot off to infinity if
λ > μ. So if the fluid system is stable, then the discrete stochastic model is
stable. We ask the question: can we extend this idea to a network of queues?
We will see later in this section that the discrete stochastic network is stable
if the deterministic fluid network is stable.
That certainly sounds appealing from a mathematical standpoint. But in
Chapter 7, we never worried about it. We just checked if the resulting mean
arrival rate in every node of a network is smaller than the average service
rate, and if all nodes satisfy that then the network is stable. So why bother
with this cumbersome task of taking the fluid limit to check for stability? This
is an extremely reasonable question to ask and until the recent past no one
questioned that approach. However, a flurry of research indicated that the
condition of all nodes having arrival rates smaller than service rates is cer-
tainly necessary but not sufficient. The fluid model will help us indicate if the
system would be unstable, despite the necessary conditions being satisfied.
That is the motivation to study them. But before we present the fluid model,
we will see in the next section examples of networks where the necessary
condition for stability is indeed not sufficient.

8.2.1 Special Multiclass Queueing Networks with Virtual Stations

In this section we consider three queueing networks and run simulations to
see if they are stable. In all cases the naive condition that the arrival rates
into every node must be less than the service capacity at that node will be
satisfied. However, we will show that the simulations indicate the networks
are indeed unstable thus motivating the need to find any “additional” con-
ditions for stability. We will also present that by means of a concept called
“virtual station,” which would be responsible for the need of additional
stability conditions. We begin by stating an example from Dai [25].
Stability, Fluid Approximations, and Non-stationary Queues 455

Node A Node B
μ1 μ2
λ
μ3 μ4

μ5

FIGURE 8.3
Reentrant line example.

Consider a multiclass network depicted in Figure 8.3 (Dai [25]). Jobs enter
node A at average rate λ, after service (mean service time of 1/μ1 ) they go to
node B for a service that takes 1/μ2 time on average. Then they reenter node
A for another round of service (with mean 1/μ3 ) and then go to node B again
for a service that takes an average time of 1/μ4 . They finally visit node A
for a service that takes an average 1/μ5 time before exiting the system. Such
a system is called a queueing network with reentrant lines and is typical in
semiconductor manufacturing facilities. Although there is only a single flow,
for ease of explanation we call the first visit to node A as class-1, first visit to
node B as class-2, second visit to node A as class-3, second visit to node B as
class-4, and final visit to node A as class-5. Notice that the subscripts of the
service rates match the respective classes.
There is a single server at node A that uses a priority order class-5 (high-
est) then class-3 and then class-1 (lowest priority). Likewise, there is a single
server at node B as well that gives higher priority to class-2 than class-4.
However, at all nodes jobs within a class are served FCFS. Also, the priori-
ties are preemptive resume (see Section 5.2.3 for a definition) type. We are not
specifying the probability distributions for the interarrival times or service
times for two reasons: (i) stability can be determined by just knowing their
means; (ii) we do not want to give the impression that the interarrival times
or the service times are IID. Having said that, when we run simulations we
do need to specify distributions and for that reason we would do so in the
examples.
The stability conditions in terms of traffic intensities at nodes A and B are

λ λ λ λ λ
+ + <1 and + <1 (8.5)
μ1 μ3 μ5 μ2 μ4

respectively. The question is whether these conditions are sufficient to

ensure that the system is stable. To answer that we consider a numerical
problem next where these conditions are satisfied.
456 Analysis of Queues

Problem 75
Consider the reentrant line in Figure 8.3. Let the arrivals be according to a
Poisson process with mean λ = 1. Also all service times are exponentially
distributed with μ1 = 10, μ2 = 2, μ3 = 8, μ4 = 2.5, and μ5 = 1.5. Verify that
conditions in (8.5) are satisfied. Then simulate the system for about 2000 time
units with an initially empty system to obtain the number of jobs in each of
the two nodes A and B over time. Also state if either servers is underutilized
for the duration of the simulation.
Solution
We can immediately verify that conditions in (8.5) are satisfied because

λ λ λ λ λ
+ + = 0.89167 < 1 and + = 0.9 < 1.
μ1 μ3 μ5 μ2 μ4

We start with an empty system and simulate arrivals and service accord-
ing to the description given. By keeping track of the number of jobs in each
of the nodes, we plot Figure 8.4(a) and (b). In Figure 8.4a, notice how the
number of jobs in node A rises and then falls, then rises higher and then
crashes to zero with the high queue length periods increasing in size. A sim-
ilar trend can also be observed in Figure 8.4b. However, a curious finding
is the fact that when there are jobs in one node, the other is more or less
empty. In other words, notice that the number in node A and that in node B
are negatively correlated. Further, if we add the number of jobs in nodes A
and B, we can plot the total number of jobs in the entire system and that is
given in Figure 8.5. Although it is true from Figure 8.4a and b that the num-
ber in each queue hits zero often but if one were to add the number of jobs
in node A to that in node B, the total number of jobs shows an increasing
trend (Figure 8.5). We can hence conclude that the system is indeed unstable
because the total number in the entire system is showing a rising trend over
time. Also, in terms of the utilization during the course of simulation we find
the following. Although the utilization of node A is close to the traffic inten-
sity of 0.89167, the utilization of node B is only 0.7836 which is significantly
lower than the traffic intensity of 0.9 that we expect to see.

Next we investigate the reason behind why we see the strange behavior
of the system becoming unstable and the nodes not reaching their expected
utilization in Problem 75. Let Xi (t) be the number of class-i jobs in the system
at time t for i = 1, 2, 3, 4, 5. In particular, consider jobs of class-2 and class-5.
Could node A be working on a class-5 job at the same time when node B is
working on a class-2 job? Say that is possible and there is one job of class-2
and one job of class-5 in the system. Since class-2 and class-5 get preemp-
tive priorities at their respective nodes, both must be in service. However,
the service times could not have started simultaneously. So let us say that
Stability, Fluid Approximations, and Non-stationary Queues 457

400

350

300
Number of jobs in node A

250

200

150

100

0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
(a) t

600

500
Number of jobs in node B

400

300

200

100

0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
(b) t

FIGURE 8.4
Sample path of number of jobs over time in each node of the reentrant line example. (a) Number
of jobs in A versus t. (b) Number of jobs in node B versus t.

the class-2 job was in the system when the class-5 job entered and began
service (the argument would not change if we went the other way around).
But that is impossible because the class-5 job would have been a class-4 job
that completed service (but the server would be processing class-2 and hence
a class-4 cannot have completed). In other words, before becoming a class-
2 and a class-5 job, they were class-1 and class-4 jobs, respectively, and a
458 Analysis of Queues

600

500
Total number of jobs in the system

400

300

200

100

0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
t

FIGURE 8.5
Total number in the entire system in the reentrant line example.

class-1 job cannot be completed when there is a class-5 job in the system and a
class-4 job cannot be completed when there is a class-2 job in the system.
Hence we make the crucial observation that

X2 (t)X5 (t) = 0

for all t especially if we started with an empty system at t = 0. That means

that the system can never process a class-2 and a class-5 job simultaneously.
Therefore, if the load brought by class-2 and class-5 jobs is too high, then the
system will not be stable. In Problem 75, notice that

λ λ
+ = 1.167 > 1.
μ2 μ5

The system cannot spend a fraction λ/μ2 time serving class-2 and another
λ/μ5 fraction of time serving class-5 since both cannot be served simultane-
ously. Thus a crucial condition for stability is

λ λ
+ < 1. (8.6)
μ2 μ5

This condition can also be interpreted as though there is a “virtual station”

into which class-2 and class-5 flow that needs to be stable.
Stability, Fluid Approximations, and Non-stationary Queues 459

Remark 18

The conditions for the network represented in Figure 8.3 (with priority policy
described earlier) to be stable are

λ λ
+ < 1,
μ2 μ5

λ λ λ
+ + <1 and
μ1 μ3 μ5

λ λ
+ < 1.
μ2 μ4

At this juncture a natural question to ask is if reentrant lines or the pri-
ority policy is needed to observe such virtual stations and have additional
conditions for stability. In the next two examples we will relax one of those
two conditions. First we present the example that is popularly known as
Kumar–Seidman–Rybko–Stolyar network in the literature. Sometimes it is
also referred to as Rybko–Stolyar–Kumar–Seidman network and is depicted
in Figure 8.6. Around the same time Kumar and Seidman as well as Rybko
and Stolyar wrote articles considering the network in Figure 8.6. The only
difference is that Kumar and Seidman considered a deterministic system
whereas Rybko and Stolyar a stochastic system.
Since we are only interested in the average rates, the deterministic and
the stochastic versions are identical from a stability standpoint. Class-1 jobs
enter node A externally at an average rate of λ1 per unit time. They get served
at node A for an average time of 1/μ1 and then go to node B where they are
called class-2. Class-2 jobs get served for an average time of 1/μ2 and exit
the system. Class-3 jobs arrive externally at an average rate of λ3 per unit
time into node B. They require an average processing time of 1/μ3 and upon
completion they go to node A and get served for an average time of 1/μ4 (as
class-4) before exiting the network. There is a single server at node A and a

μ1 μ2
λ1

μ4 μ3
λ3
Node A Node B

FIGURE 8.6
Rybko–Stolyar–Kumar–Seidman network.
460 Analysis of Queues

single server at node B. Notice that this is not a reentrant line. However, like
the previous example, here too we consider a preemptive resume priority
scheme. Class-2 and class-4 jobs are given higher priority at their respective
nodes. This is natural because by giving priority to them, we could purge
jobs out of the system (with the hope that it would reduce the number of
jobs in the system).
The stability conditions in terms of traffic intensities at nodes A and B are

λ1 λ3 λ1 λ3
+ < 1 and + <1 (8.7)
μ1 μ4 μ2 μ3

respectively. We once again are interested to determine whether these con-

ditions are sufficient to ensure that the system is stable. For that, we describe
a numerical problem in which these conditions are satisfied. It is crucial to
reiterate that contrary to what is presented in the example, it is not necessary
for the arrivals and/or service to be stochastic; also it is not necessary for the
interarrival time or the service times to be IID.

Problem 76
Consider the network in Figure 8.6. Let the arrivals be according to a Poisson
process with mean λ1 = λ3 = 1. Also all service times are exponentially dis-
tributed with μ1 = 5, μ2 = 10/7, μ3 = 4, and μ4 = 4/3. Verify that conditions
in (8.7) are satisfied. Then simulate the system for about 4000 time units with
an initially empty state to obtain the number of jobs in each of the two nodes
A and B over time.
Solution
It is relatively straightforward to verify that conditions in (8.7) are satisfied
since

λ1 λ3 λ1 λ3
+ = 0.95 < 1 and + = 0.95 < 1.
μ1 μ4 μ2 μ3

Starting with an empty system, we simulate arrivals and service according

to distributions described. We plot graphs of the number of jobs in nodes
A and B in Figure 8.7(a) and (b), respectively. Notice that in both nodes the
number of jobs rises sharply and falls to zero in cycles and each cycle grows
bigger and bigger. Also when one node has a large number of jobs, the other
is small (or even zero) and vice versa. Also if we monitor the total number
of jobs in the entire network, as done in Figure 8.8, we see that the system is
unstable because the total number in the entire system is constantly rising.

The reason for instability is very similar to that we saw in Problem 75,
although here too the traffic intensity conditions (8.7) are satisfied. Let X2 (t)
Stability, Fluid Approximations, and Non-stationary Queues 461

1500
Number of jobs in node A

1000

500

0
0 500 1000 1500 2000 2500 3000 3500 4000
(a) t

3000

2500
Number of jobs in node B

2000

1500

1000

500

0
0 500 1000 1500 2000 2500 3000 3500 4000
(b) t

FIGURE 8.7
Number in each node of Kumar–Seidman–Rybko–Stolyar network example. (a) Number of jobs
in node A versus t. (b) Number of jobs in node B versus t.

and X4 (t) be the number of class-2 and class-4 jobs, respectively, in the sys-
tem at time t. It is impossible for node A to be working on a class-4 job at
the same time when node B is working on a class-2 job. This is because with
respect to the class-2 and class-4 jobs, if they were both being served simulta-
neously, the previous event would have been start of a class-2 job or start of a
class-4 job. For that it would be necessary for a class-1 or a class-3 job to have
been completed respectively. But that is impossible because the respective
462 Analysis of Queues

3000

2500
Total number of jobs in the system

2000

1500

1000

500

0
0 500 1000 1500 2000 2500 3000 3500 4000
t

FIGURE 8.8
Number of jobs in the entire Kumar–Seidman–Rybko–Stolyar network.

nodes would be working on the higher priority jobs. In other words, before
becoming a class-2 job, a job would have been a class-1 job that would have
just completed. But for a class-1 job to be complete there could be no class-4
jobs in the system. So there would be a class-2 job in the system only if there
is no class-4 job in the system. Likewise, we can see using an identical argu-
ment that there would be a class-4 job in the system only if there are no
class-2 jobs in the system.
Hence we conclude that

X2 (t)X4 (t) = 0

for all t if we started with an empty system at t = 0. This means that the
system as a whole cannot process a class-2 and a class-4 job simultaneously.
Therefore, if the load brought by class-2 and class-4 jobs is too high, then the
system will not be stable. In Problem 76, notice that

λ1 λ3
+ = 1.45 > 1.
μ2 μ4

The system cannot spend a fraction λ1 /μ2 time serving class-2 and another
λ3 /μ4 fraction of time serving class-4 since both cannot be served simultane-
ously. Thus a crucial condition for stability is

λ1 λ3
+ < 1. (8.8)
μ2 μ4
Stability, Fluid Approximations, and Non-stationary Queues 463

This condition is as though there exists a virtual station into which class-2
and class-4 flow and that station also needs to have a traffic intensity of less
than 1.

Remark 19

The conditions for the network represented in Figure 8.6 (with priority policy
described earlier) to be stable are

λ1 λ3
+ < 1,
μ2 μ4
λ1 λ3
+ < 1 and
μ1 μ4
λ1 λ3
+ < 1.
μ2 μ3
To illustrate that the network is stable if the conditions in Remark 19
are satisfied, we consider a set of numerical values different from those in
Problem 76. Although λ1 = λ3 = 1, we have μ1 = 2, μ2 = 2.5, μ3 = 20/11,
and μ4 = 20/9. Notice that the conditions in Remark 19 are satisfied. In
particular, similar to the numerical values in Problem 76, here too

λ1 λ3 λ1 λ3
+ = 0.95 and + = 0.95.
μ1 μ4 μ2 μ3

However,

λ1 λ3
+ = 0.85.
μ2 μ4

For this set of numerical values we simulate the system and obtain the total
number of customers in this stable system over time in Figure 8.9. By con-
trasting with that of the unstable network in Figure 8.8, notice how the
number in the system does not blow up and keeps hitting zero from time
to time. Thus clearly the standard traffic intensity conditions are only nec-
essary but not sufficient. Having made a case for that we present one final
example of an unstable network.
We present as a last example an FCFS network with reentrant lines. This
network is depicted in Figure 8.10 and is identical to the example considered
in Chen and Yao [19]. Chen and Yao [19] describe this network as a Bramson
network since it is a simplification of the network considered by Bramson
[13]. Like the previous examples here too we only describe the network in
terms of the average rates. Also, the deterministic and the stochastic versions
464 Analysis of Queues

70
Total number of jobs in the system

0
0 500 1000 1500 2000 2500 3000 3500 4000
t

FIGURE 8.9
Number of jobs in a stable Kumar–Seidman–Rybko–Stolyar network.

are identical from a stability standpoint. Class-1 jobs enter node A externally
at an average rate of λ per unit time. They get served at node A for an average
time of 1/μ1 and then go to node B where there are called class-2. Class-2 jobs
get served for an average time of 1/μ2 and go for another round of service
at node B as class-3 jobs. Class-3 jobs take an average 1/μ3 time for service
and convert to class-4 jobs at the end of service. Class-4 jobs are also served
at node B at an average rate of μ4 per unit time. Upon service completion,
class-4 jobs convert to class-5 and get served at node A before exiting the
system. Average class-5 service time is 1/μ5 . There is a single server at node
A and a single server at node B and each server at their respective nodes
use FCFS discipline. Notice that this is indeed a reentrant line. However, the
main difference is this is FCFS (and not priority scheme as we saw in the two
previous examples).

μ5

μ4
μ3
μ1 μ2
λ

Node A Node B

FIGURE 8.10
Network with FCFS at all stations.
Stability, Fluid Approximations, and Non-stationary Queues 465

The stability conditions in terms of traffic intensities at nodes A and B are

λ λ λ λ λ
+ < 1 and + + <1 (8.9)
μ1 μ5 μ2 μ3 μ4

respectively. We once again are interested in determining whether these con-

ditions are sufficient to ensure that the system is stable. For that, we describe
a numerical problem in which these conditions are satisfied. The numerical
values used are identical to that in Chen and Yao [19]. It is crucial to reit-
erate that contrary to what is presented in the example, it is not necessary
for the arrivals and/or service to be stochastic; also it is not necessary for
the interarrival time or the service times to be IID, leave alone exponential
distribution.

Problem 77
Consider the network in Figure 8.10. Let the arrivals be according to a
Poisson process with mean rate λ = 1. Also all service times are exponen-
tially distributed with 1/μ1 = 0.02, 1/μ2 = 0.8, 1/μ3 = 0.05, 1/μ4 = 0.05, and
1/μ5 = 0.88. Verify that conditions in (8.9) are satisfied. Then simulate the
system for about 50,000 time units with an initially empty state to obtain the
number of jobs in each of the two nodes A and B over time.
Solution
It is relatively straightforward to verify that conditions in (8.9) are satisfied
since

λ λ λ λ λ
+ = 0.9 < 1 and + + = 0.9 < 1.
μ1 μ5 μ2 μ3 μ4

Starting with an empty system, we simulate arrivals and services accord-

ing to distributions described in the problem. Since we have simulated for
a much larger time compared to the previous two examples, Figure 8.11(a)
and (b) clearly indicate, the cyclic and increasing pattern in terms of the num-
ber of jobs in nodes A and B, respectively. Also Figure 8.12 shows the total
number of customers in the system and from that we can certainly conclude
that the system is unstable. That is because the total number in the entire sys-
tem has a rising trend. Although it is true that the number in each queue hits
zero from time to time, but if one were to add the number of jobs in node A
to that in node B, the total number of jobs continuously increases. However,
a curious finding is the fact that when one node has a lot of jobs, the other
one is relatively empty.

The reason for instability is very similar to that we saw in the previous
two problems, that is, Problems 75 and 76. However, the virtual station
466 Analysis of Queues

5000

4500

4000
Number of jobs in node A

3500

3000

2500

2000

1500

1000

500

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
(a) t ×104
6000

5000
Number of jobs in node B

4000

3000

2000

1000

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
(b) t ×104

FIGURE 8.11
Sample path for number in each node of FCFS network example. (a) Number of jobs in node A
versus t. (b) Number of jobs in node B versus t.

condition is a lot more subtle and hard to explain. However, here too the
traffic intensity conditions (8.9) are satisfied. But the servers end up idling
for longer than they can afford and keep catching up. As that happens the
queue piles up and this causes a cascading effect. Nonetheless it is not easy
to write down the explicit sufficient conditions for stability. As one would
expect, for larger networks it would indeed be more complicated to test for
stability using virtual stations. Hence we use fluid models to analyze the
stability, which is the focus of the remainder of this section.
Stability, Fluid Approximations, and Non-stationary Queues 467

6000

5000
Total number of jobs in the system

4000

3000

2000

1000

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
t ×104

FIGURE 8.12
Total number in the entire system in the FCFS network example.

8.2.2 Stable Fluid Network Implies Stable Discrete Network

Recall that in Section 8.1, we considered a single queueing station with a
single server and entities flow through it exactly once. For that discrete and
stochastic system, we showed that if we scaled it appropriately (called fluid
scaling), then it would result in a deterministic fluid queue. More impor-
tantly, the number in the discrete and stochastic system, would in the limit
converge to that of the fluid queue. In other words, if we started with a dis-
crete system with average arrival rate λ and average service capacity μ, then
we can construct a fluid queue into which fluid flows deterministically at
rate λ and gets emptied out from an orifice with capacity μ per unit time.
The number in the system in the original stochastic discrete queue will con-
verge to that of the deterministic fluid queue. As we saw earlier, one of the
benefits of that is that if the deterministic fluid queue is stable then stochastic
discrete queue will also be stable.
Here we extend that notion to an entire network stating that if a determin-
istic fluid network is stable, then the stochastic discrete network (that was
scaled to obtain the fluid network) is also stable. For that, we first describe
how to convert a discrete queueing network into a fluid network; then we
state that if we can show the fluid network is stable, we are done; finally
we briefly describe how to show our fluid network is stable. Our treatment
here is rather preliminary and is only meant to get a flavor for the concept.
In fact to begin with, our notion of what is stable is rather vague. Technically
what we mean by stable for the stochastic discrete network is that the net-
work is “positive Harris recurrent.” Also, what we mean by stable for the
468 Analysis of Queues

fluid network we will address only subsequently (and contrast it against the
notion of “weakly stable”). At this time we just say “stable” to not get dis-
tracted by technical details. There are some excellent texts and monographs
that interested readers are encouraged to consider for a fully rigorous treat-
ment of this subject. They include Dai [25], Meyn [82], Chen and Yao [19],
and Bramson [13], to name a few.
We first describe the network setting. It is crucial to realize that the nota-
tion is somewhat different from those in the previous chapters. The setting
as well as converting from a discrete network to a fluid network has been
adapted from Meyn [82]. Consider a network with many single-server nodes
or stations. Henceforth we will use the terms node and station interchange-
ably. There could be one or more queues or buffers at each station (we use the
terms buffers and queues interchangeably). The key difference in the nota-
tion in this section is that the flow, routing, and service are with respect to
the buffers and not the nodes unlike previous chapters. However, as always,
the flow in this network is discrete and stochastic in terms of arrivals and
service. But the routing from buffer to buffer is deterministic. Next we explic-
itly characterize these networks and describe the inputs for our analysis as
follows:

1. The network consists of N service stations (or nodes).

2. There is one server at node i for all i such that 1 ≤ i ≤ N.
3. There are buffers in the entire network and at least one in each
node. Clearly, ≥ N. Throughout the network, buffers are num-
bered 1, 2, . . ., . The one-to-one relationship between nodes and
buffers are described in matrix C which is a node-buffer incidence
matrix. Thus if node i has buffer j, then Cij = 1 for i ∈ [1, . . . , N] and
j ∈ [1, . . . , ]. The matrix C is N × , so that the rows correspond to
the nodes and columns correspond to buffers.
4. For all j ∈ [1, . . . , ], service times of customers (or jobs) at buffer j
have a mean 1/μj if they are processed in isolation. In other words,
if the server is processing a job from buffer j, then the service rate is
μj if that is the only job the server is processing.
5. We assume that the service discipline or policy used in the network
can be specified. The only requirements are that the policy be: (a)
nonidling or work conserving, that is, if any of the buffers in a node
is nonempty, the server would not be idle; (b) head-of-the-line ser-
vice at every buffer, that is, at most one job in a buffer can have
partially completed service. Condition (a) necessarily requires that
every server processes jobs at its full capacity, even if it is processor
sharing. Condition (b) does not preclude having processor sharing
across buffers.
Stability, Fluid Approximations, and Non-stationary Queues 469

6. There is infinite waiting room at each buffer.

7. Externally, customers or jobs arrive at buffer j at an average rate of
λj per unit time. All external arrivals are independent of each other,
the service times and the network state.
8. For all i ∈ [1, . . . , ] and j ∈ [1, . . . , ], when a customer completes
service at buffer i, the customer either goes to another buffer j (such
that j = i) or departs the network. The deterministic routing matrix
R is defined as Rij = 1 if after completing service customers at buffer
i join buffer j. We assume that I − R is invertible, where I is the ×
identity matrix.

Our objective is to determine whether such a queueing network is stable.

To address that objective, our first step is to provide some examples to
clarify this network setting since it is somewhat different from before. Then
we will show how to convert such a discrete and stochastic network into a
fluid and deterministic one. Finally, we will state that if the fluid network
is stable, then so will the discrete one. With that said, we first present an
example problem.

Problem 78
Consider a network with single servers in each node that has buffers as
depicted in Figure 8.13. There are three products that flow in the network.
The buffers have infinite size. One product has a deterministic route of
buffers 1, 2, and 3 before exiting the network; another goes through buffers
4, 5, and then 6; and the last one enters buffer-7, gets served, and exits after
being served at buffer-8. The external arrival rates and the service rates are
provided. Say the servers at each node use a preemptive priority policy giv-
ing highest priority to the shortest expected processing time among all types
of jobs waiting at its node. Assume that μi < μj if i < j. Can this system be
modeled using the network setting described earlier?

Node 1 Node 2 Node 3

λ1 μ1 λ4 μ4 μ5

μ6
μ2
μ3
λ7 μ7 μ8

FIGURE 8.13
Network with nodes with multiple buffers and deterministic routes.
470 Analysis of Queues

Solution
The system can be modeled using the network setting described as follows.
The network has N = 3 service stations (or nodes) called node 1, node 2, and
node 3. There is one server at each of the nodes 1, 2, and 3. There are = 8
buffers in the entire network and at least one in each node. Notice that > N.
Using the buffer numbers we can see that C11 = C13 = C17 = 1 since node 1 has
buffers 1, 3, and 7. Likewise, we have C22 = C24 = 1 and C35 = C36 = C38 = 1
for the same reason. Thus we have the C matrix as
⎡ ⎤
1 0 1 0 0 0 1 0
C = ⎣ 0 1 0 1 0 0 0 0 ⎦.
0 0 0 0 1 1 0 1

The service rates at the buffers are specified in Figure 8.13. Also, the service
discipline is described in the problem statement (although we would not use
that here, we just verify that it is nonidling). There is infinite waiting room
at each buffer.
External arrival rate of customers into buffers 1, 2, 3, 4, 5, 6, 7, and 8 are
λ1 , 0, 0, λ4 , 0, 0, λ7 , and 0, respectively. The routing matrix from buffer to
buffer is given by

⎡ ⎤
0 1 0 0 0 0 0 0
⎢ 0 0 1 0 0 0 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 0 0 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 1 0 0 0 ⎥
R=⎢
⎢
⎥
⎥
⎢ 0 0 0 0 0 1 0 0 ⎥
⎢ 0 0 0 0 0 0 0 0 ⎥
⎢ ⎥
⎣ 0 0 0 0 0 0 0 1 ⎦
0 0 0 0 0 0 0 0

which can be computed as Rij = 1 whenever there is an arc from buffer i to

buffer j in Figure 8.13 and Rij = 0 otherwise. Note that I − R is invertible.

In a similar fashion, it is relatively straightforward to model the networks

in Figures 8.3, 8.6, and 8.10 using this setting to obtain C and R for them (see
Problem 79). However, in all the examples there is a notion of individual
flows (such as the three flows in Figure 8.13) in the network. That is actually
not necessary for the network setting. One could envision a feed-forward
subnetwork where several flows merge to form a super-flow. In other words,
the kind of networks that are allowed are fairly generic. Also, in some sense
the networks we considered in earlier chapters could be envisioned as special
cases. In fact, some researchers call each buffer as a separate class (like we
did in the previous section) and hence such a network would be a multiclass
Stability, Fluid Approximations, and Non-stationary Queues 471

network with deterministic routing and reentrant lines. Another way to con-
sider this network is that there is a resource constraint that forces each server
to work on multiple buffers. For example, each queue could be correspond-
ing to a buffer of a machine and each node an operator. So each operator is
responsible for a set of machines and the operator switches between jobs on
all the machines he or she is assigned to work on. Thus the whole node can
be thought of as either a machine or a resource.
With that motivation, the next question to ask is how do we convert
such a discrete and stochastic network into a fluid and deterministic net-
work by scaling it appropriately? It turns out that the procedure is relatively
straightforward where we decompose the network into individual buffers
and replace the discrete arrivals by fluids and valves for emptying buffers.
Thus for the fluid model, all we need to state is the fluid entering rate and
emptying rate for every buffer at all times. That would specify our fluid
model. We explain next how a deterministic and fluid model of a stochas-
tic discrete network looks like. For that we first consider a small example to
explain and subsequently generalize it to any network. Recall the Rybko–
Stolyar–Kumar–Seidman network in Figure 8.6. Priority is given to buffer-4
at node A and buffer-2 at node B. The fluid model of that network would
be constructed in the following manner. Fluid would arrive at constant
rates λ1 and λ3 continuously into buffers 1 and 3, respectively. If buffer-2
is nonempty, then node B would drain it at rate μ2 . Notice that if buffer-2
is empty, that does not mean it is not getting any inputs, it is just that the
input rate is smaller than μ2 . So if buffer-2 is empty, then whatever capac-
ity buffer-2 is not using will be used to drain buffer-3. Likewise, at node A,
if buffer-4 is nonempty then all of the node’s capacity will be used to drain
buffer-4. However, if buffer-4 is empty, then the node will offer just the nec-
essary amount of capacity to buffer-4 to ensure it continues to be empty, and
the remaining capacity to drain out buffer-1.
We formalize that mathematically. Let ζj (t) be the processing capacity
allocated to buffer j for j = 1, 2, 3, 4 for the network in Figure 8.6 (we will
subsequently define ζj (t) more precisely for a generic network). For exam-
ple, if at time t node A is draining a nonempty buffer-4, then ζ4 (t) = 1
and ζ1 (t) = 0. However, at time t if buffer-4 is empty but it gets arrivals at
rate a4 and buffer-1 is nonempty, then ζ4 (t) = a4 /μ4 and ζ1 (t) = 1 − a4 /μ4
(we need the condition a4 /μ4 < 1 for buffer-4 to be empty). Finally if both
buffers 1 and 4 are empty at time t and arrival rates into them are λ1 and
a4 , respectively, then ζ4 (t) = a4 /μ4 and ζ1 (t) = λ1 /μ1 (we need the condition
λ1 /μ1 + a4 /μ4 < 1 for both buffers 1 and 4 to be empty). In a similar fash-
ion, one could consider node B and describe ζ2 (t) and ζ3 (t). With that one
could decompose the network into individual buffers and write down the
arrival as well as emptying rates for each buffer at time t, as described in
Table 8.1.
Therefore, notice that it is relatively straightforward to convert a dis-
crete stochastic network into a fluid deterministic one. Now we formalize
472 Analysis of Queues

TABLE 8.1
Arrival and Drainage Rates in Fluid-Scaled Network
Buffer Arrival Rate at Time t Draining Rate at Time t
1 λ1 μ1 ζ1 (t)
2 μ1 ζ1 (t) μ2 ζ2 (t)
3 λ3 μ3 ζ3 (t)
4 μ3 ζ3 (t) μ4 ζ4 (t)

that for a generic discrete stochastic network with N nodes, buffers, node-
buffer incidence matrix C, and buffer-to-buffer routing matrix R. Also, for all
j ∈ {1, . . . , }, λj is the external average arrival rate into buffer j and 1/μj is the
average service time for a job in buffer j. This can be converted into a fluid
deterministic network and decomposed into individual buffers so that all we
need to specify is the input and drainage rate of each buffer. To explain the
conversion process, we use some extra notation to keep the presentation less
cumbersome. Let Ji be the set of buffers in node i, that is, Ji = {j : Cij = 1} for
all i ∈ [1, . . . , N]. In Figure 8.13, for example, J1 = {1, 3, 7}, J2 = {2, 4}, and
J3 = {5, 6, 8}. Likewise, let s(j) be the node where buffer j resides. Again,
in the example in Figure 8.13, s(3) = 1 since buffer-3 is in node 1, and
s(8) = 3 since buffer-8 is in node 3. This gives us a mapping between buffers
and nodes.
For a given buffer j such that j ∈ {1, . . . , }, let zj (t) be the cumulative time
allocated by node s(j) to process buffer j in time (0, t]. For all t ≥ 0, let ζj (t) be
the right derivative of zj (t) and is written as

d+
ζj (t) = zj (t).
dt

Since zj (t) although continuous is not differentiable everywhere, as a con-

vention we use its right derivative (especially in nondifferentiable points)
to define ζj (t). Like before, ζj (t) is indeed the processing capacity allocated
to buffer j by node s(j). With that said we can model buffer j as a fluid

queue with arrival rate λj + μi ζi (t)Rij at time t and drainage rate μj ζj (t)
i=1
at time t. The capacity allocations ζj (t) for buffer j is closely linked to the
scheduling policy used as well as the contents of all the buffers. However
they should satisfy some generic conditions such as

ζj (t) ≤ 1
j∈Ji

for all i ∈ {1, . . . , N}. This inequality would be an equality if at least one of
the buffers in the set Ji of node i is nonempty.
Stability, Fluid Approximations, and Non-stationary Queues 473

That said, we can conclude that it is possible to convert a discrete stochas-

tic queueing network into a fluid deterministic queueing network. It is also
relatively straightforward to see that the scaled arrival process and service
process converge to the exact same fluid limits described in the fluid model.
However, it is a little more involved to show that by scaling the queue length,
busy period, remaining workload process, etc., one would obtain the corre-
sponding quantities in the fluid network. The reader is referred to Dai [25]
as well as Chen and Yao [19] for a detailed description, proof and mapping
of all the processes from the discrete to the fluid model by scaling. The tech-
nique follows a similar argument to that made for the single buffer case in
Section 8.1. Now, since the discrete model’s queue length with scaling con-
verges to that of the fluid model, we can make some arguments relating to
their respective stability. In particular, it is possible to show that if the fluid
queue is stable, then so is the discrete stochastic one. For that we first charac-
terize the different forms of stability for both the discrete and the fluid cases.
Then we describe what to predict for the stability of the discrete queue given
our assessment of the fluid queue.
For that we first define Xj (t) as the number of jobs in buffer j at time
t (including any jobs being served) in the discrete stochastic network such
that 1 ≤ j ≤ . We also define xj (t) as the amount of fluid in buffer j at time t
in the fluid deterministic network (obtained using the conversion described
earlier) such that 1 ≤ j ≤ . For all j ∈ [1, ], Dai [25] as well as Chen and Yao
[19] show that as n → ∞,

Xj (nt)
→ xj (t).
n

This is what we meant in the previous paragraph that not only do the arrival
and service process converge to their fluid limits but so do the number in
each buffer. Notice that xj (t) is a deterministic quantity. In some articles xj (t)
is also written as Xj (t) to specifically denote the fluid limit. Next we define
various “degrees” of stability for the discrete as well as the fluid network in
terms of Xj (t) and xj (t) for all j ∈ [1, ].
For the discrete stochastic network, we define two “degrees” of stability
as follows:

• Stable: A discrete stochastic network is called stable if j Xj (t) < ∞
for all t, especially as t → ∞. For that one typically shows that the
stochastic process {X(t), t → ∞} is positive Harris recurrent, where
X(t) = (X1 (t), . . . , X (t)).
• Rate stable: A discrete stochastic network is called rate-stable if for
every buffer j, the steady-state departure rate equals the steady-
state “effective” arrival rate obtained by solving the flow balance.
To mathematically state that, let Dj (t) be the number of jobs that
depart buffer j in time (0, t]. Also let a = [a1 . . . a ] be a row vector
474 Analysis of Queues

of effective arrival rates that can be obtained as a = λ(I − R)−1 where

λ = [λ1 λ2 . . . λ ], I the identity matrix and R the routing matrix.
If for every j, we have Dj (t)/t → aj almost surely as t → ∞, then the
network is rate stable. Note that j Xj (t) could be ∞ as t → ∞.

Now, for the fluid deterministic network, we define two corresponding

“degrees” of stability as follows:

• Stable: A fluid deterministic

network is called stable if there exists a
finite time δ so that j xj (t) = 0 for all t > δ, given any finite initial
fluid level x1 (0), . . ., x (0). That means that if the fluid queue started
with any initial level of fluid, it would eventually all drain out.
• Weakly
stable: A fluid deterministic network is called weakly stable
if j xj (t) = 0 for all t > 0, given that the network is empty initially,

that is, j xj (0) = 0. That means that if the fluid queue started out
empty, it would remain empty throughout.

The reason we presented the degrees of stability for the discrete and fluid
networks in a “corresponding” fashion is that as the title of this section states,
the discrete network is stable if the fluid network is stable in a corresponding
manner. We formalize this in the next remark.

Remark 20

If the fluid deterministic network is weakly stable, then the discrete stochastic
network is rate stable. Also, if the fluid deterministic network is stable, then
the discrete stochastic network is positive Harris recurrent (hence stable).

To explain this remark as well as stability notions, let us consider the sim-
plest example of a single buffer on a single node, as done in Section 8.1. The
deterministic fluid model has an inflow rate λ and an orifice capacity μ. If
λ < μ, no matter how much fluid there was in the system initially, as long
as it was finite, the buffer would empty in a finite time. Therefore, the fluid
model is stable if λ < μ. Remark 20 states that if the fluid model is stable then
the original discrete queue is stable. This can be easily verified because we
know that the discrete stochastic system is stable (or positive Harris recur-
rent) if λ < μ. Now if λ = μ, the fluid queue would remain at the initial level
at all times. Thus if there is a nonzero initial fluid level, then the time to
empty is infinite. But if the initial fluid level is zero, then it would remain
zero throughout. Hence when λ = μ, the queue is only weakly stable but not
stable. Thus when λ = μ the discrete queue is only rate stable. Of course if
λ > μ the fluid queue is unstable and so is the discrete queue.
Stability, Fluid Approximations, and Non-stationary Queues 475

Although in the preceding simple example it was not necessary to invoke

the fluid model, in a larger network it may be. The understanding is that it
is easier to determine if the fluid deterministic network is stable, weakly sta-
ble, or unstable than the corresponding notions for the discrete stochastic
network. However, once we know the stability of the fluid deterministic net-
work, we can immediately state that for the discrete stochastic network. We
will see in the next section how to assess the stability of a fluid deterministic
network.

8.2.3 Is the Fluid Model of a Given Queueing Network Stable?

We saw in the previous section how to scale and convert a discrete stochas-
tic queueing network into a fluid deterministic one. We also saw there that
if the fluid deterministic network was stable (or weakly stable), then the
corresponding discrete stochastic network is stable (or rate stable). What
remains is to check if the fluid deterministic network is stable, weakly stable,
or unstable, and we would be done. As described in Dai [25], that is not
particularly easy although it is usually better than checking if the discrete
stochastic network is stable. In particular, the fluid network stability would
almost have to be done on a case-by-case basis. There are two steps to follow.
In the first step, start with an empty fluid queue and see if any queue builds
up. If no queue builds up then the fluid network is weakly stable, otherwise it
is unstable (in which case we are done). If the fluid network is weakly stable,
then as a second step try an arbitrary initial fluid level. The usual candidate
is x1 (0) = 1 and xj (0) = 0 for all j = 1. If there does not exist a finite emptying
time δ after which the system would continue to be empty, then the system is
only weakly stable. However, if one can show there exists a finite emptying
time, then consider a more generic initial state and if there exists an empty-
ing time that is finite, then we have a stable system. We illustrate this using
networks we have seen earlier in the next remark.

Remark 21

The corresponding fluid networks in Problems 75, 76, and 77 are all weakly
stable since if we started with an empty system they would remain empty.
But they are not stable since if we have a finite nonzero amount of fluid in
the buffer initially, then the time to empty becomes infinite. As evident from
the simulations, the discrete stochastic networks are all not positive Harris
recurrent but they are rate stable.

The preceding technique shows how to check if a fluid network is stable

given a particular numerical setting. What if the arrival rates are smaller or
if the processing rates are faster? It is cumbersome to check for each case the
476 Analysis of Queues

network’s stability. However, if we knew the necessary and sufficient con-

ditions for a fluid network to be stable, that would be helpful. We describe
them next. To obtain the necessary conditions, let a = [a1 . . . a ] be a row
vector of effective arrival rates that can be obtained as a = λ(I − R)−1 where
λ = [λ1 λ2 . . . λ ], I the identity matrix, and R the routing matrix. Then for
every buffer j, we define ρj = aj /μj and the row vector ρ = [ρ1 . . . ρ ]. If ρT
is the transpose of ρ and ê is an N × 1 column of ones, then the necessary
conditions for the fluid model to be stable is

CρT < ê.

Usually, if these necessary conditions are satisfied, then the fluid model is
at least weakly stable with allocation rates at buffer j ζj (t) = ρj for all t. But
those conditions may not be sufficient to ensure stability (beyond weak sta-
bility). We would address the sufficient conditions later but first explain the
necessary conditions with an example problem.

Problem 79
Consider the networks in Figures 8.3, 8.6, and 8.10. For all three networks,
derive the necessary conditions for stability which would result in the fluid
models to be at least weakly stable, if not stable?
Solution
For each of the Figures 8.3, 8.6, and 8.10 using their respective R and C, as
well as λj and μj values for each buffer j, we derive the conditions for the
fluid models to be weakly stable in the following manner.

• For the network in Figure 8.3, we have N = 2, = 5, λ1 = λ, and

λ2 = λ3 = λ4 = λ5 = 0. Thus λ = [λ 0 0 0 0]. Also, the routing
matrix is

⎡ ⎤
0 1 0 0 0
⎢ 0 0 1 0 0 ⎥
⎢ ⎥
R=⎢
⎢ 0 0 0 1 0 ⎥
⎥
⎣ 0 0 0 0 1 ⎦
0 0 0 0 0

and the node-buffer incidence matrix is

1 0 1 0 1
C= .
0 1 0 1 0
Stability, Fluid Approximations, and Non-stationary Queues 477

The effective arrival ratevector is a = λ(I −1

− R) = [λ λ λ λ λ].
Thus we can obtain ρ = μλ1 μλ2 μλ3 μλ4 μλ5 . The conditions for the
fluid model to be at least weakly stable are CρT < ê, which results in

λ λ λ λ λ
+ + <1 and + < 1.
μ1 μ3 μ5 μ2 μ4

These are indeed identical to the conditions in (8.5). For any

nonidling or work-conserving policy, if the fluid model started out
with an empty system, then the allocation rates at buffer j (for
j = 1, 2, 3, 4, 5) would be ζj (t) = λ/μj at all times t. That would result
in the buffers being empty at all times, hence the system would be
weakly stable. However, it is worthwhile pointing out that if the
fluid model’s initial state was nonempty in at least one buffer, then
the necessary conditions are not sufficient to ensure stability.
• For the network in Figure 8.6, we have N = 2, = 4, and λ =
[λ1 0 λ3 0]. Also, the routing matrix is
⎡ ⎤
0 1 0 0
⎢ 0 0 0 0 ⎥
R=⎢
⎣ 0
⎥
0 0 1 ⎦
0 0 0 0

and the node-buffer incidence matrix is

1 0 0 1
C= .
0 1 1 0

−1
The effective arrival a = λ(I − R) = [λ1 λ1 λ3 λ3 ]. Thus
rate vector is
λ1 λ1 λ3 λ3
we can obtain ρ = μ1 μ2 μ3 μ4 . The conditions for the fluid model
to be at least weakly stable are CρT < ê which results in

λ1 λ3 λ1 λ3
+ < 1 and + < 1.
μ1 μ4 μ2 μ3

These are indeed identical to the conditions in (8.7). For any

nonidling or work-conserving policy, if the fluid model started out
with an empty system, then the allocation rates at buffer j (for
j = 1, 2, 3, 4) would be ζj (t) = aj /μj at all times t. That would result
in the buffers being empty at all times, hence the system would
be weakly stable. However, it is worthwhile pointing out that if
buffers 2 and 4 were given priorities at the respective nodes, then
we saw in Problem 76 that the preceding conditions were not suf-
ficient to ensure stability. One can show that if we started with an
478 Analysis of Queues

initial fluid level of 1 in buffer-1 and zero in all other buffers, then the
fluid system would never empty if λ1 /μ2 + λ3 /μ4 > 1. In fact we will
show subsequently that the sufficient condition to ensure stability is
λ1 /μ2 + λ3 /μ4 < 1.
• For the network in Figure 8.10, we have N = 2, = 5, λ1 = λ, and
λ2 = λ3 = λ4 = λ5 = 0. Thus λ = [λ 0 0 0 0]. Also, the routing matrix is

⎡ ⎤
0 1 0 0 0
⎢ 0 0 1 0 0 ⎥
⎢ ⎥
R=⎢
⎢ 0 0 0 1 0 ⎥
⎥
⎣ 0 0 0 0 1 ⎦
0 0 0 0 0

and the node-buffer incidence matrix is

1 0 0 0 1
C= .
0 1 1 1 0

The effective arrival rate vector is a = −1

λ(I − R) = [λ λ λ λ λ]. Thus
we can obtain ρ = μλ1 μλ2 μλ3 μλ4 μλ5 . The conditions for the fluid
model to be at least weakly stable are CρT < ê which results in

λ λ λ λ λ
+ <1 and + + < 1.
μ1 μ5 μ2 μ3 μ4

These are indeed identical to the conditions in (8.9). For any

nonidling or work-conserving policy (such as FCFS), if the fluid
model started out with an empty system, then the allocation rates
at buffer j (for j = 1, 2, 3, 4, 5) would be ζj (t) = λ/μj at all times t. That
would result in the buffers being empty at all times, hence the sys-
tem would be weakly stable.

Having described the necessary conditions for stability, our next goal is
to obtain the sufficient conditions. Unfortunately, unlike the necessary con-
ditions, the sufficient conditions cannot be stated in a generic fashion and
would have to be addressed on a case-by-case basis. However, knowledge
of the dynamics of the network would certainly aid in the process of obtain-
ing some conditions and all we need to do is to check if those conditions are
sufficient. As an example, recall the virtual station conditions described in
Section 8.2.1. Are those virtual station conditions sufficient to ensure stability
or would more conditions be needed? To answer that question, we consider
a specific example, namely the Kumar–Seidman–Rybko–Stolyar network in
Stability, Fluid Approximations, and Non-stationary Queues 479

Figure 8.6. Assume that the necessary conditions

λ1 λ3 λ1 λ3
+ < 1 and + <1
μ1 μ4 μ2 μ3

are satisfied. We would like to check if in addition the virtual station

condition λ1 /μ2 + λ3 /μ4 < 1 is sufficient to ensure stability.
To check that we begin with an initial buffer level of x1 (0), x2 (0),
x3 (0), and x4 (0). Using that if we can show there exists a finite time δ so
that x1 (t) = x2 (t) = x3 (t) = x4 (t) = 0 for all t > δ under the condition λ1 /μ2 +
λ3 /μ4 < 1, then we are done. We assume that

μ1 > μ2 and μ3 > μ4 ,

otherwise the condition λ1 /μ2 + λ3 /μ4 < 1 would always be satisfied if the
necessary conditions are satisfied. That is because if μ1 ≤ μ2 then λ1 /μ2 +
λ3 /μ4 ≤ λ1 /μ1 + λ3 /μ4 < 1 (similarly when μ3 ≤ μ4 ). Hence we make that
assumption to avoid the trivial solution.
Without loss of generality we assume that all four buffers are nonempty
initially with the understanding that other cases can be handled in a simi-
lar fashion. Node 2 would drain buffer-2 at rate μ2 and node 1 would drain
buffer-4 at rate μ4 since buffers 2 and 4 have priority. This would continue
until one of buffers 2 or 4 becomes empty. Say that is buffer-4 (the argu-
ment would not be different if it was buffer-2). Now that buffer-4 is empty
and is not receiving any inputs from buffer-3 to process, buffer-1 can now
be drained at rate μ1 and at the same time buffer-2 is being drained at μ2 .
Notice that buffers 1 and 3 have been getting input fluids at rates λ1 and λ3 ,
respectively, since time t = 0. Also currently, buffer-2 is getting input at rate
μ1 . Since μ2 < μ1 , contents in buffer-2 would only grow while that in buffer-1
would shrink until buffer-1 becomes empty. Now we have buffers 1 and 4
empty and other two nonempty.
However, since buffer-1 gets input at rate λ1 , its departure is also λ1 . Since
buffer-2 now has a smaller input rate than output, it will drain out all its fluid
and become empty. Thus buffers 1, 2, and 4 are now empty. Now buffer-
3 would start draining at rate μ3 . Since μ3 > μ4 , buffer-4 would now start
building up. Because of that buffer-1 would stop draining and it would also
start accumulating. But buffer-2 would continue to remain empty. Thus the
next event is buffer-3 would empty out. At this time buffers 1 and 4 would
be nonempty. But buffer-4 would now receive input only at rate λ3 from
buffer-3, which would result in buffer-4 draining out but buffer-1 would
continue building up. This would continue till buffer-4 becomes empty at
which time the only nonempty buffer would be buffer-1. At this time buffer-
1 would start draining at rate μ1 into buffer-2 which in turn would drain
at a slower rate μ2 . Thus buffer-1 would drain out, buffer-4 would remain
empty, while buffers 2 and 3 would accumulate. This would continue till
480 Analysis of Queues

Empty μ1 Nonempty μ2 Empty

μ1 Empty μ2
λ1 λ1
Empty μ3 Nonempty Empty μ3 Nonempty
λ3 λ3
μ4 μ4

Nonempty μ1 Empty μ2 Nonempty

μ1 Empty μ2
λ1 λ1
μ3 Nonempty μ3 Empty
Empty Empty λ3 λ3
μ4 μ4

FIGURE 8.14
Cycling through buffer conditions in fluid model of Rybko–Stolyar–Kumar–Seidman network.

buffer-1 becomes empty. Thus buffers 2 and 3 are nonempty while buffers
1 and 4 are empty. This is the same situation as the beginning of this para-
graph. In essence this process would cycle through until all buffers empty,
as depicted in Figure 8.14.
Notice that irrespective of the initial finite amount of fluid in the four
buffers, the system would reach one of the four conditions in Figure 8.14.
Then it would cycle through them. A natural question to ask is: would the
cycle continue indefinitely or would it eventually lead to an empty system
and stay that way? Since this is a deterministic system, if we could show that
if in every cycle the total amount of fluid strictly reduces, then the system
would eventually converge to an empty one. Therefore, all we need to show
is if we started in one of the four conditions in Figure 8.14, then the next time
we reach it there would be lesser fluid in the system. Say we start in the state
where buffer-3 is nonempty (with a units of fluid) and buffers 1, 2, and 4
are empty. If we show that the next time we reach that same situation, the
amount of fluid in buffer-3 would be strictly less than a, then the condition
that enables that would be sufficient for the fluid model to be stable. Using
that argument we present the next problem which can be used to show that
the condition λ1 /μ2 +λ3 /μ4 < 1 is sufficient for the fluid model of the network
in Figure 8.6 to be stable.

Problem 80
Consider the fluid model of the network in Figure 8.6. Assume that the
necessary conditions λ1 /μ1 + λ3 /μ4 < 1 and λ1 /μ2 + λ3 /μ3 < 1 are satis-
fied. Also assume that μ1 > μ2 and μ3 > μ4 . Let the initial fluid levels be
x1 (0) = x2 (0) = x4 (0) = 0 and x3 (0) = a for some a > 0. Further, let T be the first
passage time defined as the next time that buffers 1, 2, and 4 are empty, and
buffer-3 is nonempty, that is,

T = min{t > 0 : x1 (t) = x2 (t) = x4 (t) = 0, x3 (t) ≥ 0}.

Stability, Fluid Approximations, and Non-stationary Queues 481

Show that the condition

λ1 /μ2 + λ3 /μ4 < 1

is sufficient to ensure that x3 (T) < a.

Solution
Notice that we begin with the north-east corner of Figure 8.14 with
x1 (0) = x2 (0) = x4 (0) = 0 and x3 (0) = a. Buffer-3 would start emptying out
while buffers 1 and 2 would start filling up. At time t1 = a/(μ3 − λ3 )
we would reach the south-east corner of Figure 8.14 with x1 (t1 ) = λ1 t1 ,
x2 (t1 ) = x3 (t1 ) = 0, and x4 (t1 ) = (μ3 − μ4 )t1 . After time t1 buffer-4 would
start emptying out while buffer-1 would continue to build up. Then at
time t2 = t1 + (μ3 − μ4 )t1 /(μ4 − λ3 ) buffer-4 would become empty and we
would reach the south-west corner of Figure 8.14. At time t2 the system
state would be x1 (t2 ) = λ1 t2 and x2 (t2 ) = x3 (t2 ) = x4 (t2 ) = 0. Immediately after
time t2 , buffer-1 would start emptying while buffers 2 and 3 would fill
up. This would continue until t3 = t2 + λ1 t2 /(μ1 − λ1 ). At time t3 , the sys-
tem state would be in the north-west corner of Figure 8.14 with x1 (t3 ) = 0,
x2 (t3 ) = (t3 −t2 )(μ1 −μ2 ), x3 (t3 ) = λ3 (t3 −t2 ), and x4 (t1 ) = 0. Soon after time t3 ,
buffer-2 would start emptying while buffer-3 would continue to grow till we
reach the north-east corner of Figure 8.14 when the first passage time occurs.
As given in the problem statement, that time is T which can be computed as
T = t3 + (t3 − t2 )(μ1 − μ2 )/(μ2 − λ1 ). At time T we have x3 (T) = λ3 (T − t2 ) and
x1 (T) = x2 (T) = x4 (T) = 0.
We need to show that if λ1 /μ2 + λ3 /μ4 < 1, then x3 (T) < a. In other words,
we need to find the condition that ensures λ3 (T − t2 ) < a. For that, let us
write down λ3 (T − t2 ) < a in terms of the problem parameters working our
way through the definitions of other terms defined earlier (substituting for
T, t3 , t2 , and finally t1 ) as follows:

λ3 (T − t2 ) < a,

⇒ λ3 (t3 − t2 )(μ1 − λ1 )/(μ2 − λ1 ) < a,

⇒ λ3 λ1 t2 /(μ2 − λ1 ) < a,

λ3 λ1 t1 (μ3 − λ3 )
⇒ < a,
(μ2 − λ1 )(μ4 − λ3 )
λ3 λ1 a
⇒ < a.
(μ2 − λ1 )(μ4 − λ3 )

If we cancel out a which is positive on both sides and rewrite the expression
we get
482 Analysis of Queues

λ1 /μ2 + λ3 /μ4 < 1

which indeed is the sufficient condition for stability.

From this we could also show that

T = αa

where
μ2
α=
(μ2 − λ1 )(μ4 − λ3 )

for any a > 0. So if we started with a amount if fluid in buffer-3 and all other
buffers empty, then after time T = αa we would have βa amount of fluid in
buffer-3 and all other buffers empty, where

λ 3 λ1
β= .
(μ2 − λ1 )(μ4 − λ3 )

Now if we started with βa, then after time αβa we would have β2 a amount
of fluid in buffer-3 and all other buffers empty. In this manner if we were to
continue, then the total time to empty the system (that started with a amount
of fluid in buffer-3 and all other buffers empty) is

αa μ2 a
αa + βαa + β2 αa + β3 αa + · · · = = .
1−β μ2 μ4 − λ1 μ4 − λ3 μ2

This shows that if the condition λ1 /μ2 + λ3 /μ4 < 1 is satisfied, then the
amount of fluid in the system converges to zero in a finite time. Thus we
can see that if we started with some arbitrary amount of fluid x1 (0), x2 (0),
x3 (0), and x4 (0) in buffers 1, 2, 3, and 4, respectively, then there exists a finite
time δ after which the system would remain empty. Therefore, under that
condition the fluid network is stable. That guarantees that the correspond-
ing stochastic discrete network originally depicted in Figure 8.6 would also
be stable. In a similar manner one could derive the sufficient conditions for
stability of other deterministic fluid networks and thereby the corresponding
stochastic discrete network on a case-by-case basis.
Having said that, it is important to point out that there are other ways
to derive the conditions for stability. In particular, Lyapunov functions pro-
vide an excellent way to check if fluid networks are stable. Although we
do not go into details of Lyapunov functions in this book, it is worthwhile
describing them for the sake of completeness. Lyapunov functions have been
used extensively to study the stability of deterministic dynamical systems.
Stability, Fluid Approximations, and Non-stationary Queues 483

They can hence be immediately used to study the stability of determinis-

tic fluid networks. Consider a fluid queue with xj (t) being the amount of
fluid in buffer j at time t for all j ∈ {1, . . . , }. Let xt be the -dimensional
vector of fluid levels at time t, that is, xt = [x1 (t) . . . x (t)]. A Lyapunov
function L(xt ) is such that L : R+ → R+ and satisfies the condition L(xt ) = 0
only when xj (t) = 0 for all j. Some examples of Lyapunov functions are:

L(xt ) = j aj xj (t) such that aj > 0 for all j; L(xt ) = maxj xj (t); L(xt ) = j x2j (t);

and combinations such as L(xt ) = maxb1 ,...,b j bj xj (t) such that bj > 0 for
all j. Lyapunov functions L(xt ) usually increase with increase in xj (t)
for any j.
Although there are many possible Lyapunov functions to choose from,
all we need is one appropriate one. In particular, if we can find a Lyapunov
function L(xt ) such that dLdt(xt ) < 0 for all values of L(xt ) greater than some
finite constant value, then the fluid model is stable. Again, the choice of
Lyapunov functions can be made on a case-by-case basis. Popular Lyapunov
function choices are quadratic and piecewise constant functions. In fact even
the time for a fluid system to empty can be used as a Lyapunov function. For
many deterministic fluid networks, using Lyapunov functions is indeed the
preferred way to show stability; however, sometimes it is just difficult to find
the right function. As it turns out, even for the stochastic discrete queueing
networks one can use the Foster–Lyapunov criterion that works in a similar
fashion. One could use the Foster–Lyapunov criterion to show stability of
both discrete time (see Meyn [82]) and continuous time networks. With that
said, in the next section we move to a different application of fluid models.

8.3 Diffusion Approximations for Performance Analysis

In this section, we consider queueing systems for which exact closed-
form algebraic expressions for performance measures are either analytically
intractable or too cumbersome to use. Our objective here is to develop rea-
sonable approximations for such systems that are simple to use for design
(such as in an optimization framework) and control. The G/G/1 queue is one
such example for which we have in previous chapters developed approxi-
mations for some performance measures. Notice that those approximations
can be used to develop an intuition for how parameters (such as the arrival
rate) affect performance measures (such as the mean sojourn time). They can
also be used in optimization contexts where, for example, a constraint could
be that the mean sojourn time must be less than a stipulated value.
However, unlike the G/G/1 queue, for multiserver queues it is harder
to use one of the previous methods (such as MVA). Although we have pre-
sented approximations for the G/G/s queues in previous chapters, it turns
484 Analysis of Queues

out that they are all based on diffusion approximations, which is the main
technique we consider in this section. What is interesting is that in many sit-
uations it is more appealing to use the G/G/s approximation than the exact
result for even an M/M/s queue! The reason is that the exact M/M/s queue
is not easy to use. For example, if one were to design (or control) the num-
ber of servers, the mean sojourn time formula is a complicated expression in
terms of s that one would rather use the simpler G/G/s approximation. To
add to the mix if we were to also consider abandonments, retrials, and server
breakdowns, diffusion approximations may be the only alternative even for
Markovian systems.
With that motivation in the next few sections we present a brief intro-
duction to diffusion approximations without delving into great detail with
respect to all the technical aspects. There is a rich literature with some
excellent books and articles on this topic. The objective of this section is to
merely provide a framework, perhaps some intuition and also fundamental
background for the readers to access the vast literature on diffusion approxi-
mation (which is also sometimes referred to as heavy-traffic approximations
especially in queues). For technical details on weak convergence, which is
the foundation of diffusion approximations, readers are referred to Glynn
[46] and Whitt [105]. We merely present the scaling procedure which results
in what is called diffusion limit. Similar to the fluid limit we presented in
Section 8.1, next we present the diffusion limit which is based on Chen
and Yao [19]. Subsequently, we will describe diffusion approximations for
multiserver queues.

8.3.1 Diffusion Limit and Functional Central Limit Theorem

In this section, we use simulations to develop an intuition for a diffusion
process and then describe diffusion limit of a stochastic process. The
approach is somewhat similar to that of the fluid limit we considered in
Section 8.1. Here too we begin by considering A(t), the number of entities
that arrived into a system from time 0 to t. The average arrival rate is λ
which is

A(t)
λ = lim .
t→∞ t

Recall that to obtain the fluid limit of the discrete arrival process {A(t), t ≥ 0},
we defined An (t) as

A(nt)
An (t) =
n

for any n > 0 and t ≥ 0. We showed that An (t) → λt as n → ∞ which we called

the fluid limit.
Stability, Fluid Approximations, and Non-stationary Queues 485

In a similar fashion, here we define Ân (t) as

√ A(nt) − nλt
Ân (t) = n An (t) − λt = √
n

for any n > 0 and t ≥ 0. We would like to study Ân (t) as n → ∞ which we
will call the diffusion scaling (because the resulting process is a diffusion
process). We first illustrate the diffusion scaling using the same example as
in Section 8.1. Recall that to illustrate the strength of the results we con-
sider (i) an arrival process with an extremely high coefficient of variation,
(ii) a fairly small t, and (iii) analyze arrivals to the second node of a tandem
network (hence arrivals are not IID).

Problem 81
Consider a G/G/1 queue with interarrival times as well as service times
according to Pareto distributions. The coefficient of variation for interarrival
times is 5 and for the service time it is equal to 2. The average arrival rate is
1 per unit time and the average service rate is 1.25 per unit time. The depar-
tures from this queue act as arrivals to a downstream queue. Let A(t) be the
number of entities that arrive at the downstream node during (0, t]. For t = 0
to 10 time units, graph three sample paths of Ân (t) = A(nt)−nλt
√
n
versus t for
n = 1, 10, 100, and 1000.
Solution
It is crucial to note that the A(t) process is the arrivals to the downstream
node which is the same as the departures from the G/G/1 node described
in the question. Also the average arrival rate is λ = 1. By writing a simula-
tion using the algorithm in Problem 37 in Chapter 4, we can obtain sample
paths of the output process from the G/G/1 queue, in particular the number
of departures during any interval of time. Using this for various values of
n = 1, 10, 100, and 1000, we can plot three sample paths of Ân (t) = A(nt)−nλt
√
n
versus t, as shown in Figure 8.15(a)–(d).
From the figure, note that in (a) where n = 1, the sample paths are similar
to the workload process with jumps and constant declining sample paths,
except the values go below zero. When n = 10 as seen in (b), the sample paths
are still similar but they are closer than in case (a) because we have about
100 arrivals as opposed to 10 in case (a). We see this trend more prominent
in case (c) where the sample paths have closed in and the jumps are not so
prominent. Finally in (d) when n = 1000 which for this example is sufficiently
large, the sample paths essentially look like Brownian motions. There are a
couple of things to notice. Unlike the fluid limits, the diffusion limit does not
go to a deterministic value but it appears to be a normal random variable
(and the whole process converges to a Brownian motion). Also, the range
486 Analysis of Queues

4 5

4
3
3
2
2
1 1

An(t)
An(t)

0 0

–1
–1
–2
–2
–3

–3 –4
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(a) t (b) t

8 8

6 6

4
4
2
2
An(t)
An(t)

0
0
–2
–2
–4

–4 –6

–6 –8
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(c) t (d) t

FIGURE 8.15
Sample paths of scaled arrival process Ân (t). (a) n = 1. (b) n = 10. (c) n = 100. (d) n = 1000.

of values for all four cases are more or less the same. In other words, the
variability does not appear to depend on n.

While this result appears to be reasonable for a general arrival process, for
the description of the key results we only consider renewal processes (refer
to Whitt [105] for a rigorous description in the more general case of arrival
processes, not necessarily renewal). Now, let {A(t), t ≥ 0} be a renewal pro-
cess with average interrenewal time 1/λ and squared coefficient of variation
of interarrival time C2a . We know that for a large n and any t, A(nt) would be
approximately normally distributed with mean λnt and variance λC2a nt. Thus
for a large n, Ân (t) = A(nt)−nλt
√
n
would be normally distributed with mean 0
and variance λC2a t. Also, based on the previous example, we can conjecture
that the stochastic process {Ân (t), t ≥ 0} converges to a Brownian motion with
drift 0 and variance term λC2a as n → ∞. The theory that supports this conjec-
ture is an extension of the well-known central limit theorem to functionals.
Stability, Fluid Approximations, and Non-stationary Queues 487

This functional central limit theorem (FCLT) is called Donsker’s theorem

which we present next.
To describe FCLT, we suppose that Z1 , Z2 , Z3 , . . ., is a sequence of IID
random variables with finite mean m and finite variance σ2 . Let Sn be the
partial sum described as

Sn = Z1 + Z2 + · · · + Zn

n −mn
for any n ≥ 1. Central limit theorem essentially states that as n → ∞, Sσ √
n
converges to a standard normal random variable. In practice, for large n
one approximates Sn as a normal random variable with mean nm and vari-
ance nσ2 . Donsker’s theorem essentially generalizes this to functionals thus
resulting in the FCLT. Define Yn (t) as

nt
S
nt − m
nt 1
Yn (t) = √ = √ [Zi − m]
σ n σ n
i=1

for any t ≥ 0 where

z is the floor function denoting the greatest integer less
than or equal to z. Donsker’s FCLT states that as n → ∞, the entire stochastic
process {Yn (t), t ≥ 0} converges to the standard Brownian motion (also knows
as the Weiner process). In other words if the stochastic process {B(t), t ≥ 0} is
a standard Brownian motion, that is, a Brownian motion with drift 0 and
variance term 1, then as n → ∞, {Yn (t), t ≥ 0} converges in distribution to
{B(t), t ≥ 0}.
Notice that we have shown FCLT for only the partial sum Sn . Next
we consider the counting process or renewal process {N(t), t ≥ 0} where
N(t) = max{k ≥ 0 : Sk ≤ t}, that is, the number of renewals in the time interval
(0, t]. Define Rn (t) as

N(nt) − nt/m
Rn (t) = √
(σ/m) n/m

for any t ≥ 0. Chen and Yao [19] show that by applying Donsker’s theorem
and random change theorem as n → ∞, the stochastic process {Rn (t), t ≥ 0}
also converges to the standard Brownian motion. To develop an intuition
it may be worthwhile to show that for large t, N(t) is a normal random
variable with mean t/m and variance σ2 t/m3 (see Exercises at the end of
the chapter). In summary, if {B(t), t ≥ 0} is a Brownian motion with drift 0
and variance term 1, then as n → ∞, {Rn (t), t ≥ 0} converges in distribution
to {B(t), t ≥ 0}. Now, we put this in perspective with respect to the arrival
process {A(t), t ≥ 0} which is a renewal process with average interrenewal
time 1/λ and squared coefficient of variation of interarrival time C2a . Using
488 Analysis of Queues

the preceding result we can verify our conjecture that {Ân (t), t ≥ 0} defined
earlier converges to a Brownian motion with drift 0 and variance term λC2a
as n → ∞.
It is not difficult to see that similar to the arrival process, the service
time process when scaled in a similar fashion also converges to a Brownian
motion. Thus the next natural step is to use the results in a G/G/1 setting
where the average arrival rate is λ and the SCOV of the interarrival times
is C2a , and the service rate is μ with service time SCOV C2s . The analysis
would be identical to that in Section 7.1.1. There we showed the results
using the normal approximation which would follow in a very similar fash-
ion, albeit more rigorous, if we modeled the underlying stochastic processes
as Brownian motions. For sake of completeness we simply restate those
results here. As the traffic intensity ρ (recall that ρ = λ/μ) approaches 1, the
workload in the system converges to a reflected
Brownian motion with drift
(λ − μ)/μ and variance term λ C2a + C2s /μ2 . Thus the steady-state distribu-
tion of the workload is exponential with parameter γ per unit time, where
2(1−ρ)μ 2
γ=
2 2
. Since an arriving customer in steady state would wait for a
λ Ca +Cs
time equal to the workload for service to begin, the waiting time before
service is also according to exp(γ) when ρ ≈ 1.
Although we did not explicitly state in Section 7.1.1, this is an extremely
useful result. For example we could answer questions such as what is the
probability that the service for an arriving customer would begin within
the next t0 time (answer: 1 − e−γt0 ). This is also extremely useful in design-
ing the system. For example if the quality-of-service metric is that not more
than 5% of the customers must wait longer than 5 time units (e.g., minutes),
then we can write that constraint as e−γ5 ≤ 0.05. Thus it is possible to obtain
approximate expressions for the distribution of waiting times and sojourn
times using the diffusion approximation when the traffic intensity is close
to one (for that reason these approximations are also referred to as heavy-
traffic approximations). That said, in the next two sections we will explore
the use of diffusion approximations in multiserver queue settings. However,
the approach, scaling, and analysis are significantly different from what was
considered for the G/G/1 case.

8.3.2 Diffusion Approximation for Multiserver Queues

As the title suggests, here we consider diffusion approximations specifically
for multiserver queues and the methodology is somewhat different from
that for single-server queues. We first explain the diffusion approximation
in a rather crude fashion and then describe the multiserver setting in more
detail. The main goal in a diffusion approximation or diffusion scaling is
to consider a stochastic process {Z(t), t ≥ 0} whose transient or steady-state
distribution we are interested in (usually this is the number in the system
process {X(t), t ≥ 0} but we will keep it more generic here). The process is
Stability, Fluid Approximations, and Non-stationary Queues 489

√
scaled by a factor “n” across time and n across “space” so that we define
Ẑn (t) as

Z(nt) − Z(nt)
Ẑn (t) = √
n

for any n > 0 and t ≥ 0. The term Z(nt) is the deterministic fluid model
(potentially different from the fluid scaling we saw earlier in this chapter)
of the stochastic process {Z(t), t ≥ 0}. Usually, Z(nt) = E[Z(nt)] or a heuristic
approximation for it. However, if that is not possible, then the usual fluid
scaling (via a completely different scale) can be applied, that is, Z(nt) =
Z (nt) where the RHS is the usual fluid limit where we let the fluid scale
→ ∞.
Assuming that the deterministic fluid model (Z(nt)) can be computed,
the main objective here is to study Ẑn (t). In particular, by applying the
scaling “n,” the analysis is to show that as n → ∞, the stochastic process
{Ẑn (t), t ≥ 0} converges to a diffusion process (that is the reason this method
is called diffusion approximation or diffusion scaling). A diffusion process is
a continuous-time stochastic process with almost surely continuous sample
paths and satisfies the Markov property. Examples of diffusion processes
are Brownian motion, Ornstein–Uhlenbeck process, Brownian bridge pro-
cess, branching process, etc. It is beyond the scope of this book to show the
convergence of the stochastic process {Ẑn (t), t ≥ 0} as n → ∞ to a diffusion
process {Ẑ∞ (t), t ≥ 0}. However, we do provide an intuition and interested
readers are referred to Whitt [105] for technical details. The key idea of diffu-
sion approximation is to start by using the properties of {Ẑ∞ (t), t ≥ 0}, such as
the distribution of Ẑ∞ (∞). Then for large n, Ẑn (∞) is approximately equal in
distribution to Ẑ∞ (∞). Thereby, we can approximately obtain a distribution
for Z(∞) using
√
Z(∞) = Z(∞) + nẐn (∞)

such that Ẑn (∞) ≈ Ẑ∞ (∞).

Having said that, the issue that is unclear is what n is or how it should
be picked. In other words, under what scale of time (and/or space), the dif-
fusion approximation is appropriate. Notice that we pick a large n for the
approximation, but does it have any physical significance? Turns out for the
multiserver queue with arrival rate λ, service rate μ, and s servers, there are
two possibilities. One could choose n = 1/(1 − ρ)2 , where ρ = λ/sμ or n = s.
In other words, the diffusion approximation works well when the traffic is
heavy or there are a large number of servers (or even both). That is essentially
what we would consider for the rest of this section. For the sake of illustra-
tion and development of intuition, we only consider an M/M/s queue and
490 Analysis of Queues

interested readers are referred to the literature, especially Whitt [105], for the
G/G/s case. In M/M/s queues, Markov property leads to diffusion processes,
however, in the G/G/s case although the marginal distribution at any time in
steady state converges to Gaussian, the process itself may not be a diffusion
(since Markov property would not be satisfied). Nonetheless there is merit in
considering the M/M/s case. Thus for the remainder of this section we only
consider M/M/s queues, that is, Poisson arrivals (at rate λ) and exponential
service times (with mean 1/μ at every server).
For such an M/M/s queue, let X(t) be the number of customers in the sys-
tem at time t. We are interested in applying diffusion scaling to the stochastic
process {X(t), t ≥ 0}. Further, define X̂n (t) as

X(nt) − X(nt)
X̂n (t) = √
n

for any n > 0 and t ≥ 0. As a heuristic approximation for the deterministic

fluid model X(nt), we consider the steady-state number in the system L for
the M/M/s queue which from Section 2.1 is

λ p0 (λ/μ)s λ
L= +
μ s!sμ[1 − λ/(sμ)]2

where
s−1 −1
1 (λ/μ)s 1
p0 = (λ/μ)n + .
n! s! 1 − λ/(sμ)
n=0

Thus we use the heuristic approximation X(nt) = L. Since L = E[X(∞)], the

approximation is exact for large (nt). That said, we now study the diffusion-
scaled process {X̂n (t), t ≥ 0} where

X(nt) − L
X̂n (t) = √
n

by increasing n (in all our numerical experiments we let X(0) =

L). Our
objective here is to show that as n increases, {X̂n (t), t ≥ 0} converges to a
diffusion process. For that we consider three different sets of experiments
corresponding to the three different scalings for n.

8.3.2.1 Fix s, Increase λ

In this scaling, we consider a sequence of M/M/s queues where μ and s are
held a constant and only λ is increased so that ρ approaches 1. We use the
Stability, Fluid Approximations, and Non-stationary Queues 491

2 1.5

1.5
1

Diffusion-scaled process
Diffusion-scaled process

1
0.5
0.5
0
0
–0.5
–0.5

–1 –1

–1.5 –1.5
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
(a) t (b) t

0.4 0.25

0.2 0.2
Diffusion-scaled process

Diffusion-scaled process

0
0.15
–0.2
0.1
–0.4
0.05
–0.6
0
–0.8

–1 –0.05

–1.2 –0.1
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
(c) t (d) t

FIGURE 8.16
Sample paths of scaled process X̂n (t/n) vs. t for μ = 1 and s = 4. (a) λ = 3.2, ρ = 0.8, n = 25.
(b) λ = 3.6, ρ = 0.9, n = 100. (c) λ = 3.8, ρ = 0.95, n = 400. (d) λ = 3.96, ρ = 0.99, n = 10, 000.

scale n = 1/(1 − ρ)2 so that n increases as ρ increases. We plot

X(t) − L
X̂n (t/n) = √
n

versus t for various increasing values of λ in Figure 8.16. Notice that the
diffusion-scaled process is a little different and not scaled across time (we use
t/n as opposed to t). From Figure 8.16 it is clear the {X̂n (t/n), t ≥ 0} process
converges to a diffusion process as n is scaled. This would be more pow-
erful if we were to have scaled time as well, that is, plotted X̂n (t) instead of
X̂n (t/n). One could use this scaling when the system has high traffic intensity
but not a large number of servers.

8.3.2.2 Fix ρ, Increase λ and s

In this scaling, we consider a sequence of M/M/s queues where μ and ρ are
held a constant but λ and s are increased so that s approaches ∞. We use the
492 Analysis of Queues

2.5 5

2 4
1.5 3

Diffusion-scaled process
Diffusion-scaled process

1
2
0.5
1
0
0
–0.5

–1 –1

–1.5 –2

–2 –3
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(a) t (b) t

1.5 2

1
1
0.5
Diffusion-scaled process
Diffusion-scaled process

0
0
–0.5
–1 –1
–1.5
–2
–2
–2.5
–3
–3
–3.5 –4
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(c) t (d) t

FIGURE 8.17
Sample paths of scaled process X̂n (t/n) vs. t for μ = 1 and ρ = 0.9. (a) λ = 3.6, s = 4, n = 4.
(b) λ = 14.4, s = 16, n = 16. (c) λ = 36, s = 40, n = 40. (d) λ = 90, s = 100, n = 100.

scale n = s so that n increases as s increases. We plot

X(t) − L
X̂n (t/n) = √
n

versus t for various increasing values of λ and s in Figure 8.17. Notice that
the diffusion-scaled process is a little different and not scaled across time
(we use t/n as opposed to t). From Figure 8.17 it is clear the {X̂n (t/n), t ≥ 0}
process converges to a diffusion process as n is scaled. This would be more
powerful if we were to have scaled time as well, that is, plotted X̂n (t) instead
of X̂n (t/n). One could use this scaling when the system has a large number
of servers but not a very high traffic intensity.

8.3.2.3 Fix β, increase λ and s

In this scaling, we consider a sequence of M/M/s queues where only μ is
held a constant but λ and s are increased so that ρ approaches 1. In particular
Stability, Fluid Approximations, and Non-stationary Queues 493

we consider the Halfin–Whitt regime (due to Halfin and Whitt [50]) in which
ρ → 1 but β is held a constant where
√
β = (1 − ρ) s.

We use the scale n = s (the choice of n = 1/(1 − ρ)2 would have also worked)
so that n increases as s increases. We plot

X(t) − L
X̂n (t/n) = √
n

versus t for various increasing values of λ and s in Figure 8.18. Notice that the
diffusion-scaled process is a little different and not scaled across time (we use
t/n as opposed to t). From Figure 8.18 it is clear the {X̂n (t/n), t ≥ 0} process
converges to a diffusion process as n is scaled. This would be more power-
ful if we were to have scaled time as well, that is, plotted X̂n (t) instead of
X̂n (t/n). One could use this scaling when the system has both high traffic

1 3
0.5 2.5

0 2
Diffusion-scaled process

Diffusion-scaled process

1.5
–0.5
1
–1
0.5
–1.5
0
–2
–0.5
–2.5 –1
–3 –1.5
–3.5 –2
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(a) t (b) t

2.5 3

2 2
Diffusion-scaled process
Diffusion-scaled process

1.5
1
1

0.5 0

0 –1
–0.5
–2
–1
–3
–1.5

–2 –4
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
(c) t (d) t

FIGURE 8.18
Sample paths of scaled process X̂n (t/n) vs. t for μ = 1 and β = 0.2. (a) λ = 0.8, ρ = 0.8, n = s = 1.
(b) λ = 3.6, ρ = 0.9, n = s = 4. (c) λ = 15.2, ρ = 0.95, n = s = 16. (d) λ = 62.4, ρ = 0.975, n = s = 64.
494 Analysis of Queues

intensity and a large number of servers (which is typical in inbound call

centers).
In summary, based on the experimental evidence, it is not hard to see
that the stochastic process {X̂n (t), t ≥ 0} converges to a diffusion process as
n increases. Also, as n → ∞ and t → ∞, E[X̂n (t)] converges to zero since
E[X(nt)] would be L. However, it is more tricky to obtain a distribution for
X̂n (t) as t → ∞ and n → ∞. Halfin and Whitt [50] show that if we define

X(nt) − s
X̂n (t) = √
n

and use the scale n = s, then

lim P{X̂n (∞) ≥ 0} = θ

n→∞

where θ is a constant satisfying 0 < θ < 1. We provide an intuition for that

first and subsequently describe θ. For X̂n (t) ≥ 0, the process converges to a
reflected Brownian motion with negative drift (since for X(nt) ≥ s, the CTMC
is a birth and death process with constant parameters λ and sμ which is a
random walk that converges to a Brownian motion upon scaling). However,
for X̂n (t) < 0, the process converges to an Ornstein–Uhlenbeck process (since
for X(nt) < s, the CTMC is a birth and death process with parameters λ and
X(nt)μ which is a random walk that converges to an Ornstein–Uhlenbeck
process upon scaling). Halfin and Whitt [50] state that this argument also
holds for G/M/s queues as well.
Therefore, from this realization it is possible to obtain the probability that
X̂n (t) ≥ 0 as t → ∞ and n → ∞. Thus under the scaling n = s,

P{X(∞) ≥ s} → θ

and Halfin and Whitt [50] show that

√ −1
θ = 1 + 2πβφ(β) exp(β2 /2)

where φ(x) is the probability a standard normal random variable is less

than x. Thus with probability θ, a customer arriving to the system in steady
state will experience any delay. It is crucial to notice that with the other
two scalings we considered earlier as well, it is possible to obtain θ. In
those cases, θ would just be 0 or 1. In particular, if we fix s and increase λ
then θ approaches 1 (that is, an arriving request with probability 1 will be
delayed for service to begin). Whereas if we fix ρ but increase λ and s, then θ
approaches zero (that is, an arriving customer with probability 1 will find a
free server).
Stability, Fluid Approximations, and Non-stationary Queues 495

Whether or not an arriving customer would have to wait for service to

begin leads us to a related topic of abandonments where customers abandon
the queue if their service does not begin in a reasonable time. This is the focus
of the next section.

8.3.3 Efﬁciency-Driven Regime for Multiserver Queues with Abandonments

In this section, we consider queues with customer abandonments. There are
two main types of abandonments: balking and reneging. When a customer
arrives to a queueing system but decides to abandon it without joining
(usually because the queue is long), then we say that the customer is balk-
ing. Balking typically occurs only in queues where customers can actually
observe the queues (such as in restaurants, post offices, banks, etc.). Thus it
is not very crucial to address balking in systems like inbound call centers,
computer systems, manufacturing, etc. However, when a customer enters
the system and waits for a while but decides to abandon it because service
has not begun, then that customer is reneging from the queue. In this section
when we say abandonments, we mean reneging as the analysis is motivated
by inbound call centers.
Consider a single-stage queueing system with s identical servers. Cus-
tomers arrive according to a Poisson process with mean arrival rate λ per
second. The service times are exponentially distributed with mean 1/μ sec-
onds. The traffic intensity is ρ = λ/sμ. So far the system looks like an M/M/s
queue which is also known as the Erlang-C model (note: the M/M/s/k queue
is known as Erlang-A model and M/M/s/s, the Erlang-B model). However,
we extend the model to allow for customer abandonments. Each customer
has a patience time so that if the time in the queue before service exceeds
the patience time then the customer would abandon. We assume that the
patience times are IID random variables that are exponentially distributed
with mean patience time 1/α seconds. This system is represented as an
M/M/s/∞ + M queue. The number in the system at time t, X(t), can be
used to model {X(t), t ≥ 0} as a birth and death process with birth parameters
λ and death parameters min(s, i)μ + (i − s)+ α when X(t) = i. However, the
analysis does not result in simple expressions that can be used in design and
control. Hence we consider diffusion approximations.
One can derive diffusion approximations of such systems in three
regimes: (i) ρ < 1 is called quality-driven regime because it strives for qual-
ity of service (typically λ → ∞ and s → ∞ and λ/(sμ) → ρ < 1); (ii) ρ > 1 is
called efficiency-driven (ED) regime because it strives for efficient use of
servers (typically λ → ∞ and s → ∞ and λ/(sμ) → ρ > 1); (iii) ρ ≈ 1 is called
quality- and efficiency-driven√ (QED) regime as it strives for both (especially
as λ → ∞, s → ∞ such that s(1 − ρ) → β, also known as Halfin–Whitt lim-
iting regime). Notice that because of the abandonments (assume α > 0) all
regimes would result in stable queues (in fact the β in QED could also be
negative). Here we only consider the analysis of the ED regime by closely
496 Analysis of Queues

following the results in Whitt [106]. In fact the references in Whitt [106] point
to articles that consider the other two regimes.
To obtain the diffusion limits for the ED regime of the M/M/s + M model,
we begin with the Erlang-A M/M/s/K+M model. The waiting space K−s will
be chosen to be large enough and scaled in a manner that would approach
infinity. Thus we consider a sequence of M/M/s/K + M queues indexed by s,
the number of servers which we would use to scale. In particular, let λs and
Ks be the scaled arrival rate and system capacity. However the service rate
μ and abandonment rate α are not scaled. Also, the traffic intensity ρ is not
scaled and remains fixed for the entire sequence of queues with ρ > 1 (since
the regime is ED). Let

μ(ρ − 1)
q= . (8.10)
α

We perform the following scaling

λs = ρsμ (8.11)

Ks = s(η + 1) (8.12)

for some η > q. Equation 8.12 is to ensure that asymptotically no arriving

customers are rejected due to a full system (see Whitt [106]). With this scaling
we proceed with the diffusion scaling.
Let Xs (t) be the number of customers in the system at time t when there
are s servers. Define the diffusion term

Xs (t) − Xs (t)
X̂s (t) = √
s

where Xs (t) is a deterministic model of Xs (t). We use a heuristic approxi-

mation for the deterministic quantity Xs (t) which is where the Xs (t) tends to
linger around in the ED regime. In particular, we select Xs (t) as an “equilib-
rium” point where the system growth rate equals the shrinkage rate. Thus
we have Xs (t) as the solution to

λs = sμ + [Xs (t) − s]α

by making the realization that Xs (t) must be greater than s (as ρ > 1 results
in λs > min{i, s}μ for any i ≥ 0). Thus we have

λs − sμ
Xs (t) = + s = (1 + q)s (8.13)
α
Stability, Fluid Approximations, and Non-stationary Queues 497

where the last equality is by substitution for λs in Equation 8.11 and using
Equation 8.10 for q. Thus we represent the diffusion term as

Xs (t) − s(1 + q)
X̂s (t) = √ (8.14)
s

for all t ≥ 0.
Whitt [106] shows that the stochastic process {X̂s (t), t ≥ 0} as s → ∞ con-
verges to an Ornstein–Uhlenbeck diffusion process. In state x, the infinites-
imal mean or state-dependent drift of the Ornstein–Uhlenbeck process is
−αx and infinitesimal variance 2μρ. Further, the steady-state distribution
of X̂s (∞) converges to a normal distribution with mean 0 and variance
ρμ/α. Next we explain that briefly. We showed earlier in this section via
simulations how processes like {X̂s (t), t ≥ 0} converge to diffusion processes
(hence the term diffusion limit) as s → ∞. Thus that is not a surprising result.
Further, it is possible to show a weak convergence of the birth and death pro-
cess {Xs (t), t ≥ 0} to an Ornstein–Uhlenbeck process by appropriately scaling
(akin to how the constant birth and death parameter converges to a Brown-
ian motion). That is because beyond state s(1+q) since the death rate exceeds
the birth rate, the process gets pulled back to s(1 + q). Likewise, below state
s(1 + q) where the birth rate exceeds the death rate, the process gets pushed
up to s(1 + q). This results in a convergence to the Ornstein–Uhlenbeck
process centered around s(1 + q). Also, the steady-state distribution of an
Ornstein–Uhlenbeck diffusion process centered at zero (with mean drift rate
−m and infinitesimal variance v) is zero-mean normal with variance equal
to v/(2m). Notice that the drift rate of −m implies that the drift in state x is
−mx. That said, the only things remaining to be shown are that the drift rate
for our process is m = α and infinitesimal variance equal to 2μρ. This is the
focus of the next problem.

Problem 82
Show that the Ornstein–Uhlenbeck diffusion process that results from scal-
ing {X̂s (t), t ≥ 0} as s → ∞ has a drift of m(x) = −αx and infinitesimal variance
v(x) = 2μρ for any feasible state x.
Solution
Notice that since Xs (t) takes on any integer value√k ≥ 0, X̂s (t) would corre-
spondingly take on discrete values [k − s(1 + q)]/ s for k = 0, 1, 2, . . . but we
are ultimately interested in any real-valued state x. For that we first consider
an arbitrary real value x and a sequence of xs values for X̂s (t) for each s so
that xs → x as s → ∞. Assuming that s is sufficiently large, we can consider
the following choice for xs so that the preceding condition is met:
498 Analysis of Queues

√
s(1 + q) + x s − s(1 + q)
xs = √ .
s

For any s, the infinitesimal mean (corresponding to the drift) ms (xs ) can
be computed as follows (with the first equation being the definition):

ms (xs ) = lim E[(X̂s (t + h) − X̂s (t))/h|X̂s (t) = xs ],

h→0
√ √
= lim E[(Xs (t + h) − Xs (t))/ h s |Xs (t) = sxs + s(1 + q)],
h→0
√
λs h − μsh − αh sxs + sq + o(h)
= lim √ ,
h→0 h s
√
for any xs ≥ − q s where o(h) is a collection of terms of order less than h such
that o(h)/h → 0 as h → 0.
By taking the limit h → 0 and substituting for λs using Equation 8.11
we get
√ √ √
ms (xs ) = μρ s − μ s − αxs − αq s = −αxs

by substituting for q in Equation 8.10. Now we let s → ∞ resulting in

ms (xs ) → m(x) such that

m(x) = −αx.

It is worthwhile
√ pointing out that
√this expression was derived assuming that
xs ≥ − q s. But what if xs < − q s? It turns out that for sufficiently
√ large s it
is not even feasible to reach states xs that are smaller than −q s. Even if one
were to reach such a state, the calculation would result in m(x) = ∞ which
would imply an instantaneous drift to a higher x. However, that is the reason
the problem is worded as m(x) = αx for any feasible state x. Whitt [106] shows
that P{Xs (∞) ≤ s} → 0 as s → ∞ using a fluid model which implies that in
√ there is not chance for xs to be less than zero (leave alone less
steady state
than −q s).
Next we consider the infinitesimal variance. For any s, the infinitesimal
variance vs (xs ) can be computed as follows (with the first equation being the
definition):

vs (xs ) = lim E (X̂s (t + h) − X̂s (t))2 /hX̂s (t) = xs ,
h→0
√
= lim E (Xs (t + h) − Xs (t))2 /(hs)|Xs (t) = sxs + s(1 + q) ,
h→0
√
λs h + μsh + αh sxs + sq + o(h)
= lim ,
h→0 hs
Stability, Fluid Approximations, and Non-stationary Queues 499

√
for any xs ≥ − q s where o(h) is a collection of terms of order less than h such
that o(h)/h → 0 as h → 0 but different from the o(h) defined in ms (xs ).
By taking the limit h → 0 and substituting for λs using Equation 8.11
we get
√ √
vs (xs ) = μρ + μ + αxs / s + αq = 2ρμ + αxs / s

where the second equality uses expressions for q from Equation 8.10. Now
we let s → ∞ resulting in vs (xs ) → v(x) such that

v(x) = 2μρ.

Thus the steady-state distribution of the diffusion process, that is, X̂s (∞)
as s → ∞ converges to a normal distribution with mean 0 and variance ρμ/α.
This involves a rigorous argument taking stochastic process limits appropri-
ately (see Whitt [106] and Whitt [105] for further details). Therefore, as an
approximation we can use for fairly large s values that X̂s (∞) is approxi-
mately normally distributed with mean 0 and variance ρμ/α. Hence using
Equation 8.14 we can state that Xs (∞) is approximately normally distributed
with mean s(1 + q) and variance sρμ/α. Assuming a reasonably signifi-
μ
cant abandonment rate α so that α << s we can see that Xs (∞) would be
greater than s with a very high probability (approximately 1). Hence we
can write down Lq ≈ L − s using our usual definition of L and Lq being the
steady-state number of customers in the system and in the queue waiting for
service to begin, respectively. Thus we have Lq ≈ sq since L ≈ s(1 + q). Now
define Pab as the probability that an arriving customer in steady state would
abandon without service. Using Little’s law for abandoning customers,
we have

1
Lq = λs Pab .
α

Using the fact that Lq ≈ sq = sμ(ρ − 1)/α and λs = sρμ, we can compute

ρ−1
Pab ≈ .
ρ

To illustrate the results we use an example from Whitt [106]. Consider an

inbound call center with s = 100 servers each with service rate μ = 1 customer
per minute. The arrival rate is λ = 110 customers per minute and the aver-
age abandonment time is 1/α = 10 min for each customer. For such a system
ρ = 1.1 which gives rise to Lq ≈ sq = sμ(ρ−1)/α = 100 and Pab ≈ ρ−1ρ = 10/11.
These approximations match extremely closely with the exact results. Using
the fact that Xs (∞) approximately equals s plus the number of customers
500 Analysis of Queues

waiting to begin service, the variance of Xs (∞) must be equal to the vari-
ance of the number of customers waiting to begin service. Using that we
can obtain for the preceding numerical example that the standard devia-
tion
√ of the number of customers waiting to begin service is approximately
sρμ/α = 33.1662, which is remarkably close to the exact result of 33.1 cus-
tomers. The key point to make is that these approximations are conducive
to use in design and control (as opposed to the exact results). This is typical
for most diffusion approximations where the results are surprisingly simple
although the process to obtain them are fairly intensive. That said, in the
next section we leverage upon both fluid and diffusion approximations for
transient analysis.

8.4 Fluid Models for Queues with Time-Varying Parameters

In this section, we focus on transient analysis of queueing systems where the
parameters such as arrival rate, service rate, and number of servers are time
varying. We only consider systems that are modulated by Poisson processes.
However, the Poisson processes can be deterministically or stochastically
nonhomogeneous over time. As an example, consider the Mt /M/st queue
that is useful to model inbound call centers where the expected arrival rate
is time varying (say λt at time t) and the number of servers or agents is also
time varying (say st at time t). The mean service rate μ remains a constant.
Out ultimate goal is to derive the mean and variance of X(t) which is the
number of customers in the system at time t. For that we first describe the
notation for a generic nonhomogeneous Poisson process and then consider
the two nonhomogeneous Poisson processes that modulate the dynamics of
the Mt /M/st example system.
In terms of notation, let (t) be the expected number of events in a generic
nonhomogeneous Poisson process in time (0, t]. Then the nonhomogeneous
Poisson process is written as N((t)) which is the random number of events
from time 0 to t. For the Mt /M/st queue described here, the arrival process
is the classical nonhomogeneous Poisson process Na (·) for which we use the
subscript “a” to denote “arrival.” We can write down the nonhomogeneous
arrival process as
⎛ ⎞
t
Na ⎝ λu du⎠ .
0

The other set of events that are responsible for the dynamics of an Mt /M/st
queue are the departures. We let Nd (·) to denote the departure process,
which is also a Poisson process that is not only time-homogeneous but also
Stability, Fluid Approximations, and Non-stationary Queues 501

state dependent. Thus the nonhomogeneous departure process is

⎛ ⎞
t
Nd ⎝ min{X(u), su }μdu⎠ .
0

Thus, for this Mt /M/st queue, we can write down X(t) in terms of the
initial state of the system X(0), as well as the two nonhomogeneous Poisson
processes Na (·) and Nd (·) as
⎛ ⎛ ⎞ ⎞
t t
X(t) = X(0) + Na ⎝ λu du⎠ − Nd ⎝ min{X(u), su }μdu⎠ .
0 0

To this say we add two additional situations: (i) customers renege after
exp(β) time if their service does not start; (ii) there is a new stream of
customers that arrive according to a homogeneous Poisson process with
parameter α but could balk upon arrival resulting in a queue joining prob-
ability qX(t) if the customer arrives at time t and sees X(t) others in the
system. Clearly the reneging occurs according to a nonhomogeneous Poisson
process, let us call it Nr (·). Likewise, let Nb (·) denote the nonhomoge-
neous Poisson process corresponding to the second stream of customers that
potentially balk. For this modified system we can write down X(t) as
⎛ ⎛ ⎞ ⎞
t t
X(t) = X(0) + Na ⎝ λu du⎠ − Nd ⎝ min{X(u), su }μdu⎠
0 0
⎛ ⎞ ⎛ ⎞
t t
− Nr ⎝β max{X(u) − su , 0}du⎠ + Nb ⎝α qX(u) du⎠ .
0 0

With that example and extension we next consider a generic single-

station queueing system with time-varying parameters and state-dependent
transitions modulated by k Poisson processes (where k is an arbitrary num-
ber which is equal to 2 and 4, respectively, in the previous example and its
extension). Let X(t) be the number of customers in this queueing system at
time t. We would like to write down a generic expression for X(t) so that we
can describe a theory for that. In that light, assume that we can write down
an integral equation for X(t) as
⎛ ⎞

k t
X(t) = X(0) + li Yi ⎝ fi (s, X(s))ds⎠ , (8.15)
i=1 0
502 Analysis of Queues

where Yi (·) is an independent nonhomogeneous Poisson process, li = ± 1,

and fi (·, ·) is a continuous function for all i ∈ {1, 2, . . . , k}. It is a worthwhile
exercise to explicitly write down li , Yi (·), and fi (·, ·) for the previous exam-
ple and its extension before forging ahead. After having done that we ask
the question: Using fluid and diffusion approximations can we write down
expressions for E[X(t)] and Var[X(t)]? The answer is yes and that will be the
focus of the rest of this section.

8.4.1 Uniform Acceleration

To obtain E[X(t)] for X(t) defined in Equation 8.15, we use the concept of
fluid scaling. The approach presented here was developed initially in Kurtz
[71] and further fine-tuned by Mandelbaum et al. [77]. For the approach to
work, we make some assumptions that are fairly mild for most queueing
systems. We first assume that the number of modulating Poisson processes k
is finite. Then we assume that for any i, t, and x, |fi (t, x)| ≤ Ci (1 + x) for some
Ci < ∞ and T < ∞ such that t ≤ T. Next we define the function F(x, t) as

k
F(t, x) = li fi (t, x). (8.16)
i=1

We assume that there exists a constant M < ∞ such that |F(t, x) −

F(t, y)| ≤ M|x − y| for all t ≤ T and T < ∞. It is crucial to point out that we
will not explicitly state the assumptions in the results to follow, but they are
all necessary to be satisfied.
Now we are ready to obtain the fluid limits for the stochastic process
{X(t), t ≥ 0}. For this we perform a fluid scaling that results in a determin-
istic process. Recall X(t) defined in Equation 8.15. Consider a sequence of
stochastic processes {Xn (t), t ≥ 0} indexed by n so that Xn (t) is obtained by
scaling X(t) as follows:

k ! "
t
nXn (0) + li Yi nfi (s, Xn (s))ds
i=1 0
Xn (t) = (8.17)
n

with nXn (0) = X(0) so that the initial state is also scaled. The scaled process
{Xn (t), t ≥ 0} is obtained essentially by taking n times faster rates of events.
Such a scaling is also called uniform acceleration in the literature (see Massey
and Whitt [79]). Like in all the fluid models we have seen in this chapter, here
too as n → ∞, the scaled process {Xn (t), t ≥ 0} converges to a deterministic
process almost surely.
The result once again is an artifact of functional strong law of large num-
bers (FSLLN), which leads to what is called the strong approximation. In
Stability, Fluid Approximations, and Non-stationary Queues 503

particular, as described in Kurtz [71],

lim Xn (t) = X̄(t)

n→∞

almost surely, where X̄(t) is a deterministic quantity which results in a

deterministic process {X̄(t), t ≥ 0}. If we obtain X̄(t), our fluid model is
complete. As a heuristic we consider X̄(t) = limn → ∞ E[Xn (t)] which is rea-
sonable when the assumptions described earlier are satisfied. Thus taking
expectations on both sides of Equation 8.17 we get

k ! "
t
nXn (0) + li E Yi nfi (s, Xn (s))ds
i=1 0
E[Xn (t)] =
n
k
t
nXn (0) + li E nfi (s, Xn (s))ds
i=1 0
= (8.18)
n
k
t
= Xn (0) + li E fi (s, Xn (s))ds (8.19)
i=1 0

where Equation 8.18 is due to the Poisson process property recalling that
the expected value of a nonhomogeneous Poisson process N((t)) is (t).
If we know the distribution of Xn (s) then we can write down Equation 8.19,
but we consider a nonparametric approach. For that we use the Lipschitz
property of the function fi (·, ·) due to which

|E[fi (s, Xn (s))] − fi (s, E[Xn (s)])| ≤ ME[|Xn (s) − E[Xn (s)]|].

If we let n → ∞ in this expression, then the RHS goes to zero. Thus we have

lim E[fi (s, Xn (s))] = lim fi (s, E[Xn (s)]).

n→∞ n→∞

Using this result in Equation 8.19 and letting n → ∞, we can rewrite

Equation 8.19 as

k t
lim E[Xn (t)] = Xn (0) + li lim fi (s, E[Xn (s)])ds.
n→∞ n→∞
i=1 0
504 Analysis of Queues

Since we consider X̄(t) = limn → ∞ E[Xn (t)], using the previous equation we
can write down X̄(t) as the solution to the equation

k t
X̄(t) = X̄(0) + li fi (s, X̄(s))ds. (8.20)
i=1 0

Note that using the previous expression it is possible to solve numerically for
X̄(t) for any t. Thus for large n one can approximate Xn (t) as the deterministic
quantity X̄(t). But what is the connection to E[X(t)] that we alluded to earlier
in this section? As it turns out, that would have to be done on a case-by-case
basis. We illustrate that process using an example next.

Problem 83
Consider an Mt /M/st system that models an inbound call center. The con-
stant mean service rate for this call center is μ = 4 customers per hour.
Table 8.2 describes the expected arrival rate λt per hour and number of
servers (st ) by discretizing into eight hourly intervals. Develop a fluid scaling
or uniform acceleration for this system by numerically describing the deter-
ministic process {X̄(t), 0 ≤ t ≤ 8}. Compare against a simulated sample path
of the number in the system process {X(t), 0 ≤ t ≤ 8}. Assume that X(0) = 80,
that is, at time zero there are already 80 customers in the system. Also, obtain
an approximation for E[X(t)] and compare against simulations by creating
100 replications and averaging them.
Solution
The problem description is that of a call center where the arrival rate and
number of servers (that is, representatives or call handlers) are time vary-
ing. However, within an hour we assume that they are held a constant (the

TABLE 8.2
Hourly Arrival Rate and
Staffing at a Call Center
t λt st
(0, 1] 400 110
(1, 2] 440 120
(2, 3] 500 130
(3, 4] 720 170
(4, 5] 800 220
(5, 6] 720 200
(6, 7] 600 140
(7, 8] 400 120
Stability, Fluid Approximations, and Non-stationary Queues 505

analytical model does not need this assumption, it is there just for the
insights). It may be worthwhile going through Table 8.2. In essence during
the first hour 400 customers are expected and the number of servers is 110.
Likewise, during the seventh hour 600 customers are expected and the num-
ber of servers is 140. Notice that in time intervals (3, 4] and (6, 7] there is an
overload situation, that is, the arrival rate is larger than the service capacity,
λt > st μ. However, since this is a transient analysis, that is not much of an
issue but worth watching out for.
That said we now consider the Mt /M/st system with μ = 4 and λt and st
from Table 8.2. Let X(t) be the number of customers in the system at time t
with X(0) = 80. We can rewrite Equation 8.15 as
⎛ ⎞ ⎛ ⎞
t t
X(t) = X(0) + Y1 ⎝ λu du⎠ − Y2 ⎝ μ min{su , X(u)}du⎠ , (8.21)
0 0

where Y1 (·) and Y2 (·) are the nonhomogeneous Poisson arrival and depar-
ture processes, respectively. For some large n and all t ∈ [0, 8], let

λt st
at = and rt = .
n n

We will pretend rt is an integer for the fluid scaling but that will not be neces-
sary for the limiting deterministic process that will be defined subsequently.
Define Xn (t) as

t t
nXn (0) + Y1 0 nau du − Y2 0 μn min{ru , Xn (u)}du
Xn (t) = ,
n

where Xn (0) = X(0)/n. Notice that this equation is identical in form to that of
the scaled process in Equation 8.17. More crucially notice that this equation
is also identical to Equation 8.21 if we let

X(t) = nXn (t)

for all t.
As n → ∞, Xn (t) converges to X̄(t) which is given by the solution to
⎛ ⎞ ⎛ ⎞
t t
X̄(t) = X̄(0) + ⎝ au du⎠ − ⎝ μ min{ru , X̄(u)}du⎠ , (8.22)
0 0

as described in Equation 8.20. Before proceeding ahead, it is important to

realize that the theory is developed in an opposite direction. In particular, in
theory we begin with rt and at known and then scale the system by a huge
506 Analysis of Queues

factor n and say that the scaled process {Xn (t), t ≥ 0} converges to its fluid
limit {X̄(t), t ≥ 0}. However, in this problem we begin with λt and st , select an
n and then figure rt and at . Thus the approximation would work well when
λt is significantly larger than μ and st significantly larger than 1. In fact, the
choice of n is actually irrelevant.
We arbitrarily select n = 50 (any other choice would not change the
results). Then we solve for X̄(t) by performing a numerical integration for
Equation 8.22 via first principles in calculus. Using this fluid scaling we plot
an approximation for X(t) using X(t) ≈ nX̄(t) in Figure 8.19. To actually com-
pare against a sample path of X(t), we also plot a simulated sample path of
X(t) in that same figure (see the jagged line). The smooth line in that figure
corresponds to nX̄(t). The crucial thing to realize is that figure would not
change if a different n was selected. Notice how remarkably closely the sim-
ulated graph follows the fluid limit giving us confidence it is performing
well.
However, the next thing to check is whether E[X(t)] is close to X̄(t).
For that consider Figure 8.20. The smooth line is nX̄(t) which is identical
to that in Figure 8.19. By performing 100 simulations E[X(t)] can be esti-
mated for every t. The estimated E[X(t)] is the jagged line in Figure 8.20.
The two graphs are incredibly close suggesting that the approximation is
fairly reasonable. The three ways this fit could improve even further: (1) if
λt and st were much higher; (2) if we used a parametric approach instead
of Lipschitz to resolve Equation 8.19; and (3) if we used well over 100

260

240

220
Number in system at time t

200

180

160

140

120

100

60
0 1 2 3 4 5 6 7 8
t

FIGURE 8.19
Number in the system after fluid scaling (smooth line) vs. single simulation sample path (jagged
line).
Stability, Fluid Approximations, and Non-stationary Queues 507

240

220
Expected number in system at time t

200

180

160

140

120

100

80
0 1 2 3 4 5 6 7 8
t

FIGURE 8.20
Mean number in the system using 100 replications of simulation (jagged line) vs. fluid
approximation (smooth line).

replications. Nonetheless it is remarkable how good the approximation is

for E[X(t)] which gives a lot of credibility for this transient analysis.

Similar to the previous example it is possible to develop fluid models

in other situations as well, especially including balking and reneging is
fairly straightforward. Making the service time time-varying is a relatively
straightforward extension. In summary, it is possible to develop a reasonable
approximation for E[X(t)] by performing a fluid scaling, if that is appropriate
for the system (that is, in the example above we saw we need λt to be signif-
icantly larger than μ and st be significantly larger than 1). The fluid scaling
uses the fact that Xn (t) approaches X̄(t) as n → ∞. The next natural question
to ask (considering our approach in the previous √ section) is what happens to
the difference of Xn (t) − X̄(t), by scaling it by n? Does it lead to a diffusion
process? And can we use that to approximate Var[X(t)]? Those are precisely
what are dealt with in the next section.

8.4.2 Diffusion Approximation

In this section, we consider a diffusion approximation for Xn (t) described
in the previous section in Equation 8.17. That would enable us to obtain an
approximate distribution for Xn (t) for any t (hence transient analysis) espe-
cially when n is large. We will subsequently, via an example, connect that to
508 Analysis of Queues

X(t) (defined in Equation 8.15) like we did in the previous section. Now, for
the distribution of Xn (t), we consider the “usual” diffusion scaling. Define
the scaled process {X̂n (t), t ≥ 0} where X̂n (t) is given by
√
X̂n (t) = n(Xn (t) − X̄(t)).

Besides the assumptions made in the previous section, we also require that
F satisfies

d
F(t, x) ≤ M,
dx

for some finite M and 0 ≤ t ≤ T. Kurtz [71] shows that under those conditions

lim X̂n (t) = X̂(t)

n→∞

where {X̂(t), t ≥ 0} is a diffusion process. In addition, Kurtz [71] shows that

X̂(t) is the solution to

k t # t
X̂(t) = li fi (s, X̄(s))dWi (s) + F (s, X̄(s))X̂(s)ds, (8.23)
i=1 0 0

Wi (·)’s are independent standard Brownian motions, and F (t, x) = dF(t, x)/dx.
It is crucial to note that this result requires that F(t, x) is differentiable every-
where with respect to x but that is often not satisfied in many queueing
models. There are few ways to get around this which is the key fine-tuning
by Mandelbaum et al. [77] that we alluded to earlier. However, here we
take the approach in Mandelbaum et al. [78] which states that as long as the
deterministic process X̄(t) does not linger around the nondifferentiable point
or points, Equation 8.23 would be good to use.
With that understanding we now describe the diffusion approximation
for Xn (t). For that we use the result in Ethier and Kurtz [33] which states
that if X̂(0) is a constant or a Gaussian random variable, then {X̂(t), t ≥ 0} is a
Gaussian process. Since a Gaussian process is characterized by its mean and
variance, we only truly require the mean and variance of X̂(t) which can be
obtained from Equation 8.23. We will subsequently use that but for now note
that we have a diffusion approximation. In particular for a large n,

X̂(t)
Xn (t) ≈ X̄(t) + √ . (8.24)
n

Since X̂(t) is Gaussian when X̂(0) is a constant (which is a reasonable

assumption), we have Xn (t) to approximately be a Gaussian random variable
Stability, Fluid Approximations, and Non-stationary Queues 509

when n is large. Thus we only require E[Xn (t)] and Var[Xn (t)] to characterize
Xn (t).
Assuming that X̄(0) = Xn (0) = nX(0) where X(0) is a deterministic known
constant quantity, we can see that X̂(0) = 0. Further, by taking the expected
value and variance of approximate Equation 8.24, we get

E[X̂(t)]
E[Xn (t)] ≈ X̄(t) + √ , (8.25)
n

Var[X̂(t)]
Var[Xn (t)] ≈ . (8.26)
n

However, we had shown earlier that E[Xn (t)] = X̄(t) as n → ∞. Using that or
by showing E[X̂(t)] = 0 for all t since X̂(0) = 0 using Equation 8.23, we can say
that E[Xn (t)] ≈ X̄(t). Now, for Var[X̂(t)] we use the result in Arnold [6] for
linear stochastic differential equations. In particular by taking the derivative
with respect to t of Equation 8.23 and using the result in Arnold [6], we get
Var[X̂(t)] as the solution to the differential equation:

dVar[X̂(t)]
k
= fi (t, X̄(t)) + 2F (t, X̄(t))Var[X̂(t)], (8.27)
dt
i=1

with initial condition Var[X̂(0)] = 0. Once we solve for this ordinary differ-
ential equation, we can obtain Var[X̂(t)] which we can use in Equation 8.26
to get Var[Xn (t)] and subsequently Var[X(t)]. We illustrate this by means of
an example, next.

Problem 84
Consider the Mt /M/st system that models an inbound call center described
in Problem 83. This is a continuation of that problem and it is critical to
go over that before proceeding ahead. Using the results of Problem 83,
develop a diffusion model for that system. Then, obtain an approximation
for Var[X(t)] and compare against simulations by creating 100 replications
and obtaining sample variances.
Solution
Several of the details are based on the solution to Problem 83. Recall X(t), the
number in the system at time t is described in Equation 8.21 as
⎛ ⎞ ⎛ ⎞
t t
X(t) = X(0) + Y1 ⎝ λu du⎠ − Y2 ⎝ μ min{su , X(u)}du⎠ ,
0 0
510 Analysis of Queues

where Y1 (·) and Y2 (·) are the nonhomogeneous Poisson arrival and depar-
ture processes, respectively. For some large n and all t ∈ [0, 8], let

λt st
at = and rt = .
n n

Recall that Xn (t) is

t t
nXn (0) + Y1 0 nau du − Y2 0 μn min{ru , Xn (u)}du
Xn (t) = ,
n

where Xn (0) = X(0)/n. In fact,

X(t) = nXn (t)

for all t, hence

Var[X(t)] = n2 Var[Xn (t)].

As n → ∞, Xn (t) converges to X̄(t) which is given by the solution to

⎛ ⎞ ⎛ ⎞
t t
X̄(t) = X̄(0) + ⎝ au du⎠ − ⎝ μ min{ru , X̄(u)}du⎠ .
0 0

Likewise, as n → ∞, X̂n (t) defined as

√
X̂n (t) = n(Xn (t) − X̄(t))

converges to X̂(t) such that {X̂(t), t ≥ 0} is a diffusion process. All we need

is Var[X̂(t)]. To obtain Var[X̂(t)], we rewrite Equation 8.27 by appropriately
substituting for terms to obtain Var[X̂(t)] as the solution to

dVar[X̂(t)]
= at + μ min{rt , X̄(t)} − 2I(rt ≥ X̄(t))μVar[X̂(t)],
dt

where the indicator function I(A) is one if A is true and zero if A is false. This
ordinary differential equation can be solved by numerically integrating it to
get Var[X̂(t)] for all t in 0 ≤ t ≤ 8.
From Equation 8.26 we have Var[Xn (t)] ≈ Var[X̂(t)]/n and from an earlier
equation we know Var[X(t)] = n2 Var[Xn (t)]. Hence we get the approximation

Var[X(t)] ≈ nVar[X̂(t)].
Stability, Fluid Approximations, and Non-stationary Queues 511

2500
Variance of number in system at time t

2000

1500

1000

500

0
0 1 2 3 4 5 6 7 8
t

FIGURE 8.21
Variance of number in system using 100 replications of simulation (jagged line) vs. diffusion
approximation (smooth line).

Using this diffusion scaling we plot an approximation for Var[X(t)] in

Figure 8.21 which is the smooth line. By performing 100 simulations
Var[X(t)] can be estimated for every t. The estimated Var[X(t)] is the jagged
line in Figure 8.21. The two graphs are reasonably close suggesting that the
approximation is fairly reasonable.

As evident from the example problem, fluid and diffusion approxima-

tions are useful for transient analysis of systems with high arrival rates and
a large number of servers. Although we restricted ourselves to Mt /M/st
queues, they can seamlessly be extended to queues with balking and reneg-
ing, as briefly explained earlier. Further, Mandelbaum et al. [77] describe
how to extend to a wider variety of situations such as Jackson networks with
possibly state-dependent routing, queues with retrials, and priority queues,
all of which include the possibility of abandonments. Some of these will be
described in the exercises.

Reference Notes
In the last two decades, one of the most actively researched topics in the
analysis-of-queues area is arguably the concept of fluid scaling. Fluid limits
results are based on some phenomenally technical underpinnings that use
512 Analysis of Queues

stochastic process limits. This chapter does not do any justice in that regard
and interested readers are encouraged to refer to Whitt [105]. Our objective of
this chapter was to present some background material that would familiarize
a reader to this approach to analyze queues using fluid models. The author is
extremely thankful to several colleagues that posted handouts on the world
wide web that were tremendously useful while preparing this material. In
particular, Balaji Prabhakar’s handouts on fluid models, Lyapunov func-
tions, and Foster–Lyapunov criterion; Varun Gupta’s gentle introduction to
fluid and diffusion approximations; Gideon Weiss’ treatment of stability of
fluid networks; and John Hasenbein’s collection of topics on fluid queues
were all immensely useful in developing this manuscript.
As the title suggests, this chapter is divided into three key pieces: stabil-
ity, fluid-diffusion approximations, and time-varying queues. Those topics
have somewhat independently evolved in the literature and usually do not
appear in the same chapter of any book. Thus it is worthwhile describing
them individually from a reference notes standpoint. The topic of stabil-
ity of multiclass queueing networks is a fascinating one, years ago most
researchers assumed that the first-order traffic intensity conditions were suf-
ficient for stability. However, in case of multiclass queueing networks with
reentrant lines using deterministic routing and/or local priorities among
classes, more conditions are necessary for stability due to the virtual station
condition. This is articulated nicely in several books and monographs with
numerous examples. In particular, this chapter benefited greatly from: Chen
and Yao [19] with the clear exposition of fluid limits and all the excellent
multiclass network examples; Dai [25] for describing the fluid networks and
conditions for stability; Meyn [82] for the explanation of how to go about
showing a fluid network is stable; and Bramson [13] for the technical details
and plenty of examples.
Moving on to the next section, it was a somewhat familiar topic for this
book, that is, fluid and diffusion approximations. In fact, we used those
approximations in Chapter 4 without proof. However, only in Chapter 7 we
showed by approximations based on reflected Brownian motion how one
could get good approximations for general queues. However, those meth-
ods were similar to the traditional (Kobayashi [64]) diffusion approximation.
On the other hand, this section provides a diffusion approximation by scal-
ing, in fact first by performing a fluid scaling and subsequently a diffusion
one. There are numerous books and monographs that go into great details
regarding fluid and diffusion scaling. This chapter benefited greatly from
Whitt [105] as well as Chen and Yao [19], especially in terms of construct-
ing scaled processes. However, for an overview of the mathematical details
and example of diffusion processes, Glynn [46] is an excellent resource. This
chapter also benefited from several articles such as Halfin and Whitt [50] as
well as Whitt [106].
The last section of this chapter was on uniform acceleration and strong
approximations. This topic was the focus of the author’s student Young
Stability, Fluid Approximations, and Non-stationary Queues 513

Myoung Ko’s doctoral dissertation. Most of the materials in that section are
from Young’s thesis, in particular from preliminary results of his papers.
The pioneering work on strong approximations was done by Kurtz [71]. The
concept also appears in the book by Ethier and Kurtz [33]. Subsequently,
Mandelbaum et al. [77] described some of the difficulties in using the strong
approximation results available in the literature due to issues regarding
differentiability while obtaining the diffusion limits. The limits, both fluid
and diffusion, are based on the topic of uniform acceleration that can be
found in Massey and Whitt [79]. Numerical studies that circumvent the dif-
ferentiability requirement can be found in Mandelbaum et al. [78]. Young
Myong Ko has found ways to significantly improve both fluid and diffusion
approximations so that they are extremely accurate (article forthcoming).

Exercises
8.1 Consider an extension to the Rybko–Stolyar–Kumar–Seidman net-
work in Figure 8.6 with a node C in between A and B. The first flow
gets served in nodes A, C, and then B whereas the second flow has a
reverse order. Priority is given to the second flow in node C, hence in
terms of priority it is identical to that of A. Obtain the condition for
stability and verify using simulations. This network is taken from
Bramson [13] which has the figure and the stability condition.
8.2 Solve Problem 76 assuming that: (i) interarrival times and ser-
vice times are deterministic constants; (ii) interarrival times are the
same as in Problem 76 but service times are according to a gamma
distribution with coefficient of variation 2.
8.3 Show that the virtual station condition described in Section 8.2.1
along with the necessary conditions are sufficient to ensure stabil-
ity for the network in Figure 8.3. Follow a similar argument as
the one for the Rybko–Stolyar–Kumar–Seidman network outlined
in Section 8.2.3.
8.4 Let Z1 , Z2 , Z3 , . . ., be a sequence of IID random variables with finite
mean m and finite variance σ2 . Define Sn as the partial sum

Sn = Z1 + Z2 + · · · + Zn

for any n ≥ 1. Next, consider the renewal process {N(t), t ≥ 0} where

N(t) = max{k ≥ 0 : Sk ≤ t}. Show that for a very large t, N(t) is a
normal random variable with mean t/m and variance σ2 t/m3 .
8.5 Redraw sample paths of scaled process X̂n (t) versus t (not X̂n (t/n)
versus t) for the same range of t in Figures 8.16 through 8.18.
514 Analysis of Queues

8.6 Consider an M/M/s queue with arrival rate λ = 10 per minute and
number of servers s = 5. For μ = 2.5, 2, and 1.8 (all per minute) plot
X̂n (t) versus t for t ∈ [0, 1] minutes using n = 2, 20, 200, and 2000,
where X̂n (t) is defined in Section 8.4.2 in terms of Xn (t) and X̄(t).
Use multiple sample paths to illustrate the diffusion approximation.
Also use X(0) = 0.
8.7 Consider a finite population queueing system with s = 50 servers
each capable of serving at rate μ = 5 customers per hour. Assume
service times are exponentially distributed. Also upon service com-
pletion each customer spends an exponential time with mean 1 h
before returning to the queueing system. Assume that there are
400 customers in total but at time 0, the queue is empty. Obtain
an approximation for the mean and variance of the number of cus-
tomers in the system during the first hour. Perform 100 simulations
and evaluate the approximations.
8.8 Consider the following extension to Problem 83. In addition to all the
details in the problem description, say that customers renege from
the queue (that is, abandon before service starts) after an exp(β) if
their service does not begin. Use 1/β = 5 min. Use fluid and diffu-
sion approximations and numerically obtain E[X(t)] and Var[X(t)].
Compare the results by performing 100 simulations.
8.9 Solve the previous problem under the following additional condi-
tion (note that reneging still occurs): some customers access the call
center from their web browser using Internet telephony. These cus-
tomers arrive according to a homogeneous Poisson process at rate
α = 10 per hour. Only these customers also have real time access to
their position in the wait line (e.g., position-1 implies next in line
for service). However, because of that if there are i total customers
waiting for service to begin, then with probability (0.9)i an arriving
customer joins the system (otherwise the customer would balk).
8.10 Consider an Mt /M/st /st queue, that is, there is no waiting room. If
an arriving customer finds all servers busy, then the customer retries
after exp(θ) time. At this time we say that the customer is in an orbit.
Assume that λt alternates between 100 and 120 each hour for a four
hour period. Also μ = 1 and st during the four hour-long slots are: 90,
125, 125, and 150, respectively. Compute approximately the mean
and variance of the number of customers in the queue as well as in
the orbit. Care is to be taken to derive expressions when X(t) is a 2D
vector.
9
Stochastic Fluid-Flow Queues: Characteristics
and Exact Analysis

In the previous chapter, we saw deterministic fluid queues where the flow
rates were mostly constant and toward the end we saw a case where the
rates varied deterministically over time. In this and the next chapter, we
focus on stochastic fluid queues where flow rates are piecewise constant
and vary stochastically over time. We consider only the flow rates from a
countable set. On a completely different note, in some sense the diffusion
limits we saw in previous chapters can be thought of as a case of flow rates
from an uncountable set that are continuously varying (as opposed to being
piecewise constant). Thus from a big-picture standpoint, metaphorically the
models in this chapter fall somewhere between the deterministically time-
varying fluid queues and diffusion queues. However, here we will not be
presenting any formal scaling of any discrete queueing system to result in
these fluid queues. We focus purely on performance analysis of these queues
to obtain workload distributions.
For the performance analysis we start by describing a queueing system
where the entities are fluids. For example, a sink in a kitchen or bathroom
can be used to explain the nuances. Say there is a fictitious tap or faucet that
has a countable number of settings (as opposed to a continuous set which is
usually found in practice). At each discrete setting, water flows into the sink
at a particular rate. Typically the sojourn time in each setting is random and
the setting changes stochastically over time. This results in a piecewise con-
stant flow rate that changes randomly over time. The sink itself is the queue
or buffer that holds fluid (in this case water), which flows into it. The drain
is analogous to a server that empties the fluid off the sink (however, unlike
a real bathtub or sink, here we assume the drainage rates are not affected by
the weight of the fluid). For our performance analysis, we assume that we
know the stochastic process that governs the input to the sink as well as the
drainage. Using that our aim is to obtain the probability distribution of the
amount of fluid in the sink.
Naturally, these models can be used in hydrology such as analyzing
dams, reservoirs, and water bodies, as well as in process industries such as
chemicals and petrochemicals. However, a majority of the results presented
here have been motivated by applications in computer, communication, and
information systems. Interestingly these are truly discrete systems, but there
are so many discrete entities that flow in an extremely small amount of

515
516 Analysis of Queues

time that it is conducive to model them as fluids. In some sense what we

are really doing is not analyzing the system at the granularity of a discrete
entity (usually packet or cell) but at a higher granularity where we are con-
cerned when the flow rate changes. This also makes it rather convenient since
at the packet level one sees long-range dependence (Leland et al. [74]) in
interarrival times as packets belonging to the same file, for example, are
going to arrive back to back. With that motivation we first describe some
introductory remarks next including details of various applications, fol-
lowed by some performance analysis.

9.1 Introduction
The objective of this section is to provide some introductory remarks regard-
ing stochastic fluid-flow queues. We begin by contrasting fluid-flow queues
against discrete queues to understand their fundamental differences as well
as underlying assumptions. Once we put things in perspective, we describe
some more applications and elaborate on others described previously. Then
we go over some preliminary material such as inputs that go into the per-
formance analysis as well as a description of the condition for stability.
We conclude the section with a characterization of the stochastic flow-rate
processes that govern the flow of fluids.

9.1.1 Discrete versus Fluid Queues

Here we seek to put things in perspective by comparing and contrasting
fluid-flow queues against queues with discrete entities. It may be worthwhile
revisiting this aspect after becoming more comfortable with fluid queues
such as at the end of this chapter. However, in some sense the two are
hard to compare because they are typically considered at different levels of
abstraction, use different aspects of variability/randomness, and the state
information used in modeling are significantly different. Thus while making
the comparison we will consider a specific system with the disclaimer that
fluid queues have a much wider set of applications as we will see in the next
section.
Consider a web server that responds to requests for files. Typically the
files are broken down into small packets and sent from the server to a gate-
way which transmits the packets through the Internet to the users making
the requests. Say we are interested in modeling the queueing system at the
gateway. The gateway processes information at a rate of c kb/s and sends
it into the Internet. We define the workload at the gateway as the amount of
information (in kb) to be processed and forwarded. Our earlier definition
of workload was in units of time which can easily be done by dividing the
current definition of workload by c.
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 517

A1 A2 A3 A4 A5

C
X(t)

(a) Time

Off On

R– C C
X(t)

(b) Time

FIGURE 9.1
Comparing workloads in (a) discrete and (b) fluid queues.

The key difference between discrete queues and fluid queues is that in
the discrete case the entire file arrives instantaneously, thus creating a jump
or discontinuity in the workload process. Whereas in the fluid case, the file
arrives gradually over time at a certain flow rate. Therefore, the workload
process would not have jumps in fluid queues, that is, they are continuous.
For both the discrete and the continuous cases, let X(t) be the amount of fluid
(i.e., workload) in the buffer at time t. Figure 9.1 illustrates this difference
between the X(t) process versus t in the discrete and fluid queues. In par-
ticular it is crucial to point out that although in practice the entire file does
not actually arrive instantaneously but in the discrete model, Figure 9.1a, we
essentially assume the arrival time of the file (A1 , A2 , . . .) as when the entire
file completely arrives at the gateway.
In many systems, this is a necessity as the entire entity is necessary for
processing to begin. Notice that with every arrival, the workload jumps up
by an amount equal to the file size. The workload is depleted at rate c. We
have seen such a workload process in the discrete queues, the only differ-
ences here are that (i) the notation for X(t) is not what we used for the discrete
case; and (ii) the workload depletion rate is c and not 1 like is usually done.
However, what enables fluid model analysis is the fact that the gateway does
not have to wait for the entire file to arrive. As soon as the first packet of the
518 Analysis of Queues

file arrives, it sends it off without waiting for the whole file to arrive. Alter-
natively one could think of the discrete queue as a “bulk” arrival of a batch of
packets whereas in the fluid queue this batch arrives slowly over time. Since
the batch arrives back to back, modeling at the “packet” level is tricky. In a
similar fashion, although not represented in the workload process, we con-
sider a discrete entity’s departure from the system when all of its service is
completed which is not the case for fluids. In fact thus the concept of sojourn
times at the granularity of a whole file is not so easy in fluid queues.
Now we describe the workload process for the fluid queue. In particular,
refer to Figure 9.1b. We assume that information flows into the system as
an on–off fluid. We will explain on–off fluid subsequently, however, for the
purposes of this discussion it would suffice to think of an “on” time as when
there is one or more files back to back that gets processed by the server at
rate c. When there is information, it flows in at rate R. However, when there
is no information flow, we call that period as “off.” From the figure, notice
that the workload gradually increases at rate R − c when the source is on.
It is because fluid enters at rate R and is removed at rate c, resulting in an
effective growth rate of R − c. Also, when the fluid entry is off, the workload
reduces at rate c (provided there is workload to be processed, otherwise it
would be zero). Notice that in Figure 9.1 there is no relationship between the
discrete case’s arrival times and file sizes against the fluid case’s on and off
times.
In summary, fluid-flow models are applicable in systems where the
workload arrives gradually over time (as opposed to instantaneously) and
we are interested in aggregate performance such as workload distribution.
In most cases the entities are themselves either fluids or can be approximated
as fluids. Under these situations the analysis based on fluid models would be
extremely accurate and in comparison the discrete models would be way off.
In particular, when the discrete entities arrive in a back-to-back fashion, it is
extremely conducive to model them as fluids as opposed to discrete point
masses with constant interarrival times. We will next see some applications
where fluid models would be appropriate for their analysis.

9.1.2 Applications
Here we present some scenarios where stochastic fluid-flow models can be
used for performance analysis. The idea is to give a flavor for the kind of
systems that can be modeled using fluid queues. We begin by presenting
some examples from computer and communication systems where informa-
tion flow is modeled as fluids. Then we present an example from supply
chain, followed by one in hydrology and finally a transportation setting. As
described earlier, most of the fluid model results have been motivated by
applications in computer-communication networks. Thus it is worthwhile to
describe a few examples at different granularities of space, size, and time.
However, what is common in all the cases is that entities arrives in a bursty
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 519

fashion, that is, things happen in bursts at different rates as opposed to one
by one in a fairly uniform fashion.
For example, consider a CPU on a computer that processes tasks from a
software agent that is part of a multi-agent system. An agent is a complex
software that can autonomously send, receive, and process information on
behalf of a user. The agent sends tasks to its CPUs to aid in decision-making.
In Aggarwal et al. [3] we consider software agents that are hosted on comput-
ers that perform military logistic operations. We show (see the first figure in
Aggarwal et al. [3]) that the tasks are indeed generated in bursts by the agents
and submitted to the CPU for processing. The burtsy nature is due to the fact
that the agent receives a trigger externally and submits a set of jobs to the
CPU to determine its course of action. The CPU does not wait for all the tasks
of the agent to arrive to begin processing but processes them as they arrive.
It is important to notice that these tasks are fairly atomic in nature that can
be processed independently rather quickly with roughly similar processing
times. This makes it ideal to analyze as fluid queues.
Fluid models have been successfully used in modeling packet-level traf-
fic in both wired (also called wire-line) and wireless networks. Irrespective
of whether it is end systems such as computers and servers, or inside the
network such as routers, switches, and relays, or both such as multi-hop
wireless nodes and sensors, information flow in the form of packets can
be nicely modeled as fluids. In all these systems information stochastically
flows into buffers in the form of tiny packets. These packets are processed
and the stored packets are forwarded to a downstream node in the network.
This process called store-and-forward of packets results in an extremely effi-
cient network. Contrast this to the case where entire files are transferred hop
by hop as a whole (as opposed to packetizing them); a significant amount
of time would be wasted just waiting for entire files to arrive. In the store-
and-forward case, some packets of a file would already be at the destination
while other packets still at the origin, even if the origin and destination are
in two extremes of the world.
At a much coarser granularity, consider users that access a server farm for
web or other application processing. The users enter a session and within a
session they send requests (usually through browsers for web applications)
and receive responses. The users alternate between periods of activity and
quiet times during a session. Also the servers can process requests inde-
pendent of other requests. Thus within a session requests arrive in a bursty
fashion much like a bunch of packets that are part of a file in the previous
example. These requests are stored in a buffer and processed one by one by
the server. One can model each user as a source that toggles between burst-
ing requests and idling in a stochastic fashion. That can nicely be analyzed
using fluid queues. In summary, there are several computer and communi-
cation systems that can be modeled using fluid queues. They key elements
are: bursty traffic, the ability to process smaller elements of the traffic, and
finally (although not emphasized earlier) the smaller elements must have
520 Analysis of Queues

similar processing requirements. Those would result in fluid queues being

appropriate tools for analysis. Such features are prevalent in other systems
as well, some of which we present in the following.
Consider a single bay of a cross-dock where trucks arrive and the contents
(in boxes or palettes) are removed from the truck. After that the truck leaves
and another truck pulls into the bay and the process continues. We focus on
the flow of the contents (boxes or palettes) from the truck at the bay into the
warehouse. Whenever there is a truck at the bay, contents flow out of the
truck at a certain rate (equal to how fast the truck can be emptied). However
when there is no truck, there is no flow of contents. Thus the input can be
modeled as an on–off source (we will see that subsequently). The crucial point
is that one does not wait until all the contents are emptied to start processing
them. Instead as and when contents are removed from the truck, they are
placed in a buffer and processed one by one. Since the processing starts well
before all the contents are emptied from the truck, it would be better to use a
fluid model as opposed to a discrete batch arrival G[X] /G/1 queue to model
the system.
Next we present two applications from civil engineering. The first is from
hydrology. Fluid models are a natural fit to model water flow in dams and
reservoirs. The reservoir is the buffer and water is drawn from it on a daily
basis in a controlled fashion. The reservoir receives water input from var-
ious sources. The input is stochastic and at any time the input rate can be
approximated as a discrete quantity. This lends itself nicely to model using
fluid queues. Another civil engineering example is in transportation, in par-
ticular roadways. Although in these systems the queues or buffers are rather
abstract, one routinely measures flow rates and capacities of roadways. Cars
and other vehicles are typically modeled as fluids since they move in a back-
to-back fashion during congestion with periods of large gaps due to signal
effects. These are usually used in ramp metering strategies in highways,
signal light controls, as well as designing capacities and tolls.
In summary, there are several applications of fluid queues, some of which
have been presented in this section. In fact there are applications that are not
queues but these analysis methods can be effectively used. For example, in
reliability the fluid input to the system can be thought of as continuous stres-
sors and the output capacity is related to continuous maintenance. Thus the
fluid level in the buffer can be thought of as the condition of the system that
goes up and down due to maintenance and stressors. Similar to reliability
of physical systems these models are also extremely useful in insurance risk
analysis as well as warranty reserves. Another set of applications that comes
to mind is to model processing times, travel times, project completion times,
etc. Akin to modeling lifetimes in reliability, these times can also be modeled
using results from fluid queues. In particular, some of the transient analy-
sis and first passage time analysis are useful in these applications. With that
motivation, in the next section we formally define a fluid queue that we will
subsequently analyze in the following sections.
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 521

9.1.3 Preliminaries for Performance Analysis

Here we present a preliminary setup for performance analysis of a single
fluid queue. Consider a buffer that is capable of holding B amount of fluid
workload. We call such a buffer as one of size B if B is finite, otherwise we
call it an infinite-sized buffer. Fluid flows into the buffer for a random time
at a constant rate sampled from a countable set. For example, fluid flows at
rate r(1) units per second (such as kbps) for a random amount of time t1 ,
then flows at rate r(2) units per second for a random amount of time t2 , and
so on. This behavior can be captured as a discrete-state stochastic process
that jumps from one state to another whenever the traffic flow rate changes.
This can be formalized as a stochastic process {Z(t), t ≥ 0} that is in state Z(t)
at time t. Fluid flows into the buffer at rate r(Z(t)) at time t.
We typically call the stochastic process {Z(t), t ≥ 0} that drives the traffic
generation as an environment process and the origin of fluid we call a source.
Fluid is removed from the buffer by a channel that has a fixed capacity c
(Figure 9.2). Let X(t) be the amount of fluid in the buffer (i.e., buffer content)
at time t. The dynamics of the buffer content process {X(t), t ≥ 0} if B = ∞, is
described by

dX(t) r(Z(t)) − c if X(t) > 0
=
dt {r(Z(t)) − c}+ if X(t) = 0.

where {x}+ = max(x, 0). The dynamics when B < ∞ is

⎧
⎪ {r(Z(t)) − c}+ if X(t) = 0
dX(t) ⎨
= r(Z(t)) − c if X(t) > 0
dt ⎪
⎩ {r(Z(t)) − c}− if X(t) = B

where {x}− = min(x, 0).

Next we describe the stability condition when B = ∞. We assume that
Z(∞) is the state of the environment in steady state. If the expected arrival
rate in steady state is lesser than the drainage capacity c, then we expect the
fluid queue to be stable. This intuitive result has been shown in Kulkarni

X(t)
c
Z(t)

FIGURE 9.2
Buffer with environment process Z(t) and output capacity c. (From Gautam, N., Quality of
service metrics, in Frontiers in Distributed Sensor Networks, S.S. Iyengar and R.R. Brooks, Eds.,
Chapman & Hall/CRC Press, Boca Raton, FL, 2004, pp. 613–628. With permission.)
522 Analysis of Queues

and Rolski [66]. In other words, the buffer content process {X(t), t ≥ 0}
(when B = ∞) is stable if the mean traffic arrival rate in steady state is less
than c, that is,

E{r(Z(∞))} < c. (9.1)

If this stability condition is satisfied, then X(t) has a limiting distribution.

For both B = ∞ and B < ∞, we are interested in the limiting distribution
of X(t), that is,

lim P{X(t) ≤ x} = P{X ≤ x}

t→∞

such that X(t) → X as t → ∞. This is the main performance metric that we

will consider. Recall that X(t) is the workload in the buffer (with appropriate
units such as kb) at time t. To obtain P{X ≤ x} we require the characteristics
of the stochastic process {Z(t), t ≥ 0} as well as the piecewise-constant arrival
rates in each state r(Z(t)), the buffer size B and channel capacity c. In the next
section we present some of the environment processes that can be effectively
characterized and analyzed.

9.1.4 Environment Process Characterization

Almost any discrete-valued stochastic process {Z(t), t ≥ 0} can be used for
fluid queues. However, the following is a collection of processes that are
conducive for analysis. The first is a continuous-time Markov chain (CTMC),
followed by an alternating renewal process, and finally a semi-Markov
process (SMP).

9.1.4.1 CTMC Environmental Processes

The environmental process that is conducive for exact analysis to obtain
P(X ≤ x) is the CTMC. In the next section we describe the CTMC environ-
ment process in great detail. However, for sake of completeness and to
contrast with other processes, we briefly describe it here. Let {Z(t), t ≥ 0}
be an irreducible, finite state CTMC with states and generator matrix Q.
When the CTMC is in state i, traffic flows at rate r(i). Let pi be the stationary
distribution that the CTMC is in state i such that [p1 p2 . . . p ]Q = [0 0 . . . 0]

and pi = 1. If B = ∞, the stability condition is i = 1 pi r(i) < c.
i=1

9.1.4.2 Alternating Renewal Environmental Processes

The environment process for several of the applications we described earlier
can be modeled as an alternating renewal process, in particular an on–off
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 523

process. The environmental process {Z(t), t ≥ 0} alternates between on and

off states. The on times form a renewal process and so do the off times.
However, the on and off times in a cycle do not have to be independent.
A source that generates such an alternating renewal environment process
is also known as an on–off source. In this and the next chapter we will
encounter on–off sources quite frequently. When the source is on, fluid flows
into the buffer at rate r and no fluid flows when the source is off.
The on times (or also called “up” times) are distributed according to a
general CDF U(·). The mean on time τU can be calculated as

∞
τU = tdU(t).
0

Likewise, the off times (or “down” times) are according to a general distri-
bution with CDF D(·). The mean off time τD can be calculated in a similar
manner as

∞
τD = tdD(t).
0

For the rest of this book we assume that the CDFs U(·) and D(·) are such
that we can either compute their LSTs directly or they can be suitably
approximated as phase-type distributions whose LSTs can be computed.
When the buffer size B = ∞, the system would be stable if

rτU
< c.
τU + τD

9.1.4.3 SMP Environmental Processes

The most “general” type of discrete-state environment process we will con-
sider is the SMP. Notice that the CTMC, alternating renewal process, and
even the DTMC are special cases of the SMP. Thus by developing methods
that can be used to derive performance metrics when the environment is an
SMP, we can cover the other stochastic processes as well. For that reason we
next describe the SMP environment process in some detail. Consider an SMP
{Z(t), t ≥ 0} on state space {1, 2, ..., }. Fluid is generated at rate r(i) at time t
when the SMP is in state Z(t) = i. Let Sn denote the time of the nth jump
epoch in the SMP with S0 = 0. Define Zn as the state of the SMP immediately
after the nth jump, that is,

Zn = Z(Sn +).
524 Analysis of Queues

Let

Gij (x) = P{S1 ≤ x; Z1 = j|Z0 = i}.

The kernel of the SMP is

G(x) = [Gij (x)]i,j=1,..., .

Note that {Zn , n ≥ 0} is a DTMC, which is embedded in the SMP. Assume that
this DTMC is irreducible and recurrent with transition probability matrix

P = G(∞).

Let

Gi (x) = P{S1 ≤ x|Z0 = i) = Gij (x)
j=1

and the expected time the SMP spends in state i be

τi = E(S1 |Z0 = i).

Let

πi = lim P{Zn = i}
n→∞

be the stationary distribution of the DTMC {Zn , n ≥ 0}. It is given by the

unique nonnegative solution to

[π1 π2 . . . π ] = [π1 π2 . . . π ]P and πi = 1.
i=1

The stationary distribution of the SMP is thus given by

πi τi
pi = lim P{Z(t) = i} = .
t→∞
πm τm
m=1

If B = ∞, the stability condition is

pi r(i) < c.
i=1
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 525

With that description we are now ready for performance analysis of fluid
queues.

9.2 Single Buffer with Markov Modulated Fluid Source

Our objective in this section is to derive performance measures for the single
buffer stochastic fluid-flow system as depicted in Figure 9.2. We assume that
the input to the buffer is governed by a CTMC (this is also referred to in the
literature as a Markov modulated fluid source or MMFS). This is the only
case for which we obtain algebraic expressions for the steady-state buffer
content distribution. We rely on bounds and approximations for other input
(i.e., environment) processes which would be the focus of the next chapter.
But, if we can model other environment processes using appropriate phase-
type distributions, then (as we will see at the end of this section) we can
use the CTMC models described here for analyzing those queues. Thus in
some sense the CTMC case is general enough. With that motivation we first
describe the set of notation to be used. Then we show Kolmogorov differen-
tial equations for our measures of interest. Thereafter we will illustrate their
solutions and describe some examples at the end of the section.

9.2.1 Terminology and Notation

Here we describe the notation and terminology for the single buffer fluid
queue with MMFS. Although some of these have been defined earlier and
some will be defined again as appropriate, the main goal is to have a col-
lection of notation as well as terminology in one location for ease of future
reference. We begin with the notion of a buffer which is essentially a wait-
ing area or queue. In this and the next chapter we assume that the entities
flowing in and out of the buffer are fluids. Let B be the size of the buffer,
that is, the maximum amount of fluid that can be held in the buffer. Depend-
ing on the application, the units for B would be appropriately defined, such
as liters and bytes. In most of our analysis we will consider only infinite-size
buffers, that is, B = ∞, however, whenever B is finite, it would be specifically
mentioned.
Fluid enters this buffer stochastically from a “source” which we say is
modulated or governed or driven by a process called the environment pro-
cess which is essentially a stochastic process. Here we denote the input to
the buffer of size B as driven by a random discrete-state stochastic process
{Z(t), t ≥ 0}. When the environment is in state Z(t), fluid arrives to the buffer
at rate r(Z(t)). The units of r(Z(t)) would depend on the applications and
would take the form liters per second, bytes per second, etc. We assume that
the environment process {Z(t), t ≥ 0} is an ergodic CTMC with state space
526 Analysis of Queues

S = {1, 2, . . . , }. The number of states is finite, that is, < ∞. The infinitesi-
mal generator matrix for the CTMC {Z(t), t ≥ 0} is Q = [qij ], which is an ×
matrix. Let pi be the steady-state probability that the environment is in state
i, that is,

pi = lim P{Z(t) = i}
t→0

for all i ∈ S. Since the CTMC is ergodic, we can compute the steady-state
probability row vector p = [p1 p2 . . . p ] by solving for

pQ = [0 0 . . . 0] and pi = 1.
i=1

Having described the buffer and its input, next we consider the output.
The output capacity of the buffer is c. This means that whenever there is fluid
in the buffer it gets removed at rate c. However, if the buffer is empty and
the input rate is smaller than c, then the output rate would be same as the
input rate. For that reason it is called output capacity as the actual output
rate could be smaller than c. The units of c are the same as that of r(Z(t)).
Thus both would be in terms of liters per second or bytes per second, etc.
The term output capacity is also sometimes referred to as channel capacity
or processor capacity. Unless stated otherwise, we assume that c remains a
constant. Before proceeding ahead it maybe worthwhile familiarizing with
the input, buffer, and output using Figure 9.2.
Next we describe the buffer contents and its dynamics. Let X(t) be the
amount of fluid in the buffer at time t. We first assume that B = ∞ (and later
relax that assumption). Whenever X(t) > 0 and Z(t) = i, X(t) either increases
at rate r(i) − c or decreases at rate c − r(i) depending on whether r(i) is greater
or lesser than c, respectively. To capture that we define the drift d(i) when
the CTMC {Z(t), t ≥ 0} is in state i (i.e., Z(t) = i) as

d(i) = r(i) − c

for all i ∈ S. Therefore, we have

dX(t)
= d(i)
dt

if X(t) > 0 and Z(t) = i for any t ≥ 0. Next, when X(t) = 0 it stays at 0 as long
as the drift is non-positive, that is,

dX(t)
=0
dt
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 527

if X(t) = 0, Z(t) = i, and r(i) ≤ c for any t ≥ 0. However, as soon as the drift
becomes positive, the buffer contents would start increasing from 0.
Given the dynamics of X(t), a natural question to ask is if X(t) would drift
off to infinity (considering B = ∞). As it turns out, based on Equation 9.1, we
can write down the stability condition as

r(i)pi < c. (9.2)
i=1

In other words, the LHS of this expression is the steady-state average input
rate by conditioning on the state of the environment and unconditioning. We
need the average input rate to be smaller than the output capacity. Another
way of stating the stability condition is

d(i)pi < 0
i=1

since the average drift must be negative.

Next, we present an example to illustrate various notations described pre-
viously (they are also summarized in Table 9.1). For the examples as well
as the analysis to follow, we describe some matrices. We define the drift
matrix D as

D = diag[d(i)]

TABLE 9.1
List of Notations
B Buffer size (default is B = ∞)
Z(t) State of environment (CTMC) modulating buffer input at time t
S State space of CTMC {Z(t), t ≥ 0}
Number of states in S, i.e., = |S| and is finite
qij Transition rate from state i to j in CTMC {Z(t), t ≥ 0}
Q Infinitesimal generator matrix, i.e., Q = [qij ]
pi Stationary probability for the ergodic CTMC {Z(t), t ≥ 0}
r(i) Fluid arrival rate when Z(t) = i
R Rate matrix, i.e., R = diag[r(i)]
c Output capacity of the buffer
d(i) Drift in state Z(t) = i, i.e., d(i) = r(i) − c
D Drift matrix, i.e., D = diag[d(i)] = R − cI
X(t) Amount of fluid in the buffer at time t
528 Analysis of Queues

implying that it is a diagonal matrix with diagonal entries corresponding

to the drift. We can also write down D = R − cI where R is the rate matrix
given by

R = diag[r(i)]

and I is the × identity matrix. As an example, consider a buffer with

B = ∞, c = 12 kbps, and environment CTMC {Z(t), t ≥ 0} with = 4 states,
S = {1, 2, 3, 4}, and
⎡ ⎤
−10 2 3 5
⎢ 0 −4 3⎥
⎢ 1 ⎥
Q=⎢ ⎥.
⎣ 1 1 −3 1⎦
1 2 3 −6

It is not hard to see that the CTMC is ergodic with p = [0.0668 0.2647 0.4118
0.2567]. The fluid arrival rates in states 1, 2, 3, and 4 are 20, 15, 10, and 5 kbps,
respectively. In other words, r(1) = 20, r(2) = 15, r(3) = 10, and r(4) = 5 with
⎡ ⎤
20 0 0 0
⎢ 0 15 0 0⎥
R=⎢ ⎥
⎣ 0 0 10 0⎦ .
0 0 0 5

Thus we have the drift matrix as

⎡ ⎤
8 0 0 0
⎢0 3 0 0⎥
D=⎢
⎣0
⎥.
0 −2 0⎦
0 0 0 −7

4
Notice that the system is stable since r(i)pi = 10.7086, which is less
i=1
than c = 12. For this numerical example, a sample path of Z(t) and X(t) is
depicted in Figure 9.3(a) with X(0) = 0 and Z(0) = 1. Notice that when the
drift is positive, fluid increases in the buffer and when the drift is nega-
tive, fluid is nonincreasing. It is also important to pay attention to the slope
(although not drawn to scale) as they correspond to the drift.
Now we consider the case when the buffer size B is finite, that is, B < ∞.
Most of what we have described thus far holds. We just state the differences
here. In particular, when the buffer is full, that is, X(t) = B, if the drift is pos-
itive, then fluid enters the buffer at rate c and a fraction of fluid is dropped
at rate r(Z(t)) − c. This would result in the X(t) process staying at B until the
drift becomes negative. However, when X(t) < B, the dynamics are identical
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 529

Z(t) Z(t)
4 4
3 3
2 2
1 1
t t

X(t) X(t)

t t
(a) (b)

FIGURE 9.3
(a) Sample path of environment process and buffer contents when B = ∞ and (b) Sample paths
when B < ∞.

to the infinite size case. The only other thing is that the system is always sta-
ble when B < ∞. Hence the stability condition described earlier is irrelevant.
To illustrate the B < ∞ case, we draw a sample path of Z(t) and X(t) for the
same example considered earlier, that is, c = 12 kbps and {Z(t), t ≥ 0} has
= 4 states, S = {1, 2, 3, 4}, and
⎡ ⎤
−10 2 5 3
⎢ ⎥
⎢ 0 −4 1 3⎥
Q=⎢
⎢
⎥
⎣ 1 1 −3 1⎥⎦
1 2 3 −6

with r(1) = 20, r(2) = 15, r(3) = 10, and r(4) = 5. The sample path is illustrated
in Figure 9.3(b). Notice that for the sake of comparison, the Z(t) sample
paths are identical in Figures 9.3(a) and (b). However, when the X(t) process
reaches B in Figure 9.3(b), it stays flat till the system switches to a negative
drift state.
The next step is to analyze the process {X(t), t ≥ 0}. In other words, we
would like to capture the dynamics of X(t) and characterize it by deriving a
probability distribution for X(t) at least as t → ∞. For that it is important to
notice that unlike most of the random variables we have seen thus far, X(t) is
a mixture of discrete and continuous parts. Notice from Figures 9.3(a) and (b)
that X(t) has a mass (i.e., takes on a discrete value) at 0. Also if B < ∞, then
X(t) has a mass at B as well. Everywhere else X(t) takes on continuous values.
This we will see that X(t) will have a mixture of discrete and continuous parts
530 Analysis of Queues

with a mass at 0 and possibly at B (if B < ∞). With that in mind we proceed
with analyzing the dynamics of X(t) next.

9.2.2 Buffer Content Analysis

The main objective here is to obtain a distribution for X(t) and since X(t)
describes the amount of fluid in the buffer at time t, this is called buffer content
analysis. All the notation and terminology used here are described in Section
9.2.1. Unless specified explicitly, for most of the analysis, B could either be
finite or be infinite. For the buffer content analysis, we define for all j ∈ S the
joint distribution

Fj (t, x) = P{X(t) ≤ x, Z(t) = j}. (9.3)

We seek to obtain an analytical expression for Fj (t, x) for all j ∈ S. For that we
consider some j ∈ S and derive the following expressions:

Fj (t + h, x) = P{X(t + h) ≤ x, Z(t + h) = j} (9.4)

= P{Z(t + h) = j|X(t + h) ≤ x, Z(t) = i}P{X(t + h) ≤ x, Z(t) = i}
i∈S
(9.5)

= P{Z(t + h) = j|Z(t) = i}P{X(t + h) ≤ x|Z(t) = i}P{Z(t) = i}
i∈S
(9.6)
= (1 + qjj h)P{X(t + h) ≤ x|Z(t) = j}P{Z(t) = j}

+ qij hP{X(t + h) ≤ x|Z(t) = i}P{Z(t) = i} + o(h). (9.7)
i∈S,i=j

Before proceeding ahead, it is worthwhile explaining the steps in the preced-

ing derivation. Equation 9.4 is by directly replacing t by t + h in the definition
of Fj (t, x) in Equation 9.3. By conditioning on Z(t) as well as bringing X(t + h)
in the conditional part we get Equation 9.5. The step could have been bro-
ken down into two steps to arrive at the same result. It may be fruitful to
verify that the conditional probabilities lead to the LHS. Next, since Z(t) pro-
cess is a CTMC, we can remove X(t + h) in the conditional argument and
also write the second term in product form to get Equation 9.6. Then, Equa-
tion 9.7 is obtained by writing down the CTMC transition probability from
state i to j in time h as qij h + o(h) if i = j and 1 + qjj h + o(h) if i = j, where
o(h) is a collection of terms of higher order than h such that o(h)/h → 0
as h → 0.
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 531

However, before taking the limit, we first consider an infinitesimally

small h > 0 (although this is not rigorous, it makes explanation much
easier). The amount of fluid that would have arrived into the buffer in
time h when the source is in state i is r(i)h. Likewise the amount of
fluid that would have departed the buffer in time h is ch. Then we have
P{X(t + h) ≤ x|Z(t) = i} = P{X(t) ≤ x − (r(i) − c)h|Z(t) = i} + ô(h)/h where ô(h) is
another collection of terms of higher order than h. However, for the sake of
not being too messy, we will drop the ô(h) term as it would vanish anyway
when we take the limit h → 0. Then we can write down Equation 9.7 as

Fj (t + h, x) = (1 + qjj h)P{X(t) ≤ x − (r(j) − c)h|Z(t) = j}P{Z(t) = j}

+ qij hP{X(t) ≤ x − (r(i) − c)h|Z(t) = i}P{Z(t) = i} + o(h).
i∈S,i=j

We can immediately rewrite this equation by converting the conditional

probabilities to joint probabilities as

Fj (t + h, x) = (1 + qjj h)P{X(t) ≤ x − (r(j) − c)h, Z(t) = j}

+ qij hP{X(t) ≤ x − (r(i) − c)h, Z(t) = i} + o(h).
i∈S,i=j

Using the definition of Fj (t, x) in Equation 9.3 we can then write down this
equation as

Fj (t + h, x) = (1 + qjj h)Fj (t, x − (r(j) − c)h)

+ qij hFi (t, x − (r(i) − c)h) + o(h).
i∈S,i=j

We can rearrange the equation by subtracting Fj (t, x) on both sides and

dividing by h to get

Fj (t + h, x) − Fj (t, x) Fj (t, x − (r(j) − c)h) − Fj (t, x)

=
h h

+ qjj Fj (t, x − (r(j) − c)h)

o(h)
+ qij Fi (t, x − (r(i) − c)h) + .
h
i∈S,i=j
532 Analysis of Queues

We rewrite the expression as

Fj (t + h, x) − Fj (t, x) Fj (t, x − (r(j) − c)h) − Fj (t, x)

+ (r(j) − c)
h −(r(j) − c)h
o(h)
= qij Fi (t, x − (r(i) − c)h) + .
h
i∈S

Now we take the limit as h → 0 and the above equation results in the
following partial differential equation:

∂Fj (t, x) ∂Fj (t, x)

+ (r(j) − c) = qij Fi (t, x). (9.8)
∂t ∂x
i∈S

To make our representation compact, we define the row vector F(t, x) as

F(t, x) = [F1 (t, x) F2 (t, x) . . . F (t, x)]. (9.9)

Then the vector F(t, x) satisfies the following partial differential equation

∂F(t, x) ∂F(t, x)
+ D = F(t, x)Q, (9.10)
∂t ∂x

where D is the drift matrix. Verify that the jth vector element of the equation
is identical to that of Equation 9.8. Having described the partial differential
equation, the next step is to write down the initial and boundary conditions.
We assume that X(0) = x0 and Z(0) = z0 for some given finite and allowable
x0 and z0 . Hence the initial conditions for all j ∈ S are

1 if j = z0 and x ≥ x0
Fj (0, x) =
0 otherwise.

Also, the boundary conditions are

Fj (t, 0) = 0 if d(j) > 0

and (if B < ∞)

Fj (t, B) = P{Z(t) = j} if d(j) < 0.

The first boundary condition states if j is a state of the environment such

that the drift is positive (i.e., fluid arrival rate is greater than emptying rate),
then the probability the buffer would be empty when the environment is in
state j is zero. Thus if d(j) > 0, then P{X(t) = 0, Z(t) = j} = 0 which essentially
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 533

is the same as Fj (t, 0) = 0 since X(t) is nonnegative. Notice that if the drift is
negative at time t (i.e., Z(t) = j and d(j) < 0), there is a nonzero probability of
having X(t) = 0. In other words, X(t) has a mass at zero and when X(t) = 0
the drift is negative.
The second boundary condition is a little more involved. However, it
applies only when the buffer size is finite, that is, B < ∞. Just like how X(t)
has a mass at zero, it would also have a mass at B, that is, P{X(t) = B} would
be non-zero if the drift at time t is positive, that is, d(Z(t)) > 0. Thus as x
approaches B from below, Fj (t, x) governed by the partial differential equa-
tion would not include the mass at zero and would truly be representing
P{X(t) < B, Z(t) = j}. However, if the drift is negative, there would be no mass
at B for the X(t) process. Thus if d(j) < 0, P{X(t) < B, Z(t) = j} would be equal
to P{X(t) ≤ B, Z(t) = j} which would just be P{Z(t) = j} since X(t) is bounded
by B. When d(j) > 0 we will have Fj (t, B) + P{X(t) = B, Z(t) = j} = P{Z(t) = j}.
For that reason we do not have the second boundary condition for all j but
only for d(j) < 0. We will revisit this case subsequently through an example.
But it is worthwhile pointing out that this indeed was not an issue when
X(t) = 0 but only when X(t) = B because of the way the CDF is defined as a
right-continuous function.
That said, now we have a partial differential equation for the unknown
vector F(t, x) with initial and boundary conditions. The next step is to solve
it and obtain F(t, x). There are two approaches. One is to use a numer-
ical approach which is effective when numerical values are available for
all the parameters. There are software packages that can solve such partial
differential equations. The second approach is to analytically solve for F(t, x)
that we explain briefly. Let F̃j (w, x) be the LST of Fj (t, x) defined as

∞ ∂F(t, x)
F̃j (w, x) = e−wt dt.
∂t
0

Also the row vector of LSTs being the LST of the individual elements, hence

F̃(w, x) = [F˜1 (w, x) F˜2 (w, x) . . . F˜ (w, x)].

Thus taking the LST of both sides of Equation 9.10 we get

dF̃(w, x)
wF̃(w, x) − wF(0, x) + D = F̃(w, x)Q,
dx

which is an ordinary differential equation in x for F̃(w, x) once we use the

initial condition for F(0, x). However, except for certain special cases, in gen-
eral solving this ordinary differential equation and inverting the LST to get
534 Analysis of Queues

F(t, x) as a closed-form algebraic expression is nontrivial. Hence we next con-

sider the steady-state distribution of X(t), that is, as t → ∞ for which we can
indeed obtain a closed-form algebraic expression.

9.2.3 Steady-State Results and Performance Evaluation

The objective of this section is to obtain a closed-form expression for the
distribution for X(t) as t → ∞. For this steady-state result we use the nota-
tion and terminology described in Sections 9.2.1 and 9.2.2. Unless specified
explicitly, for most of the analysis, B could either be finite or be infinite. For
the steady-state analysis we define for all j ∈ S the joint distribution

Fj (x) = lim Fj (t, x) = lim P{X(t) ≤ x, Z(t) = j}. (9.11)

t→∞ t→∞

We would like to obtain a closed-form analytical expression for Fj (x) for all
j ∈ S. For that we require the system to be stable if B = ∞, and the condition
of stability is

pi d(i) < 0.
i∈S

Assuming that is satisfied, since in steady state the system would be

stationary, we have

∂Fj (t, x)
lim =0
t→∞ ∂t

for all j ∈ S. Hence we can write down an ordinary differential equation by

taking the limit t → ∞ of Equation 9.10 as

dF(x)
D = F(x)Q, (9.12)
dx

where

F(x) = [F1 (x) F2 (x) . . . F (x)]. (9.13)

Since this is steady-state analysis, initial conditions would not matter. The
boundary conditions reduce to

Fj (0) = 0 if d(j) > 0

and (if B < ∞)

Fj (B) = pj if d(j) < 0.

Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 535

Note that matrix ordinary differential equations of the form dG(x)

dx = G(x)A
to obtain row vector G(x) in terms of square matrix A (of appropriate size)
can be solved as G(x) = K exp(Ax) where exp(Ax) = I + Ax + A2 x2 /2! +
A3 x3 /3! + · · · and K is a constant row vector that can be obtained by
using the boundary conditions. To obtain exp(Ax) we can write down
A = VEV −1 , where E is a diagonal matrix with eigenvalues of A and the
columns of V correspond to the right eigenvectors of A. Then we get
exp(Ax) = Vdiag[eEii x ]V −1 , where Eii is the ith eigenvalue of A, that is, the
ith diagonal element of E, and diag[eEii x ] is a diagonal matrix with the ith
entry corresponding to eEii x . Motivated by that, for the matrix differential
equation (9.12)

dF(x)
D = F(x)Q,
dx

we try as solution

F(x) = eλx φ,

where φ is a 1 × row vector. The solution (F(x) = eλx φ) works if and only if

φ(λD − Q) = [0 0 . . . 0]

since dF(x)
dx D would be φλDe
λx and F(x)Q would be φQeλx . Essentially φ is

a left eigenvector and λ an eigenvalue both of which need to be determined.

For that we first solve the characteristic equation:

det(λD − Q) = 0, (9.14)

where det(A) is the determinant of square matrix A. Upon solving the equa-
tion, we would get the eigenvalues λ. Then using φ(λD − Q) = [0 0 . . . 0]
for each solution λ, we can obtain the corresponding left eigenvectors φ.
Before forging ahead, we describe some properties and notation. We
partition the state space S into three sets, S+ , S0 , and S− , that denote the
states where the drift is positive, zero, and negative, respectively. Also, + ,
0 , and − are the number of states with positive, zero, and negative drift,
respectively, such that + + 0 + − = . Thus we have

S+ = {i ∈ S : d(i) > 0},

S0 = {i ∈ S : d(i) = 0},

S− = {i ∈ S : d(i) < 0},

536 Analysis of Queues

+ = |S+ |,

0 = |S0 |,

− = |S− |.

Using that we can write down some properties. Firstly notice that Equa-
tion 9.14 would have + + − solutions {λi , i = 1, 2, . . . , + + − }. The + + −
solutions
could include multiplicities. The crucial property is that when
pi d(i) < 0, exactly + of the λi values have negative real parts, one is
i∈S
zero, and, − − 1 have positive real parts.
For sake of convenience we number the λi ’s as

Re(λ1 ) ≤ Re(λ2 ) ≤ · · · ≤ Re(λ+ ) < Re(λ+ +1 )

= 0 < Re(λ+ +2 ) ≤ · · · ≤ Re(λ+ +− ), (9.15)

where Re(ω) is the real part of a complex number ω. Using this specific
order we are now ready to state the general solution to the differential
equation (9.12) as

+ +−

F(x) = ai eλi x φi , (9.16)
i=1

where ai values are some constants that need to be obtained (recall that we
know how to compute λi and φi for all i).
To compute ai values, we explicitly consider two cases depending on
whether the size of the buffer is infinite or finite. Hence we have the
following:

• If B = ∞ with pi d(i) < 0, then ai values are given by the
i∈S
solution to

aj = 0 if Re(λj ) > 0 (9.17)

a+ +1 = 1/(φ+ +1 1) (9.18)

+ +1

ai φi (j) = 0 if j ∈ S+ , (9.19)
i=1

where 1 is a column vector of 1s and φi (j) is the jth element of vec-

tor φi . We now explain the preceding set of equations. Consider
Equation 9.16. Recall that each element of F(x), that is, the LHS,
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 537

is a joint probability, hence must necessarily be between zero and

one. If we let x → ∞, the only way the elements of F(x) would be
a probability measure is if aj = 0 for all λj where Re(λj ) > 0. This
results in Equation 9.17. Also notice that if we let x → ∞ in Equa-
tion 9.16, all we would have is F(∞) = a+ +1 φ+ +1 since λ+ +1 = 0
and the terms with Re(λj ) < 0 would vanish as x → ∞. However,
F(∞)1 = 1 since F(∞)1 is equal to limt→∞ P{X(t) ≤ ∞} which must
be one if the system is stable. Hence we get Equation 9.18 since
a+ +1 φ+ +1 1 = 1. Finally, Equation 9.19 is directly an artifact of the
boundary condition Fj (0) = 0 if d(j) > 0 for all j ∈ S which is equiva-
+ +1
lent to ai φi (j) = 0 if j ∈ S+ since only the first + + 1 of the ai
i=1
values are nonzero and the elements in S+ are all those with positive
drift, that is, d(j) > 0.
• If B < ∞, then ai values are given by the solution to

− ++

ai φi (j) = 0 if j ∈ S+ (9.20)
i=1

− ++

ai φi (j)eλi B = pj if j ∈ S− , (9.21)
i=1

where φi (j) is the jth element of vector φi . Equation 9.20 is due to the
boundary condition Fj (0) = 0 if d(j) > 0 for all j ∈ S which is equiva-
− ++
lent to ai φi (j) = 0 if j ∈ S+ since the elements in S+ are all
i=1
those with positive drift, that is, d(j) > 0. Likewise, Equation 9.21 is
due to the boundary condition Fj (B) = pj if d(j) < 0 for all j ∈ S which
− ++
is equivalent to ai φi (j)eλi B = pj if j ∈ S− since the elements in
i=1
S− are all those with negative drift, that is, d(j) < 0.

9.2.4 Examples
In this section, we present a few examples to illustrate the approach to obtain
buffer content distribution and describe relevant insights. For that we require
characteristics of the environment process, namely, the generator matrix Q
and rate matrix R as well as buffer characteristics such as size B and output
capacity c. Then using Q, R, B, and c we can obtain the joint distribution
Fj (x) as well as the marginal limiting distribution of X(t). The notation, ter-
minology, and methodology used here are described in Sections 9.2.1, 9.2.2,
and 9.2.3. All but the last example are steady-state analyses, and they are all
presented in a problem-solution format.
538 Analysis of Queues

Problem 85
Consider the example described in Section 9.2.1 where there is an infinite-
sized buffer with output capacity c = 12 kbps and the input is driven by an
environment CTMC {Z(t), t ≥ 0} with = 4 states, S = {1, 2, 3, 4}, and

⎡ ⎤
−10 2 3 5
⎢ 0 −4 1 3⎥
Q=⎢
⎣ 1
⎥
1 −3 1⎦
1 2 3 −6

and fluid arrival rates in states 1, 2, 3, and 4 are 20, 15, 10, and 5 kbps,
respectively. Obtain the joint distribution vector F(x) as well as a graph of
the CDF limt→∞ P{X(t) ≤ x} versus x.
Solution
For this problem we have Q described earlier,

⎡ ⎤
20 0 0 0
⎢ 0 15 0 0⎥
⎢ ⎥
R=⎢ ⎥,
⎣ 0 0 10 0⎦
0 0 0 5

B = ∞, and c = 12. The drift matrix is

⎡ ⎤
8 0 0 0
⎢0
⎢ 3 0 0⎥⎥
D=⎢ ⎥.
⎣0 0 −2 0⎦
0 0 0 −7

The steady-state probability vector for the environment process is

[p1 p2 p3 p4 ] = [0.0668 0.2647 0.4118 0.2567]. The system is stable
4
since Dii pi < 0.
i=1
Notice that S+ = {1, 2} since states 1 and 2 have positive drift. Likewise
S− = {3, 4} since states 3 and 4 have negative drift. Also since there are no
zero-drift states, S0 is a null set. Also + = 2 and − = 2. Thus by solving
Equation 9.14 we would get two λ values with negative real parts, one λ
value would be zero and one with positive real part. We check that first. We
solve for λ in the characteristic equation

det(λD − Q) = 0,
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 539

to obtain

[λ1 λ2 λ3 λ4 ] = [−1.3733 − 0.5994 0 1.7465].

Notice that the λi values are ordered according to Equation 9.15. Then
using φ(λD − Q) = [0 0 0 0] for each solution λ, we can obtain
the corresponding left eigenvectors φ1 = [−0.2297 0.9600 0.1087 0.1179],
φ2 = [0.1746 0.7328 0.5533 0.3555], φ3 = [0.1201 0.4754 0.7396 0.4610], and
φ4 = [−0.0317 − 0.0660 − 0.9741 0.2138].
Thereby using Equation 9.16 we can write down F(x) as

F(x) = a1 eλ1 x φ1 + a2 eλ2 x φ2 + a3 eλ3 x φ3 + a4 eλ4 x φ4 .

All we need to compute are a1 , a2 , a3 , and a4 . For that we use Equa-

tions 9.17 through 9.19. In particular, from Equation 9.17, we have a4 = 0
since Re(λ4 ) > 0; based on Equation 9.18, a3 = 1/(φ3 1) = 0.5568; from Equa-
tion 9.19 we get a1 φ1 (1)+a2 φ2 (1)+a3 φ3 (1) = −0.2297a1 +0.1746a2 +0.0668 = 0
and a1 φ1 (2) + a2 φ2 (2) + a3 φ3 (2) = 0.96a1 + 0.7328a2 + 0.2647 = 0 which can
be solved to obtain a1 = 0.0082 and a2 = − 0.3720. Hence we have

F(x) = 0.0082[−0.2297 0.9600 0.1087 0.1179]e−1.3733x

− 0.3720[0.1746 0.7328 0.5533 0.3555]e−0.5994x

+ 0.5568[0.1201 0.4754 0.7396 0.4610]

= [0.0668 − 0.0019e−1.3733x − 0.0650e−0.5994x

0.2647 + 0.0079e−1.3733x − 0.2726e−0.5994x

0.4118 + 0.0009e−1.3733x − 0.2058e−0.5994x

0.2567 + 0.0010e−1.3733x − 0.1323e−0.5994x ].

Notice that the first component of this expression corresponds to

[0.0668 0.2647 0.4118 0.2567] which would be F(∞) and it is equal to
[p1 p2 p3 p4 ]. Although theoretically this is not surprising, the calcula-
tions lead to this because when we solve [p1 p2 p3 p4 ]Q = [0 0 0 0] we
are essentially obtaining the left eigenvector which is [p1 p2 p3 p4 ] and the
eigenvalue of 0.
Next, we have the CDF

lim P{X(t) ≤ x} = P{X ≤ x} = F(x)1 = 1 + 0.0079e−1.3733x − 0.6757e−0.5994x

t→∞

for all x ≥ 0 by letting X(t) → X as t → ∞. A graph of the CDF is depicted

in Figure 9.4. Notice that there is a mass at x = 0 which can also be obtained
540 Analysis of Queues

0.9

0.8
P(X < = x)

0.7

0.6

0.5

0.4

0 1 2 3 4 5 6 7 8 9 10
X

FIGURE 9.4
Graph of P{X ≤ x} vs. x for the infinite buffer case (Problem 85).

by letting x = 0 in 1 + 0.0079e−1.3733x − 0.6757e−0.5994x that would result in

P{X = 0} = 0.3322.

To present the similarities and differences between the cases when the
buffer size is infinite versus finite, in the next problem we consider the exact
same numerical values as the previous problem, except for the size of the
buffer. That is described next.

Problem 86
Consider Problem 85 with the only exception that the buffer size is finite
with B = 2. Obtain the joint distribution vector F(x) for 0 ≤ x < B as well as
the distribution for X(t) as t → ∞.
Solution
Recall that the analysis in Section 9.2.3 does not make any assumptions about
B until obtaining the constants ai . Thus from the solution to Problem 85 we
have (for 0 ≤ x < B)

F(x) = a1 eλ1 x φ1 + a2 eλ2 x φ2 + a3 eλ3 x φ3 + a4 eλ4 x φ4 ,

where

[λ1 λ2 λ3 λ4 ] = [−1.3733 − 0.5994 0 1.7465],

φ1 = [−0.2297 0.9600 0.1087 0.1179],
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 541

φ2 = [0.1746 0.7328 0.5533 0.3555],

φ3 = [0.1201 0.4754 0.7396 0.4610],

and

φ4 = [−0.0317 −0.0660 − 0.9741 0.2138].

All we need to compute are a1 , a2 , a3 , and a4 . For that we use Equations 9.20
and 9.21.
From Equation 9.20 we get a1 φ1 (1) + a2 φ2 (1) + a3 φ3 (1) + a4 φ4 (1) =
−0.2297a1 + 0.1746a2 + 0.1201a3 − 0.0317a4 = 0 and a1 φ1 (2) + a2 φ2 (2) + a3 φ3
(2) + a4 φ4 (2) = 0.96a1 + 0.7328a2 + 0.4754a3 − 0.0660a4 = 0. Likewise, from
Equation 9.21 we get a1 φ1 (3)eλ1 B + a2 φ2 (3)eλ2 B +a3 φ3 (3)eλ3 B + a4 φ4 (3)eλ4 B =
0.0070a1 + 0.1669a2 + 0.7396a3 −32.0307a4 = p3 and a1 φ1 (4)eλ1 B + a2 φ2 (4)eλ2 B
+ a3 φ3 (4)eλ3 B + a4 φ4 (4)eλ4 B = 0.0076a1 + 0.1072a2 + 0.4610a3 + 7.0293a4 = p4 .
Using the fact that p3 = 0.4118 and p4 = 0.2567, these four equations can be
solved to obtain a1 = 0.0097, a2 = − 0.4397, a3 = 0.6581, and a4 = 0.000050924.
Next using the notation X(t) → X as t → ∞, we have the distribution of
X given by

P{X ≤ x} = F(x)1 = 0.0093e−1.3733x − 0.7986e−0.5994x

+ 1.1820 − 0.000043698e1.7465x if 0 < x < B,

P{X = x} = F(0)1 = 0.3926 if x = 0,

P{X = x} = 1 − F(B)1 = 0.0597 if x = B.

A graph of P(X ≤ x) versus x is depicted in Figure 9.5. Notice the mass

at x = 0. Also at x = B the CDF is not 1 and the gap explains the mass
of X at B.

Having described two examples in a numerical fashion, our next example

will be a symbolic calculation. It is based on one of the simplest, yet widely
used models which is the exponential on–off environment process that is also
known as CTMC on–off source.

Problem 87
(CTMC on–off source) Consider a source that inputs fluid into an infinite-
size buffer. The source toggles between on and off states. The on times are
according to exp(α) and off times according to exp(β). Traffic is generated
at rate r when the source is in the on-state and no traffic is generated when
the source is in the off-state. Assume that r > c, where c is the usual output
542 Analysis of Queues

0.9

0.8
P(X < = x)

0.7

0.6

0.5

0.4

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

FIGURE 9.5
Graph of P{X ≤ x} vs. x for the finite buffer case of Problem 86.

capacity. Find the condition of stability. Assuming that the stability condi-
tion is satisfied what is the steady-state distribution of the buffer contents in
terms of r, c, α, and β?
Solution
The environment process {Z(t), t ≥ 0} is a CTMC with = 2 states and
S = {1, 2} with 1 representing off and 2 representing the on-state. Therefore,

−β β 0 0
Q= and R = .
α −α 0 r

With r > c, the drift matrix is

−c 0
D= .
0 r−c

The steady-state probabilities for the environment process are p1 = α α

+ β and
p2 = α β
+ β . The system is stable if 0p1 + rp2 < c. Thus the stability condition
(notice that B = ∞) is

rβ/(α + β) < c.

State 1 has negative drift and state 2 has positive drift. Hence by solving
for Equation 9.14 we would get one λ value with negative real part and one
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 543

λ value would be zero. To confirm that we solve for λ in the characteristic

equation

det(λD − Q) = 0,

which yields

(−λc + β)(λr − λc + α) − αβ = 0.

By rewriting this equation in terms of the unknown λ we get

c(r − c)λ2 − (β(r − c) − αc)λ = 0.

Therefore, we have the two solutions to λ as

λ1 = β/c − α/(r − c),

λ2 = 0.

From the stability condition rβ/(α + β) < c we have rβ − c(α + β) < 0 and
dividing that by the positive quantity c(r − c), we get β/c − α/(r − c) < 0.
Hence λ1 < 0. Thus we have verified that one λ value has negative real part
and the other one is zero.
Next, using φ(λD − Q) = [0 0] for each λ, we can obtain the correspond-
ing left eigenvectors as φ1 = [(r − c)/c 1] and φ2 = [α/(α + β) β/(α + β)].
Thereby using Equation 9.16 we can write down F(x) as

F(x) = a1 eλ1 x φ1 + a2 eλ2 x φ2 .

All we need to compute are a1 and a2 . For that we use Equations 9.18 and
9.19. From Equation 9.18, a2 = 1/(φ2 1) = 1. Also, from Equation 9.19 we get
a1 φ1 (2) + a2 φ2 (2) = 0 since that corresponds to state with a positive drift and
that results in a1 = − α β + β . Hence we have

αc − (r − c)βeλ1 x β
F(x) = [F1 (x) F2 (x)] = (1 − eλ1 x ) .
c(α + β) α+β

Therefore, the limiting distribution of the buffer content process is

βr
lim P{X(t) ≤ x} = F1 (x) + F2 (x) = 1 − eλ1 x , (9.22)
t→∞ c(α + β)
544 Analysis of Queues

where

λ1 = β/c − α/(r − c). (9.23)

From Equation 9.22 we can quickly write down

βr
lim P{X(t) > x} = eλ1 x . (9.24)
t→∞ c(α + β)

In particular notice that

βr
lim P{X(t) > 0} =
t→∞ c(α + β)

which makes sense since in a cycle of one busy period and one idle period
on average a quantity proportional to rβ/(α + β) fluid arrives but that fluid
is depleted during a busy period at rate c. Thus the ratio of the mean busy
period to the mean cycle of one busy period to an idle period must equal
rβ/[c(α + β)] which is also the equation for the fraction of time there is a
non-zero amount of fluid in the buffer.

On–off environment processes are a popular way to characterize informa-

tion flow because when there is information to transmit, it flows at the speed
of the cable and when there is no information to transmit, nothing flows. So
information flow can be nicely modeled as an on–off process. However, the
on and off times may not be exponentially distributed. But since any distribu-
tion can be approximated as a phase-type distribution, we could always use
the analysis presented here, thus this approach is rather sound and powerful.
We illustrate that in the next problem.

Problem 88
Consider an on–off source that generates fluid into a buffer of infinite size
with output capacity 8 units per second. The on times are IID random vari-
ables with CDF U(t) = 1 − 0.6e−3t − 0.4e−2t and the off times are IID Erlang
random variables with mean 0.5 and variance 1/12 in appropriate time units
compatible with the on times. When the source is on, fluid is generated at
rate 16 units per second and no fluid is generated when the source is off.
Compute the probability that there would be more than 10 units of fluid in
the buffer in steady state.
Solution
Notice that the on times correspond to a two-phase hyperexponential distri-
bution. So the on time would be exp(3) with probability 0.6 and it would be
exp(2) with probability 0.4, which can be deduced from U(t). The off times
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 545

correspond to the sum of three IID exp(6) random variables. Thus we can
write down the environment process {Z(t), t ≥ 0} as an = 5 state CTMC with
states 1 and 2 corresponding to on and states 3, 4, and 5 corresponding to off.
Thus the Q matrix corresponding to S = {1, 2, 3, 4, 5} is

⎡ ⎤
−3 0 3 0 0
⎢ 0 −2 2 0 0 ⎥
⎢ ⎥
Q=⎢
⎢ 0 0 −6 6 0 ⎥.
⎥
⎣ 0 0 0 −6 6 ⎦
3.6 2.4 0 0 −6

The rate matrix is

⎡ ⎤
16 0 0 0 0
⎢ 0 16 0 0 0 ⎥
⎢ ⎥
R=⎢
⎢ 0 0 0 0 0 ⎥.
⎥
⎣ 0 0 0 0 0 ⎦
0 0 0 0 0

Using that and the fact that c = 8, we have the drift matrix

⎡ ⎤
8 0 0 0 0
⎢ 0 8 0 0 0 ⎥
⎢ ⎥
D=⎢
⎢ 0 0 −8 0 0 ⎥.
⎥
⎣ 0 0 0 −8 0 ⎦
0 0 0 0 −8

Since B = ∞, we need to first check if the buffer is stable. For that we obtain
[p1 p2 p3 p4 p5 ] = [0.2222 0.2222 0.1852 0.1852 0.1852]. We have
5
Dii pi = − 0.8889 < 0, hence the system is stable.
j=1
The rest of the analysis proceeds very similar to Problem 85 with the only
exception being the final expression to compute, which here is the probabil-
ity that there would be more than 10 units of fluid in the buffer in steady
state. Nevertheless for the sake of completion we go through the entire pro-
cess. Notice that S+ = {1, 2} since states 1 and 2 have positive drift. Likewise
S− = {3, 4, 5} since states 3, 4, and 5 have negative drift. Also since there are
no zero-drift states, S0 is a null set. Also + = 2 and − = 3. Thus by solv-
ing Equation 9.14 we would get two λ values with negative real parts, one λ
value would be zero and two with positive real parts. We solve for λ in the
characteristic equation

det(λD − Q) = 0,
546 Analysis of Queues

to obtain

[λ1 λ2 λ3 λ4 λ5 ] = [−0.3227 −0.0836 0 1.0156 − 0.3765i 1.0156 + 0.3765i]

with λi values ordered according to Equation 9.15. Then using φ(λD −

Q) = [0 0 . . . 0] for each solution λ, we can obtain the corresponding
left eigenvectors

φ1 = 0.8678 −0.4163 0.2064 0.1443 0.1009 ,

φ2 = 0.5038 0.5881 0.4030 0.3626 0.3263 ,

φ3 = 0.4949 0.4949 0.4124 0.4124 0.4124 ,

φ4 = −0.2334 + 0.0632i −0.1686 + 0.0501i 0.0980 − 0.2752i 0.2741 + 0.3886i −0.7740 ,
and

φ5 = −0.2334 − 0.0632i −0.1686 − 0.0501i 0.0980 + 0.2752i 0.2741 − 0.3886i −0.7740 .

Thereby using Equation 9.16 we can write down F(x) as

F(x) = a1 eλ1 x φ1 + a2 eλ2 x φ2 + a3 eλ3 x φ3 + a4 eλ4 x φ4 + a5 eλ5 x φ5 .

We still need to compute a1 , a2 , a3 , a4 , and a5 . For that we use Equa-

tions 9.17 through 9.19. In particular, from Equation 9.17, we have a4 = 0
and a5 = 0 since Re(λ4 ) > 0 and Re(λ5 ) > 0. Based on Equation 9.18, a3 = 1/
(φ3 1) = 0.4491. From Equation 9.19 we get a1 φ1 (1) + a2 φ2 (1) + a3 φ3 (1) =
0.8678a1 + 0.5038a2 + 0.2222 = 0 and a1 φ1 (2) + a2 φ2 (2) + a3 φ3 (2) = − 0.4163
a1 + 0.5881a2 + 0.2222 = 0 which can be solved to obtain a1 = − 0.0260 and
a2 = − 0.3963. Hence we have all the elements of F(x).
Also, we have the CDF

lim P{X(t) ≤ x} = F(x)1 = a1 eλ1 x φ1 1 + a2 eλ2 x φ2 1 + a3 eλ3 x φ3 1

t→∞

= 1 − 0.0235e−0.3227x − 0.8654e−0.0836x

for all x ≥ 0. Thus the probability that there would be more than 10 units of
fluid in the buffer in steady state is P(X > 10) = 0.0235e−3.227 + 0.8654e−0.836 =
0.3761.

For this problem we have for any x ≥ 0, P(X > x) = 0.0235e−0.3227x +

0.8654e−0.0836x . A plot of P(X > x) is depicted in Figure 9.6 for 0 ≤ x ≤ 30.
Notice how slowly the probability reduces with x. Also note that the first
term, namely, 0.0235e−0.3227x hardly contributes to P(X > x). In the next
chapter we will leverage upon this to obtain tail probabilities P(X > x) for
large x. However, for the rest of this chapter we will be concerned with the
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 547

0.9

0.8

0.7

0.6
P{X > x}

0.5

0.4

0.3

0.2

0.1

0
0 5 10 15 20 25 30
x

FIGURE 9.6
Graph of P{X > x} vs. x for the infinite buffer case of Problem 88.

entire distribution. We conclude our examples with a problem on transient

analysis adapted from Kharoufeh and Gautam [62].

Problem 89
The speed of a particular vehicle on a highway is modulated by a five-state
CTMC {Z(t), t ≥ 0} with S = {1, 2, 3, 4, 5}. When the CTMC is in state i, the
speed of the vehicle is Vi = 75/i miles per hour for i ∈ S. The infinitesimal
generator matrix of the CTMC is

⎡ ⎤
−919.75 206.91 264.85 238.67 209.32
⎢ 223.01 −971.71 301.98 232.73 213.98⎥
⎢ ⎥
⎢ ⎥
Q = ⎢ 343.04 277.78 −1283.57 392.72 270.03⎥
⎢ ⎥
⎣ 353.91 232.27 213.69 −1059.47 259.59⎦
370.92 200.89 216.80 225.60 −1014.21

in units of h−1 . Assume that the CTMC is in state 1 at time 0. Obtain a method
to compute the CDF of the amount of time it would take for the vehicle to
travel 1 mile. Also numerically compute for sample values of t the proba-
bility that the vehicle would travel one mile before time t. Verify the results
using simulations.
548 Analysis of Queues

Solution
Let T(x) be the random time required for the vehicle to travel a distance
x miles. We are interested in P{T(x) ≤ t} for x = 1; however, we provide
an approach for a generic x. Now let X(t) be the distance the vehicle trav-
eled in time t. A crucial observation needs to be made which is that the
events {T(x) ≥ t} and {X(t) ≤ x} are equivalent. In other words, the event
that a vehicle travels in time t a distance less than or equal to x is the
same as saying that the time taken to reach distance x is greater than or
equal to t. Therefore, P{T(x) ≥ t} = P{X(t) ≤ x} and hence the CDF of T(x) is
P{T(x) ≤ t} = 1 − P{X(t) ≤ x}.
Next, we will show a procedure to compute P{X(t) ≤ x}, and thereby
obtain P{T(x) ≥ t}. Notice that, X(t) can be thought of as the amount of fluid
in a buffer at time t with X(0) = 0 modulated by an environment process
{Z(t), t ≥ 0} with Z(0) = 1. The buffer size is infinite (B = ∞) and the output
capacity c = 0. Fluid flows in at rate r(Z(t)) = VZ(t) = 75/Z(t) at time t. Essen-
tially the amount of fluid at time t corresponds to the distance traveled by
the vehicle at time t.
Now define the following joint probability distribution,

Fj (t, x) = P{X(t) ≤ x, Z(t) = j}, ∀j∈S

which is identical to that of Equation 9.3. If we know Fj (t, x), then we can
immediately obtain the required P{T(x) ≤ t} using

P{T(x) ≤ t} = 1 − P{X(t) ≤ x} = 1 − Fi (t, x).
i∈S

Using the representation in Equation 9.9 we define the row vector F(t, x) as

F(t, x) = [F1 (t, x) F2 (t, x) . . . F (t, x)].

Based on Equation 9.10 we know that the vector F(t, x) satisfies the partial
differential equation

∂F(t, x) ∂F(t, x)
+ D = F(t, x)Q,
∂t ∂x

with initial condition for all i ∈ S

Fi (0, x) = Ai (x) = P{Z(0) = i}.

Note that D is the drift matrix (essentially diagonal matrix of r(i) − c values
with r(i) = 75/i and c = 0).
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 549

To solve the partial differential equation, we require some additional

notation. Let F∗i (s2 , x) be the Laplace transform (LT) of Fi (t, x) with respect
to t, that is,

∞
F∗i (s2 , x) = e−s2 t Fi (t, x)dt.
0

Also, let F̃∗i (s2 , s1 ), i ∈ S denote the Laplace–Stieltjes transform (LST) of

F∗i (s2 , x) with respect to x, that is,

∞
F̃∗i (s2 , s1 ) = e−s1 x dF∗i (s2 , x).
0

Writing in matrix form, we have F̃∗ (s2 , s1 ) = F̃∗i (s2 , s1 ) as the 1 × 5 row
∗
∗ i∈S
vector of transforms. Likewise F (s2 , x) = Fi (s2 , x) i ∈ S is the row vector of
Laplace transforms of Fi (t, x) with respect to t.
To solve the partial differential equation, we take the LT and then the
LST of the partial differential equation to get an expression in the transform
space

F̃∗ (s2 , s1 ) = Ã(s1 )(s1 D + s2 I − Q)−1

where Ã(s1 ) is a 1 × 5 row vector of the LSTs of the initial condition. There
are several software packages that can be used to numerically invert this
transform. As additional complication is the 2D nature. Readers are referred
to Kharoufeh and Gautam [62] for a numerical inversion algorithm as well
as a list of references for different inversion techniques. Using x = 1 and the
initial condition X(0) = 0 and Z(0) = 1 giving rise to

Ã(s1 ) = [ 1 0 0 0 0 ]

we can obtain numerically an expression for F(t, x) for various values of t.

Thus we can obtain P{T(x) ≤ t} for x = 1 as described in Table 9.2. An empir-
ical CDF based on 100,000 observations of travel time was generated via
Monte-Carlo simulation methods, is also presented in the last column of
the table.

Having described an example of transient analysis, it provides a nice

transition to the next topic which is the first passage time analysis in fluid
queues. We are interested in the first time the fluid reaches a particular level
from a given initial level.
550 Analysis of Queues

TABLE 9.2
Travel Time CDF to Traverse x = 1
Mile for Sample t Values
t P{T(x) ≤ t} P{T(x) ≤ t}
(min) Inversion Simulation
1.25 0.0786 0.0777
1.47 0.3335 0.3352
1.70 0.6859 0.6865
1.92 0.9141 0.9136
2.14 0.9873 0.9872
2.37 0.9991 0.9991
2.59 1.0000 0.9999
2.81 1.0000 1.0000
Source: Kharoufeh, J.P. and Gautam, N.,
Transp. Sci., 38(1), 97, 2004. With
permission.

9.3 First Passage Times

Consider a single buffer fluid model as described in Figure 9.2. We continue
to use the notation and terminology in Section 9.2.1. The input to this buffer
is driven by an environment process {Z(t), t ≥ 0} which is an -state CTMC
with state space S and infinitesimal generator matrix Q. The output capacity
of the buffer is c and the drift matrix is D. We assume for convenience that
all diagonal elements of the drift matrix are nonzero (i.e., for all j ∈ S we have
r(j) = c). The assumption is purely for the purposes of presentation but does
not pose any difficulties analytically (see exercises at the end of the chapter).
Let X(t) be the amount of fluid in the buffer at time t. In this section we are
interested in answering the question: Given X(0) = x and Z(0) = i, what is
the distribution of the random time for the buffer content to reach a level
of a or b for the first time? This is sometimes known as first passage time
or hitting time. Three cases are relevant (all other cases can be written in
terms of these three cases): a ≤ x ≤ b, a = b ≤ x, and a = b ≥ x. We generally
assume that the buffer is infinite-sized, that is, B = ∞. However, only the
case a = b ≤ x requires the system to be stable. In fact for the other cases B can
even be finite, as long as B > b. Although we have considered the three cases,
we will mainly describe the analysis for a ≤ x ≤ b and present remarks for the
other two cases toward the end of the next section.
Define the first passage time T as

T = inf{t > 0 : X(t) = a or X(t) = b}. (9.25)

Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 551

Thus T denotes the time it takes for the buffer content to reach a or b for
the first time. Our objective is to obtain a distribution for T as a CDF (i.e.,
P{T ≤ t}) or its LST (i.e., E[e−wT ]). For that let

Hij (x, t) = P{T ≤ t, Z(T) = j | X(0) = x, Z(0) = i}

for i, j ∈ S; a ≤ x ≤ b and t ≥ 0. Define the × matrix H(x, t) = [Hij (x, t)] so

that Hij (x, t) corresponds to the element in row i and column j in H(x, t). We
follow a procedure very similar to that in Section 9.2.2 and obtain a partial
differential equation in the next section.

9.3.1 Partial Differential Equations and Boundary Conditions

In this section, we derive a partial differential equation very similar to Equa-
tion 9.10 for H(x, t) defined there (the H(x, t) here is different and in fact
a matrix). Consider the term Hij (x, t + h) where h is a small positive real
number. We can write Hij (x, t + h) as

Hij (x, t + h) = P{T ≤ t + h, Z(T) = j | X(0) = x, Z(0) = i}.

By conditioning and unconditioning on Z(h) = k, we can show that

Hij (x, t + h) = Hkj (x + h(r(i) − c), t) qik h
k∈S,k=i

+ Hij (x + h(r(i) − c), t) (1 + qii h) + o(h),

where o(h) represents higher order terms of h. Subtracting Hij (x, t) on both
sides and rearranging terms, we get

Hij (x, t + h) − Hij (x, t)

h
Hij (x + (r(i) − c)h, t) − Hij (x, t)
= + Hkj (x + h(r(i) − c), t) qik + o(h)/h.
h
k∈S

o(h)
Taking the limit as h → 0, since h → 0, we have

∂Hij (x, t) ∂Hij (x, t)

= (r(i) − c) + Hkj (x, t) qik .
∂t ∂x
k∈S
552 Analysis of Queues

Writing that in a matrix form results in

∂H(x, t) ∂H(x, t)
−D = QH(x, t). (9.26)
∂t ∂x

Next we describe the boundary conditions for all i ∈ S and j ∈ S (with

a ≤ x ≤ b) as follows:

Hjj (b, t) = 1 if r(j) > c, (9.27)

Hjj (a, t) = 1 if r(j) < c, (9.28)
Hij (b, t) = 0 if r(i) > c and i = j, (9.29)
Hij (a, t) = 0 if r(i) < c and i = j. (9.30)

The first boundary condition (i.e., Equation 9.27) is so because if the initial
buffer content is b and source is in state j (assuming r(j) > c), then essentially
the first passage time has occurred. Therefore, the probability that the first
passage time occurs before time t and the source is in state j when it occurred
is 1. The second boundary condition (Equation 9.28) is based on the fact that
if the initial buffer content is a and the source is in state j such that r(j) < c,
then the first passage time is zero. Hence, the probability that the first pas-
sage time happens before time t and the source is in state j when it occurred
is 1. The third boundary condition (i.e., Equation 9.29) is due to the fact that
although the first passage time is zero, the probability that the source is state
j when the first passage time occurs is zero (since at time t = 0 the source is
state i with r(i) > c and i = j). For exactly the same reason, the last boundary
condition (Equation 9.30) is the way it is, that is, the first passage time is zero
but it cannot occur when the source is state j, given that the source was in
state i at time t = 0 with r(i) < c and i = j.
Next we solve the partial differential equation (PDE), that is, Equa-
tion 9.26. First we take the LST across the PDE with respect to t. That reduces
to the following ordinary differential equation (ODE):

dH̃(x, w)
D = (wI − Q)H̃(x, w) (9.31)
dx

where H̃(x, w) is the LST of H(x, t) with respect to t and that in turn equals
the LST of each element of H(x, t). Not only is the ODE easier to solve, but we
can also immediately obtain the LST of the CDF of the first passage time T.
We first solve the ODE. For that let S1 (w), . . . , S (w) be scalar solutions to
the characteristic equation

det(DS(w) − wI + Q) = 0.
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 553

For each Sj (w) we can find column vectors φj (w) that satisfy

Sj (w)Dφj (w) = (wI − Q)φj (w).

Thus given w, the Sj (w) values are eigenvalues and φ1 (w), . . . , φ (w) are
the corresponding right eigenvectors. Using those we can write down the
solution to Equation 9.31.
The solution to this ODE is given by

H̃·,j (x, w) = a1,j (w)eS1 (w)x φ1 (w) + a2,j (w)eS2 (w)x φ2 (w)

+ · · · + a,j (w)eS (w)x φ (w), (9.32)

where ai,j (w) values are constants to be determined and H̃·,j (x, w) is a column
vector such that
⎡ ⎤
H̃1j (x, w)
⎢ H̃ (x, w) ⎥
⎢ 2j ⎥
H̃·,j (x, w) = ⎢
⎢ ..
⎥.
⎥
⎣ . ⎦
H̃j (x, w)

We can obtain the 2 ai,j (w) values using the 2 equations corresponding to
the LST of the boundary condition equations (9.27) through (9.30) which for
all i ∈ S and j ∈ S are

H̃jj (b, w) = 1 if r(j) > c,

H̃jj (a, w) = 1 if r(j) < c,

H̃ij (b, w) = 0 if r(i) > c and i = j,

H̃ij (a, w) = 0 if r(i) < c and i = j.

Thereby, using Equation 9.32 we can write down the LST of the distri-
bution of T. In particular, given X(0) = x0 and Z(0) = z0 , the LST of the first
passage time distribution can be computed as

E e−wT = H̃z0 j (x0 , w). (9.33)
j=1

Although in most instances this equation cannot be inverted to get the CDF
of T, one can quickly get moments of T. Specifically for r = 1, 2, 3, . . .,
554 Analysis of Queues

r
−wT
r rd E e
E[T ] = (−1)
dwr

at w = 0. We can also obtain the probability that the first passage time ends
in state j. In particular

P{Z(T) = j|X(0) = x0 , Z(0) = z0 } = Hz0 j (x0 , ∞) = H̃z0 j (x0 , 0)

since the CDF in the limit t → ∞ is equivalent to its LST in the limit w → 0.
In the next section we present some examples to illustrate the approach. We
now present two remarks for the cases when a = b ≤ x and a = b ≥ x.

Remark 22

Forthe case a = b ≤ x we require that the stability condition is satisfied, that

is, j ∈ S d(j)pj < 0 where d(j) is the drift when the environment process is in
state j and pj is the stationary probability that the environment is in state j.
The analysis is exactly the same as the case a ≤ x ≤ b, especially the defini-
tion of T in Equation 9.25 (of course the “or” is redundant) and the PDE in
Equation 9.26. The only exception is that the boundary conditions would
now be

Hjj (a, t) = 1 if r(j) < c

Hij (a, t) = 0 if r(i) < c and i = j

since the first passage time would be zero if we started at a in a state with neg-
ative drift. However, we cannot solve for all ai,j (w) values in Equation 9.32
using the given boundary conditions as there are not enough equations as
unknowns. For that we use some additional conditions including ai,j (w) = 0
if Si (w) > 0 since as x → ∞ we require the condition for Hij (x, t) to be a joint
probability distribution. In addition, it is worthwhile noting that since a first
passage time can only end in a state with negative drift

Hij (x, t) = 0 if r(j) > c

for any x.

Remark 23

For the case a = b ≥ x as well the analysis is exactly the same as the case
a ≤ x ≤ b, especially the definition of T in Equation 9.25 (again the “or” being
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 555

redundant) and the PDE in Equation 9.26. The only exception is that the
boundary conditions would now be

Hjj (b, t) = 1 if r(j) > c

Hij (b, t) = 0 if r(i) > c and i = j

since the first passage time would be zero if we started at b in a state with
positive drift. However, we cannot solve for all ai,j (w) values in Equation 9.32
using the given boundary conditions as there are not enough equations as
unknowns. For that notice that if the fluid level reached zero in state i (for
that r(i) must be less than c) then it stays at zero till the environment process
changes state to some state k = i. Thus the first passage time is equal to the
stay time in state i plus the remaining time from k till the first passage time
starting in k. By conditioning on k and unconditioning, we can write down
in LST format for all i ∈ S such that r(i) < c

qik qik
H̃ij (0, w) = H̃kj (0, w) .
−qii qik + w
k∈S,k=i

In addition, it is worthwhile noting that since a first passage time can only
end in a state with positive drift

Hij (x, t) = 0 if r(j) < c

for any x.

In the next section, we present some examples to illustrate the proce-

dures developed here, to describe some closed-form solutions for special
cases as well as applying to some problem instances that are extensions to
those presented here.

9.3.2 Examples
To explain some of the nuances described in the previous section on first
passage times, we consider a few examples here. They are presented in a
problem–solution format.

Problem 90
Consider a reservoir from which water is emptied out at a constant rate
of 10 units per day. Water flows into the reservoir according to a CTMC
556 Analysis of Queues

{Z(t), t ≥ 0} with S = {1, 2, 3, 4, 5} and

⎡ ⎤
−1 0.4 0.3 0.2 0.1
⎢ 0.4 −0.7 0.1 0.1 0.1 ⎥
⎢ ⎥
⎢ ⎥
Q=⎢ 0.5 0.4 −1.1 0.1 0.1 ⎥.
⎢ ⎥
⎣ 0.2 0.3 0.3 −1 0.2 ⎦
0.3 0.3 0.3 0.3 −1.2

When Z(t) = i, the inflow rate is 4i. It is considered nominal if there is

between 20 and 40 units of water in the reservoir. However if the amount
of water is over 40 units or below 20 units it is considered excessive or con-
cerning, respectively. At time t = 0, there is 30 units of water and Z(0) = 2.
How many days from t = 0 do you expect the water level to become exces-
sive or concerning? What is the probability that at the end of a nominal spell
the water level would be excessive?
Solution
Let X(t) be the amount of water in the reservoir at time t with X(0) = 30
units. Let T be the first passage time to water levels of either a = 20
units or b = 40 units, whichever happens first. Based on the definition
Hij (x, t) = P{T ≤ t, Z(T) = j|X(0) = x, Z(0) = i}, we can write down its LST with
respect to t as

∞ ∂Hij (x, t)
H̃ij (x, w) = e−wt dt.
∂t
0

To compute the expected number of days from t = 0 for the water level to
become excessive or concerning, all we need is

d
5
E[T|X(0) = 30, Z(0) = 2] = (−1) H̃2j (30, w)
dw
j=1

d
at w = 0. For that we can compute dw H̃ij (x, w) at w = 0 by taking a very small
H̃ij (x,h)−H̃ij (x,0)
h > 0 and obtaining h .
Before explaining how to obtain that, con-
sider the other question, that is, the probability that at the end of a nominal
spell the water level would be excessive. In other words, we need

5
5
P{Z(T) ∈ {3, 4, 5}|X(0) = 30, Z(0) = 2} = H2j (30, ∞) = H̃2j (30, 0).
j=3 j=3
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 557

Thus if we know for all i and j the values of H̃ij (x, h) for some small h
and H̃ij (x, 0), we can immediately compute both E[T|X(0) = 30, Z(0) = 2] and
P{Z(T) ∈ {3, 4, 5}|X(0) = 30, Z(0) = 2}.
To compute H̃ij (x, h) and H̃ij (x, 0), we can write down from Equation 9.32,
for j = 1, 2, 3, 4, 5,
⎡ ⎤
H̃1j (x, w)
⎢ ⎥
⎢ H̃2j (x, w) ⎥
⎢ ⎥
⎢ H̃3j (x, w) ⎥ = a1,j (w)eS1 (w)x φ1 (w)
⎢ ⎥
⎢ ⎥
⎣ H̃4j (x, w) ⎦
H̃5j (x, w)
+ a2,j (w)eS2 (w)x φ2 (w)
+ · · · + a5,j (w)eS5 (w)x φ5 (w), (9.34)

where ai,j (w), Sj (w), and φj (w) values need to be determined especially for
w = 0 and w = h for some small h.
We can obtain S1 (w), . . . , S5 (w) as the five scalar solutions to the charac-
teristic equation

det(DS(w) − wI + Q) = 0.

In particular, we get S1 (0) = −0.4020, S2 (0) = 0.5317, S3 (0) = 0, S4 (0) = 0.0146,

and S5 (0) = 0.1758. When h = 0.000001, we have S3 (h) = − 3.0042 × 10−6 .
However, the values of Sj (h) for j = 1, 2, 4, 5 are too close to the respective
Sj (0) values and hence they are not reported here.
Then, for each Sj (w) we can find column vectors φj (w) that satisfy

Sj (w)Dφj (w) = (wI − Q)φj (w).

Thus we get
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0.3067 −0.0675 −0.4472
⎢ −0.9393 ⎥ ⎢ −0.0590 ⎥ ⎢ −0.4472 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
φ1 (0) = ⎢ −0.1234 ⎥, φ2 (0) = ⎢ −0.9832 ⎥, φ3 (0) = ⎢ −0.4472 ⎥,
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ −0.0783 ⎦ ⎣ 0.1425 ⎦ ⎣ −0.4472 ⎦
−0.0479 0.0705 −0.4472
⎡ ⎤ ⎡ ⎤
0.4064 −0.0801
⎢ 0.4159 ⎥ ⎢ −0.0637 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
φ4 (0) = ⎢ 0.4359 ⎥ , and φ5 (0) = ⎢ −0.1184 ⎥.
⎢ ⎥ ⎢ ⎥
⎣ 0.4774 ⎦ ⎣ −0.8042 ⎦
0.4939 0.5733
558 Analysis of Queues

Here, the values of φj (h) for j = 1, 2, 3, 4, 5 and small h are too close to the
respective φj (0) values and hence they are not reported.
To obtain the ai,j (w) values for i = 1, 2, 3, 4, 5, j = 1, 2, 3, 4, 5, and for two
sets namely w = 0 and w = h, we solve the following 25 boundary condition
equations for each w:

H̃jj (40, w) = 1 for j = 3, 4, 5 ,

H̃jj (20, w) = 1 for j = 1, 2 ,

H̃ij (40, w) = 0 for i = 3, 4, 5 and all j = i ,

H̃ij (20, w) = 0 for i = 1, 2 and all j = i.

Thus we get for w = 0,

⎡ ⎤
a1,1 (0) a2,1 (0) a3,1 (0) a4,1 (0) a5,1 (0)
⎢ a1,2 (0) a2,2 (0) a3,2 (0) a4,2 (0) a5,2 (0) ⎥
⎢ ⎥
⎢ ⎥
⎢ a1,3 (0) a2,3 (0) a3,3 (0) a4,3 (0) a5,3 (0) ⎥
⎢ ⎥
⎣ a1,4 (0) a2,4 (0) a3,4 (0) a4,4 (0) a5,4 (0) ⎦
a1,5 (0) a2,5 (0) a3,5 (0) a4,5 (0) a5,5 (0)
⎡ ⎤
−2416.2 0.1129 × 10−9 −4.5941 −2.3770 0.0541 × 10−3
⎢ 2516.0 0.0369 × 10−9 −1.5004 −0.7763 0.0177 × 10−3 ⎥
⎢ ⎥
⎢ ⎥
=⎢ −9.2 −0.5470 × 10−9 0.3563 0.2910 −0.0490 × 10−3 ⎥.
⎢ ⎥
⎣ −36.2 0.2120 × 10−9 1.4369 1.1714 −0.6466 × 10−3 ⎦
−54.4 0.1851 × 10−9 2.0652 1.6909 0.6238 × 10−3

Also, for w = h = 0.000001, we get

⎡ ⎤
a1,1 (h) a2,1 (h) a3,1 (h) a4,1 (h) a5,1 (h)
⎢ a1,2 (h) a2,2 (h) a3,2 (h) a4,2 (h) a5,2 (h) ⎥
⎢ ⎥
⎢ a1,3 (h) a2,3 (h) a3,3 (h) a4,3 (h) a5,3 (h) ⎥
⎢ ⎥
⎣ a1,4 (h) a2,4 (h) a3,4 (h) a4,4 (h) a5,4 (h) ⎦
a1,5 (h) a2,5 (h) a3,5 (h) a4,5 (h) a5,5 (h)
⎡ ⎤
−2416.2 0.1129 × 10−9 −4.5929 −2.3757 0.0541 × 10−3
⎢ 2516.0 0.0369 × 10−9 −1.5000 −0.7759 0.0177 × 10−3 ⎥
⎢ ⎥
=⎢ −9.2 −0.5470 × 10−9 0.3562 0.2909 −0.0490 × 10−3 ⎥.
⎢ ⎥
⎣ −36.2 0.2120 × 10−9 1.4364 1.1709 −0.6466 × 10−3 ⎦
−54.4 0.1851 × 10−9 2.0645 1.6901 0.6238 × 10−3
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 559

Now using ai,j (w), Sj (w), and φj (w) values for i = 1, 2, 3, 4, 5, j = 1, 2, 3, 4, 5

at w = 0 and w = h in Equation 9.34 we can compute H̃ij (x, w). In particular
for x = 30 (which is what we need here) we get
⎡ ⎤
H̃1,1 (30, 0) H̃1,2 (30, 0) H̃1,3 (30, 0) H̃1,4 (30, 0) H̃1,5 (30, 0)
⎢ H̃2,1 (30, 0) H̃2,2 (30, 0) H̃2,3 (30, 0) H̃2,4 (30, 0) H̃2,5 (30, 0) ⎥
⎢ ⎥
⎢ ⎥
⎢ H̃3,1 (30, 0) H̃3,2 (30, 0) H̃3,3 (30, 0) H̃3,4 (30, 0) H̃3,5 (30, 0) ⎥
⎢ ⎥
⎣ H̃4,1 (30, 0) H̃4,2 (30, 0) H̃4,3 (30, 0) H̃4,4 (30, 0) H̃4,5 (30, 0) ⎦
H̃5,1 (30, 0) H̃5,2 (30, 0) H̃5,3 (30, 0) H̃5,4 (30, 0) H̃5,5 (30, 0)
⎡ ⎤
0.5618 0.1776 0.0249 0.1047 0.1309
⎢ 0.5096 0.1844 0.0289 0.1197 0.1574 ⎥
⎢ ⎥
=⎢
⎢ 0.4463 0.1481 0.0427 0.1612 0.2017 ⎥.
⎥
⎣ 0.2879 0.0955 0.0628 0.3251 0.2287 ⎦
0.2420 0.0799 0.0574 0.1811 0.4396

The values of H̃ij (30, h) for some small h do not differ in the first four sig-
nificant digits from the corresponding H̃ij (30, 0) values for i = 1, 2, 3, 4, 5 and
j = 1, 2, 3, 4, 5.
Thus we have the expected number of days from t = 0 (with initial water
level X(0) = 30) for the water level to become excessive or concerning as

d
5
E[T|X(0) = 30, Z(0) = 2] = (−1) H̃2j (30, w)|w=0
dw
j=1

5
H̃2j (30, h) − H̃2j (30, 0)
= − lim ≈ 5.4165
h→0 h
j=1

days by using h = 0.000001. Likewise, the probability that at the end of a

nominal spell the water level would be excessive (given the initial condi-
tion) is

P{Z(T) ∈ {3, 4, 5}|X(0) = 30, Z(0) = 2}

5
5
= H2j (30, ∞) = H̃2j (30, 0) = 0.306.
j=3 j=3

Next we consider a problem that leverages off the previous problem but
uses the conditions in Remark 22. The objective is to provide a contrast
against the previous problem, however, under a similar setting.
560 Analysis of Queues

Problem 91
Consider the setting in Problem 90 where the first passage time ends with
the water level becoming excessive and the environment in one of the three
positive drift states 3, 4, or 5 with probabilities 0.0945, 0.3911, or 0.5144,
respectively (these are the probabilities that the first passage time would
end in states 3, 4, or 5 given that it ended with water level becoming exces-
sive). Compute how long the water level will stay excessive before becoming
nominal?
Solution
We let t = 0 as the time when the water level just crossed over from
nominal to excessive. Using the same notation as in Problem 90 for X(t)
and Z(t), we have X(0) = 40, P{Z(0) = 3} = 0.0945, P{Z(0) = 4} = 0.3911, and
P{Z(0) = 5} = 0.5144. Let T be the time when the water level crosses back to
becoming nominal, that is,

T = inf{t > 0 : X(t) = 40}.

To compute E[T], we follow the analysis in Remark 22. We first require

the stability condition to be satisfied. The steady-state probabilities for the
environment process {Z(t), t ≥ 0} are p1 = 0.2725, p2 = 0.3438, p3 = 0.1652,
p4 = 0.1315, and p5 = 0.0870. Using the drift in state j, d(j) = 4j − 10
5
for j = 1, 2, 3, 4, 5, we have d(j)pj = − 0.3328 which is less than
j=1
zero and hence the system is stable. Now, to compute E[T] we use
Hij (x, t) = P{T ≤ t, Z(T) = j|X(0) = x, Z(0) = i} in particular its LST with respect
to t, H̃ij (x, w).
To compute E[T], the expected number of days from t = 0 for the water
level to return to nominal values, we use

d
5 5
E[T] = (−1) H̃ij (40, w)P{Z(0) = i}
dw
i=3 j=1

d
at w = 0. To compute dw H̃ij (x, w) at w = 0 here too we consider a very small
H̃ (x,h)−H̃ (x,0)
h > 0 and obtain it approximately as ij h
ij
. Now, to evaluate H̃ij (x, h)
and H̃ij (x, 0), we can write down from Equation 9.32, for j = 1, 2, 3, 4, 5,
⎡ ⎤
H̃1j (x, w)
⎢ ⎥
⎢ H̃2j (x, w) ⎥
⎢ ⎥
⎢ H̃3j (x, w) ⎥ = a1,j (w)eS1 (w)x φ1 (w)
⎢ ⎥
⎢ ⎥
⎣ H̃4j (x, w) ⎦
H̃5j (x, w)
+ a2,j (w)eS2 (w)x φ2 (w) + · · · + a5,j (w)eS5 (w)x φ5 (w), (9.35)
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 561

where ai,j (w), Sj (w), and φj (w) values need to be determined for w = 0 and
w = h for some small h.
We can obtain Sj (w) for j = 1, 2, 3, 4, 5 as the scalar solutions to the
characteristic equation

det(DS(w) − wI + Q) = 0.

But this is identical to that in Problem 90. Likewise φj (w) can be computed
as the column vectors that satisfy

Sj (w)Dφj (w) = (wI − Q)φj (w)

which is also identical to that in Problem 90. Thus refer to Problem 90

for φj (w) and Sj (w) for j = 1, 2, 3, 4, 5 at w = 0 and w = h. What remains in
Equation 9.35 are the ai,j (w) values for w = 0 and w = h. For that refer back to
the approach in Remark 22. First of all

⎡ ⎤
H̃1j (x, w) ⎡ ⎤
⎢ ⎥ 0
⎢ H̃2j (x, w) ⎥ ⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥
⎢ H̃3j (x, w) ⎥=⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎣ 0 ⎦
⎣ H̃4j (x, w) ⎦
0
H̃5j (x, w)

for j = 3, 4, 5 since the first passage time can never end in states 3, 4, or 5
as the drift is positive in those states (only when the drift is negative, it is
possible to cross over into a particular buffer content level from above). Thus
we need only ai,j (w) values for i = 1, 2, 3, 4, 5 and j = 1, 2. But ai,j (w) = 0 for
all i where Si (w) > 0, otherwise as x → ∞, the expression for H̃ij (x, w) in
Equation 9.35 would blow up. Hence, we have a2,j (w) = 0, a4,j (w) = 0, and
a5,j (w) = 0 for j = 1, 2. Thus all we are left with is to obtain a1,1 (w), a1,2 (w),
a3,1 (w), and a3,2 (w). For that we have four boundary conditions, namely,

H̃11 (40, w) = 1,

H̃22 (40, w) = 1,

H̃12 (40, w) = 0, and

H̃21 (40, w) = 0.
562 Analysis of Queues

Thus we have

⎡ ⎤
H̃11 (40, w) ⎡ ⎤
⎢ ⎥ 1
⎢ H̃21 (40, w) ⎥ ⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥
⎢ H̃31 (40, w) ⎥ = a1,1 (w)e40S1 (w) φ1 (w) + a3,1 (w)e40S3 (w) φ3 (w) = ⎢ · ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎣ · ⎦
⎣ H̃41 (40, w) ⎦
·
H̃51 (40, w)

and
⎡ ⎤
H̃12 (40, w) ⎡ ⎤
⎢ ⎥ 0
⎢ H̃22 (40, w) ⎥ ⎢ 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢ H̃32 (40, w) ⎥ = a1,2 (w)e40S1 (w) φ1 (w) + a3,2 (w)e40S3 (w) φ3 (w) = ⎢ · ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎣ · ⎦
⎣ H̃42 (40, w) ⎦
·
H̃52 (40, w)

.
where the .. in the column vector denotes unknown quantities. Once we
know a1,1 (w), a1,2 (w), a3,1 (w), and a3,2 (w), the unknown quantities can be
obtained. Solving for the four equations at w = 0 and w = h = 0.000001 we
get a1,1 (0) = 7.7343 × 106 , a3,1 (0) = − 1.6856, a1,2 (0) = − 7.7343 × 106 , and
a3,2 (0) = − 0.5505; also a1,1 (h) = 7.7344 × 106 , a3,1 (h) = − 1.6858, a1,2 (h) =
−7.7344 × 106 , and a3,2 (h) = − 0.5505.
Now using ai,j (w), Si (w), and φi (w) values for i = 1, 3, j = 1, 2 at w = 0
and w = h in Equation 9.35 we can compute H̃ij (x, w). In particular for x = 40
(which is what we need here) we get

⎡ ⎤
H̃11 (40, 0) H̃12 (40, 0) ⎡ ⎤
⎢ ⎥ 1 0
⎢H̃21 (40, 0) H̃22 (40, 0)⎥ ⎢0 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢H̃31 (40, 0) H̃32 (40, 0)⎥ ⎢ 0.3452⎥
⎢ ⎥ = ⎢0.6548 ⎥.
⎢ ⎥ ⎣0.6910 0.3090⎦
⎣H̃41 (40, 0) H̃42 (40, 0)⎦
0.7153 0.2847
H̃51 (40, 0) H̃52 (40, 0)

The values of H̃ij (40, h) for some small h do not differ in the first four sig-
nificant digits from the corresponding H̃ij (40, 0) values for i = 1, 2, 3, 4, 5 and
j = 1, 2, hence they are not reported.
Thus we have the expected number of days from t = 0 (with ini-
tial water level X(0) = 40 as well as initial environmental conditions
P{Z(0) = 3} = 0.0945, P{Z(0) = 4} = 0.3911, and P{Z(0) = 5} = 0.5144) for the
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 563

water level to become nominal as

d
5 2
E[T] = (−1) H̃ij (40, w)|w=0 P{Z(0) = i}
dw
i=3 j=1

5
2
H̃ij (40, h) − H̃ij (40, 0)
= − lim P{Z(0) = i}
h→0 h
i=3 j=1

which is approximately 32.7935 days by using h = 0.000001.

Having considered the case in Remark 22, next we solve a problem that
uses the conditions in Remark 23. It is worthwhile contrasting against the
previous two problems considering they are under a similar setting.

Problem 92
Consider the setting in Problem 90 with the exception that at time t = 0 we
just enter the concerning level and Z(0) = 1. What is the expected sojourn
time for the water level to stay at the concerning level before moving
to nominal?
Solution
At t = 0 water level just crosses over from nominal to concerning. Using the
same notation as in Problem 90 for X(t) and Z(t), we have X(0) = 20 and
Z(0) = 1. Let T be the time when the water level crosses back to becoming
nominal from concerning, that is,

T = inf{t > 0 : X(t) = 20}.

To compute E[T], we follow the analysis in Remark 23 and use

Hij (x, t) = P{T ≤ t, Z(T) = j|X(0) = x, Z(0) = i} in particular its LST with respect
to t, H̃ij (x, w). To obtain E[T], the expected number of days from t = 0 for the
water level to return to nominal values, we use

d
5
E[T] = (−1) H̃1j (20, w)
dw
j=1

d
at w = 0. To compute dw H̃ij (x, w) at w = 0, here too we consider a very small
H̃ij (x,h)−H̃ij (x,0)
h > 0 and obtain it approximately as h . Now, to evaluate H̃ij (x, h)
564 Analysis of Queues

and H̃ij (x, 0), we can write down from Equation 9.32, for j = 1, 2, 3, 4, 5,

⎡ ⎤
H̃1j (x, w)
⎢ H̃2j (x, w) ⎥
⎢ ⎥
⎢ ⎥
⎢ H̃3j (x, w) ⎥ = a1,j (w)eS1 (w)x φ1 (w)
⎢ ⎥
⎣ H̃4j (x, w) ⎦
H̃5j (x, w)
+ a2,j (w)eS2 (w)x φ2 (w) + · · · + a5,j (w)eS5 (w)x φ5 (w), (9.36)

det(DS(w) − wI + Q) = 0.

But this is identical to that in Problem 90. Likewise φj (w) can be computed
as the column vectors that satisfy

Sj (w)Dφj (w) = (wI − Q)φj (w)

which is also identical to that in Problem 90. Thus refer to Problem 90 for
φj (w) and Sj (w) for j = 1, 2, 3, 4, 5 at w = 0 and w = h. What remains in Equa-
tion 9.35 are the ai,j (w) values for w = 0 and w = h. For that refer back to the
approach in Remark 23. First of all

⎡ ⎤
H̃1j (x, w) ⎡ ⎤
⎢ ⎥ 0
⎢ H̃2j (x, w) ⎥ ⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥
⎢ H̃3j (x, w) ⎥=⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎣ 0 ⎦
⎣ H̃4j (x, w) ⎦
0
H̃5j (x, w)

for j = 1, 2 since the first passage time can never end in states 1 or 2 as the
drift is negative in those states (only when the drift is positive is it possible
to cross over into a particular buffer content level from below). Thus we need
only ai,j (w) values for i = 1, 2, 3, 4, 5 and j = 3, 4, 5.
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 565

Of those 15 unknown ai,j (w) values, 9 can be obtained through the

following boundary conditions:

H̃33 (20, w) = 1, H̃34 (20, w) = 0, H̃35 (20, w) = 0,

H̃44 (20, w) = 1, H̃43 (20, w) = 0, H̃45 (20, w) = 0,

H̃55 (20, w) = 1, H̃53 (20, w) = 0, and H̃54 (20, w) = 0.

For the remaining six unknowns we use for j = 3, 4, 5

5
q1k q1k
H̃1j (0, w) = H̃kj (0, w)
−q11 q1k + w
k=2

5
q21 q21 q2k q2k
H̃2j (0, w) = H̃1j (0, w) + H̃kj (0, w) ,
−q22 q21 + w −q22 q2k + w
k=3

where qij corresponds to the element in the ith row and jth column of Q.
Solving the 15 equations we get for w = 0,
⎡ ⎤
a1,3 (0) a2,3 (0) a3,3 (0) a4,3 (0) a5,3 (0)
⎢ ⎥
⎣ a1,4 (0) a2,4 (0) a3,4 (0) a4,4 (0) a5,4 (0) ⎦
a1,5 (0) a2,5 (0) a3,5 (0) a4,5 (0) a5,5 (0)
⎡ ⎤
0.0093 × 10−3 −0.2211 × 10−4 −0.2109 −0.0033 −0.0014
⎢ ⎥
=⎣ 0.1326 × 10−3 0.1117 × 10−4 −0.8945 −0.0466 −0.0208 ⎦ .
−0.1419 × 10−3 0.1094 × 10−4 −1.1307 0.0500 0.0223

Also, for w = h = 0.000001, the values of ai,j (h) are the same as that when w = 0
to the first few significant digits. Hence we do not present that here.
Now using ai,j (w), Si (w), and φi (w) values for i = 1, 2, 3, 4, 5 and j = 3, 4, 5
at w = 0 and w = h in Equation 9.36 we can compute H̃ij (x, w). In particular
for x = 20 (which is what we need here) we get

⎡ ⎤⎡ ⎤
H̃13 (20, 0) H̃14 (20, 0) H̃15 (20, 0) 0.1583 0.3995 0.4422
⎢ ⎥ ⎢
⎢H̃23 (20, 0)
⎢
H̃24 (20, 0) H̃25 (20, 0)⎥ ⎢0.1497
⎥ ⎢ 0.3913 0.4590⎥
⎥
⎢ ⎥ ⎥
⎢H̃33 (20, 0) H̃34 (20, 0) H̃35 (20, 0)⎥ = ⎢1 0 0 ⎥.
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎣H̃43 (20, 0) H̃44 (20, 0) H̃45 (20, 0)⎦ ⎣0 1 0 ⎦
H̃53 (20, 0) H̃54 (20, 0) H̃55 (20, 0) 0 0 1
566 Analysis of Queues

The values of H̃ij (20, h) for some small h do not differ in the first four sig-
nificant digits from the corresponding H̃ij (20, 0) values for i = 1, 2, 3, 4, 5 and
j = 3, 4, 5, hence they are not reported.
Thus we have the expected number of days from t = 0 (with initial water
level X(0) = 20 as well as initial environmental condition Z(0) = 1) for the
water level to become nominal as

d H̃1j (20, h) − H̃1j (20, 0)

5 5
E[T] = (−1) H̃1j (20, w)|w=0 = − lim
dw h→0 h
j=3 j=3

which is approximately 27.602 days by using h = 0.000001.

In fact in the previous problem we can also immediately write down the
time spent in critical state as 25.1175 days if we were to start in state 2 (instead
of 1 in the previous problem). Having seen a set of numerical problems, we
next focus on some exponential on–off source cases where we can obtain
closed-form algebraic expressions.

Problem 93
Consider an exponential on–off source that inputs fluid into a buffer. The on
times are according to exp(α) and off times according to exp(β). When the
source is on, fluid enters the buffer at rate r and no fluid enters the buffer
when the source is off. The output capacity is c. Assume that initially there
is x amount of fluid in the buffer. Define the first passage time as the time it
would take for the buffer contents to reach level x∗ or 0, whichever happens
first with x∗ ≥ x ≥ 0. Let states 1 and 2 represent the source being off and on,
respectively. For i = 1, 2, find the LST of the first passage time given that the
environment is initially in state i. Also for i = 1, 2, find the probability that
the first passage time occurs with x∗ or 0 amount of fluid, given that initially
the environment is in state i.
Solution
The setting is identical to that in Problem 87. Recall that the environment
process {Z(t), t ≥ 0} is a CTMC with

−β β −c 0
Q= and D = .
α −α 0 r−c

Next we use the definition of Hij (x, t) and its LST H̃ij (x, w) in Section 9.3 for
i = 1, 2 and j = 1, 2. Then the solution to Equation 9.31 is given by
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 567

H̃11 (x, w)
= a11 (w)eS1 (w)x φ1 (w) + a21 (w)eS2 (w)x φ2 (w),
H̃21 (x, w)

H̃12 (x, w)
= a12 (w)eS1 (w)x φ1 (w) + a22 (w)eS2 (w)x φ2 (w).
H̃22 (x, w)

For i = 1, 2 and j = 1, 2 we now obtain closed-form expressions for Si (w),

φi (w), and aij (w) for any w. To obtain S1 (w) and S2 (w), we solve characteristic
equation

det(DS(w) − wI + Q) = 0.

The two roots of the characteristic equation yield

−b̂ − b̂2 + 4w(w + α + β)c(r − c)
S1 (w) = ,
2c(r − c)

−b̂ + b̂2 + 4w(w + α + β)c(r − c)
S2 (w) = ,
2c(r − c)

where b̂ = (r − 2c)w + (r − c)β − cα.

Then, for each Sj (w) such that j = 1, 2, we can find column vectors φj (w)
that satisfy

Sj (w)Dφj (w) = (wI − Q)φj (w).

For i = 1, 2 we have

w+α−Si (w)(r−c) β
φi (w) = α = w+β+Si (w)c .
1 1

For convenience we write down in terms of ψi (w) for i = 1, 2 so that

β
ψi (w) =
w + β + Si (w)c

and thus

ψi (w)
φi (w) = .
1
568 Analysis of Queues

Finally, solving for a11 (w), a21 (w), a12 (w), and a22 (w) using the LST of
the boundary conditions H̃22 (x∗ , w) = 1, H̃11 (0, w) = 1, H̃21 (x∗ , w) = 0, and
H̃12 (0, w) = 0, resulting in

∗
a11 (w) = eS2 (w)x /δ(w),

a12 (w) = −ψ2 (w)/δ(w),

∗
a21 (w) = −eS1 (w)x /δ(w),

a22 (w) = ψ1 (w)/δ(w), (9.37)

∗ ∗
where δ(w) = eS2 (w)x ψ1 (w) − eS1 (w)x ψ2 (w).
Now that we have expressions for H̃ij (x, w) for i = 1, 2 and j = 1, 2, the LST
of the first passage time given that the environment is initially in state 1 (i.e.,
off) can be computed as

H̃11 (x, w) + H̃12 (x, w) = (a11 (w) + a12 (w))eS1 (w)x ψ1 (w)
+ (a21 (w) + a22 (w))eS2 (w)x ψ2 (w).

Likewise, the LST of the first passage time given that the environment is
initially in state 2 (i.e., on) can be computed as

H̃21 (x, w) + H̃22 (x, w) = (a11 (w) + a12 (w))eS1 (w)x + (a21 (w) + a22 (w))eS2 (w)x .

Also for i = 1, 2, the probability that the first passage time occurs with 0
amount of fluid, given that initially the environment is in state i is H̃i1 (x, 0).
Likewise for i = 1, 2, the probability that the first passage time occurs with x∗
amount of fluid, given that initially the environment is in state i is H̃i2 (x, 0).
To compute H̃ij (x, 0) for i = 1, 2 and j = 1, 2 when we let w = 0, we need to be

careful about the fact that b̂2 = |b̂|. Notice that if rβ < c(α + β), then b̂ < 0,
otherwise b̂ ≥ 0.
Assume that rβ < c(α + β), which would be necessary if we require the
queue to be stable (note that it is straightforward to write down the expres-
sions to follow even for the case rβ ≥ c(α + β) but is not presented here).
Continuing with the assumption that rβ < c(α + β), we can see by letting
w = 0 that

cα − β(r − c)
S1 (0) = 0 and S2 (0) = .
c(r − c)
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 569

Hence, we have ψ1 (0) = 1 and ψ2 (0) = β(r−c)

cα . Finally, using δ(0) =
(0)x ∗ (0)x ∗
e S 2 ψ1 (0) − e S 1 ψ2 (0),
∗
∗ eS2 (0)x
a11 (0) = eS2 (0)x /δ(0) = ∗
,
eS2 (0)x − β(r−c)
cα

−β(r−c)
cα
a12 (0) = −ψ2 (0)/δ(0) = ∗
,
eS2 (0)x − β(r−c)
cα

∗ −1
a21 (0) = −eS1 (0)x /δ(0) = ,
e S2 (0)x∗ − β(r−c)
cα

1
a22 (0) = ψ1 (0)/δ(0) = .
e S2 (0)x∗ − β(r−c)
cα

Hence, we can compute for i = 1, 2 and j = 1, 2 the values of H̃ij (x, 0) as

S2 (0)x∗ S2 (0)x β(r − c) 1
H̃11 (x, 0) = e −e ∗
,
cα S
e 2 (0)x − β(r−c)
cα
∗
1
H̃21 (x, 0) = eS2 (0)x − eS2 (0)x ∗
,
eS2 (0)x − β(r−c)
cα
β(r − c) 1
H̃12 (x, 0) = eS2 (0)x − 1 ∗
,
cα eS2 (0)x − β(r−c)
cα

β(r − c) 1
H̃22 (x, 0) = eS2 (0)x − ∗
.
cα eS2 (0)x − β(r−c)
cα

In the next example, we continue with the setting of the previous exam-
ple, however, with the restriction of the first passage time occurring only
when the amount of fluid in the buffer becomes empty.

Problem 94
Consider an exponential on–off source that inputs fluid into an infinite-sized
buffer. The on times are according to exp(α) and off times according to
exp(β). When the source is on, fluid enters the buffer at rate r and no fluid
enters the buffer when the source is off. The output capacity is c. Assume that
the system is stable. Define the first passage time as the time it would take
for the buffer contents to empty for the first time given that initially there
is x amount of fluid in the buffer and the source is in state i, for i = 1 and 2
570 Analysis of Queues

representing off and on, respectively. Then using that result derive the LST
of the busy period distribution, that is, the consecutive period of time there
is nonzero fluid in the buffer.
Solution
Notice that the setting is identical to that of Remark 22 where a = b ≤ x with
a = b = 0. Also, the stability condition is that rβ < c(α + β). Following the
procedure in Remark 22, define T as the first time the amount of fluid in
the buffer reaches 0 and thereby Hij (x, t) = P{T ≤ t, Z(T) = j|X(0) = x, Z(0) = i}.
Using the results in Problem 93 we can write down H̃ij (x, w), the LST of
Hij (x, t) as follows:

H̃11 (x, w)
= a11 (w)eS1 (w)x φ1 (w) + a21 (w)eS2 (w)x φ2 (w),
H̃21 (x, w)

with H̃12 (x, w) = H̃22 (x, w) = 0. Solving the characteristic equation

det(DS(w) − wI + Q) = 0,

we can obtain S1 (w) and S2 (w) as

−b̂ − b̂2 + 4w(w + α + β)c(r − c)
S1 (w) = ,
2c(r − c)

−b̂ + b̂2 + 4w(w + α + β)c(r − c)
S2 (w) = ,
2c(r − c)

where b̂ = (r − 2c)w + (r − c)β − cα. Also the column vectors φj (w) that satisfy

Sj (w)Dφj (w) = (wI − Q)φj (w)

are for i = 1, 2 are

w+α−Si (w)(r−c) β
φi (w) = α = w+β+Si (w)c .
1 1

To solve for a11 (w) and a21 (w), we use the LST of the boundary conditions
H̃11 (0, w) = 1 and H̃12 (0, w) = 0. Of course, H̃12 (0, w) = 0 any way, hence
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 571

that boundary condition is not useful. We use the additional condition that
aij (w) = 0 if Si (w) > 0 for i = 1, 2 and j = 1, 2. Since S1 (w) ≤ 0 and S2 (w) > 0,
we have a21 (w) = 0. Thus the only term that is nonzero is a11 (w) which is
given by

w + β + S1 (w)c
a11 (w) =
β

since H̃11 (0, w) = 1. Thus we have

H̃11 (x, w) S1 (w)x 1
=e w+β+S1 (w)c .
H̃21 (x, w) β

Now we compute the LST of the busy period distribution. Notice that a
busy period begins with the environment in state 2 with zero fluid in the
buffer (i.e., x = 0) and ends when the environment is in state 1. Hence the
1 (w)c
LST of the busy period distribution is H̃21 (0, w) = w+β+S
β .

Next we consider the case where we start with x amount of fluid and find
the distribution for the time it would take for the buffer contents to reach x∗
with x∗ ≥ x.

Problem 95
Consider an on–off source with on times according to exp(α) and off times
according to exp(β). When the source is on, fluid enters the buffer at rate r
and no fluid enters the buffer when the source is off. The output capacity
is c. Define the first passage time as the time it would take for the buffer
contents to reach a level x∗ for the first time given that initially there is x
amount of fluid in the buffer (such that x ≤ x∗ ) and the source is in state i, for
i = 1 and 2 representing off and on, respectively. Derive the LST of the first
passage time.
Solution
This setting is identical to that of Remark 23 where a = b ≥ x with a = b = x∗ .
We define T as the first time the amount of fluid in the buffer reaches x∗
and thereby Hij (x, t) = P{T ≤ t, Z(T) = j|X(0) = x, Z(0) = i}. Using the results in
Problem 93 we can solve

det(DS(w) − wI + Q) = 0,
572 Analysis of Queues

to obtain S1 (w) and S2 (w) as

−b̂ − b̂2 + 4w(w + α + β)c(r − c)
S1 (w) = ,
2c(r − c)

−b̂ + b̂2 + 4w(w + α + β)c(r − c)
S2 (w) = ,
2c(r − c)

where b̂ = (r − 2c)w + (r − c)β − cα. Also since the column vectors φj (w) must
satisfy

Sj (w)Dφj (w) = (wI − Q)φj (w)

they are for i = 1, 2 as follows:

w+α−Si (w)(r−c) β
φi (w) = α = w+β+Si (w)c .
1 1

Thus we can write down H̃ij (x, t), the LST of Hij (x, t) as H̃11 (x, w) =
H̃21 (x, w) = 0, and

H̃12 (x, w)
= a12 (w)eS1 (w)x φ1 (w) + a22 (w)eS2 (w)x φ2 (w).
H̃22 (x, w)

To solve for ai2 (w) for i = 1, 2, we use the LST of the boundary conditions
H̃22 (x∗ , w) = 1 and H̃21 (x∗ , w) = 0. Of course, H̃21 (x, w) = 0 any way, hence
that boundary condition is not useful. We use the additional condition that

β
H̃12 (0, w) = H̃22 (0, w) .
β+w

These together yield the following:

∗ ∗
a12 (w)eS1 (w)x + a22 (w)eS2 (w)x = 1 (since H̃22 (x∗ , w) = 1),

β β
(a12 (w) + a22 (w)) = a12 (w)
β+w w + β + S1 (w)c

β
+ a22 (w)
w + β + S2 (w)c
β
since H̃12 (0, w) = H̃22 (0, w) β+w .
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 573

Solving for a12 (w) and a22 (w), we get

−1
S1 (w)x∗ S2 (w)x∗ S1 (w) w + β + S2 (w)c
a12 (w) = e −e ,
S2 (w) w + β + S1 (w)c
−1
S2 (w)x∗ S1 (w)x∗ S2 (w) w + β + S1 (w)c
a22 (w) = e −e .
S1 (w) w + β + S2 (w)c

Thus the LST of the first passage time H̃ij (x, t) is given by H̃11 (x, w) =
H̃21 (x, w) = 0, and

H̃12 (x, w)
= a12 (w)eS1 (w)x φ1 (w) + a22 (w)eS2 (w)x φ2 (w),
H̃22 (x, w)

where a12 (w), a22 (w), φ1 (w), φ2 (w), S1 (w), and S2 (w) are described earlier.

Next we present some examples that are extensions of standard single

buffer fluid models that can be solved using the techniques derived in the
previous examples. First we present the fluid equivalent to the M-policy (also
called m-policy) that is popular in discrete queues. Essentially in discrete
queues the M-policy is one where a server starts serving a queue as soon
as there are M customers in the system; it serves it until the queue is empty.
Then it stops serving until the number of customers become M, and the cycle
continues. We now present the fluid version of that. Such mechanisms are
common in computer systems where the processor switches between high-
and low-priority jobs.

Problem 96
Consider an infinite-sized buffer with fluid input according to an on–off
source that has on and off times exponentially distributed with parameters
α and β, respectively. When the source is on fluid flows in at rate r and no
fluid flows in when off. As soon as the buffer content reaches level a, fluid is
removed from the buffer at rate c. When the buffer becomes empty, the out-
put valve is shut. It remains shut until the buffer content reaches a. In other
words the output also behaves like an alternating on–off sink. Assume that
rβ
α + β < c < r. Obtain LSTs of the consecutive time the buffer is drained at rate
c as well as the time the buffer takes to reach a. Also derive the expected on
and off times for the output valve or sink.
Solution
Let T1 be the time for the output channel to empty the contents in the buffer
starting with a. Also let T2 be the time for the buffer contents to rise from 0
574 Analysis of Queues

to a during which there is no output. Mathematically we denote

T1 = inf{t > 0 : X(t) = 0|X(0) = a}

and

T2 = inf{t > 0 : X(t) = a|X(0) = 0}.

First we obtain the distribution of T1 and then that of T2 . Let O1 (t) be the
CDF of the random variable T1 such that

O1 (t) = P{T1 ≤ t}.

Define Õ1 (w) as the LST of O1 (t) such that

Õ1 (w) = E e−wT1 .

Due to the definition of T1 , the source is “on” initially with a amount of fluid
in the buffer, so that T1 is the first passage time to reach zero amount of fluid
in the buffer. For that we can directly substitute for expressions in Problem
94 to obtain the results. The LST Õ1 (w) is

w+β+cs0 (w) a s0 (w)

β e if w ≥ w∗
Õ1 (w) = (9.38)
∞ otherwise,

where

w∗ = 2 cαβ(r − c) − rβ − cα − cβ /r, (9.39)

−b − b2 + 4w(w + α + β)c(r − c)
s0 (w) = (9.40)
2c(r − c)

and b = (r − 2c)w + (r − c)β − cα. Note that the LST is defined for all w ≥ w∗
where w∗ is essentially the point where s0 (w) becomes imaginary because the
term inside the square-root is negative. However, the fact that w∗ < 0 ensures
that this would not be a problem for w ≥ 0.
Now let O2 (t) be the CDF of the random variable T2 such that

O2 (t) = P{T2 ≤ t}.

Define Õ2 (s) as the LST of O2 (t) such that

Õ2 (s) = E e−sT2 .
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 575

During time T2 , the output from the buffer is zero. Therefore, the buffer’s
contents X(t) is nondecreasing. Thus we have

O2 (t) = P{T2 ≤ t} = P{X(t) > a|X(0) = 0}. (9.41)

For all t ∈ [0, ∞), let Z(t) = 1 denote that the source is off and Z(t) = 2 denote
that the source is on at time t. Define for i = 1, 2

Hi (x, t) = P{X(t) ≤ x, Z(t) = i}.

Also define the vector H(x, t) = [H1 (x, t) H2 (x, t)]. Then H(x, t) satisfies the
following partial differential equation

∂H(x, t) ∂H(x, t)
+ R = H(x, t)Q (9.42)
∂t ∂x

with initial conditions H1 (x, 0) = 1 and H2 (x, 0) = 0, where

0 0 −β β
R= and Q = .
0 r α −α

Now, taking the Laplace transform of Equation 9.42 with respect to t, we get

∂H∗ (x, s)
sH∗ (x, s) − H(x, 0) + R = H∗ (x, s)Q. (9.43)
∂x

Due to the initial conditions following Equation 9.42, Equation 9.43

reduces to

∂H∗ (x, s)
sH∗ (x, s) − [1 0] + R = H∗ (x, s)Q.
∂x

Taking the LST of this equation with respect to x yields

sH̃∗ (w, s) − [1 0] + wH̃∗ (w, s)R − wH∗ (0, s)R = H̃∗ (w, s)Q.

Since P{X(t) ≤ 0, Z(t) = 1} = 0, we have H∗ (0, s) = H1∗ (0, s) 0 and, therefore,
wH∗ (0, s)R = [0 0]. Hence this equation reduces to

H̃∗ (w, s) = [1 0][sI + wR − Q]−1 .

Plugging in for R and Q, and taking the inverse of the matrix yields

1
H̃∗ (w, s) = [s + wr + α β]. (9.44)
wr(s + β) + αs + βs + s2
576 Analysis of Queues

However,

Õ2 (s) = 1 − sH1∗ (a, s) − sH2∗ (a, s).

Therefore, inverting the transform in Equation 9.44 with respect to w, and

substituting in this equation, yields

2
β −a αs+βs+s
Õ2 (s) = e rs+rβ . (9.45)
β+s

1 (w) dÕ2 (s)

Next we use the relations E[T1 ] = − dÕdw at w = 0 and E[T2 ] = − ds
at s = 0. From that, the mean of T1 and T2 are given by

r + a(α + β)
E[T1 ] = , (9.46)
cα + cβ − rβ

r + a(α + β)
E[T2 ] = . (9.47)
rβ

Notice that the ratio E[T1 ]/E[T2 ] is independent of a. This indicates that
no matter what a is, the ratio of time spent by the sink in on and off times
1]
remains the same. Also E[TE[T
1 + T2 ]
= c(αrβ
+ β) . This is not surprising because in
every on–off cycle of the sink an average of E[T1 + T2 ] αrβ
+ β fluid enters the
buffer all of which exit the buffer during time whose mean is E[T1 ], hence
the average amount of fluid exiting a buffer in one cycle is E[T1 ]c. Hence we
1]
have E[TE[T
1 + T2 ]
= c(αrβ
+ β) .

Problem 97
Consider an exponential on–off source with on times according to exp(α)
and off times according to exp(β). When the source is on, fluid enters at rate
r into an infinite-sized buffer and no fluid flows into the buffer when the
source is off. Fluid is drained from the buffer using two rates according to a
threshold policy. When the amount of fluid in the buffer is less than x∗ the
output capacity is c1 , whereas if there is more than x∗ fluid, it is removed at
rate c1 + c2 . For such a buffer, derive an expression for

lim P{X(t) > B̂}

t→∞

where 0 < x∗ < B̂ < ∞. Assume that r > c1 + c2 > c1 > rβ/(α + β).
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 577

Solution
The aim is to compute the limiting distribution (as t → ∞) of X(t), the
amount of fluid in the buffer at time t. Note that the output capacity is c1
when X(t) < x∗ and it is c1 + c2 when X(t) ≥ x∗ . The output capacity of the
system can be modeled as an alternating renewal process that stays at c1 for
a random time T1 and switches to c1 + c2 for a random time T2 before mov-
ing back to c1 . Essentially T1 is the first passage time for the buffer content
to reach x∗ (from below) given that initially the source is off and there is
x∗ amount of fluid. During the entire time T1 , the output capacity is c1 and
hence we can directly use the results from Problem 95. Likewise T2 is the
first passage time for the amount of fluid to reach x∗ (from above) given that
initially there is x∗ amount of fluid and the source is on. Also, during the
time T2 , the output capacity is c1 + c2 , which enables us to use results from
Problem 94.
To compute the limiting probability (as t → ∞) that X(t) is greater than
B̂ we condition on the region above or below x∗ to obtain

P{X(t) > B̂} = P{X(t) > B̂|X(t) > x∗ }P{X(t) > x∗ }

+ P{X(t) > B̂|X(t) ≤ x∗ }P{X(t) ≤ x∗ }
⇒ lim P{X(t) > B̂}
t→∞

= lim P{X(t) > B̂|X(t) > x∗ }P{X(t) > x∗ }. (9.48)

t→∞

Since the output capacity is an alternating renewal process (alternating

between c1 and c1 + c2 for T1 and T2 , respectively) that switches states every
time the threshold x∗ is crossed, we have

E[T2 ]
lim P{X(t) > x∗ } = . (9.49)
t→∞ E[T1 ] + E[T2 ]

We will subsequently obtain closed-form expressions for E[T1 ] and E[T2 ]

using the results from Problems 95 and 94, respectively. Thus to complete the
RHS of Equation 9.48, we now derive an expression for P{X(t) > B̂|X(t) > x∗ }
in the limit as t → ∞.
For that, consider the same CTMC on–off source that inputs fluid into
another buffer whose output capacity is a constant c1 + c2 . Let the amount of
fluid in that buffer at time t be X̂(t). Then we have

P{X(t) > B̂|X(t) > x∗ } = P{X̂(t) > B̂ − x∗ |X̂(t) > 0}

since the X(t) process being above x∗ is stochastically equivalent to the X̂(t)
process being above 0. We can immediately write down
578 Analysis of Queues

P{X̂(t) > B̂ − x∗ |X̂(t) > 0} = P{X̂(t) > B̂ − x∗ , X̂(t) > 0}/P{X̂(t) > 0}

= P{X̂(t) > B̂ − x∗ }/P{X̂(t) > 0}.

However, we know from Problem 87 (CTMC on–off source) Equation 9.24

that

βr
lim P{X̂(t) > x} = eλ1 x
t→∞ (c1 + c2 )(α + β)

where λ1 = β/(c1 + c2 ) − α/(r − c1 − c2 ) from Equation 9.23. Hence we have

P{X̂(t) > B̂ − x∗ } ∗
lim = eλ1 (B̂−x ) .
t→∞ P{X̂(t) > 0}

Thus we can write down

∗
lim P{X(t) > B̂|X(t) > x∗ } = eλ1 (B̂−x ) , (9.50)
t→∞

where λ1 = β/(c1 + c2 ) − α/(r − c1 − c2 ).

Thus the only things that remain are expressions for E[T1 ] and E[T2 ]. We
do that one by one. Recall that T1 is the first passage time to reach fluid level
x∗ from below given that initially there is x∗ amount of fluid and the source
is off. Using the LST of the first passage time distribution in Problem 95, we
can see that the LST E[e−wT1 ] is equal to H̃12 (x∗ , w) defined in that problem.
Hence, we can write down
∗ ∗
S2 (w)βeS1 (w)x − S1 (w)βeS2 (w)x
E e−wT1 = ∗ ∗
S2 (w)(w + β + S1 (w)c1 )eS1 (w)x − S1 (w)(w + β + S2 (w)c1 )eS2 (w)x

where

−b̂ − b̂2 + 4w(w + α + β)c1 (r − c1 )
S1 (w) = ,
2c1 (r − c1 )

−b̂ + b̂2 + 4w(w + α + β)c1 (r − c1 )
S2 (w) = ,
2c1 (r − c1 )

with b̂ = (r − 2c1 )w + (r − c1 )β − c1 α. To compute E[T1 ] we can take the

derivative of E[e−wT1 ] with respect to w and let w = 0. Also, we need to
use an analysis similar to that done in Problem 93, to get S1 (0) = 0 and
1)
S2 (0) = c1 α−β(r−c
c1 (r−c1 ) . For that we require the assumption that rβ < c1 (α + β)
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 579

which is stated in the problem statement. Notice that if c2 = 0 then S2 (0) =

−λ1 . By systematically taking the derivative and letting w = 0, we have

d −wT1
E[T1 ] = − Ee |w=0
dw

1 c1 α+β ∗
= + (eS2 (0)x − 1), (9.51)
β β c1 (α + β) − rβ

1)
where S2 (0) = c1 α−β(r−c
c1 (r−c1 ) > 0.
Now we derive E[T2 ]. Recall that T2 is the first passage time to reach
fluid level x∗ from above given that initially there is x∗ amount of fluid and
the source is on. During this time the output capacity is c1 + c2 . Notice that
this first passage time is the same as the busy period of a buffer with CTMC
on–off source input and output capacity c1 + c2 . Thus using the LST of the
busy period distribution described toward the end of the solution to Problem
94, we can see that

w + β + S0 (w)(c1 + c2 )
E e−wT2 =
β

where

2
−b − b + 4w(w + α + β)(c1 + c2 )(r − c1 − c2 )
S0 (w) = ,
2(c1 + c2 )(r − c1 − c2 )

with b = (r − 2(c1 + c2 ))w + (r − c1 − c2 )β − (c1 + c2 )α. To compute E[T2 ] we

take the derivative of E[e−wT2 ] to get

d −wT2
E[T2 ] = − Ee |w=0
dw

1 c1 + c2 α+β
=− + . (9.52)
β β (c1 + c2 )(α + β) − rβ

Thus by consolidating Equations 9.48 through 9.52, and rearranging the

terms, we get

lim P{X(t) > B̂}

t→∞

rβ c1 (α + β) − rβ ∗
= (0)x ∗ eλ1 (B̂−x )
α + β c1 [(c1 + c2 )(α + β) − rβ]e S 2 − c2 rβ

c1 α−β(r−c1 )
where S2 (0) = c1 (r−c1 ) and λ1 = β/(c1 + c2 ) − α/(r − c1 − c2 ).
580 Analysis of Queues

In the next example, we will consider a few aspects that will give a flavor
for the analysis in the next chapter. In particular, we will consider: (i) multi-
ple sources that superpose traffic into a buffer, (ii) a network situation where
the departure from one node acts as input to another node, and (iii) a case of
non-CTMC-based environment process.

Problem 98
Consider two infinite-sized buffers in tandem as shown in Figure 9.7. Input
to the first buffer is from N independent and identical exponential on–off
sources with on time parameter α, off time parameter β and rate r. The
output from the first buffer is directly fed into the second buffer. The out-
put capacities of the first and second buffers are c1 and c2 , respectively.
What is the stability condition? Assuming that is satisfied, characterize the
environment process governing input to the second buffer.
Solution
Let Z1 (t) be the number of sources that are in the “on” state at time t. Clearly
{Z1 (t), t ≥ 0} is a CTMC with N + 1 states. When Z1 (t) = i the input rate is ir.
For notational convenience we assume that c1 is not an integral multiple of
r. Thus every state in the CTMC {Z1 (t), t ≥ 0} has strictly positive or strictly
negative drifts. Let

!c "
1
= .
r

Thus whenever Z1 (t) ∈ {0, . . . , − 1}, the drift is negative, that is, the
first buffer’s contents would be nonincreasing. Likewise, whenever
Z1 (t) ∈ {, . . . , N}, the drift is positive, that is, the first buffer’s contents would
be increasing. The first buffer is stable if the average fluid arrival rate in
Nrβ
steady state is lesser than the service capacity, that is, α+β < c1 . If the
first buffer is stable then the steady-state average departure rate from that
buffer is also Nrβ/(α + β). Thus the second buffer is also stable if αNrβ + β < c2 .
Although not needed for this problem’s analysis, unless c2 < c1 , the second

1
2 c1 c2
X1(t) X2(t)
N

FIGURE 9.7
Tandem buffers with multiple identical sources. (From Gautam, N. et al., Prob. Eng. Inform. Sci.,
13, 429, 1999. With permission.)
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 581

buffer would always be empty. Thus, we assume that

Nrβ
< c2 < c1 .
α+β

Now we are ready to characterize the environment process governing

input to the second buffer. Let X1 (t) be the amount of fluid in the first buffer
at time t. Notice that if X1 (t) = 0, then the output rate from the first buffer is
equal to the input rate to the first buffer. This happens (with nonzero prob-
ability measure) only when the drift is negative. However, if X1 (t) > 0, the
output rate is equal to c1 . Thus the output rate from the first buffer is from
the set {0, r, 2r, . . . , ( − 1)r, c1 }. Hence we can think of the input to the second
buffer as though governed by a stochastic process with + 1 discrete states
{0, 1, . . . , }. Define

Z1 (t) if X1 (t) = 0
Z2 (t) =
if X1 (t) > 0,

where Z1 (t) is the number of sources on at time t. The stochastic process

{Z2 (t), t ≥ 0} is an SMP on state space {0, 1, . . . , } with kernel

G(t) = Gij (t)

which is to be derived. By definition, Gij (t) is the joint probability that given
the current state is i the next state is j and the sojourn time in i is less than t.
For i = 0, 1, . . . , − 1 and j = 0, 1, . . . , , it is relatively straightforward to
obtain Gij (t) as follows (since the sojourn times in state i are exponentially
distributed):
⎧ iα

⎪
⎪ 1 − exp{−(iα + (N − i)β)t} if j = i − 1
⎨ iα+(N−i)β

Gij (t) = (N−i)β
1 − exp{−(iα + (N − i)β)t} if j = i + 1
⎪
⎪ iα+(N−i)β
⎩
0 otherwise .

To quickly explain this result, for i = 0, 1, . . . , − 1, when the SMP is in state i,

there are i sources in the “on” state and N − i in the “off” state. Thus the
sojourn time in state i is until one of the sources changes states, hence it is
exponentially distributed with parameter iα + (N − i)β. Also the next state
is i + 1 if one of the off sources switch to on which happens with proba-
bility (N − i)β/(iα + (N − i)β). Likewise the next state is i − 1 if one of the
on sources switch to off which happens with probability iα/(iα + (N − i)β).
Notice that since the numerator would be zero when i = 0, we do not have to
be concerned about the case j = i − 1 when i = 0.
582 Analysis of Queues

The only tricky part in the kernel is to describe Gj (t). For that, we define
a first passage time in the {X1 (t), t ≥ 0} process as

T = min{t > 0 : X1 (t) = 0}.

Then for j = 0, 1, . . . , − 1, we have

Gj (t) = P{T ≤ t, Z1 (T) = j|X1 (0) = 0, Z1 (0) = }.

We now derive an expression for G̃j (s), the LST of Gj (t). Note that

G (t) = 0.

Let
⎡ ⎤
−Nβ Nβ 0 ... 0 0
⎢ α −α − (N − 1)β (N − 1)β 0 ... 0 ⎥
⎢ ⎥
⎢ · · · · · · ⎥
⎢ ⎥
Q=⎢
⎢ · · · · · · ⎥⎥
⎢ · · · · · · ⎥
⎢ ⎥
⎣ 0 0 ... (N − 1)α −(N − 1)α − β β ⎦
0 0 ... 0 Nα −Nα

and

R = diag(0, r, 2r, . . . , Nr).

Let sk (w) and χk (w) be the kth eigenvalue and corresponding eigenvec-
tor respectively of R−1 (wI − Q). As we described earlier, there are states
with negative drift. Without loss of generality we let s0 , s1 , . . . , s−1 be the
negative eigenvalues and χ0 , χ1 , . . . , χ−1 be the corresponding eigenvec-
tors written in that form suppressing they are functions of w for compact
notation. Define

Hij (x, t) = P{T ≤ t, Z1 (T) = j | X(0) = x, Z1 (0) = i}

for i, j = 0, 1, . . . , N. Then the LST of Hij (x, t) is

−1

H̃j (x, w) = akj esk x χk .
k=0

Then the LST of the kernel element we require can be computed as

G̃j (w) = H̃j (0, w).

Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 583

In other words,

G̃0 (w) = a00 χ0 + a10 χ1 + · · · + a−10 χ−1

G̃1 (w) = a01 χ0 + a11 χ1 + · · · + a−11 χ−1

and so on

G̃−1 (w) = a0−1 χ0 + a1−1 χ1 + · · · + a−1−1 χ−1

Representing in matrix notations we have

G̃j (w) = [A χ∗ ]j ,

where

χ∗ = [χ0 χ1 . . . χ−1

To solve for the constants aij we use

⎛ ⎞⎛ ⎞
a00 a10 ... a−10 χ00 χ01 ... χ0−1
⎜ ... ⎟⎜ χ10 χ11 ... χ1−1 ⎟
⎜ a01 a11 a−11 ⎟⎜ ⎟
⎜ ⎟⎜ ⎟ = I× .
⎜ .. .. .. .. ⎟⎜ .. .. .. .. ⎟
⎝ . . . . ⎠⎝ . . . . ⎠
a0−1 a1−1 ... a−1−1 χ−1
0 χ−1
1 ... χ−1
−1

In matrix notations

A χ = I,

where

χ = χ0 χ1 . . . χ−1

and χk is an -dimensional vector obtained by truncating χk to its first

elements. Therefore,

G̃j (w) = χ−1 χ∗ j .

Thus the input to the second buffer is governed by stochastic process

{Z2 (t), t ≥ 0} which is an SMP on state space {0, 1, . . . , } with kernel

G(t) = Gij (t)
584 Analysis of Queues

whose LST we describe above. When the environment is in state Z2 (t) at time
t, the fluid enters the second buffer at rate min(Z2 (t)r, c1 ).
We can also compute the sojourn time τi in state i, for i = 0, 1, . . . , as
⎧
⎨ 1
if i = 0, 1, . . . , − 1
iα+(N−i)β
τi = −1
⎩
j=1 G̃j (0) if i = ,

where G̃j (0) is the derivative of G̃j (s) with respect to s evaluated at s = 0.

Then for i = 0, 1, . . . ,

ai τi
pi = lim P{Z2 (t) = i} = , (9.53)
t→∞
k=0 ak τk

where

a = a G(∞) = a G̃(0).

Reference Notes
Stochastic fluid flow models or fluid queues have been around for almost
three decades but have not received the attention that the deterministic fluid
queues have received from researchers. In fact this may be the first textbook
that includes two chapters on fluid queues. Pioneering work on fluid queues
was done by Debasis Mitra and colleagues. In particular, the seminal article
by Anick, Mitra, and Sondhi [5] is a must read for anyone interested in the
area of fluid queues. At the end of the next chapter, there is a more exten-
sive reference to the fluid queue literature. This chapter has been mainly
an introductory one and we briefly describe the references relevant to its
development.
The single buffer fluid model setting, notation, and characterization
described in Sections 9.1.3, 9.1.4, and 9.2.1 are adapted from Kulkarni [69].
The main buffer content analysis for CTMCs in Sections 9.2.2 and 9.2.3 first
appeared in Anick, Mitra, and Sondhi [5] and the version in this chapter has
been mostly derived from Vidhyadhar Kulkarni’s course notes. Section 9.3
on first passage time analysis with examples is from a collection of papers
including Narayanan and Kulkarni [84], Aggarwal et al. [3], Gautam et al.
[39], Kulkarni and Gautam [70], and Mahabhashyam et al. [76].
There are several extensions to the setting considered in this section.
Some of these we will see in the next chapter. It is worthwhile mention-
ing about others that we will not see. In particular, Kulkarni and Rolski
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 585

[66] extend the fluid model analysis to continuous state space processes
such as the Ornstein Uhlenbeck process. Krishnan et al. [65] consider frac-
tional Brownian motion driving input to a buffer. Kella [58] uses Levy
process inputs to analyze non-product form stochastic fluid networks. From
a methodological standpoint other techniques are possible. For example,
Ahn and Ramaswami [4] consider matrix-analytic methods for transient
analysis, steady-state analysis, and first passage times of both finite- and
infinite-sized buffers.

Exercises
9.1 Consider an infinite-sized buffer into which fluid entry is modulated
by a six-state CTMC {Z(t), t ≥ 0} with S = {1, 2, 3, 4, 5, 6} and
⎡ ⎤
−1 1 0 0 0 0
⎢ 1 −2 0⎥
⎢ 1 0 0 ⎥
⎢ 0 1 −2 0⎥
⎢ 1 0 ⎥
Q=⎢ ⎥.
⎢ 0 0 1 −2 1 0⎥
⎢ ⎥
⎣ 0 0 0 1 −2 1⎦
0 0 0 0 1 −1

The constant output capacity is c = 18 kbps and the fluid arrival rates
in states 1, 2, 3, 4, 5, and 6 are 30, 25, 20, 15, 10, and 5 kbps, respec-
tively. Obtain the joint probability that in steady state there is more
than 20 kb of fluid in the buffer and the CTMC is in state 1. Also
write down an expression for limt→∞ P{X(t) ≤ x}.
9.2 Consider a finite-sized buffer whose input is modulated by a five-
state CTMC {Z(t), t ≥ 0} with S = {1, 2, 3, 4, 5} and
⎡ ⎤
−3 2 1 0 0
⎢ 2 −5 2 1 0 ⎥
⎢ ⎥
⎢ ⎥
Q=⎢ 1 2 −6 2 1 ⎥.
⎢ ⎥
⎣ 0 0 2 −4 2 ⎦
0 0 1 2 −3

The output capacity for the buffer is c = 15 kbps and the fluid arrival
rates in states 1, 2, 3, 4, and 5 are 20, 16, 12, 8, and 4 kbps, respec-
tively. Consider three values of buffer size B in kb, namely, B = 2,
B = 4, and B = 6. For the three cases obtain the probability there is
more than 1 kb of fluid in the buffer in steady state. Also obtain the
fraction of fluid that is lost because of a full buffer in all three cases.
586 Analysis of Queues

9.3 Consider an infinite-sized buffer whose input is modulated by a

five-state CTMC {Z(t), t ≥ 0} with S = {1, 2, 3, 4, 5} and
⎡ ⎤
−20 8 6 4 2
⎢ 4 −15 4 4 3 ⎥
⎢ ⎥
⎢ ⎥
Q=⎢ 1 4 −10 3 2 ⎥.
⎢ ⎥
⎣ 3 3 3 −12 3 ⎦
1 3 5 7 −16

The output capacity for the buffer is c = 4 kbps and the fluid arrival
rate is (7 − i) kbps in state i for all i ∈ S. Derive an expression for
limt→∞ P{X(t) ≤ x} for x ≥ 0.
9.4 “Leaky Bucket” is a control mechanism for admitting data into a
network. It consists of a data buffer and a token pool, as shown in
Figure 9.8. Tokens in the form of fluid are generated continuously
at a fixed rate γ into the token pool of size BT . The new tokens
are discarded if the token pool is full. External data traffic enters
the infinite-sized data buffer in fluid form from a source modu-
lated by an environmental process {Z(t), t ≥ 0} which is an -state
CTMC. Data traffic is generated at rate r(Z(t)) at time t. If there are
tokens in the token pool, the incoming fluid takes an equal amount
of tokens and enters the network. If the token pool is empty then
the fluid waits in the infinite-sized data buffer for tokens to arrive.
Let X(t) be the amount of fluid in the data buffer at time t and
Y(t) the amount of tokens in the token buffer at time t. Assume
that at time t = 0 the token buffer and data buffer are both empty,
that is, X(0) = Y(0) = 0. Draw a sample path of Z(t), X(t), Y(t),
and the output rate R(t). What is the stability condition? Assum-
ing stability, using the results in this chapter derive an expression

X(t)
(Z(t), r(Z(t))) R (t)

Output
Data buffer
Token BT Y (t)
pool

Token
rate γ

FIGURE 9.8
Single leaky bucket. (From Gautam, N., Telecommun. Syst., 21(1), 35, 2002. With permission.)
Stochastic Fluid-Flow Queues: Characteristics and Exact Analysis 587

for the distribution of the amount of fluid in the data buffer in

steady state.
9.5 A sensor operates under three sensing environments: harsh,
medium, and mild that we call states 1, 2, and 3, respectively. The
environment toggles between the three states according to a CTMC
with S = {1, 2, 3} and in units of “per day,”
⎡ ⎤
−20 16 4
⎢ ⎥
Q = ⎣ 10 −40 30 ⎦ .
2 8 −10

There is a battery of 1 kJ charge that is used to run the sensor.

The battery charge depletes continuously at rates 0.5, 0.2, and
0.1 kJ per day if operated under harsh, medium, and mild con-
ditions, respectively. The battery is replaced instantaneously with
a fully charged battery if the charge depletes to zero or the bat-
tery has not been replaced in 5 days, whichever happens first.
What fraction of replacements are due to the charge depleting
completely?
9.6 Reconsider one of the problems described earlier. There is a buffer
whose input is modulated by a five-state CTMC {Z(t), t ≥ 0} with
S = {1, 2, 3, 4, 5} and
⎡ ⎤
−20 8 6 4 2
⎢ 4 −15 4 4 3 ⎥
⎢ ⎥
⎢ ⎥
Q=⎢ 1 4 −10 3 2 ⎥.
⎢ ⎥
⎣ 3 3 3 −12 3 ⎦
1 3 5 7 −16

The output capacity for the buffer is c = 4 kbps and the fluid arrival
rate is (7 − i) kbps in state i for all i ∈ S. Initially there is 0.1 kb of
fluid and the environment is in state 3. Define the first passage time
as the time to reach a buffer level of 0.2 or 0 kb, whichever happens
first. Obtain the LST of the first passage time and derive the mean
first passage time.
9.7 Consider a wireless sensor node that acts as a source that generates
fluid at a constant rate r. This fluid flows into a buffer of infinite size.
The buffer is emptied by a channel that toggles between capacity c
and 0 for exp(γ) and exp(δ) time, respectively. Assume that c > r
and that the system is stable. Derive an expression for the steady-
state distribution of buffer contents. Also, characterize the output
rate process from the queue (notice that the output rate is 0, r, or c).
588 Analysis of Queues

9.8 An exponential on–off source inputs fluid into an infinite-sized

buffer. The on and off time parameters are as usual α and β, respec-
tively. There are two thresholds x1 and x2 such that 0 < x1 < x2 < ∞.
Let X(t) be the amount of fluid in the buffer at time t. At time t,
the output capacity is c1 if X(t) < x1 , it is c2 if x1 ≤ X(t) < x2 and c3 if
X(t) ≥ x2 . Assume that the system is stable and c1 < c2 < c3 < r, where
r is the usual arrival rate when the source is on. Derive an expres-
sion to determine the fraction of time the output capacity is ci for
i = 1, 2, 3. Use that to determine the probability that in steady state
the amount of fluid in the buffer exceeds B, where B > x2 .
9.9 Starting with the LSTs, derive the expressions for E[T1 ] in Equa-
tions 9.46 and 9.51 as well as E[T2 ] in Equations 9.47 and 9.52
by taking the derivative of the respective LSTs and taking limits
appropriately.
9.10 A node in a multi-hop wireless network has an infinite-sized buffer.
The input to the buffer is modulated by a standard exponential on–
off source with on and off parameters α and β, respectively. Also
fluid arrival rate is r when the source is on. The channel capacity is
also modulated by an exponentially distributed alternating renewal
process. The channel capacity toggles between c1 and c2 for exp(γ)
and exp(δ) time, respectively. Assume that r > c2 > c1 . What is the
condition of stability? Assuming the system to be stable, obtain an
expression for the steady-state distribution of buffer contents.
10
Stochastic Fluid-Flow Queues: Bounds and
Tail Asymptotics

In Chapter 9, we saw single-stage stochastic fluid queues with continuous-

time Markov chain (CTMC) environment processes governing the inputs,
and the output channel capacities remaining a constant c. There we showed
how to obtain steady-state workload distributions as well as perform first
passage time analysis for those queues. What if we were to extend to
general environment processes, considering varying output capacities, mul-
ticlass queues, multistage queues (i.e., networks), and merging fluid flows?
The objective of this chapter is precisely to study the performance analy-
sis under those extended conditions. However, it is intractable to obtain
closed-form expressions, and we will show here how to derive bounds and
approximations for those cases.

10.1 Introduction and Preliminaries

In this chapter, we begin by considering an infinite-sized buffer into
which fluid enters at discrete rates governed by an environment process
{Z(t), t ≥ 0}. When the environment process is in state Z(t) at time t, fluid
enters the buffer at rate r(Z(t)). Fluid is removed from the buffer from an ori-
fice with capacity c. To explain the notion of capacity, let X(t) be the amount
of fluid in the buffer at time t. If X(t) > 0, then fluid exits the buffer at rate
c. However, if X(t) = 0, then fluid exits the buffer at rate r(Z(t)) ignoring
the fact that there could be instantaneous times (hence measure zero) when
X(t) = 0 and r(Z(t)) > c. This system is depicted in Figure 10.1. This is iden-
tical to the setting in Section 9.1.3. However, notice that at this time we do
not make any assumption regarding the stochastic process {Z(t), t ≥ 0} other
than the fact that Z(t) is discrete.
Our objective is to first obtain bounds and approximations for
limt → ∞ P{X(t) > x} under this setting. This would also enable us to compare
against exact analysis when the environment process is a CTMC. After that,
we will consider varying output capacities, multiclass queues, multistage
queues, and multisource queues. For all those extensions, the bounds and
approximations can be used suitably as we will see subsequently. However,

589
590 Analysis of Queues

X(t)
c
Z(t)

FIGURE 10.1
Single buffer with a single environment process and output capacity c (From Gautam, N., Qual-
ity of service metrics. In: Frontiers in Distributed Sensor Networks, S.S. Iyengar and R.R. Brooks
(eds.), Chapman & Hall/CRC Press, Boca Raton, FL, pp. 613–628, 2004.)

to obtain bounds and approximations, we first need to find a way to summa-

rize the input process to the queue. For that, we use the concept of asymptotic
logarithmic moment generating function (ALMGF), which is also closely
related to the notion of effective bandwidths. We consider that next.

10.1.1 Inﬂow Characteristics: Effective Bandwidths and ALMGF

To obtain bounds and approximations for buffer content distributions, we
first need to summarize the characteristics of traffic flow into a buffer in an
effective manner. For that we use ALMGF and effective bandwidths that we
describe in the following text. It is crucial to point out here that through-
out this section we will not consider the buffer at all. Our focus will just be
on the input environment process. We will show in the next section how to
use the ALMGF and effective bandwidths to obtain bounds and approxima-
tions. Interestingly, the notion and analysis can also be used under discrete
queueing situations to obtain bounds and approximations for the workload
distribution. However, we will mainly focus on the fluid case with occasional
reference to discrete.
We describe the notation used and we deliberately restate some of the
definitions so that they can all be in one place for easy reference. Let A(t)
be the total amount of (fluid or discrete) workload generated by a source
(or environment process) over time (0, t]. The unit of A(t) is in appropri-
ate workload units (kB or liters or even time units especially for discrete).
For the analysis to follow, we only consider a fluid model. Consider an
ergodic stochastic process {Z(t), t ≥ 0} that models the flow of fluid work-
load. As defined earlier, r(Z(t)) is the rate at which workload is generated at
time t.
Using r(Z(t)) we can write down the total amount of fluid generated as

t
A(t) = r(Z(u))du.
0

This uses the fact that A(t) is the cumulative workload generated from time
0 to t, and r(Z(u)) is the instantaneous rate at which workload is gener-
ated. Thus, from the first principles of integration we have the expression
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 591

for A(t). Note that {A(t), t ≥ 0} is a stochastic process that is nondecreasing

in t. Next we will summarize the {A(t), t ≥ 0} process so that it is conducive
for performance analysis subsequently. We describe two notions, ALMGF
and effective bandwidths, that are written in terms of A(t).
The ALMGF of the workload process is defined as

1
h(v) = lim log E{exp(vA(t))}
t→∞ t

where v ≥ 0 is a parameter much like the parameters in moment generating

functions, LSTs, etc. We assume that {A(t), t ≥ 0} satisfies the Gärtner–Ellis
conditions, that is, h(v) as defined earlier exists, is finite for all real v, and
is differentiable. Using the preceding expression for h(v), it is possible to
show that h(v) is an increasing, convex function of v (see Figure 10.2) and for
all v ≥ 0,

rmean ≤ h (v) ≤ rpeak

where
rmean = E(r(Z(∞))) is the mean traffic flow rate
rpeak = supz {r(z)} is the peak traffic flow rate
h (v) denotes the derivative of h(v) with respect to v

The effective bandwidth of the workload process is defined as

1 h(v)
eb(v) = lim log E{exp(vA(t))} = .
t→∞ vt v

Thus, the ALMGF and effective bandwidths are related much similar to rela-
tionship between LST and the Laplace transform. There are benefits of both

h(v)

FIGURE 10.2
Graph of h(v) versus v.
592 Analysis of Queues

and thus we will continue using both. Using the definition of eb(v), it can be
shown that eb(v) is an increasing function of v and

rmean ≤ eb(v) ≤ rpeak .

Also,

lim eb(v) = rmean and lim eb(v) = rpeak .

v→0 v→∞

These properties are depicted in Figure 10.3. Although we will see their
implications subsequently, it is worthwhile explaining the properties related
to rmean and rpeak , which are the mean and peak input rates, respectively.
Essentially, eb(v) summarizes the workload generation rate process. Two
obvious summaries are rmean and rpeak that correspond to the average case
and worst-case scenarios. The eb(v) parameter captures those as well as
everything in between. This would become more apparent when we con-
sider the workload flowing into an infinite-sized buffer (with X denoting the
steady-state buffer content level assuming it exists). If the output capacity
of this buffer is eb(v), then as x → ∞, P{X > x} → e−vx . Naturally, the output
capacity of the buffer must be greater than rmean to ensure stability and it
must be less than rpeak to have any fluid buildup. We expect the probabil-
ity of the buffer content being greater than x to be higher when the output
capacity is closer to rmean than when it is closer to rpeak . That intuition can be
verified.

eb(v)

r peak

r mean

FIGURE 10.3
Graph of eb(v) versus v.
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 593

Having defined ALMGF and the effective bandwidth (as well as briefly
alluding to how we will use it for describing steady-state buffer content
distribution), the next question is if we are given the environment process
{Z(t), t ≥ 0} and the workload rates r(Z(t)) for all t, can we obtain expressions
for h(v) and eb(v)? That will be the focus of the following section.

10.1.2 Computing Effective Bandwidth and ALMGF

In this section, we show how to obtain the effective bandwidth and ALMGF
starting with the environment process. In particular, we will show that
for three types of environment processes: CTMCs, alternating renewal pro-
cess, and semi-Markov processes (SMPs) (for a detailed description of these
environment processes, see Section 9.1.4).

10.1.2.1 Effective Bandwidth of a CTMC Source

Elwalid and Mitra [30] and Kesidis et al. [61] use eigenvalue techniques to
show how to compute the effective bandwidths of sources that are modu-
lated by CTMCs. The description to follow is based on those articles. Let
{Z(t), t ≥ 0} be an irreducible, finite state CTMC with generator matrix Q.
When the CTMC is in state i, the source generates fluid at rate r(i). Let R
be a diagonal matrix with elements corresponding to the fluid rates at those
states. In other words, R = diag[r(i)]. These descriptions are identical to the
notations in Section 9.2.1. Let e(M) denote the largest real eigenvalue of a
square matrix M. Then,

h(v) = e(Q + vR). (10.1)

Next, we show through Problem 99 we next show how to derive the previous
expression.

Problem 99
Derive the expression for h(v) in Equation 10.1 using the definition of h(v)
for a CTMC environment process {Z(t), t ≥ 0} with state space S, generator
matrix Q = [qij ], and fluid rate matrix R.
Solution
From the definition of h(v) given by

1
h(v) = lim log E evA(t) ,
t→∞ t
594 Analysis of Queues

we consider

gi (t) = E evA(t) Z(0) = i

for some i ∈ S. We can immediately write down the following for some
infinitesimally small positive h:

gi (t + h) = E evA(t+h) Z(0) = i

= E evA(t+h) Z(h) = j, Z(0) = i P{Z(h) = jZ(0) = i}
j∈S

= evr(i)h E evA(t) Z(0) = j qij h + evr(i)h E evA(t) Z(0) = i + o(h)
j∈S

where o(h) are terms of the order higher than h such that o(h)/h → 0
as h → 0. Before proceeding, we explain the last equation. First of all,
P{Z(h) = j|Z(0) = i} = qij h + o(h) if i = j and P{Z(h) = i|Z(0) = i} = 1 + qii h + o(h)
using standard CTMC transient analysis results. Also, from time 0 to h when
the CTMC is in state i, r(i)h amount of fluid is generated. Thus, A(t + h) is
stochastically identical to A(t) + r(i)h assuming that at time h the environ-
ment process toggles from i to j. Using that we can rewrite gi (t + h) in the
previous equation as

gi (t + h) = evr(i)h gj (t)qij h + evr(i)h gi (t) + o(h).
j∈S

Subtracting gi (t) from both sides of the equation, dividing by h, and letting
h → 0, we get

dgi (t)
= gi (t)vr(i) + gj (t)qij
dt
j∈S

using the fact that evr(i)h = 1 + vr(i)h + o(h). We can write the preceding
differential equation in vector form as

dg(t)
= [Rv + Q]g(t)
dt

where g(t) is a |S| × 1 column vector of gi (t) values. The equation is similar
to several differential equations derived in Chapter 9. From there we can see
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 595

that the generic solution is of the form

g(t) = aj eλj t ψj
j∈S

where
aj values are scalar constants
λj and ψj are the jth eigenvalue and corresponding right eigenvector,
respectively, of (Rv + Q)

Let θ = e(Rv+Q) be the largest real eigenvalue of (Rv+Q). We can rewrite

g(t) as follows:

g(t) = eθt aj e(λj −θ)t ψj .
j∈S

Notice that as t → ∞, e(λj −θ)t → 0 if λj = θ and e(λj −θ)t → 1 if λj = θ. By

considering π0 as the 1 × |S| row-vector corresponding to the initial prob-
ability of the environment, that is, π0 = [P{Z(0) = i}], clearly we have from
the definition of h(v) that

1
h(v) = lim log[π0 g(t)].
t→∞ t

Using the expression for g(t), we can immediately write down the following:
⎡ ⎤
1
h(v) = lim log ⎣eθt π0 aj e(λj −θ)t ψj ⎦
t→∞ t
j∈S
⎧ ⎡ ⎤⎫
1⎨ ⎬
= lim log eθt + log ⎣π0 aj e(λj −θ)t ψj ⎦
t→∞ t ⎩ ⎭
j∈S
⎡ ⎤
1
= θ + lim log ⎣π0 aj e(λj −θ)t ψj ⎦
t→∞ t
j∈S

= θ + 0.

For the last expression, we do need some j for which λj = θ so that the sum-
mation itself does not go to zero or infinite as t → ∞. Since θ = e(Rv + Q), we
have h(v) = e(Rv + Q).
596 Analysis of Queues

Of course, eb(v) can be immediately obtained as eb(v) = h(v)/v. We now

illustrate eb(v) for a CTMC on-off source.

Problem 100
Consider an on-off source with on-times according to exp(α) and off-times
according to exp(β). Traffic is generated at rate r when the source is in the
on-state and no traffic is generated when the source is in the off-state. Obtain
a closed-form algebraic expression for eb(v) for such a source.
Solution
The environment process {Z(t), t ≥ 0} is a two-state CTMC, where the first
state corresponds to the source being off and the second state corresponds to
the source being on. Therefore,

0 0 −β β
R= and Q= .
0 r α −α

Define matrix M = Q + vR since we need to compute the largest eigenvalue

of Q + vR. Thus,

−β β
M= .
α rv − α

The eigenvalues of M are given by the solution to the characteristic equation

|M − λI| = 0,

which yields

(rv − α − λ)(−β − λ) − βα = 0.

It can be rewritten in the form of quadratic equation (in terms of λ) as

λ2 + (β + α − rv)λ − βrv = 0,

the solution to which is

rv − α − β ± (rv − α − β)2 + 4βrv
λ= .
2

Using the larger of the two eigenvalues of M, we obtain the effective

bandwidth as
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 597

rv − α − β + (rv − α − β)2 + 4βrv
eb(v) = . (10.2)
2v

For the preceding CTMC source, we have the mean rate as

rβ
rmean =
α+β

and the peak rate as

rpeak = r.

By taking the limits in Equation 10.2 as v → 0 and v → ∞, we can see that

eb(0) = rmean and eb(∞) = rpeak . In fact, the result for h (v) can be obtained
in exactly the same fashion. Next, we present a numerical example and
graphically illustrate eb(v) for various v values.

Problem 101
Water flows into the reservoir according to a CTMC {Z(t), t ≥ 0} with
S = {1, 2, 3, 4, 5} and
⎡ ⎤
−1 0.4 0.3 0.2 0.1
⎢ 0.4 −0.7 0.1 0.1 0.1 ⎥
⎢ ⎥
⎢
Q = ⎢ 0.5 0.4 −1.1 0.1 0.1 ⎥
⎥.
⎣ 0.2 0.3 0.3 −1 0.2 ⎦
0.3 0.3 0.3 0.3 −1.2

When Z(t) = i, the inflow rate is 4i. Graph eb(v) versus v for the water flow.
Solution
For the CTMC source we have
⎡ ⎤
4 0 0 0 0
⎢ 0 8 0 0 0 ⎥
⎢ ⎥
R= ⎢
⎢ 0 0 12 0 0 ⎥.
⎥
⎣ 0 0 0 16 0 ⎦
0 0 0 0 20
5
Also rpeak = 20 and rmean = 4ipi = 9.6672 since the solution to [p1 p2 p3
i=1
p4 p5 ] Q = [0 0 0 0 0] and p1 + p2 + p3 + p4 + p5 = 1 is
[p1 p2 p3 p4 p5 ] = [0.2725 0.3438 0.1652 0.1315 0.0870]. Using eb(v) = e(Q/v +
R), we plot eb(v) versus v in Figure 10.4.
598 Analysis of Queues

16
eb(v)

8
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
v

FIGURE 10.4
Graph of eb(v) versus v for Problem 101.

Next, we present an approach to compute the effective bandwidth of

an SMP source (after that we will show for an alternating on-off source by
considering it as a special case of SMP).

10.1.2.2 Effective Bandwidth of a Semi-Markov Process (SMP) Source

Kulkarni [68] shows how to compute the effective bandwidths of sources
that are modulated by Markov regenerative processes (MRGP). Since the
SMP is a special type of MRGP, we follow the same analysis. Let {Z(t), t ≥ 0}
be an -state SMP. Recall from the SMP environment characterization in
Section 9.1.4 that S1 is the time of the first jump epoch in the SMP. Also recall
that Zn is the state of the SMP after the nth jump. Define

ij (u, v) = E{e−(u−r(i)v)S1 I(Z1 = j) | Z0 = i},

for i, j = 1, 2, . . . , and −∞ < u, v < ∞, where I(·) is the indicator function.

Notice that we can write down ij (u, v) in terms of the kernel of the SMP
G(x) = [Gij (x)], where

Gij (x) = P{S1 ≤ x; Z1 = j | Z0 = i}.

Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 599

In particular, if the LST of Gij (x) is G̃ij (w), that is,

∞
G̃ij (w) = e−wx dGij (x),
0

then we have

ij (u, v) = G̃ij (u − r(i)v). (10.3)

Having characterized ij (u, v), we write it in the matrix form as

(u, v) = [ij (u, v)],

which is an × matrix. Using the same notation as in the CTMC case, we

define e((u, v)) as the largest real-positive eigenvalue of (u, v).
Define

e∗ (v) = sup {e((u, v))}

{u>0:e((u,v))<∞}

and

u∗ (v) = inf{u > 0 : e((u, v)) < ∞}.

Then, for a given v,

(a) If e∗ (v) ≥ 1 (see Figure 10.5(a)), h(v) is a unique solution to

e((h(v), v)) = 1.
(b) If e∗ (v) < 1 (see Figure 10.5(b)), h(v) = u∗ (v).

Thereby, the effective bandwidth can be obtained as eb(v) = h(v)/v.

e(Λ(u,v)) e(Λ(u,v))

e* (v)

1 1
e* (v)

u* (v) u u* (v) u
(a) (b)

FIGURE 10.5
(a) e((u, v)) versus u and (b) e((u, v)) versus u.
600 Analysis of Queues

We now explain computing the ALMGF and the effective bandwidth

for SMP sources using an example. The example will be of the type where
e∗ (v) ≥ 1 for all v due to the type of distributions used. We will see examples
in the following sections for SMP sources, where we encounter e∗ (v) < 1.

Problem 102
Fluid flows into a buffer according to a three-state SMP {Z(t), t ≥ 0} with
state space {1, 2, 3}. The elements of the kernel of this SMP are given as
follows: G12 (t) = 1 − e−t − te−t , G21 (t) = 0.4(1 − e−0.5t ) + 0.3(1 − e−0.2t ),
G23 (t) = 0.2(1 − e−0.5t ) + 0.1(1 − e−0.2t ), G32 (t) = 1 − 2e−t + e−2t , and
G11 (t) = G13 (t) = G22 (t) = G31 (t) = G33 (t) = 0. Also, the flow rates in the three
states are r(i) = i for i = 1, 2, 3. Graph eb(v) versus v for v ∈ [0, 3].
Solution
For the SMP source we have the LST of the kernel as

⎡ 1
⎤
0 (w+1)2
0
⎢ ⎥
G̃(w) = ⎣ 0.2
0.5+w + 0.06
0.2+w 0 0.1
0.5+w + 0.02
0.2+w ⎦.
2
0 (2+w)(1+w) 0

Now, using the elements of the LST of the kernel, we can easily write down
for i = 1, 2, 3 and j = 1, 2, 3,

ij (u, v) = G̃ij (u − r(i)v).

Notice that the LSTs are such that G̃ij (w) would shoot off to infinity only
if their denominators (if any) become zero. However, the shooting off to
infinity would not be sudden but gradual. Hence, for all v, e∗ (v) ≥ 1 (in fact
e∗ (v) = ∞ and u∗ (v) = max{r(1)v−1, r(2)v−0.2, r(3)v−1, 0} because, for exam-
ple, at u = r(1)v − 1, the denominator of G̃12 (w) goes to zero, and so on for
the other LSTs). Thus, to compute eb(v), all we need is the unique solution to
e((veb(v), v)) = 1.
Next, we explain how to numerically obtain the unique solution. Essen-
tially, for a given v, e((u, v)) decreases with respect to u from u∗ (v) to
infinity. Using that and the bounds on eb(v), that is, rmean ≤ eb(v) ≤ rpeak for all
v, we can perform a binary search for eb(v) between max{rmean , u∗ (v)/v} and
rpeak to find the unique solution to e((h(v), v)) = 1. Notice that rpeak = r(3) = 3
3
and rmean = ipi = 1.8119, since the stationary distribution of the SMP
i=1
can be computed as [p1 p2 p3 ] = 1/(π1 τ1 + π2 τ2 + π3 τ3 )[π1 τ1 π2 τ2 π3 τ3 ]
= 1/(0.35 × 2 + 0.5 × 3.2 + 0.15 × 1.5)[0.35 × 2 0.5 × 3.2 0.15 × 1.5] = [0.2772
0.6337 0.0891].
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 601

2.8

2.7

2.6

2.5

2.4
eb(v)

2.3

2.2

2.1

1.9

1.8
0 0.5 1 1.5 2 2.5 3
v

FIGURE 10.6
Graph of eb(v) versus v for Problem 102.

Using eb(v) as the solution to e((h(v), v)) = 1, we plot eb(v) versus v for
v ∈ [0, 3] in Figure 10.6. The ALMGF h(v) can be obtained as veb(v).

10.1.2.3 Effective Bandwidth of a General On/Off Source

Information flow in computer-communication networks are frequently mod-
eled as general on/off flows because when measurements are made on a
channel, either information is flowing at the channel speed of r kbps or noth-
ing is flowing. With that motivation we consider the general on/off source,
although we see that it is indeed a special case of an SMP. But for simplicity,
we model it as an alternating renewal process. We will also see in a numerical
example how this would be a special case of a CTMC source if we can model
the general on/off process as a CTMC.
Consider a source that generates traffic according to an alternating
renewal process. In particular, we consider an on-off source where the on
and off times are independent. When the source is on, fluid is generated at
rate r, and no fluid is generated when the source is off. Let U be a random
variable that denotes an on-time and D be a random variable that denotes
an off-time (the letters U and D denote up and down but the more appro-
priate notion is on and off in this context). We first directly use results from
MRGPs, in particular regenerative processes. Say at time 0 the source just
transitioned from off to on. Let S1 be the time when such an event occurs
602 Analysis of Queues

once again. Then, S1 is called a regeneration epoch and S1 = U + D. Let

F1 be the total amount of fluid generated during S1 , and hence we have
F1 = rU. Define

(u, v) = E[e−uS1 +vF1 ] = E[e−u(U+D)+vrU ] = E[e(−u+rv)U ]E[e−uD ].

Since (u, v) is a scalar, e((u, v)) = (u, v) in scalar notation.

Now, we could have derived the preceding using SMP, we would have
unnecessarily have had to deal with a 2×2 matrix and take things from there.
Nonetheless, it is worthwhile to check to see if the results match by going
the SMP way. In fact, from here on we continue with the SMP approach.
Following the technique in the SMP case earlier, define

e∗ (v) = sup {(u, v)}

{u>0:(u,v)<∞}

and

u∗ (v) = inf{u > 0 : (u, v) < ∞}.

For a given v,

(a) If e∗ (v) ≥ 1, h(v) is a unique solution to e((h(v), v)) = 1.

(b) If e∗ (v) < 1, h(v) = u∗ (v).

Thereby, the effective bandwidth can be obtained as eb(v) = h(v)/v. Next

we present two examples to illustrate the approach to obtain ALMGF and
effective bandwidths for general on/off sources. However, in both examples
we will only see the case where e∗ (v) ≥ 1 (in a subsequent section we will
consider the e∗ (v) < 1 case).

Problem 103
Consider an on-off source that generates fluid so that the on-times are IID
random variables with CDF U(t) = 1 − 0.6e−3t − 0.4e−2t and the off-times are
IID Erlang random variables with mean 0.5 and variance 1/12 in appropriate
time units compatible with the on-times. When the source is on, fluid is gen-
erated at the rate of 16 units per second; and no fluid is generated when the
source is off. This is identical to the source described in Problem 88. Graph
h(v) versus v for v ∈ [0, 1].
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 603

Solution
For the on/off source, we have the LST of the on-times as

∞ 1.8 0.8
E e−wU = e−wt dU(t) = + .
3+w 2+w
0

Likewise, the LST of the off-times is

3
∞ 6
E e−wD = e−wt dD(t) =
6+w
0

since the CDF of the off-times D(t) = 1 − e−6t − 6te−6t − 18t2 e−6t for all t ≥ 0.
Using these we can write down

3
1.8 0.8 6
(u, v) = E e(−u+rv)U E e−uD = +
3 − rv + u 2 − rv + u 6+u

with r = 16.
Notice that the LSTs are such that (u, v) would shoot off to infinity
only if the denominator becomes zero. Also, the shooting off to infinity
would not be abrupt but gradual. Hence, for all v, e∗ (v) ≥ 1 (in fact e∗ (v) = ∞
and u∗ (v) = max{rv − 2, 0} because for all u > rv − 2, the denominator of
(u, v) is nonzero). Thus, to compute h(v) all we need is the unique solu-
tion to (h(v), v) = 1. To numerically obtain the unique solution, note that
for a given v, (u, v) decreases with respect to u from u∗ (v) to infinity.
Using that and the bounds on eb(v), that is, rmean ≤ eb(v) ≤ rpeak for all v, we
can perform a binary search for h(v) between max{vrmean , u∗ (v)} and vrpeak
to find the unique solution to (h(v), v) = 1. Here we have rpeak = r = 16
and rmean = rE[U]/(E[U] + E[D]) = 7.1111. Using h(v) as the solution to
(h(v), v) = 1, we plot h(v) versus v for v ∈ [0, 1] in Figure 10.7.

To contrast the methodologies, in the next exercise we consider the same

problem as the previous one, but use the CTMC source approach. Although
we expect the same results, the approach is quite different.
604 Analysis of Queues

10
h (v)

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
v

FIGURE 10.7
Graph of eb(v) versus v for Problem 103.

Problem 104
Consider Problem 103 and obtain h(v) for v = 0.5 and v = 1 using the CTMC
source results.
Solution
We follow the analysis outlined in the solution to Problem 88. Notice that the
on-times correspond to a two-phase hyperexponential distribution. Hence
the on-time would be exp(3) with probability 0.6 and it would be exp(2) with
probability 0.4, which can be deduced from U(t). The off-times correspond
to the sum of three IID exp(6) random variables. Thus, we can write down
the environment process {Z(t), t ≥ 0} as an = 5 state CTMC with states 1 and
2 corresponding to on and states 3, 4, and 5 corresponding to off. Thus, the
Q matrix corresponding to S = {1, 2, 3, 4, 5} is

⎡ ⎤
−3 0 3 0 0
⎢ 0 −2 2 0 0 ⎥
⎢ ⎥
⎢ ⎥
Q=⎢ 0 0 −6 6 0 ⎥.
⎢ ⎥
⎣ 0 0 0 −6 6 ⎦
3.6 2.4 0 0 −6
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 605

The rate matrix is

⎡ ⎤
16 0 0 0 0
⎢ 0 16 0 0 0 ⎥
⎢ ⎥
R=⎢
⎢ 0 0 0 0 0 ⎥.
⎥
⎣ 0 0 0 0 0 ⎦
0 0 0 0 0

Since h(v) = e(Q + vR), we obtain the eigenvalues of (Q + vR). For v = 0.5,
we get the eigenvalues as −9.3707, −4.4931 + 3.4689i, −4.4931 − 3.4689i,
5.2364, and 6.1205. When there are some eigenvalues that are complex, soft-
ware packages when asked to compute the maximum might compute the
maximum of the absolute value. However, in this case, that would return
the wrong value. The correct approach is to determine the maximum for the
real part. If one does that here, we get h(0.5) = 6.1205, which matches exactly
with the approach used in Problem 103. There is a perfect match for h(1)
as well yielding a value 14.0226, again with two eigenvalues with imaginary
parts. It is worthwhile to spend a few moments contrasting the two methods,
that is, the one used in this problem and that of Problem 103.

There are several other stochastic process sources for which one can
obtain the effective bandwidths. As described earlier for an MRGP and
regenerative process, we can use the results in Kulkarni [68]. See Krishnan
et al. [65] for the calculation of effective bandwidths for traffic modeled by
fractional Brownian motion. In fact, we can even obtain effective bandwidths
for discrete sources. We do not present any of those here but the reader is
encouraged to do a literature search to find out more about those cases. Next
we consider some quick extensions.

10.1.3 Two Extensions: Trafﬁc Superposition and Flow through a Queue

To extend our analysis to follow for fluid queues’ workload distribution
under cases of networks and multiple classes, we need to obtain the effective
bandwidths under those conditions. The first case is to obtain the effective
bandwidth in fluid queues with more than one source. The second one is the
effective bandwidth of the output from a queue.

10.1.3.1 Superposition of Multiple Sources

Consider a single buffer that admits fluid from K independent sources
(Figure 10.8). So far, we have only seen the special case K = 1. Here, each
source k (for k = 1, . . . , K) is driven by an independent environment process
{Zk (t), t ≥ 0}. Note that Zk (t) can be thought of as the state of the kth input
source (k = 1, 2, . . . , K) at time t. When source k is in state Zk (t), it generates
606 Analysis of Queues

Z1(t)
Z2(t) X(t)
C
Z k(t)

FIGURE 10.8
Single infinite-sized buffer with multiple input sources. (From Gautam, N. et al., Prob. Eng.
Inform. Sci., 13, 429, 1999. With permission.)

fluid at rate rk (Zk (t)) into the buffer. Let ebk (v) be the effective bandwidth of
source k such that

1
ebk (v) = lim log E{exp(vAk (t))}
t→∞ vt

where

t
Ak (t) = rk Zk (u) du.
0

If the stochastic process {Zk (t), t ≥ 0} is an SMP (or one of the processes that
is tractable to get the effective bandwidths), then we can obtain ebk (v). Then
the net effective bandwidth of the fluid arrival into the buffer due to all the K
sources is, say, eb(v). Since the net fluid input rate is just the sum of the input
rates of the K superpositioned sources, we have A(t) = A1 (t) + · · · + AK (t). By
definition

1 1
eb(v) = lim log E{exp(vA(t))} = lim log E[exp(v{A1 (t) + · · · + AK (t)})].
t→∞ vt t→∞ vt

Since the sources are independent, we have

1
eb(v) = lim log E[exp(vA1 (t))]E[exp(vA2 (t))] . . . E[exp(vAK (t))]
t→∞ vt

1
K
= lim log E[exp(vAk (t))].
t→∞ vt
k=1

Thus, we have

K
eb(v) = ebk (v).
k=1
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 607

In summary, the effective bandwidth of a superposition of K independent

sources is equal to the sum of the effective bandwidths of each source.

10.1.3.2 Effective Bandwidth of the Output from a Queue

Here we consider fluid that flows through a queue and is removed at a
maximum rate of c. The question we ask is what is the output effective band-
width? The motivation is that in many applications, the output traffic from a
buffer acts as input traffic for a downstream node in a network. Typically,
it may not be possible to characterize some output processes as tractable
stochastic processes and compute the effective bandwidths. Hence if we have
a relationship between the input and output effective bandwidths, we do not
need to in fact characterize the traffic process of the output.
To derive the ALMGF and effective bandwidth of the output from a
buffer, we consider an infinite-sized buffer driven by a random environment
process {Z(t), t ≥ 0}. In principle, this could be one source or a superposition
of multiple sources. But for ease of exposition we present as though it is one
source similar to the one described in Figure 10.1. When the environment
is in state Z(t), fluid enters the infinite-sized buffer at rate r(Z(t)). Let A(t)
be the total amount of fluid generated by a source into the buffer over time
(0, t], and

t
A(t) = r(Z(u))du.
0

The ALMGF of the input traffic is

1
hA (v) = lim log E{exp(vA(t))}.
t→∞ t

The effective bandwidth of the input traffic is

1
ebA (v) = lim log E{exp(vA(t))}.
t→∞ vt

We assume that the buffer is stable, that is,

E[r(Z(∞))] < c.

We ask the question: What is the ALMGF as well as the effective band-
width of the output traffic from the buffer? For that let D(t) be the total
output from the buffer over (0, t]. By definition, the ALMGF of the output is

1
hD (v) = lim log E{exp(vD(t))}.
t→∞ t
608 Analysis of Queues

Recall that

rmean ≤ h (v) ≤ rpeak

for any ALMGF h(v). For an infinite-sized stable buffer, since rmean would be
the same for the input and output due to no loss, we have

rmean ≤ hA (v) ≤ rpeak

and

rmean ≤ hD (v) ≤ c.

Figure 10.9 pictorially depicts this where there is a v∗ , which is the value of
v for which hA (v) = c. In other words, for v > v∗ , hD (v) essentially follows the
tangent at point v∗ . We can write down the relationship between hA (v) and
hD (v) as

hA (v) if 0 ≤ v ≤ v∗
hD (v) =
hA (v ) − cv + cv if v > v∗ ,
∗ ∗

where v∗ is obtained by solving for v in the equation,

d
[hA (v)] = c.
dv

The implications are similar for the effective bandwidths as well.

Specifically, if the effective bandwidth of the output traffic from the
buffer is

hA(v)

hD(v)

v* v

FIGURE 10.9
Relationship between ALMGF of the input and the output of a buffer. (From Kulkarni, V.G. and
Gautam, N., Queueing Syst. Theory Appl., 27, 79, 1997. With permission.)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 609

1
ebD (v) = lim log E{exp(vD(t))},
t→∞ vt

then to obtain the relationship between ebD (v) and ebA (v), we can write down
the effective bandwidth ebD (v) of the output as

ebA (v) if 0 ≤ v ≤ v∗
ebD (v) = v∗
c − v {c − ebA (v )} if v > v∗ .
∗

Notice from the preceding expression that ebD (v) ≤ ebA (v). That is mainly
because D(t) ≤ A(t) for all t if the queue was empty initially. Further, the peak
rate for the input is rpeak whereas it is c for the output. Since rpeak > c, there
would be values of v when ebD (v) will be lesser than ebA (v) (as for very large
v, the effective bandwidths approach their respective peak values, i.e., rpeak
for input and c for output). The main implication of ebD (v) ≤ ebA (v) is that as
fluid is passed from queue to queue, it will eventually get more and more
smooth approaching closer to the mean rate.
For more details regarding effective bandwidths of output processes,
refer to Chang and Thomas [16], Chang and Zajic [17], and de Veciana
et al. [23]. In the following section, we will find out how to use effective
bandwidths to obtain bounds and approximations for the steady-state buffer
contents.

10.2 Performance Analysis of a Single Queue

Consider the single buffer fluid queue described in Figure 10.1. Recall that
the buffer is of size infinity, and this buffer-size assumption will hold for the
entire chapter. Fluid enters the buffer at discrete rates governed by a discrete-
state environment process {Z(t), t ≥ 0}. When the environment process is in
state Z(t) at time t, fluid enters the buffer at rate r(Z(t)). The output capacity
of the buffer is a constant c. Let X(t) be the amount of fluid in the buffer at
time t. We assume that the system is stable, that is,

E[r(Z(∞))] < c.

Then as t → ∞, the buffer contents converge to a stationary distribution,

in particular X(t) → X. The objective of this section is to obtain the prob-
ability distribution of X. Specifically, we would like to obtain expressions
for P{X > x} in the form of bounds or approximations if exact ones are
intractable.
610 Analysis of Queues

A common feature for the bounds and approximations of P{X > x} is that
all of them use the effective bandwidth of the fluid input. In fact, the struc-
ture of all the approximations and bounds would also be somewhat similar.
The key differences among the different bounds and approximations are the
values of x for which the methods are valid and whether the result is con-
servative. By conservative, we mean that our expression for P{X > x} is higher
than the true P{X > x}. The reason we call that conservative is if one were to
design a system based on our expression for P{X > x}, then what is actually
observed in terms of performance would only be better. With that notion, we
first summarize the various methods for bounds and approximations. Later
we describe them in detail.

• Exact computation:
Expressions for P{X > x} of the type

P{X > x} = bi e−ηi x .
i

Based on: Anick et al. [5], Elwalid and Mitra [28, 29], and
Kulkarni [69].
Described in: Section 9.2.3.
Valid for: Any x and any CTMC environment process {Z(t), t ≥ 0} or
environment processes that can easily be modeled as CTMCs.
Drawback: Not easily extendable to other environment processes.
• Effective bandwidth approximation:
Estimates of the tail probabilities of P{X > x} using just the effective
bandwidth calculations as

P{X > x} ≈ e−ηx .

Based on: Elwalid and Mitra [30], Kesidis et al. [61], Krishnan et al.
[65], and Kulkarni [68].
Described in: Section 10.2.1.
Valid for: Large x and a wide variety of stochastic processes.
Drawback: Could be off by an order of magnitude for not-so-large x.
• Chernoff dominant eigenvalue approximation:
Improvement to the effective bandwidth approximation for
P{X > x} as

P{X > x} ≈ L e−ηx ,

where L is an adjustment using the loss probability in a bufferless

system.
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 611

Based on: Elwalid et al. [31, 32].

Described in: Section 10.2.2.
Valid for: Medium to large x and many stationary stochastic pro-
cesses.
Drawback: Not always conservative.
• Bounds for buffer content distribution:
Bounds on P{X > x} of the form

C∗ e−ηx ≤ P{X > x} ≤ C∗ e−ηx .

Based on: Palmowski and Rolski [87, 88] and Gautam et al. [39].
Described in: Section 10.2.3.
Valid for: Any x and any SMP environment process {Z(t), t ≥ 0}.
Drawback: Computationally harder than other methods.

Notice that the η described in the exponent of the last three methods
are in fact equal. So essentially the methods eventually only differ in the
constant that multiplies e−ηx . However, the approaches are somewhat differ-
ent and their scopes are different too. We will see in the following sections.
We already discussed the computation of P{X > x} for CTMC environment
processes in Chapter 9. The others are described in the following.

10.2.1 Effective Bandwidths for Tail Asymptotics

In Section 10.1, we saw how to compute the effective bandwidths of envi-
ronment processes or sources. Here we will use effective bandwidths to
characterize the tail of the buffer content distribution. For that consider the
stochastic fluid flow queue in Figure 10.1. There is a discrete-state envi-
ronment process {Z(t), t ≥ 0} that governs input to the infinite-sized buffer.
When the environment is in state Z(t), fluid enters the buffer at rate r(Z(t)).
We assume that the Gärtner–Ellis conditions (see Section 10.1.1) are sat-
isfied. The output channel capacity is a fixed constant c. Let X(t) be the
amount of fluid in the buffer at time t. We assume that the queue is
stable, that is, E[r(Z(∞))] < c. Under conditions of stability, X(t) → X as
t → ∞. We now describe how to compute the tail distribution of x, that is,
P{X > x} as x → ∞. This is also called tail asymptotics (hence the title of this
section).
Consider we can compute the effective bandwidth of the input to the
buffer eb(v). Using results from large deviations, for large values of x
(specifically as x → ∞),

P{X > x} ≈ e−ηx , (10.4)

612 Analysis of Queues

where η is the solution to

eb(η) = c. (10.5)

It is worthwhile recalling that eb(v) is an increasing function of v that

increases from rmean to rpeak as v goes from 0 to ∞. Also owing to stability and
nontriviality, we have rmean < c < rpeak . So there would be a unique solution
to Equation 10.5.
A reasonable algorithm to use is to pick an arbitrary initial v and see if
eb(v) is less than or greater than c. If it is less, we can keep increasing v
until eb(v) > c. Then we can search for η between 0 and v. Since the effec-
tive bandwidth increases from 0 to a value greater than c as we go from
0 to v, we can find η by considering it as v/2. If eb(v/2) is less than c,
then we can search between v/2 and v, otherwise we can search between
v/2 and v. We can proceed in this way by updating the upper and lower
limits on η until it converges. Such an algorithm is called a binary search.
Of course, one could develop more efficient algorithms. Next we present
some examples to illustrate the approach of computing the tail distribution,
provide some intuition for the results, and extend the results to different
conditions.

Problem 105
Consider an infinite-sized buffer with output capacity c and input regulated
by an on-off source with on-times according to exp(α) and off-times accord-
ing to exp(β). Traffic is generated at rate r, when the source is in the on-state,
and no traffic is generated when the source is in the off-state. Assume that
rβ/(α + β) < c < r. Using the effective bandwidth approximation, develop an
expression for the tail distribution. Compare that with the exact expression
for the buffer contents.
Solution
Recall from Equation 10.2 that the effective bandwidth of such a CTMC
on-off source is

rv − α − β + (rv − α − β)2 + 4βrv
eb(v) = .
2v

Solving for η in eb(η) = c, we get

rη − α − β + (rη − α − β)2 + 4rβη
=c
2η
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 613

which results in

β α
η=− + .
c (r − c)

Therefore, for large values of x (specifically as x → ∞),

P{X > x} ≈ e−ηx ,

where

β α
η=− + .
c (r − c)

From the exact analysis in Equation 9.22 in Problem 87, we can see that

βr
P{X > x} = eλx ,
c(α + β)

where

β α
λ= − .
c (r − c)

Notice that since λ = − η, the approximation is off by only a factor rβ/(c(α +

β)), which would be reasonable as x becomes very large.

Next we describe another CTMC environment process, where we again

know the exact steady-state buffer content distribution. The objective is to
see what transpires in a CTMC with more states.

Problem 106
Consider the system described in Problem 85, where there is an infinite-sized
buffer with output capacity c = 12 kbps and the input is driven by a four-state
CTMC with
⎡ ⎤
−10 2 3 5
⎢ 0 −4 1 3 ⎥
Q=⎢
⎣ 1
⎥.
1 −3 1 ⎦
1 2 3 −6
614 Analysis of Queues

Also (in units of kbps),

⎡ ⎤
20 0 0 0
⎢ 0 15 0 0 ⎥
R=⎢
⎣ 0
⎥.
0 10 0 ⎦
0 0 0 5

Using the effective bandwidth approximation, derive an expression for

limt → ∞ P{X(t) > x} and compare it against the exact value.
Solution
We have already verified in Problem 85 that the system is stable with
rmean = 0.0668 × 20 + 0.2647 × 15 + 0.4118 × 10 + 0.2567 × 5 = 10.708 kbps. Also,
rpeak = 20 kbps. The effective bandwidth of a CTMC environment process
with infinitesimal generator matrix Q and rate matrix R can be computed
from Equation 10.1 as

Q
eb(v) = e +R ,
v

where e(Q/v + R) is the largest real eigenvalue of matrix Q/v + R. Using a

computer program, we can solve for η in eb(η) = c by performing a binary
search arbitrarily starting with v = 1 (for which eb(1) = 12.8591). The search
results in η = 0.5994 per kB. Therefore, for large values of x (specifically as
x → ∞),

lim P{X(t) > x} ≈ e−0.5994x .

t→∞

From the exact analysis in Problem 85, we can write

lim P{X(t) > x} = 0.6757e−0.5994x − 0.0079e−1.3733x

t→∞

for all x ≥ 0. The second term in this example is practically negligible for
almost any x value. Thus, the approximation is off about 0.6757, which
would be reasonable when x is large and we would get the right order of
magnitude.

We now formalize what we saw in the previous two problems. Consider

an infinite-sized stable queue with output capacity c and CTMC environ-
ment process with states of which m have drift lesser than or equal to zero
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 615

(i.e., − m states with strictly positive drifts). Exact analysis (using the
notation in Section 9.2.2) yields

−m+1
−m

P{X > x} = 1 − F(x, j) = ki eλi x ,
j=1 i=1

where each ki is in terms of all the φj and aj values. However, the effective
bandwidth approximation for large values of x yields

P{X > x} ≈ e−ηx

and it turns out that

η = − max λi .
i:Re(λi )<0

In other words, η corresponds to the most dominant exponent of the exact

analysis term. That makes sense because as x grows, the effect of the other
terms would become negligible. The only issue is the factor ki that multi-
plies the exponent. In the following section, we will see how to derive an
approximation for ki . However, for the rest of this section we continue with
the effective bandwidth approximation with the understanding that we are
mainly interested in only an approximate order of magnitude analysis. For
example, in some applications a design requirement is for the tail probability
to be of the caliber of 10−9 . For something as small as that, these analyti-
cal models may be the only feasible approaches. Next we will show how to
use the analysis in the context of a well-known discrete queue, namely, the
M/M/1 queue, and derive the steady-state workload tail distribution.

Problem 107
Consider a stable M/M/1 queue with arrival rate λ and service rate μ. Derive
an expression for the steady-state workload distribution. Then use the effec-
tive bandwidth approximation to obtain the probability that the steady-state
workload is greater than x for some very large x.
Solution
Let W be a random variable denoting the steady-state workload in the sys-
tem for an M/M/1 queue. By conditioning on the steady-state number in the
616 Analysis of Queues

system, we can write down the LST:

∞
j j
−sW λ λ μ
E[e ]= 1−
μ μ μ+s
j=0

λ μ+s
= 1−
μ μ+s−λ

λ λ
= 1− 1+ .
μ μ+s−λ

The first equation is essentially by conditioning on the steady-state number

in the system being j (which happens with probability (1−λ/μ)(λ/μ)j ), which
results in a workload equal to the sum of j exp(μ) random variables or an
Erlang random variable with j phases. Inverting the LST we get the CDF of
the steady-state workload as

λ λ −(μ−λ)x λ
P{W ≤ x} = 1 − 1+ (1 − e ) = 1 − e−(μ−λ)x .
μ μ−λ μ

Thus, we have, for any x,

λ −(μ−λ)x
P{W > x} = e .
μ

Before proceeding, it is worthwhile to verify the result. In particular, we can

verify that P{W = 0} = 1 − λ/μ, which makes sense since the probability of
zero workload is the same as the probability of having zero customers in
the system. Also, E[W ] = λ/(μ(μ − λ)), which also makes sense since E[W ]
must be equal to Wq (as an arriving customer will see Wq workload on aver-
age) and it must also be equal to L/μ since on average there are L customers
each with mean workload 1/μ. Having verified the results, we now consider
the effective bandwidth approximation.
By definition, A(t) is the total amount of workload that arrived into the
buffer in time (0, t]. We consider c = 1. If Si is the service time of customer i,
then

N(t)
A(t) = Si ,
i=1

where
Si ∼ exp(μ)
N(t) is the number of events (i.e., arrivals) in time (0, t] of a Poisson process
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 617

We can immediately write

N(t)
μ
E evA(t) = E E evA(t) N(t) = E
μ−v

since E[evSi ] = μ/(μ − v). Also, by computing the generating function for a
Poisson random variable N(t), we can write

E evA(t) = e−(1−z)λt ,

where z = μ/(μ − v). Thus, the effective bandwidth is

1 (z − 1)λ λ
eb(v) = lim log E evA(t) = = .
t→∞ vt v μ−v

Solving for η in eb(η) = c with c = 1 we get

η = μ − λ.

Thus, we have the tail distribution of the workload for very large x as

P{W > x} ≈ e−ηx ,

where η = μ − λ. Notice from the exact analysis that P{W > x} = (λ/μ) e−ηx .
Thus, similar to the fluid queue, here too the exponent term agrees perfectly,
which would make the approximation excellent as x grows.

Before proceeding, we quickly extend the effective bandwidth approxi-

mation to single buffers with “multiple” independent sources as described
in Figure 10.8. Consider a single buffer that admits fluid from K independent
sources such that each source k (for k = 1, . . . , K) is driven by a random envi-
ronment process {Zk (t), t ≥ 0}. When source k is in state Zk (t), it generates
fluid at rate rk (Zk (t)) into the buffer. We still consider X(t) to be the amount
of fluid in the buffer at time t. The buffer has infinite capacity and is ser-
viced by a channel of constant capacity c. The dynamics of the buffer-content
process {X(t), t ≥ 0} is described by
⎧
dX(t) ⎨ k=1 r (Z (t)) − c
K k k if X(t) > 0
= K k k +
dt ⎩ k=1 r (Z (t)) − c if X(t) = 0
618 Analysis of Queues

where {x}+ = max(x, 0). The buffer-content process {X(t), t ≥ 0} is stable if

K
E{rk (Zk (∞))} < c.
k=1

Let ebk (v) be the effective bandwidths of source k such that

1
ebk (v) = lim log E{exp(vAk (t))},
t→∞ vt

where
t
Ak (t) = rk (Zk (u))du.
0

Assuming stability, let η be the unique solution to

K
ebk (η) = c.
k=1

The effective bandwidth approximation for large values of x yields

P{X > x} ≈ e−ηx .

Next we present an example of using effective bandwidths in a feed-forward

in-tree network that uses multiple sources as well as output effective band-
widths to illustrate our methodology. A network is called an in-tree network
if the output from one node only goes to at most one other node, whereas
the input to a node can come from several other nodes.

Problem 108
A wireless sensor system with seven nodes form a feed-forward in-tree
network as depicted in Figure 10.10. Every node of the network has an
infinite-sized buffer (denoted by B1 , . . . , B7 ) and information flows in and out
of those buffers. Except for nodes 5 and 7 that only transmit sensed informa-
tion, all other nodes sense as well as transmit information. We model sensed
information to arrive into buffers B1 , B2 , B3 , B4 , and B6 as independent and
identically distributed exponential on-off fluids with parameters α per sec-
ond, β per second, and r kBps. For i = 1, . . . , 7, the output capacity of buffer Bi
is ci kBps. The sensor network operators would prefer not to have more than
b kBps of information stored in any buffer at any time in steady state. Using
the effective bandwidth approximation, derive approximations for the prob-
ability of exceeding b kB of information in each of the seven buffers. Obtain
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 619

B2 B6 B7

FIGURE 10.10
In-tree network.

numerical values for the approximate expressions using α = 5, β = 1, r = 6,

c1 = c2 = c3 = 2, c4 = c6 = 5, c5 = 3, c7 = 8, and b = 14.
Solution
Notice that the sources are identical purely for ease of exposition and that
is not necessary at all for the analysis. However, it is crucial that the sources
are independent. We denote ebs (v) as the effective bandwidth of the source of
sensed information. We have five such sources that input traffic into buffers
B1 , B2 , B3 , B4 , and B6 . Since each of them is an exponential on-off source
with parameters α, β, and r, the effective bandwidth of the sources from
Equation 10.2 is

rv − α − β + (rv − α − β)2 + 4βrv
ebs (v) = .
2v

Also, the mean fluid-flow rate is

rβ
rmean
s = .
(α + β)

The condition, for stability for buffers B1 , . . . , B7 are rmean

s < c1 , rmean
s < c2 ,
mean
rs < c3 , 2rs
mean < c4 , 2rs
mean < c5 , 3rs
mean < c6 , and 5rs
mean < c7 , respectively.
Verify that for the numerical values α = 5, β = 1, r = 6, c1 = c2 = c3 = 2,
c4 = c6 = 5, c5 = 3, and c7 = 8, the stability conditions are satisfied.
620 Analysis of Queues

j (v) and ebj (v), respectively, be the net effective bandwidth

Now, let ebin out

flowing into and out of buffer Bj , for j = 1, . . . , 7. The relationship between

j (v) and ebj (v) for j = 1, . . . , 7 is
ebin out

⎧
⎨ ebin
j (v) if 0 ≤ v ≤ v∗j
ebout ! "
j (v) =
⎩ cj −
v∗j
v cj − ebin
j v∗j if v > v∗j ,

where v∗j is obtained by solving for v in the equation

d
v ebin
j (v) = cj .
dv

Also, from the network structure, we have the following relations:

ebin
1 (v) = ebs (v),

ebin
2 (v) = ebs (v),

ebin
3 (v) = ebs (v),

ebin out
4 (v) = ebs (v) + eb1 (v),

ebin out out

5 (v) = eb2 (v) + eb3 (v),

ebin out
6 (v) = ebs (v) + eb5 (v),

ebin out out

7 (v) = eb4 (v) + eb6 (v).

Assuming stability, for j = 1, . . . , 7, let Xj be the amount of fluid in

buffer j in steady state. We need to obtain approximations for P{Xj > b} for
j = 1, . . . , 7. On the basis of effective bandwidth approximation, we have
j = 1, . . . , 7

P{Xj > b} ≈ e−ηj b ,

where ηj is the unique solution to

ebin
j (ηj ) = cj .

Thus, we solve for ηj for all j = 1, . . . , 7 using numerical values α = 5, β = 1,

r = 6, c1 = c2 = c3 = 2, c4 = c6 = 5, c5 = 3, and c7 = 8. We can solve for η in
ebs (η) = 4 to obtain η = η1 = η2 = η3 = 0.75 since ebin 1 (v) = eb2 (v) = eb3 (v) =
in in

ebs (v) and c1 = c2 = c3 = 4.

Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 621

∗
j (v) for j = 1, 2, 3, we first need vj . For j = 1, 2, 3, we
Next, to compute ebout
can solve for v in
d
v ebin
j (v) = cj
dv
to get
#$ % # $ %
β cj α α β(r − cj )
v∗j = −1 + 1− = 0.4031.
r β(r − cj ) r cj α

At buffer B4 we have
&
ebs (v) + ebin
1 (v) if 0 ≤ v ≤ v∗1
ebin
4 (v) = ebs (v) + ebout
1 (v) = v∗1 ' ∗ (
ebs (v) + c1 − v c1 − ebin
1 v1 if v > v∗1 ,

with v∗1 = 0.4031. Solving for ebin

4 (η4 ) = c4 , we get η4 = 0.7301.
Likewise, at buffer B5 we have

ebin out out

5 (v) = eb2 (v) + eb3 (v)
⎧
⎨ebin2 (v) + eb3 (v)
in if 0 ≤ v ≤ v∗2 = v∗3
=
⎩c − v∗2 'c − ebin v∗ ( + c − v∗3 'c − ebin v∗ ( if v > v∗2 = v∗3 ,
2 v 2 2 2 3 v 3 3 3

with v∗2 = v∗3 = 0.4031. Solving for ebin 5 (η5 ) = c5 , we get η5 = 0.5738.
At buffer B6 we have
⎧
⎨ebs (v) + ebin
5 (v) if 0 ≤ v ≤ v∗5
ebin (v) = eb (v) + eb out
(v) =
⎩eb (v) + c − v∗5 'c − ebin v∗ ( if v > v∗ ,
6 s 5
s 5 v 5 5 5 5

with v∗5 = 0.2363. Solving for ebin

6 (η6 ) = c6 , we get η6 = 0.8348.
Finally, for the analysis of buffer B7 , we first obtain v∗4 = 0.5407 and
v∗6 = 0.4031. Then we have

ebin out out

7 (v) = eb4 (v) + eb6 (v)
⎧ in
⎪
⎪ eb (v) + ebin
6 (v) if 0 ≤ v ≤ v∗6 ,
⎪ 4
⎨
v∗ ' ∗ (
= ebin (v) + c6 − v6 c6 − ebin v6 if v∗6 < v ≤ v∗4 ,
⎪
⎪
4 6
⎪
⎩ v∗ ' ∗ ( v∗ ' ∗ (
c4 − v4 c4 − ebin4 v4 + c6 − v6 c6 − ebin
6 v6 if v > v∗4 .

7 (η7 ) = c7 , we get η7 = 0.6185.

Solving for ebin
622 Analysis of Queues

Thus, we have P{Xj > b} for j = 1, . . . , 7 using the effective bandwidth

approximation given by

P{Xj > b} ≈ e−ηj b ,

with b = 14. Using the values of ηj described earlier for j = 1, . . . , 7, we can

write P{X1 > b} = P{X2 > b} = P{X3 > b} ≈ 2.7536 × 10−5 , P{X4 > b} ≈
3.6383 × 10−5 , P{X5 > b} ≈ 3.2451 × 10−4 , P{X6 > b} ≈ 8.4007 × 10−6 , and
P{X7 > b} ≈ 1.7356 × 10−4 .

10.2.2 Chernoff Dominant Eigenvalue Approximation

Recall that in the previous section, we presented a reasonable approximation
for the steady-state buffer content distribution of the form P(X > x) ≈ e−ηx .
However, in several examples we saw that although for large x the order of
magnitude is reasonable, the approximation still had scope for fine-tuning.
In particular, the examples using CTMC sources and M/M/1 queue where
we can obtain exact results, we saw that a suitable multiplicative factor
would boost the approximation significantly. This is the motivation for
the results in this section called the Chernoff dominant eigenvalue (CDE)
approximation, which is essentially a refinement of the effective bandwidth
approximation.
Consider the single buffer fluid model with multiple input sources
described in Figure 10.8. In essence, the CDE approximation yields

P{X > x} ≈ Le−ηx ,

where L can be thought of as the fraction of the fluid that would be lost in
steady state if there was no buffer and η is the solution to

K
ebk (η) = c.
k=1

Notice that η is the same exponent as in the effective bandwidth approxima-

tion. Thus, the only additional computation in the CDE approximation is L,
which we describe next.
The input sources are characterized by a function mk (w), which is similar
to the ALMGF (hk (v)), and is defined as

mk (w) = lim log E{exp(wrk (Zk (t)))}.

t→∞
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 623

Let
& *

K
∗
s = sup c w − mk (w) ,
w≥0 k=1

and w∗ be obtained by solving

K
mk (w∗ ) = c,
k=1

where mk (w) denotes the derivative of mk (w) with respect to w. Then the
Chernoff estimate of L is

exp(−s∗ )
L≈ √ ,
w∗ σ(w∗ ) 2π

where

K
σ2 (w∗ ) = mk (w∗ ),
k=1

with mk (w) denoting the second derivative of mk (w) with respect to w. The
main problem is in computing mk (w). If {Zk (t), t ≥ 0} can be modeled as a
stationary and ergodic process with state space Sk and stationary probability
vector, pk , then
⎧ ⎫
⎨ ⎬
j k
mk (w) = log pk ew r (j) .
⎩ ⎭
j∈Sk

Next, we present a few examples to study the effectiveness of the CDE

approximation as well as illustrate the methods to compute L described
earlier.

Problem 109
Consider a source modulated by an -state irreducible CTMC {Z(t), t ≥ 0}
with infinitesimal generator

Q = [qij ]
624 Analysis of Queues

and stationary distribution vector p = [p ] satisfying

pQ = 0 and pl = 1.
l=1

When the CTMC is in state i, the source generates fluid at rate ri . This source
inputs traffic into an infinite capacity buffer with output channel capacity

c. Assume that l = 1 pi ri < c < maxl rl . Using CDE approximation develop
an expression for P(X > x). Then, illustrate the approach for the numerical
example of Problem 106.
Solution
Using the CDE approximation, we have

P(X > x) ≈ Le−ηx ,

where η is the unique solution to

Q
e R+ = c,
η

with e(M) denoting the largest eigenvalue of matrix M. To compute L, con-

sider function m(w) (which is essentially mk (w) but we only have K = 1)
given by
⎧ ⎫
⎨ ⎬
m(w) = log pj ew rj .
⎩ ⎭
j=1

We can thus compute w∗ as the solution to

w∗ rj
j=1 pj rj e
w∗ rj
= c.
j=1 pj e

Also,

s∗ = cw∗ − m(w∗ )

and
! " ! " ! "2
w∗ rj 2 w∗ rj w∗ rj
j=1 pj e j=1 pj rj e − j=1 pj rj e
σ2 (w∗ ) = ! "2 .
w ∗r
j=1 pj e
j
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 625

Using this, we can approximately estimate L as

exp(−s∗ )
L≈ √ .
w∗ σ(w∗ ) 2π

To illustrate the preceding bounds, we consider the system in Problem

106 with an infinite-sized buffer whose output capacity is c = 12 kbps and
input driven by a four-state CTMC that has
⎡ ⎤
−10 2 3 5
⎢ 0 −4 1 3 ⎥
Q=⎢
⎣ 1
⎥.
1 −3 1 ⎦
1 2 3 −6

Also (in units of kbps),

⎡ ⎤
20 0 0 0
⎢ 0 15 0 0 ⎥
R=⎢
⎣ 0
⎥.
0 10 0 ⎦
0 0 0 5

Recall from Problem 106 that the steady-state probability vector is p = [0.0668
0.2647 0.4118 0.2567] and η that solves e(R + Q/η) = c is η = 0.5994 per kB. To
express L, we numerically obtain w∗ = 0.0649, σ2 (w∗ ) = 20.3605, s∗ = 0.0423,
thereby L = 0.5208. Thus, from the CDE approximation, we have

P(X > x) ≈ 0.5208e−0.5994x .

Problem 110
Consider two sources that input traffic into an infinite-sized buffer with
output capacity c = 10. The first source is identical to that in Problem
102 and the second source is identical to that in Problem 103. In other
words, from source-1, fluid flows into the according to a three-state SMP
{Z1 (t), t ≥ 0} with state space {1, 2, 3}. The buffer elements of the kernel
of this SMP are G12 (t) = 1 − e−t − te−t , G21 (t) = 0.4(1 − e−0.5t ) + 0.3(1 −
e−0.2t ), G23 (t) = 0.2(1 − e−0.5t ) + 0.1(1 − e−0.2t ), G32 (t) = 1 − 2e−t + e−2t , and
G11 (t) = G13 (t) = G22 (t) = G31 (t) = G33 (t) = 0. Also, the flow rates in the three
states are r(i) = i for i = 1, 2, 3. Source-2 is an on-off source {Z2 (t), t ≥ 0} with
state space {u, d} that generates fluid so that the on-times are IID random
variables with CDF U(t) = 1−0.6e−3t −0.4e−2t and the off-times are IID Erlang
random variables with mean 0.5 and variance 1/12 in appropriate time units
compatible with the on-times. When the source is on, fluid is generated at
the rate of 16 units per second and no fluid is generated when the source is
626 Analysis of Queues

off. Using CDE approximation, develop an expression for the steady-state

buffer content distribution.
Solution
Let eb1 (v) and eb2 (v) be the effective bandwidths of sources 1 and 2, respec-
tively. We have devised methods to compute eb1 (v) and eb2 (v) in Problems
102 and 103, respectively. Then, using the CDE approximation we have the
steady-state distribution of the buffer contents X as

P(X > x) ≈ Le−ηx ,

where η is the unique solution to

eb1 (η) + eb2 (η) = c.

Solving for η we get

η = 0.0965,

with eb1 (η) = 1.8535 and eb2 (η) = 8.1465. To compute L, consider functions
mk (w) (for k = 1, 2). We have

m1 (w) = log p11 ew r(1) + p21 ew r(2) + p31 ew r(3)

and

m2 (w) = log pu2 ew ru + pd2 ew rd ,

where r(i) = i for i = 1, 2, 3, p11 p21 p31 = [0.2772 0.6337 0.0891] as described in
Problem 102, ru = 16, rd = 0, and [pu pd ] = [4/9 5/9] based on Problem 103.
We can compute w∗ as the solution to

∗ r(1) ∗ r(2) ∗ r(3) ∗r ∗r

p11 r(1) ew + p21 r(2) ew + p31 r(3) ew pu2 ru ew u + pd2 rd ew d

∗ r(1) ∗ r(2) ∗ r(3)

+ ∗r ∗r
=c
p11 ew + p21 ew + p31 ew pu2 ew u + pd2 ew d

using a binary search. Thus, we have w∗ = 0.0168 for the numerical values
stated in the problem. Also,

s∗ = cw∗ − m1 (w∗ ) − m2 (w∗ ) = 0.0091

Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 627

and
2
3 ∗ r(i) 3 ∗ r(i) 3 ∗ r(i)
pi1 ew pi1 r(i)2 ew − pi1 r(i) ew
i=1 i=1 i=1
σ2 (w∗ ) = 2
3 ∗
pi ew r(i)
i=1 1

! ∗ ∗
"! ∗ ∗
"
pu2 ew ru + pd2 ew rd pu2 r2u ew ru + pd2 r2d ew rd
! ∗ ∗
"2
− pu2 ru ew ru + pd2 rd ew rd
+ ! "2
∗ ∗
pu2 ew ru + pd2 ew rd

= 64.2977.

Using this, we can approximately estimate L as

exp(−s∗ )
L≈ √ = 1.1708.
w∗ σ(w∗ ) 2π

Thus, from the CDE approximation, we have

P(X > x) ≈ 1.1708e−0.0965x ,

especially for large values of x.

Notice that the procedure for obtaining L does not depend on the stochas-
tic process governing the sources once we know the steady-state distribution.
This makes it convenient since a single approach can be used for any discrete
stochastic process. However, as described earlier, the method is an approxi-
mation that is usually suitable for the tail distribution. Next we will consider
an approach for bounds on the entire distribution, not just the tail.

10.2.3 Bounds for Buffer Content Distribution

The CDE approximation for the steady-state buffer content distribution that
we presented in the previous section was of the form P(X > x) ≈ Le−ηx .
Although it was more accurate than the effective bandwidth approxima-
tion, the CDE approximation is not necessarily conservative and one cannot
predict its accuracy a priori. To address that here, we describe bounds for
P(X > x) that works for all x, not just for large x. Similar to the case in CDE
approximation, here too the setting is a single buffer fluid model with mul-
tiple input sources described in Figure 10.8. In particular, we consider an
628 Analysis of Queues

infinite capacity buffer into which fluid is generated from K sources accord-
ing to environment processes {Zk (t), t ≥ 0} for k = 1, 2, . . . , K that are indepen-
dent SMPs. When the SMP {Zk (t), t ≥ 0} for some k ∈ {1, 2, . . . , K} is in state
i, fluid is generated into the buffer at rate rk (i). The SMP {Zk (t), t ≥ 0} for all

k ∈ {1, 2, . . . , K} has a state space Sk = {1, 2, . . . , k } and kernel Gk (x) = Gkij (x) .
Using that we can calculate the expected time spent in state i, which we
call τki . Also assume that we can compute pk , the stationary vector of the kth
SMP {Zk (t), t ≥ 0}, where

pki = lim P{Zk (t) = i}.

t→∞

Further, using the SMP source characteristics, say we can compute the effec-
tive bandwidth of the kth source, ebk (v). Then, as always, we let η be the
solution to

K
ebk (η) = c.
k=1

Now we describe how to compute bounds for the steady-state buffer content
distribution as follows:

C∗ e−ηx ≤ lim P{X(t) > x} ≤ C∗ e−ηx ∀ x ≥ 0,

t→∞

where
X(t) is the amount of fluid in the buffer at time t
C∗ and C∗ are constants that we describe how to compute next

We only present an overview of the results; interested readers are

referred to Gautam et al. [39]. The approach uses an argument based on an
exponential change of measure.
Let G̃kij (s) be the Laplace Stieltjes transform (LST) of Gkij (x). For a given
v > 0, define

χkij (v, u) = G̃kij (−v(rk (i) − u)),

χk (v, u) = χkij (v, u) .

Denote k (η) = χk (η, ebk (η)). Let hk be the left eigenvector of k (η) corre-
sponding to the eigenvalue 1, that is,

hk = hk k (η). (10.6)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 629

Also define the following:

Pk (i, j) = [Gk (∞)]ij ,

⎛ ⎞
k k ! "
hki
Hk = ⎝ φkij (η) − 1⎠ , (10.7)
η(rk (i) − ebk (η))
i=1 j=1

& ∞ *
hki e−η(rk (i)−ebk (η))x x eη(rk (i)−ebk (η))y dGkij (y)
k
min (i, j) = inf ∞ , (10.8)
x (pki /τki ) x dGkij (y)
& ∞ *
hki e−η(rk (i)−ebk (η))x x eη(rk (i)−ebk (η))y dGkij (y)
k
max (i, j) = sup ∞ . (10.9)
x (pki /τki ) x dGkij (y)

Then the limiting distribution of the buffer content process is

C∗ e−ηx ≤ lim P(X(t) > x) ≤ C∗ e−ηx , ∀x ≥ 0,

t→∞

where

/K
k
/K k
k=1 H k=1 H
C∗ = /K , C∗ = /K ,
minA k=1 min
k (i , j )
k k maxA k=1 max (ik , jk )
k

&
A = (i1 , j1 ), (i2 , j2 ), . . . , (iK , jK ) :

*

K
k
ik , jk ∈ Sk , rk (ik ) > c and ∀k, P (ik , jk ) > 0 .
k=1

In these expressions, the only unknown terms are max k and min
k . Next, we

describe how to compute them for some special cases. For that, we drop k with
the understanding that all the expressions pertain to k.
First consider a nonnegative random variable Y with CDF Gij (x)/Gij (∞)
and density

dGij (x) 1
gij (x) = .
dx Gij (∞)
630 Analysis of Queues

The failure rate function of Y is defined by

gij (x)
λij (x) = .
1 − Gij (x)/[Gij (∞)]

Then Y is said to be an increasing failure rate (IFR) random variable if

λij (x) ↑ x

and Y is said to be a decreasing failure rate (DFR) random variable if

λij (x) ↓ x.

It is possible to obtain closed form algebraic expressions for max (i, j) and
min (i, j), if random variable Y with distribution Gij (x)/Gij (∞) is an IFR
or DFR random variable. The following result describes how to compute
max (i, j) and min (i, j) in those cases. Let x∗ and x∗ be such that
& ∞ *
∗ eη(ri −c)y dGij (y)
hi x
x = arg sup ∞
x (pi /τi )eη(ri −c)x x dGij (y)

and
& ∞ *
eη(ri −c)y dGij (y)
hi x
x∗ = arg inf ∞ .
x (pi /τi )eη(ri −c)x x dGij (y)

Then max (i, j) and min (i, j) occur at x values given by Table 10.1 with the
understanding

λij (∞) = lim λij (x).

x→∞

TABLE 10.1
Computing max (i, j) and min (i, j)
IFR DFR
ri > c ri ≤ c ri > c ri ≤ c
x∗ 0 ∞ ∞ 0
φ̃ij (−η(ri −c))τi hi τi hi λij (∞) τi hi λij (∞) φ̃ij (−η(ri −c))τi hi
max (i, j) pij pi pi (λij (∞)−η(ri −c)) pi (λij (∞)−η(ri −c)) pij pi
x∗ ∞ 0 0 ∞
τi hi λij (∞) φ̃ij (−η(ri −c))τi hi φ̃ij (−η(ri −c))τi hi τi hi λij (∞)
min (i, j) pi (λij (∞)−η(ri −c)) pij pi pij pi pi (λij (∞)−η(ri −c))
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 631

To illustrate the bounds for the steady-state distribution, we consider

a few examples. In all examples, we assume the buffer to be stable and
X(t) → X as t → ∞. We first show examples for a single source (i.e., K = 1)
with a single buffer and then generalize it to many sources as well as simple
networks. We begin with a CTMC source and compare it against the exact
results.

Problem 111
Consider a source modulated by an -state irreducible CTMC {Z(t), t ≥ 0}
with infinitesimal generator

Q = [qij ]

and stationary distribution p satisfying

pQ = 0 and pl = 1.
l=1

When the CTMC is in state i, the source generates fluid at rate ri . Let

R = diag[ri ].

This source inputs traffic into an infinite capacity buffer with output chan-
nel capacity c. Develop upper and lower bounds for P(X > x). Then, for
the numerical example of Problem 106, compare the bounds against exact
results.
Solution
We obtain η by solving

eb(η) = c,

where eb(·) is the effective bandwidth of the CTMC source and it is given by

Q
eb(v) = e R + .
v

Next, note that {Z(t), t ≥ 0} is a special case of an SMP with kernel

G(x) = [Gij (x)]

⎧
⎨ qij (1 − e−qi x ) if i = j
qi
=
⎩
0 if i = j
632 Analysis of Queues

where qi = − qii = j=i qij . The expected amount of time the CTMC spends
in state i is
1
τi = .
qi

From Equation 10.3, we have

⎧ qij
⎨ if i = j
qi −vri +u
ij (u, v) =
⎩0 if i = j

For a given v,

e∗ (v) = sup {e((u, v))}

{u>0:e((u,v))<∞}

approaches infinity. Therefore, v∗ = ∞. Thus, we have

⎧ qij
⎨ if i = j
qi −η(ri −c)
φij (η) =
⎩0 if i = j

since eb(η) = c and the remaining terms can be obtained from G̃ij (cη − ri η).
Equation 10.6 without the superscript k reduces to

hj = hi φij (η)
i=1

qij
= hi . (10.10)
qi − η(ri − c)
i=j

Using Equation 10.7, we have

⎛ ⎞

hi ⎝ (φij (η)) − 1⎠
H=
η(ri − c)
i=1 j=1

hi qi
= −1
η(ri − c) qi − η(ri − c)
i=1

hi
= . (10.11)
qi − η(ri − c)
i=1
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 633

From Equation 10.9 without the superscript k, we have for i = j and qij > 0

& ∞ *
hi e−η(ri −c)x x eη(ri −c)y dGij (y)
max (i, j) = sup ∞
x (pi /τi ) x dGij (y)
& ∞ *
hi e−η(ri −c)x eη(ri −c)y e−qi y dy
= sup x∞ −q y
x pi qi x e i dy
0
1 hi
= sup
x pi qi − η(ri − c)
0
1 hi
= inf
x pi qi − η(ri − c)

= min (i, j).

The preceding result should come as no surprise since an exponential

random variable has a constant hazard rate function.
Define

hi
g = [gi ] = .
qi − η(ri − c)

Alternatively, we can compute g using the fact that g satisfies

Q
g R+ = cg,
η

since (using Equation 10.10)

gi qij
Q qj
g R+ = rj − gj +
η j η η
i=j

qij
qj hj hi
= rj − +
η qj − η(rj − c) η qi − η(ri − c)
i=j

qj hj hj
= rj − +
η qj − η(rj − c) η
634 Analysis of Queues

hj
= c
qj − η(rj − c)

= cgj .

At any rate, we can write down bounds for the steady-state buffer content
distribution as

C∗ e−ηx ≤ P{X > x} ≤ C∗ e−ηx ,

where

H
C∗ =
mini:ri >c,j:pij >0 {min (i, j)}
hi
i=1 qi −η(ri −c)
=
1 hi
mini:ri >c pi qi −η(ri −c)

i=1 gi
= gi
mini:ri >c pi

and

H
C∗ =
maxi:ri >c,j:pij >0 {max (i, j)}
hi
i=1 qi −η(ri −c)
=
hi
maxi:ri >c p1i qi −η(r i −c)

i=1 gi
= g .
maxi:ri >c pii

To illustrate the preceding bounds, we consider the system in Problem

106 with an infinite-sized buffer whose output capacity is c = 12 kbps and
input driven by a four-state CTMC that has
⎡ ⎤
−10 2 3 5
⎢ −4 ⎥
⎢ 0 1 3 ⎥
Q=⎢ ⎥.
⎣ 1 1 −3 1 ⎦
1 2 3 −6
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 635

Also (in units of kbps),

⎡ ⎤
20 0 0 0
⎢
⎢ 0 15 0 0 ⎥
⎥
R=⎢ ⎥.
⎣ 0 0 10 0 ⎦
0 0 0 5

Recall from Problem 106 that the steady-state probability vector is p = [0.0668
0.2647 0.4118 0.2567] and η that solves e(R + Q/η) = c is η = 0.5994 per kB.
To obtain C∗ and C∗ , we solve for g as the left eigenvector of (R + Q/η)
that corresponds to eigenvalue of c, that is, g satisfies g(R + Q η) = cg. Using
that we get g = [0.1746 0.7328 0.5533 0.3555] (although this is not unique, but
notice that it appears in the numerator and denominator of both C∗ and C∗
and hence would be a nonissue). Next, notice that only in states i = 1 and
i = 2, we have ri > c. Thus, we have
g1 + g2 + g3 + g4
C∗ = = 0.6953
min(g1 /p1 , g2 /p2 )

and
g1 + g2 + g3 + g4
C∗ = = 0.6561.
max(g1 /p1 , g2 /p2 )

In Table 10.2, we contrast the bounds obtained as

C∗ e−ηx ≤ P{X > x} ≤ C∗ e−ηx ,

against the effective bandwidth approximation result in Problem 106, the

exact result in Problem 85 and the CDE approximation in Problem 109.

After describing one special case of an SMP, namely, CTMC, next

we present another special case, the general on-off source, which is an
alternating renewal process.

TABLE 10.2
Comparing the Exact Results against Approximations and Bounds
Method Result
Exact computation lim P{X(t) > x} = 0.6757e−0.5994x − 0.0079e−1.3733x
t→∞
Effective bandwidth approx. lim P{X(t) > x} ≈ e−0.5994x
t→∞
CDE approx. lim P{X(t) > x} ≈ 0.5208e−0.5994x
t→∞
Bounds 0.6561e−0.5994x ≤ lim P{X(t) > x} ≤ 0.6953e−0.5994x
t→∞
636 Analysis of Queues

Problem 112
Consider a source modulated by a two-state (on and off) process that alter-
nates between the on and off states. The random amount of time the process
spends in the on state (called on-times) has CDF U(·) with mean τU and the
corresponding off-time CDF is D(·) with mean τD . The successive on and off-
times are independent and on-times are independent of off-times. Fluid is
generated continuously at rate r during the on state and at rate 0 during
the off state. The source inputs traffic into an infinite-capacity buffer. The
output capacity of the buffer is a constant c. State the stability condition,
and assuming it is true obtain bounds for the steady-state buffer content
distribution.
Solution
The stability condition is

rτU
< c.
τU + τD

We assume that the preceding condition is true and c < r (otherwise the
buffer would be empty in steady state). Following the notation described
for the SMP bounds, we obtain the following. Define

0 D̃(vc)
(v) = ,
Ũ(−v(r − c)) 0

where Ũ(·) and D̃(·) are the LSTs of U(t) and D(t), respectively. We assume
that e((η)) = 1 has a solution and it implies that
1
e((η)) = Ũ(−η(r − c)) D̃(ηc) = 1

can be solved (otherwise we would have to use the effective bandwidth of

general on/off source). Hence, η is the smallest real-positive solution to

Ũ(−η(r − c)) D̃(ηc) = 1.

Also, Equation 10.6 without the superscript k reduces to

h = [1 D̃(ηc)].

Furthermore, from Equations 10.7 through 10.9, we get

(1 − D̃(ηc))r
H= ,
c(r − c)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 637

⎡ ∞ −ηc(y−x) 0⎤
(τU +τD ) x e dD(y)
0 infx
⎢ 0
1−D(x) ⎥
min = ⎣ ⎦
(τ +τ ) ∞ eη(r−c)(y−x) dU(y)
infx D̃(ηc) U D x1−U(x) 0

and

⎡ ∞ 0⎤
(τU +τD ) x e−ηc(y−x) dD(y)
0 supx
⎢ 0
1−D(x) ⎥
max = ⎣ ∞ ⎦.
(τU +τD ) x eη(r−c)(y−x) dU(y)
supx D̃(ηc) 1−U(x) 0

Thus, we can derive bounds for the steady-state buffer content distribu-
tion X as

C∗ e−ηx ≤ P{X > x} ≤ C∗ e−ηx , (10.12)

where

Ũ(−η(r − c)) − 1 r
C∗ = ∞ 0 (10.13)
τU + τD eη(r−c)(y−x) dU(y)
c(r − c)η infx x
1−U(x)

and

Ũ(−η(r − c)) − 1 r
C∗ = ∞ 0. (10.14)
τU + τD eη(r−c)(y−x) dU(y)
c(r − c)η supx x
1−U(x)

If U(·) is IFR/DFR, the supremums and the infimums in the preceding

equations can be obtained from Table 10.1.

Next we consider a special case of the earlier problem (namely, the Erlang
on-off source) to explore and explain the general on-off source.
638 Analysis of Queues

Problem 113
The Erlang on-off source is one with Erlang(NU , α) on-time distribution,
Erlang(ND , β) off-time distribution, and fluid is generated at rate r when the
source is on. Assuming stability, obtain bounds for the steady-state buffer
content distribution. For a numerical example with r = 15, c = 10, τU = 1/70,
and τD = 1/30, illustrate the bounds.
Solution
Note that τU = NU /α and τD = ND /β. Assume that the condition of stability
is satisfied and r > c. Thus, we have
⎡ ! "ND ⎤
β
0 D̃(vc) ⎢ 0 β+vc ⎥
(v) = =⎣ ! "NU ⎦.
Ũ(−v(r − c)) 0 α
0
α−v(r−c)

It is possible to show that e((v)) = 1 always has a solution. Hence, η is

obtained by solving
1
e((η)) = Ũ(−η(r − c)) D̃(ηc) = 1

and
ND
β
h= 1 .
β + ηc

Using the fact that the Erlang random variable has an increasing hazard rate
function, we see that
⎡ ! "ND ⎤
β
⎢ 0 (NU /α + ND /β) β+ηc ⎥
min = ⎣! "ND ⎦
β
β+ηc (NU /α + ND /β) α−η(r−c)
α
0

and
⎡ ⎤
β
0 (NU /α + ND /β) β+ηc
max = ⎣! β
"ND ! "NU ⎦.
β+ηc (NU /α + ND /β) α
α−η(r−c) 0

Thus, we have the steady-state buffer content distribution bounded as

C∗ e−ηx ≤ P{X > x} ≤ C∗ e−ηx , (10.15)

Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 639

where
! "NU
α
α−η(r−c) −1 r
C∗ = (10.16)
τU + τD c(r − c)η α
α−η(r−c)

and
! "NU
α
α−η(r−c) −1 r
C∗ = ! "NU 0 . (10.17)
τU + τD
c(r − c)η α
α−η(r−c)

Next consider the numerical example of an Erlang on-off source with on-
time distribution Erlang(NU , α) and off-time distribution Erlang(ND , β) with
r = 15, c = 10, τU = 1/70, and τD = 1/30. We keep the means constant (i.e.,
τU and τD are held constant) but decrease the variances by increasing NU
and ND . In Figure 10.11, we illustrate for four pairs of (NU , ND ) (namely,
(1, 1), (4, 3), (9, 8), and (16, 14)) the logarithm of the upper and lower bounds
on the limiting distribution of the buffer-content process. From the figure
we notice that as the variance decreases, the bounds move further apart.
Also note that C∗ increases with decrease in variance and C∗ decreases with
decrease in variance. Since η increases with decrease in variance, the tail of
the limiting distribution rapidly approaches zero.

x
0

–5 (1 , 1)

–10
Log10(P(X > x))

–15
(4 , 3)
–20

–25

–30 Lower bound

Upper bound
–35

(9 , 8)
(16 , 14)

FIGURE 10.11
Logarithm of the upper and lower bounds as a function of x. (From Gautam, N. et al., Prob. Eng.
Inform. Sci., 13, 429, 1999. With permission.)
640 Analysis of Queues

Remark 24

For the exponential on-off source, which is a special case of the Erlang on-off
source, we get C∗ = C∗ = rβ/[c(α + β)]. Hence, the upper and lower bounds
are equal resulting in

rβ
P{X > x} = e−ηx ,
c(α + β)

where
cα + cβ − βr
η=
c(r − c)

and this matches the result in Equation 9.22.

Next we consider a few extensions, especially in terms of multiple sources

and networks. We begin by considering K CTMC sources inputing fluid into
a buffer.

Problem 114
There are K independent fluid sources that input traffic into an infinite
capacity buffer. Each source k is modulated by a CTMC {Zk (t), t ≥ 0} with
infinitesimal generator Qk on state space {1, 2, ..., k }. Also,

k

pk Qk = 0 and pki = 1.
i=1

Fluid is generated at rate rk (Zk (t)) by source k at time t. Let Rk be the cor-
responding rate matrix. Fluid is removed from the buffer by a channel with
constant capacity c. Let X(t) be the amount of fluid in the buffer at time t.
Obtain bounds for P{X(t) > x} as t → ∞ assuming that the buffer is stable.
Solution
For k = 1, 2, . . . , K, let the effective bandwidth of source k be

Qk
ebk (v) = e Rk + ,
v

where e(M) is the largest real eigenvalue of the matrix M. We solve

K
ebk (η) = c
k=1
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 641

to obtain η. The left eigenvectors gk are obtained by solving

k Qk
g Rk + = ebk (η)gk .
η

We assume that the condition of stability

k

K
rk (l)pkl < c
k=1 l=1

is satisfied. Hence as t → ∞, X(t) → X. Thus, the steady-state distribution of

the buffer content is bounded as

C∗ e−ηB ≤ P{X > B} ≤ C∗ e−ηB ,

where
/K k k
∗ k=1 l=1 hl
C = /
mini1 ,...,ik : rk (ik )>c K k=1 hik /pik
k k

and

/K k k
k=1 l=1 hl
C∗ = / .
maxi1 ,...,ik : rk (ik )>c K k=1 hik /pik
k k

In the previous problem, the denominators for the C∗ and C∗ expressions

need to be computed carefully if one wants to avoid searching through the
/
entire space of K k = 1 k terms. Of course, if the problem structure is spe-
cial (such as identical sources) this would be straightforward. We will see
one such example in a subsequent problem. Now we will consider another
extension, namely, a tandem network.

Problem 115
An exponential on-off source with on-time parameter α, off-time parameter
β, and rate r (fluid generation rate when on) generates traffic into an infinite-
capacity buffer with output capacity c1 . The output from the buffer acts as an
input to another infinite-capacity buffer whose output capacity is c2 . Assume
for stability and nontriviality that

rβ
< c2 < c1 < r.
(α + β)
642 Analysis of Queues

c1 c2
X1(t) X2(t)

FIGURE 10.12
Exponential on-off input to buffers in tandem. (From Gautam, N. et al., Prob. Eng. Inform. Sci.,
13, 429, 1999. With permission.)

Obtain bounds for the buffer-content processes of the respective buffers

{X1 (t), t ≥ 0} and {X2 (t), t ≥ 0}. See Figure 10.12 for an illustration of the
model.
Solution
The effective bandwidth of the exponential on-off source is

rv − α − β + (rv − α − β)2 + 4βrv
eb(v) = . (10.18)
2v

Since the system is stable, as t → ∞, X1 (t) → X1 and X2 (t) → X2 . From

Remark 24, the expression for P(X1 > x) resulted in the upper and lower
bounds being equal. In particular for this system, we have

rβ
P{X1 > x} = e−η1 x , (10.19)
c1 (α + β)

where

c1 α + c1 β − βr
η1 = .
c1 (r − c1 )

To obtain bounds for the distribution of X2 , we first model the input to

the second buffer as a general on-off source with on-time distribution U(t)
(with mean r/(c1 (α + β) − rβ)), off-time distribution D(t) (with mean 1/β),
and rate c1 . Clearly, the off-time corresponds to the consecutive time when
buffer-1 remains empty (idle period) and the on time corresponds to the busy
period (i.e., the consecutive time the buffer is nonempty). During the busy
period of buffer-1, fluid flows from the first buffer to the second buffer at rate
c1 , and no fluid flows between the buffers during the idle period. Also, the
idle period is exponentially distributed with parameter β since once buffer-
1 becomes empty (this can happen only when the original source if off), it
would take exp(β) time for the source to switch to on. On the basis of this
we have

D(t) = 1 − e−βt .
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 643

Recall Problem 94 where we derived the LST of the busy period distribution
U(·) as
&
w+β+c1 s0 (w)
if w ≥ w∗
Ũ(w) = β
∞ otherwise,
√
where
w∗ = (2 c1 αβ(r − c1 ) − rβ − c1 α − c1 β)/r, s0 (w) = (−b −
b2 + 4w(w + α + β)c1 (r − c1 ))/(2c1 (r − c1 )), and b = (r − 2c1 )w + (r − c1 )β −
c1 α. The LST of the distribution D(·) is
&
β
if w > −β
D̃(w) = β+w
∞ otherwise.

For this general on-off “pseudo” source that inputs traffic into the second
buffer, we can compute its effective bandwidth, eb2 (v), as

eb1 (v) if 0 ≤ v ≤ v∗
eb2 (v) = ∗
(eb1 (v∗ ) − c1 ) vv + c1 if v > v∗ ,

where
2 # $ %
∗ β c1 α α β(r − c1 )
v = −1 + 1− (10.20)
r β(r − c1 ) r c1 α

and eb1 (v) is from Equation 10.18. Note that η2 is obtained by solving

eb2 (η2 ) = c2 .

If η2 ≤ v∗ , we have

0 D̃(η2 c2 )
(η2 ) = ,
Ũ(−η2 (c1 − c2 )) 0

and e((η2 )) = 1 (note that if η2 > v∗ , then e((η2 )) = 1 has no solutions).

Then we obtain h2 by solving

[1 h2 ] (η2 ) = [1 h2 ]

as h2 = D̃(η2 c2 ). If η2 > v∗ , we use the same h2 since D̃(·) only gradually goes
to infinite and the condition is mainly because of Ũ(·). The situation is similar
to the one in Figure 10.5(b). Hence, it would not cause any concerns. With
this we proceed to obtain the bounds for the distribution of X2 .
644 Analysis of Queues

Intuitively, a random variable with the distribution U(t) is a DFR random

variable (since U(t) represents the busy period distribution). The intuition
can be verified (after a lot of algebra) using the expression for U(t) in Gau-
tam et al. [39]. The steady-state distribution of the buffer-content process
{X2 (t), t ≥ 0} is bounded as

C2∗ e−η2 x ≤ P(X2 > x) ≤ C∗2 e−η2 x ,

where

D̃(η2 c2 )−1 Ũ(−η2 (c1 −c2 ))−1

−η2 c2 + η2 (c1 −c2 ) h2
C∗2 = ∞ 0, (10.21)
h2 c1 (α+β) eη2 (c1 −c2 )y dU(y)
limx→∞ x∞
β(c1 (α+β)−rβ) eη2 (c1 −c2 )x dU(y)
x

D̃(η2 c2 )−1 2 (c1 −c2 ))−1

−η2 c2 + Ũ(−η
η2 (c1 −c2 ) h2
C2∗ = h2 c1 (α+β)
. (10.22)
β(c1 (α+β)−rβ) Ũ(−η2 (c1 − c2 ))

As a final extension, we consider a combination of multiple sources and

a tandem buffer.

Problem 116
Consider the tandem buffers model in Figure 10.13. Input to the first buffer is
from N independent and identical exponential on-off sources with on-time
parameter α, off-time parameter β, and rate r. The output from buffer-1 is
directly fed into buffer-2. The output capacities of buffer-1 and buffer-2 are
c1 and c2 , respectively. Assuming stability, obtain bounds on the limiting
distributions of the contents of the two buffers.
Solution
We first obtain bounds on the contents of buffer-1 and then of buffer-2.

1
2 c1 c2
X1(t) X2(t)
N

FIGURE 10.13
Tandem buffers model with multiple sources. (From Gautam, N. et al., Prob. Eng. Inform. Sci.,
13, 429, 1999. With permission.)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 645

Buffer-1: Let Z1 (t) be the number of sources that are in the on state at time t.
Clearly, {Z1 (t), t ≥ 0} is an SMP (more specifically, a CTMC). Assume

Nrβ
< c1 < Nr
α+β

for stability (ensured by the first inequality) and nontriviality (the second
inequality ensures that buffer-1 is not always empty). We can show that (δ)
is given by
⎧ iα
⎪
⎨ iα+(N−i)β−(ir−c1 )δ if j = i − 1
φij (δ) = (N−i)β
if j = i + 1
⎪
⎩ iα+(N−i)β−(ir−c1 )δ
0 otherwise ,

and e((δ)) = 1 always has solutions. Using the expression for eb(v) in
Equation 10.18 and solving for η1 in N eb(η1 ) = c1 , we get

N(c1 α + c1 β − Nβr)
η1 = .
c1 (Nr − c1 )

The eigenvectors are obtained by solving

h = h(η1 ).

The limiting distribution of the buffer-content process {X1 (t)t ≥ 0} is given by

C1∗ e−η1 x ≤ P{X > x} ≤ C∗1 e−η1 x ,

where
N ! "
hi N
i=0 η1 (ir−c1 ) j=0 (φ (η
ij 1 )) − 1
C∗1 = ,
hi 1
mini:ir>c1
pi iα + (N − i)β − η1 (ir − c1 )
N ! "
hi N
i=0 η1 (ir−c1 ) j=0 (φij (η1 )) − 1
C1∗ = ,
hi 1
maxi:ir>c1
pi iα + (N − i)β − η1 (ir − c1 )

and

ai τi N! αN−i βi
pi = N = .
am τm i!(N − i)! (α + β)N
m=0
646 Analysis of Queues

Buffer-2: Let = c1 /r. Define

&
Z1 (t) if X1 (t) = 0
Z2 (t) = (10.23)
if X1 (t) > 0,

where Z1 (t) is the number of sources on at time t. Let R1 (t) be the output rate
from the first buffer at time t. We assume that

Nrβ
< c2 < c1 .
α+β

We can see that the {Z2 (t), t ≥ 0} process is an SMP on state space {0, 1, . . . , }
with kernel

G(t) = Gij (t)

derived in Problem 98 (see Chapter 9).

From Kulkarni and Gautam [70], we have the effective bandwidth of the
output from buffer-1, eb2 (v), given by
&
N eb1 (v) if 0 ≤ v ≤ v∗
eb2 (v) = v∗
(N eb1 (v∗ ) − c1 ) v + c1 if v > v∗ ,

where eb1 (v) is from Equation 10.18 and

2 # $ %
β c1 α α β(Nr − c1 )
v∗ = −1 + 1− .
r β(Nr − c1 ) r c1 α

Hence solving

eb2 (η2 ) = c2 ,

we get
0
N(c2 α + c2 β − Nβr) h(v∗ ) − c1 v∗
η2 = min , ,
c2 (Nr − c2 ) c 2 − c1

where
! "
rv∗ − α − β + (rv∗ − α − β)2 + 4βrv∗ N
h(v∗ ) = . (10.24)
2
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 647

If η2 ≤ v∗ , then

&
G̃ij (−η2 (ir − c2 )) if 0 ≤ i ≤ − 1,
φij (η2 ) =
G̃ij (−η2 (c1 − c2 )) if i =

and e((η2 )) = 1. We obtain h by solving

h(η2 ) = h.

It can be shown that the random variables associated with the distri-
bution Gj (x)/Gj (∞) have a decreasing failure rate. Hence, min (, j) and
max (, j) occur at x = ∞ and x = 0, respectively. Thus, we can find bounds
for the steady-state distribution of the buffer-content process {X2 (t), t ≥ 0} as
follows. On the basis of Equations 10.7 through 10.9, removing k since K = 1,
we can write

⎛ ⎞

hi ⎝ (φij(η2 )) − 1⎠ ,
H=
η2 (min(ir, c1 ) − c2 )
i=0 j=0

and obtain min (i, j) as well as max (i, j) using Table 10.1.
Then the limiting distribution of the buffer content process is

C2∗ e−η2 x ≤ lim P(X2 (t) > x) ≤ C∗2 e−η2 x , ∀x ≥ 0,

t→∞

where

H H
C∗2 = , C2∗ = .
mini,j:ir>c2 ,j=i±1 min (i, j) maxi,j:ir>c2 ,j=i±1 max (i, j)

In Figure 10.14, we illustrate the upper and lower bounds on the limiting
distribution of the buffer-content process

lim P{X2 (t) > x} = P{X2 > x}

t→∞

for a numerical example with α = 1, β = 0.3, r = 1, c1 = 13.22, c2 = 10.71, and

N = 16.
648 Analysis of Queues

–3 x

–4

–5 Upper bound
Log10(P(X2 > x))

–6

–7
Lower bound
–8

–9

FIGURE 10.14
The upper and lower bounds as a function of x. (From Gautam, N. et al., Prob. Eng. Inform. Sci.,
13, 429, 1999. With permission.)

10.3 Multiclass Fluid Queues

In this section, we consider extending the analysis discussed so far to mul-
ticlass queues. This is akin to the extension to multiple classes seen in the
discrete case in Chapter 5. In particular, we consider a system as described
in Figure 10.15. There are N input buffers, one for each class of traffic. The
input to buffer j (j = 1, . . . , N) is from the Kj sources of class j. The ith source of
class j is driven by an independent random environment process {Zij (t), t ≥ 0}
for i = 1, 2, . . . , Kj and j = 1, . . . , N. At time t, source i of class j generates fluid

B1
1
2 X1(t)
K1

1
2 X2(t)
c
K2
B2

BN
1
2
XN(t)
KN

FIGURE 10.15
Multiclass fluid system. (From Kulkarni, V.G. and Gautam, N., Queueing Syst. Theory Appl., 27,
79, 1997. With permission.)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 649

at rate rij (Zij (t)) into buffer j. All the classes of fluids are emptied by a sin-
gle channel of constant capacity c. At this time we do not specify the service
scheduling policy for emptying the N buffers.
For example, we will consider policies such as: timed round-robin
(polling) policy, where the scheduler serves the N buffers in a round-robin
fashion; static priority service policy, where there is a priority order for each
class and only when all higher priority buffers are empty this class would be
served; generalized processor sharing (GPS) policy, where a fraction of chan-
nel capacity c is offered to all buffers simultaneously; and threshold policies,
where both the buffer to serve and the fractions of capacity to be assigned
depend on the amount of fluid in the buffers (using threshold values or
switching curves). Notice that the buffers do not necessarily have a constant
output capacity. However, all the results we have seen thus far have had
constant output capacity. We will subsequently use a fictitious compensating
source to address this.
However, we first describe the main objective, which is to analyze the
buffer content levels in steady state. For that, let Xj (t) be the amount of fluid
in buffer j at time t. Assume that all N buffers are of infinite capacity. Assume
that we can use the source characteristics to obtain the effective bandwidth
of source i of class j as ebij (v) for i = 1, 2, . . . , Kj and j = 1, . . . , N. We use that
for the performance analysis. The quality-of-service (QoS) criterion is mainly
based on tail distribution of the buffer contents. In particular, for a given set
of buffer levels B1 , . . . , BN , the probability of exceeding those levels must be
less than 1 , . . . , N , respectively. This for j = 1, . . . , N,

lim P{Xj (t) > Bj } = P{Xj > Bj } < j .

t→∞

Note that the preceding QoS can indirectly be used for bounds on delay
as well.
The analysis in this section can be used not only in obtaining the tail prob-
abilities but also for admission control. For that we assume that all sources
of a particular class are stochastically identical. We assume that sources
arrive at buffers, spend a random sojourn time generating traffic according
to the respective environment process, and then depart from the buffers. We
assume that the number of sources is slowly varying compared to sources
changing states as well as buffer contents. In particular, we assume that
steady state is attained well before the number of sources of each class
changes. For such a system, our objective is to determine the feasible region
K given by

K = {(K1 , . . . , KN ) : ∀j ∈ [1, . . . , N], P(Xj > Bj ) < j }.

We can use that for controlling admission in the following manner. If we

have already admitted Kj sources of class j (for any j ∈ [1, . . . , N]) and a
650 Analysis of Queues

new Kj + 1st source arrives, we can admit the new source if admitting it
would result in the QoS criterion being satisfied for all sources. Otherwise,
the source is rejected. This can easily be accomplished by maintaining a
precomputed look-up table of the feasible region K.
For the rest of this section, we will consider performance analysis for var-
ious service scheduling policies (such as timed round robin, static priority,
generalized processor sharing, and threshold based). We will obtain admis-
sible regions wherever appropriate and solve the admission control problem.
However, prior to this we first describe how to analyze a buffer whose out-
put capacity is varying as this is a recurring theme for all policies. In fact,
the goal of the next section is to describe a unified approach to address
time-varying output capacities so that they could be used subsequently in
performance analysis.

10.3.1 Tackling Varying Output Capacity: Compensating Source

The analysis we saw earlier in this chapter for steady-state buffer content
distribution’s bounds and approximation assumed that the output channel
capacity of any buffer is a constant. However, when we analyze multiclass
fluids where each class has a buffer, the resulting channel capacity for any
buffer is not a constant over time. Thus, to perform even the simplest approx-
imation using the effective bandwidth analysis (which is common to all the
approximations and bounds), we need the output channel capacity to be
a constant over time. We consider the approach of using a fictitious com-
pensating source. It can be used effectively to analyze time-varying output
capacities when they are independent of the input environment process as
well as buffer content process. We describe the model next with notation
that will be used only in this section.
Consider a single buffer with infinite room for fluid. Fluid is generated
into this buffer by a source modulated by a discrete environment process
{Z(t), t ≥ 0}. When the environment process is in state i, fluid enters the buffer
at rate r(i). The output capacity is governed by an independent discrete
stochastic process {Y(t), t ≥ 0}. When Y(t) = j the buffer is offered a capacity
c(j). We assume that {Z(t), t ≥ 0} and {Y(t), t ≥ 0} are independent stochastic
processes. Let c = maxj {c(j)} be the maximum possible channel capacity. To
analyze the contents of such a buffer, we denote X̂(t) as the amount of fluid
in the buffer (i.e., buffer content) at time t. The dynamics of the buffer content
process {X̂(t), t ≥ 0} is described by

&
dX̂(t) r(Z(t)) − c(Y(t)) if X̂(t) > 0
=
dt {r(Z(t)) − c(Y(t))}+ if X̂(t) = 0

where {x}+ = max(x, 0).

Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 651

Z(t)

r(Z(t))

r(Z(t)) c(Y(t)) c
Z(t) Y(t) c –c(Y(t))

Y(t)

FIGURE 10.16
Original system (left) and equivalent fictitious system (right).

Now consider a fictitious infinite-sized buffer with constant output chan-

nel capacity c (recall that c = maxj {c(j)} based on the varying capacity c(j)
of the original buffer). There are two independent sources for this fictitious
buffer. One source is same as that of the original buffer, that is, one that
generates fluid at rate r(Z(t)) when the environment process is in state Z(t)
at time t. The other is what we call the compensating source. In particular,
the compensating source generates fluid into this fictitious buffer at rate
c − c(Y(t)) when the environment is in state Y(t). Let X(t) be the amount
of fluid in this fictitious buffer at time t. The key realization to make is that if
X(0) = X̂(0), then X(t) = X̂(t) for all t ≥ 0. In fact, the dynamics of {X(t), t ≥ 0}
is identical to that of {X̂(t), t ≥ 0}. However, notice that we know how to com-
pute the steady-state distribution of X(t); hence we can use that to obtain
the steady-state distribution of X̂(t). The independence of {Z(t), t ≥ 0} and
{Y(t), t ≥ 0} is mainly for the analysis, and the earlier result would hold even
if they are dependent. Figure 10.16 summarizes what we described in the
preceding text. We will use that extensively in the following sections.

10.3.2 Timed Round Robin (Polling)

Assume that tso does not change with time. The cycle time T is defined
as the amount of time the scheduler takes to complete a cycle, and is
given by

N
T = tso + τj .
j=1

Using the earlier setting, our aim is to describe an approach to derive

bounds and approximations for the steady-state buffer content distribution
P(Xj > x), for all j ∈ [1, . . . , N]. We mainly concentrate on the effective band-
width approximation and upper bounds (assuming the input sources are
SMPs). Although we could consider the CDE approximation, since it is a
relatively straightforward extension to the effective bandwidth approxima-
tion, we do not use it here (but it has been included in exercise problems
at the end of the chapter). Nonetheless, for all approaches there are two
commonalities—determining the stability conditions and doing an effec-
tive bandwidth analysis. This is done first followed by the QOS, numerical
problems, and admission control.
It is crucial to realize that since τ1 , τ2 , . . . , τN and tso are known con-
stants, for a given buffer (say j) the buffer contents and its dynamics do not
depend on the parameters of any other buffer (i.e., any i = j). Therefore,
it is convenient to analyze each buffer separately and hence we consider
some j ∈ [1, . . . , N] and describe the analysis. Buffer j can be modeled as a
single-buffer-fluid model with variable output capacity and input from Kj
different sources, such that source i of class j is modulated by an environ-
mental process {Zij (t), t ≥ 0}. At time t, the input rate is rij (Zij (t)). The output
capacity alternates between c (for τj units of time) and 0 (for T − τj units
of time).
The condition for the stability of buffer j is

K
j
τj
E[rij (Zij (∞))] < c .
T
i=1

If the stability condition is satisfied, then Xj (t) → Xj as t → ∞. Next we

describe methods such as effective bandwidth approximation and SMP
bounds to compute P(Xj > x). To utilize these techniques, one needs to first
transform the system into an appropriate one with a constant output chan-
nel capacity. Consider a single-buffer-fluid model for buffer j with a constant
output channel capacity c whose input is generated by the original Kj sources
and a fictitious compensating source.
The compensating source is such that it stays “on” for a determinis-
tic amount of time T − τj and “off” for a deterministic amount of time τj .
When the compensating source is on, it generates fluid at rate c and when
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 653

1
2
C
Xj(t)
Kj
Compensating source

FIGURE 10.17
Transforming buffer j to a constant output capacity one using a compensating source. (From
Gautam, N. and Kulkarni, V.G., Queueing Syst. Theory Appl., 36, 351, 2000. With permission.)

it is off, it generates fluid at rate 0. Note that the compensating source

is independent of the original Kj sources. As we saw in Section 10.3.1,
the dynamics of the buffer-content process (of buffer j) remain unchanged
for this transformed single-buffer-fluid model with Kj + 1 input sources
(including the compensating source) and constant output capacity c (see
Figure 10.17).
Using the results in Section 10.1.2, the effective bandwidth of the jth
compensating source described earlier is given by

c(T − τj )
ebsj (v) = .
T

Note that the effective bandwidth of this deterministic source is indeed

its mean traffic generation rate. Using the effective bandwidth approxi-
mation,

P(Xj > x) ≈ e−ηj x ,

where ηj is obtained by solving

K
j
(T − τj )
ebij (ηj ) + c = c.
T
i=1

Thus, based on the effective bandwidth approximation, the QoS criteria for
all the classes of traffic are satisfied if for all j = 1, 2, . . . , N,

e−Bj ηj < j .

Similarly, it is also possible to check if the QoS criteria is satisfied using the
upper SMP bound as C∗j e−Bj ηj < j since
654 Analysis of Queues

P(Xj > Bj ) ≤ C∗j e−ηj Bj < j .

Although the preceding results assume we know ebij (v) and also how to com-
pute C∗j , next we present a specific example to clarify these aspects. Also, for
the sake of obtaining closed-form algebraic expressions, we consider a rather
simplistic set of sources.

Problem 117
Consider a multiclass fluid queueing system with N buffers, one for each
class. For all j = 1, . . . , N, the input to buffer j are Kj independent and identical
alternating on-off sources that stay on for an exponential amount of time
with parameter αj and off for an exponential amount of time with parameter
βj . When a source is on, it generates traffic continuously at rate rj into buffer j,
and when it is off, it does not generate any traffic. The scheduler serves buffer
j for a deterministic time τj at a maximum rate c and stops serving the buffer
for a deterministic time T −τj . Using effective bandwidth approximation and
bounds, obtain expressions for P(Xj > Bj ). Then for the following numerical
values αj = 3, βj = 0.2, rj = 3.4, Bj = 30, τj /T = 3/13, Kj = 10, c = 15.3, and T
varies from 0.01 to 0.40 while τj /T is fixed, graph using the two methods of
approximate expressions for the fraction of fluid lost assuming that the size
of the buffer is Bj .
Solution
The effective bandwidth of all the Kj sources combined is

1
rj v − αj − βj + (rj v − αj − βj )2 + 4βj rj v
Kj ebj (v) = Kj .
2v

Thus, solving for ηj in

cτj
Kj ebj (ηj ) = ,
T

we get

cτj (αj + βj ) − rj Kj βj T
ηj = cτj .
Kj T (rj TKj − cτj )
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 655

Also, for the upper bound C∗j e−ηj Bj , we can obtain

r TK Kj ! ! τ (T−τ ) " "

j j
exp ηj c j j
− 1
α cτ T T
C∗j =
j j
& Kj −i *
αj + βj Kj TKj βj T − τj
min1≤i≤Kj ηj cτj
αj βj TKj βj + ηj cτj

r TK Kj ! ! τ (T−τ ) " "

j j
exp η j c j j
− 1
αj cτj T T
=! " ! TK β "Kj −1
αj +βj Kj j j T − τj
αj βj TKj βj +ηj cτj ηj cτj

Recall that the effective bandwidth approximation yields P(Xj > Bj ) ≈ e−Bj ηj
and the SMP bounds result in P(Xj > Bj ) ≤ C∗j e−ηj Bj . Thus, for the numer-
ical values listed in the problem, if we let Bj be the size of the buffer,
then the fraction of fluids lost can be approximated as P(Xj > Bj ). With that
understanding, the loss probability estimates using the effective-bandwidth
technique is

loss(ebw) = e−ηj Bj

and that using the SMP bound is

loss(smp) = C∗j e−ηj Bj .

Figure 10.18 shows the results for loss(ebw) and loss(smp), given the numer-
ical values in the problem by varying T from 0.01 to 0.40 while keeping
τj /T fixed.
Intuitively, we expect the loss probability to increase with T since an
increase in T would increase the time the server does not serve the buffer.
The SMP bounds estimate, loss(smp), increases with T and hence confirms
our intuition. The effective-bandwidth estimate, loss(ebw), does not change
with T. For small T, since loss(smp) < loss(ebw), we can conclude that the
effective-bandwidth technique produces a conservative result. For large T,
the estimate of the loss probability is smaller using the effective-bandwidth
technique than the SMP bounds technique. This indicates that there may be a
risk in using the effective-bandwidth technique as it could result in the QoS
criteria not being satisfied.

It is worthwhile pointing out that besides the fraction of fluid lost, we

could also obtain other metrics. This can be used to compare the effective
bandwidth method against the SMP bounds. In particular, say we contin-
ued with the same set of numerical values: αj = 3, βj = 0.2, rj = 3.4, Bj = 30,
656 Analysis of Queues

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

–5.2 T

Log10 [loss(smp)]
–5.3

Log10 [loss(ebw)]
–5.4

–5.5

FIGURE 10.18
Estimates of the logarithms of loss probability. (From Gautam, N. and Kulkarni, V.G., Queueing
Syst. Theory Appl., 36, 351, 2000. With permission.)

τj /T = 3/13, and c = 15.3, and if we are interested in the maximum number of

class j sources that would ensure P(X > Bj ) to be less that j , then the estimate
of the maximum number of sources using the effective-bandwidth technique
is
3 4
ebw 1 cτj
Kj,max = .
ebj (− log(j )/Bj ) T

On the other hand, using the upper bound for an SMP we choose the largest
smp
integer Kj,max that satisfies

C∗j e−ηj Bj < j .

smp −5 and T
j,max when j = 10
ebw and K
Figure 10.19 shows the results for Kj,max
varies from 0.01 to 10.00 while τj /T is fixed. As T increases, we expect fewer
sources to be allowable into the buffer so that long bursts of traffic can be
smp
avoided when the server is not serving. From the figure, Kj,max clearly con-
forms to our intuition. For large T, we may end up admitting more sources if
we used the effective-bandwidth technique and hence the QoS criterion may
not be satisfied.
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 657

10
Maximum number of sources ebw
Kj,max
8
smp
6 Kj,max

1 2 3 4 5 6 7 8 9 10
T

FIGURE 10.19
Estimate of the maximum number of sources. (From Gautam, N. and Kulkarni, V.G., Queueing
Syst. Theory Appl., 36, 351, 2000. With permission.)

Next, instead of estimating the maximum number of sources, say we are

given the number of sources and want to know the minimum bandwidth c
that would ensure that P(X > Bj ) is less than j . We continue with the same
set of numerical values: αj = 3, βj = 0.2, rj = 3.4, Bj = 30, and τj /T = 3/13,
with Kj = 10. The estimate of the smallest bandwidth required, c, using the
effective-bandwidth technique is

log(j )
cebw
min = K eb
j j − .
Bj

The loss probability estimate using the SMP bounds decreases with increase
in c. Therefore, we perform a search using the bisection method to pick a c
between the mean and peak input rates that satisfies

C∗j e−ηj Bj = j ,

smp
and we denote the c value obtained as cmin since it is the smallest output
capacity that would result in satisfying the QoS criterion

C∗j e−ηj Bj < j .

−5 and T varies smp

min and cmin when j = 10
Figure 10.20 shows the results for cebw
from 0.01 to 0.40 while τj /T is fixed. Intuitively, the bandwidth required
should increase with T so that all the buffer contents are drained out when
smp
the server is serving the buffer. The cmin obtained using the SMP bounds
technique is consistent with our intuition. On the other hand, cebw
min does not
658 Analysis of Queues

14.8 smp
Cmin

14.7
C

14.6 ebw
Cmin

14.5

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

FIGURE 10.20
Estimates of the required bandwidth. (From Gautam, N. and Kulkarni, V.G., Queueing Syst.
Theory Appl., 36, 351, 2000. With permission.)

vary with T. Therefore, on using the effective-bandwidth technique one faces

the risk of the QoS criteria not being satisfied.
In the next problem that deals with admission control for a two-class
polling system with timed round-robin policy, we only consider SMP
bounds.

Problem 118
Consider a multiclass fluid queueing system with N = 2. For j = 1, 2, class j
fluid enters into buffer j from Kj exponential on-off sources with mean on-
time 1/αj and mean off-time 1/βj . Fluid is generated by each source at rate
rj when the source is in the on-state. Fluid is emptied by a channel with
capacity c that serves buffer j for τj time and has a total switch-over time of
tso per cycle. State an algorithm to determine the feasible region for this timed
round-robin policy Ktrr so that if (K1 , K2 ) ∈ Ktrr , then P(Xj > Bj ) < j for j = 1
and j = 2. Graph the feasible region for the following numerical values:

α1 = 1, β1 = 0.3, r1 = 1.0, 1 = 10−6 , B1 = 8, T = 1.22, tso = 0.02,

α2 = 1, β2 = 0.2, r2 = 1.23, 2 = 10−9 , B2 = 10, and c = 13.22.

To begin with, assume that the cycle time T and the switch-over time tso are
fixed known constants. However, the values τ1 and τ2 vary and are appro-
priately chosen such that τ1 + τ2 + tso = T. Subsequently, consider the case
where T is varied so that it is under different orders of magnitude compared
to tso .
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 659

Solution
An algorithm to compute the feasible region:

1. Set K = ∅.
2. Let τ1 = T and τ2 = 0. (The scheduler always serves only buffer-
1, and hence there are no switch-over times and no compensating
source.)
3. Obtain the maximum number of admissible class-1 sources K1max as
the maximum value of K1 such that

C∗1 e−η1 B1 < 1 ,

where

[r1 K1 /(α1 c)]K1

C∗1 = (10.25)
(α1 + β1 ) / (α1 β1 )K1 (K1 β1 /(K1 β1 + η1 c))K1 −1

and

c(α1 + β1 ) − r1 K1 β1
η1 = .
c/K1 (r1 K1 − c)

4. K = K ∪ {(0, 0), (1, 0), . . . , (K1max , 0)}.

5. Let τ2 = T and τ1 = 0. (The scheduler always serves only buffer-
2, and hence there are no switch-over times and no compensating
source.)
6. Obtain the maximum number of admissible class-2 sources K2max as
the maximum value of K2 such that

C∗2 e−η2 B2 < 2 ,

where

[r2 K2 /(α2 c)]K2

C∗2 = (10.26)
((α2 + β2 )/(α2 β2 ))K2 (K2 β2 /(K2 β2 + η2 c))K2 −1

and

c(α2 + β2 ) − r2 K2 β2
η2 = .
c/K2 (r2 K2 − c)

7. K = K ∪ {(0, 1), (0, 2), . . . , (0, K2max )}.

660 Analysis of Queues

8. Set K1 = 1.
9. While K1 < K1max :
(i) Compute the minimum required τ1 (≤ T − tso ) such that the
loss probability is less than 1 .
(ii) Compute the available τ2 ( = T − tso − τ1 ).
(iii) Given τ2 , compute the maximum possible K2 value by
minimizing over the set A2 for K2 + 1 sources.
(iv) K = K ∪ {(K1 , 1), (K1 , 2), . . . , (K1 , K2 )}.
(v) K1 = K1 + 1.
10. Return Ktrr = K.

Using the preceding algorithm, we plot the admissible region Ktrr in

Figure 10.21 for the following numerical values given in the problem. Note
that there is a steep fall in the admissible region from (0, 20) to (1, 12). This is
due to using Equation 10.26 for the K2 sources and

K2 ! ! " "

r2 TK2
exp η2 c τ2 (T−τ 2)
− 1
∗ α2 cτ2 T T
C2 = ! "K2 ! "K2 −1
α2 +β2 TK2 β2 T − τ2
α2 β2 TK2 β2 +η2 cτ2 η2 cτ2

12
K2
10

5 10 15 20 25 30
K1

FIGURE 10.21
Admissible region Ktrr . (From Gautam, N. and Kulkarni, V.G., Queueing Syst. Theory Appl., 36,
351, 2000. With permission.)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 661

14 T = 1.22

12
K2
10

8
T = 0.14
6

4 T = 12.02
2

5 10 15 20 25 30
K1

FIGURE 10.22
Ktrr as a function of T. (From Gautam, N. and Kulkarni, V.G., Queueing Syst. Theory Appl., 36,
351, 2000. With permission.)

for the K2 + 1 sources (including the compensating source), respectively, for

the two points (0, 20) and (1, 12). Note that the correct choice of τ1 and τ2
depend upon K1 and K2 ∈ Ktrr . Note that from Step 9 in the preceding algo-
rithm, τ1 (and hence τ2 ) depends only on K1 . However, there could be other
choices of (τ1 , τ2 ) for a given (K1 , K2 ).
Next we discuss the effect of varying T, the cycle time. Intuitively, for
T tso , an increase in T would result in a smaller admissible region and
a decrease in T would result in a bigger admissible region. We confirm our
intuition by observing the results in Figure 10.22 (using the numerical values
in the problem statement) for the cases T = 1.22 and T = 12.02. When T is
approximately of the same order of magnitude as tso , a significant fraction
of the server off-time is the switch-over time. Hence, it is not clear how the
admissible region would change with T in this case. From Figure 10.22, we
can see that for the cases T = 0.14 and T = 1.22, one region is not the subset
of the other. Hence, we conclude that it is not straightforward to obtain an
optimal value of T such that the feasible region is maximized. Note that if
tso = 0, then the optimal T is such that T → 0.

That said, we move on to the next service scheduling policy, the static
priority policy.
662 Analysis of Queues

10.3.3 Static Priority Service Policy

Consider a multiclass fluid system as described in Figure 10.15. There are
N input buffers, one for each class of traffic. The input to buffer j (for
j = 1, . . . , N) is from the Kj sources of class j. The ith source of class j is
driven by an independent random environment process {Zij (t), t ≥ 0} for
i = 1, 2, . . . , Kj and j = 1, . . . , N. At time t, source i of class j generates fluid
at rate rij (Zij (t)) into buffer j. All the classes of fluids are emptied by a single
channel of constant capacity c using a static priority policy. Under this service
policy, traffic of class j has higher service priority over traffic of class i, if i > j
for all i and j in [1, . . . , N]. In other words, class-1 is given the highest pri-
ority and class-N the lowest priority. Two things make fluid priority queues
different from their discrete counterparts: (i) there is no notion of preemp-
tion in fluids, and (ii) more than one class can be served at a time in fluids.
Also, notice that unlike the timed round-robin policy we saw in the previous
section, there is no switch-over time.
However, the scheduler serves traffic of class j only if there is no fluid of
higher priority in the buffers (i.e., buffers 1, . . . , j − 1 are empty). Thus, all
the available channel capacity (a maximum of c) is assigned for the class-1
fluid, and the leftover channel capacity (if any) that class-1 does not need,
to class-2 fluid. Any leftover channel capacity that class-1 and class-2 do
not need is assigned to class-3 fluid, and so on. We will use “class” and
“priority” interchangeably, that is, class-j is also priority-j or jth priority.
Under such a policy, our objective is to derive the steady-state distribu-
tion of the contents of buffer j (for j = 1, . . . , N). Let Xj (t) be the amount of
fluid in buffer j at time t. We will state the stability condition subsequently,
but assuming it is satisfied, as t → ∞, Xj (t) → Xj . Under stability we would
like to derive expressions for P(Xj > x). We could use that to determine if
P(Xj > Bj ) < j for some Bj and j , which could be used in admission control
decisions.
Before discussing methods to obtain P(Xj > x) (for j = 1, . . . , N), we
describe the stability of the system. Notice that the scheduler is work-
conserving and fluid is always removed from the system at rate c when any
of the buffers is nonempty. Hence, if the steady-state arrival rate of fluid
is smaller than the output capacity, the system must be stable. Thus, the
stability condition can be stated as

K

N j

lim E{rij (Zij (t))} < c. (10.27)

t→∞
j=1 i=1

As stated earlier in this section, we do assume that we can compute the

effective bandwidth of all the sources with the effective bandwidth of the
ith source of class-j being ebij (v). With that we explain our analysis, first for
buffer-1 and then for all the other buffers.
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 663

Buffer-1: Since priority-1 fluid gets uninterrupted service, the analysis is

identical to the case where there is no other class traffic. Hence, we apply the
results obtained for the single class traffic case earlier in this chapter. In par-
ticular, we could use effective bandwidth analysis, CDE approximation or
SMP bounds and write P(X1 > x) as e−η1 x , L1 e−η1 x , or C∗1 e−η1 x , respectively,
where η1 is the unique solution to

K1
ebi1 (η1 ) = c. (10.28)
i=1

The constants L1 and C∗1 can be obtained using the CDE approximation
(Section 10.2.2) and SMP bounds (Section 10.2.3), respectively. Thereby,
we could use P(X1 > x) to determine if the QoS criteria P{X1 > B1 } ≤ 1 is
satisfied.
Buffer-j (1 < j ≤ N): The capacity available to buffer j is 0 when at least one of
j−1 Kk +
the buffers 1, . . . , j − 1 is nonempty and it is c − rik (Zik (t))
k=1 i=1
if all the buffers 1, . . . , j − 1 are empty. Let Rj−1 (t) be the sum of the output
rates of the buffers 1, . . . , j − 1 at time t with R0 (t) = 0. Therefore
⎧ j−1
⎨ c if k=1 Xk (t) >0
Rj−1 (t) = (10.29)
⎩ min[c, Kk j−1 r (Z (t))] if j−1
i=1 k=1 ik ik k=1 Xk (t) = 0.

Thus, the (time varying) channel capacity available for buffer j is c − Rj−1 (t)
at time t. Any sample path of the buffer content process {Xj (t), t ≥ 0} remains
unchanged if we transform the model for buffer j into one that gets served at
a constant capacity c and an additional compensating source producing fluid
at rate Rj−1 (t) at time t. Note that the compensating source j is independent
of the Kj sources of priority j.
A critical observation to make is that the compensating source is indeed
the output from a buffer whose input is the aggregated input of all 1, . . . , j−1
priority traffic and constant output capacity c. This observation is made in
Elwalid and Mitra [32] and is immensely useful in the analysis. Consider the
transformed model for the case N = 2 (a 2-priority model for ease of expla-
nation) depicted in Figure 10.23. The sample paths of the buffer content
processes {X1 (t), t ≥ 0} and {X2 (t), t ≥ 0} in this model are identical to those
in the original system. Similarly, in the case of N priorities, such a tandem
model is used.
Thus, buffer j can be equivalently modeled as one that is served at a con-
stant rate c, but has an additional compensating source as described earlier.
Let the effective bandwidth of this compensating source (which is the out-
put traffic from a fictitious buffer with input corresponding to all j − 1 higher
664 Analysis of Queues

1
2 X1(t) c B2
K1
B1 1 c
2 X2(t)
K2

FIGURE 10.23
Equivalent N = 2 priority system. (From Gautam, N. and Kulkarni, V.G., Queueing Syst. Theory
Appl., 36, 351, 2000. With permission.)

j
priority sources) be ebo (v), which is given by

eb1o (v) = 0 for all v,

⎧
⎪
⎨
Kj−1 j−1
if 0 ≤ v ≤ v∗j
j i=1 ebi j−1 (v) + ebo (v)
ebo (v) = v∗ ! " ! "
⎪
⎩ c − j {c − Kj−1 ebi j−1 v∗ − ebj−1
v i=1 j o v∗j } if v > v∗j ,
(10.30)

where v∗j is obtained by solving for v in the equation

⎡ ⎛ ⎞⎤
Kj−1
d ⎣ ⎝
ebi j−1 (v) + ebo (v)⎠⎦ = c.
j−1
v (10.31)
dv
i=1

Combining the preceding results, we get ηj as the unique solution to

K
j
j
ebij (ηj ) + ebo (ηj ) = c, ∀ j = 1, 2, . . . , N,
i=1

j
where eb1o (·) = 0 and ebo (·) is as in Equation 10.30. Note that the compensating
source j is independent of the Kj sources of priority j. If it is possible to charac-
terize the output process as a tractable stochastic process, we could use CDE
approximation (Section 10.2.2) or SMP bounds (Section 10.2.3) to derive an
expression for P(Xj > x). Otherwise, we could always use the effective band-
width approximation P(Xj > x) ≈ e−ηj x , for j = 2, . . . , N. Thereby, we could
use P(Xj > x) to determine if the QoS criteria P{Xj > Bj } ≤ j is satisfied. We
will use the SMP bounds to obtain an approximation for P(Xj > x) in the
example we present next.

Problem 119
Consider two classes of traffic. The Kj class-j sources, for j = 1, 2, are indepen-
dent and identical on-off sources with exponential on and off times, on-time
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 665

parameter αj , off-time parameter βj , and on-time traffic generation rate rj .

Each class has its own buffer and class-1 is always given higher priority for a
channel of capacity c. What is the condition of stability? Assuming that it is
satisfied, use SMP bounds to derive an upper bound on P(Xj > x) for j = 1, 2.
Solution
The mean arrival rate of traffic from each of the class-j sources is rj βj /(αj +βj ).
Hence, the stability condition is

K1 r1 β1 K2 r2 β2
+ < c.
α1 + β1 α2 + β2

Assuming this condition is satisfied, we use SMP bounds to derive upper

bounds of the type P(Xj > x) ≤ C∗j e−ηj x . For that we write down the effective
bandwidth of each class-j source as ebj (v) given by
1
rj v − αj − βj + (rj v − αj − βj )2 + 4βj rj v
ebj (v) = .
2v

For the analysis, we first consider buffer-1. If K1 ≤ c/r1 , then P{X1 > x} = 0,
since buffer-1 will always be empty. Now for the case K1 > c/r1 , let η1 be
the solution to K1 eb1 (η1 ) = c. Then the steady-state distribution of the buffer-
content process is bounded as

C∗1 e−η1 x ≤ P{X1 > x} ≤ C∗1 e−η1 x ,

where

K1 (cα1 + cβ1 − K1 β1 r1 )
η1 = , (10.32)
c(K1 r1 − c)

! "K1
K1 r1 α1
K1 r1 −c α1 +β1
C∗1 = ! "c/r1 ,
cα1
β1 (K1 r1 −c)

and
K1
K1 r1 β1
C∗1 = .
c(α1 + β1 )

Thus, an upper bound on P(X1 > x) is C∗1 e−η1 x .

666 Analysis of Queues

Now we consider buffer-2. We first model the K2 exponential on-off

sources as a single (K2 + 1)-state SMP with the states denoting the num-
ber of priority-2 sources that are on and then derive expressions for H1 ,
1 (i, j), and 1 (i, j) as defined in Equations 10.7 through 10.9. At buffer-
max min
2, besides the (K2 + 1)-state SMP, we also have a compensating source
that basically is the output from buffer-1. Recall Problem 116 where we
derived the output from buffer-1 as an SMP (although in that problem there
was no other source for buffer-2). Say we call the corresponding expres-
sions H2 , max
2 (i, j), and 2 (i, j) for the SMP model of the output from
min
buffer-1. Therefore, we can analyze the input to buffer-2 as traffic from two
sources (output from buffer-1 and the (K2 +1)-state SMP), each modulated by
an SMP.
We begin by obtaining η2 . Using the effective bandwidth of the output
from a buffer, we can show that η2 solves either

K1 eb1 (η2 ) + K2 eb2 (η2 ) = c and η2 ≤ v∗

v∗ cv∗
K1 eb1 (v∗ ) + K2 eb2 (η2 ) = and η2 > v∗ ,
η2 η2

where

2 # $ %
β1 cα1 α1 β1 (K1 r1 − c)
v∗ = −1 + 1− .
r1 β1 (K1 r1 − c) r1 cα1

Using the preceding expression for η2 , we define

c1 = K2 eb2 (η2 )

and

c2 = c − K2 eb2 (η2 ).

The (K2 + 1)-state SMP

For i = 0, 1, . . . , K2 and j = 0, 1, . . . , K2 , we define the following:
⎧
⎪
⎪
iα2
1 − exp{−(iα2 + (K2 − i)β2 )x} if j = i − 1
⎪
⎨ iα2 +(K2 −i)β2

G1i,j (x) = (K2 −i)β2
1 − exp{−(iα2 + (K2 − i)β2 )x} if j = i + 1
⎪
⎪ iα2 +(K2 −i)β2
⎪
⎩
0 otherwise,
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 667

G1i (x) = 1 − exp{−(iα2 + (K2 − i)β2 )x},

1
τ1i = ,
iα2 + (K2 − i)β2

P1ij = G1i,j (∞),

and
K −i
a1 τ1 K2 ! α2 2 βi2
p1i = K i i = .
2 1 1 i!(K2 − i)! (α2 + β2 )K2
m=0 am τm

Then, 1 (η2 ) is given by

⎧
⎪
⎪
iα2
if j = i − 1
⎪
⎪ iα2 +(K2 −i)β2 −(ir2 −c1 )η2
⎨
φ1i,j (η2 ) = (K2 −i)β2
if j = i + 1
⎪
⎪ iα2 +(K2 −i)β2 −(ir2 −c1 )η2
⎪
⎪
⎩
0 otherwise.

Also, the eigenvectors are obtained by solving

h1 = h1 1 (η2 ).

Therefore,
⎛ ⎞

K2 K2 !
"
h1i
H1 = ⎝ φ1ij (η2 ) − 1⎠
η2 (ir2 − c1 )
i=0 j=0

and
1 1
max (i, j) = min (i, j)
∞
h1i e−η2 (ir2 −c
1 )x
eη2 (ir2 −c
1 )y
x dG1ij (y)
=
p1i ∞
dG1ij (y)
τ1i x

h1i 1
= .
p1i iα2 + (K2 − i)β2 − η2 (ir2 − c1 )

The output from buffer-1

We only consider the case when K1 > c/r1 (refer to Gautam and Kulkarni
[40] for K1 ≤ c/r1 ). Let M = c/r1 . Then the output from buffer-1 can
668 Analysis of Queues

be modeled as an SMP on state space {0, 1, 2, . . . , M} (this directly follows

Problem 116 albeit somewhat different notation). For i = 0, 1, . . . , M − 1 and
j = 0, 1, . . . , M, let

⎧ iα1
⎪
⎪ iα1 +(K1 −i)β1 1 − exp{−(iα1 + (K1 − i)β1 )t} if j = i − 1
⎪
⎨
(K1 −i)β1
G2i,j (t) = 1 − exp{−(iα1 + (K1 − i)β1 )t} if j = i + 1
⎪
⎪ iα1 +(K1 −i)β1
⎪
⎩
0 otherwise.

Let

T = min{t > 0 : X1 (t) = 0}.

Then for j = 0, 1, . . . , M − 1, we have

G2M,j (t) = P{T ≤ t, N̄(T) = j|X1 (0) = 0, N̄(0) = M},

where N̄(t) denotes the number of priority-1 sources on at time t (note

that G2M,M (t) = 0). We need G2 (∞) = [G2i,j (∞)] in our analysis. We have for
i = 0, 1, . . . , M − 1 and j = 0, 1, . . . , M,

⎧ iα1
⎪
⎪ iα1 +(K1 −i)β1 if j = i − 1
⎪
⎨
G2i,j (∞) = (K1 −i)β1
if j = i + 1
⎪
⎪ iα1 +(K1 −i)β1
⎪
⎩
0 otherwise,
G2M,j (∞) = G̃2M,j (0),

where G̃2M,j (s) is the LST of G2M,j (t) that we have shown in Problem 116.
We also need the expression for the sojourn time τ2i in state i, for
i = 0, 1, . . . , M. We have
⎧ 1
⎨ iα1 +(K1 −i)β1 if i = 0, 1, . . . , M − 1
τ2i =
⎩ M−1 G̃2 (0) if i = M
j=0 M,j

Then we have for i = 0, 1, . . . , M

a2 τ2
p2i = Mi i ,
2 2
k=0 ak τk
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 669

where

a2 = a2 G2 (∞).

Define
⎧
⎨ G̃2ij (−η2 (ir1 − c2 )) if 0 ≤ i ≤ M − 1,
2
φij (η2 , m) =
⎩ mG̃2 (−η (c − c2 )) if i = M.
ij 2

2
Solve for m such that the Perron–Frobenius eigenvalue of (η2 , m) is 1.
Hence, we obtain h2 from

2
h2 (η2 , m) = h2 .

It can be shown that random variables with distribution G2Mj (x)/G2Mj (∞)
have a decreasing failure rate. Hence, min
2 (M, j) and 2 (M, j) occur at
max
x = ∞ and x = 0, respectively. Thus, we have for (i, j) ∈ {0, 1, . . . , M},
⎛ ⎞

M M !
"
h2i
H2 = ⎝ φ̄2ij (η2 , m) − 1⎠ ,
η2 (ir1 − c2 )
i=0 j=0

⎧ ⎫
⎪
⎨ h2 e−η2 (ir1 −c2 )x ∞ eη2 (ir1 −c2 )y dG2 (y) ⎪
⎬
2 i x ij
min (i, j) = inf ∞ 2 ,
x ⎪
⎩ p2i ⎪
⎭
iα1 +(K2 −i)β1 −η2 (ir1 −c ) x
2 dGij (y)

and
⎧ ⎫
⎪
⎨ h2 e−η2 (ir1 −c2 )x ∞ eη2 (ir1 −c2 )y dG2 (y) ⎪
⎬
2 i x ij
max (i, j) = sup ∞ 2 .
⎪
⎩ p2i ⎪
⎭
x
iα1 +(K2 −i)β1 −η2 (ir1 −c2 ) x
dGij (y)

Therefore, using the expressions for H1 , max

1 (i, j), 1 (i, j), H2 , 2 (i, j),
min max
and min (i, j), we have
2

H1 H2
C∗2 =
min(i1 , j1 ), (i2 , j2 ): min{i1 r1 , c} + i2 r2 > c,
pi1 j1 > 0, pi2 j2 > 0 min 1 (i , j ) 2 (i , j )
1 1 min 2 2
670 Analysis of Queues

and

H1 H2
C∗2 = .
max(i1 , j1 ), (i2 , j2 ): min{i1 r1 , c} + i2 r2 > c,
pi1 j1 > 0, pi2 j2 > 0 max 1 (i , j ) 2 (i , j )
1 1 max 2 2

Thus, an upper bound on P(X2 > x) is C∗2 e−η2 x .

Having obtained an upper bound on the buffer content distribution using

the SMP bounds in a rather tedious manner, it is natural to ask what if we
had used one of the other techniques such as effective bandwidth approxi-
mation or CDE approximation. This will be the focus of the next problem, in
particular with an admission control setting.

Problem 120
Consider the N = 2 class system described in Problem 119, where all sources
of a class are IID exponential on-off sources. Obtain admissible region K
using effective bandwidth approximation, CDE approximation, as well as
SMP bounds so that if (K1 , K2 ) ∈ K, then P(X1 > B1 ) > 1 and P(X2 > B2 ) > 2 .
Compare the approaches for the following numerical values:

α1 = 1.0, β1 = 0.2, r1 = 1.0, 1 = 10−9 , B1 = 10,

α2 = 1.0, β2 = 0.2, r2 = 1.23, 2 = 10−6 , B2 = 10, and c = 13.2.

Solution
Using the three different methodologies (effective bandwidth, CDE, and
SMP bounds), we can obtain different admissible regions based on the
expressions used for approximating P(Xj > x) for j = 1, 2. We first describe
them and provide notation:

• Effective bandwidth approximation: We obtain the admissible region

Kebw by using P(Xj > x) ≈ e−ηj x for j = 1, 2, where ηj is described in
Problem 119. Thus, we have

Kebw = {(K1 , K2 ) : e−η1 x ≤ 1 , e−η2 x ≤ 2 }.

Understandably, this is an approximation but one that is somewhat

easy to compute.
• Naive effective bandwidth approximation: In fact, if we want an approx-
imation that is even easier to compute, then we can get rid of the
effective bandwidth of the output and just use the effective band-
width of the input sources instead. Here, η1 remains unchanged
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 671

(from the previous case). However, we define η as the solution

to K1 eb1 (η) + K2 eb2 (η) = c and this would be different from η2 if
η2 > v∗ . Using η1 and η, we obtain the admissible region N by using
P(X1 > x) ≈ e−η1 x and P(X2 > x) ≈ e−ηx for j = 1, 2, where ηj is
described in Problem 119. Thus, we have

N = {(K1 , K2 ) : e−η1 x ≤ 1 , e−ηx ≤ 2 }.

We will always have N ⊂ Kebw .

(1)
• CDE approximation: We obtain the admissible region Kcde , by using
(1)
P(Xj > x) ≈ Lj e−ηj x for j = 1, 2, where ηj is described in Prob-
(1)
lem 119 and Lj is obtained using the technique in Section 10.2.2.
The full details are available in Kulkarni and Gautam [70]. Thus,
we have

(1) (1) (1)

Kcde = {(K1 , K2 ) : L1 e−η1 x ≤ 1 , L1 e−η2 x ≤ 2 }.

Although this is an approximation, it is not clear whether it is

conservative or not.
(2)
• Bufferless CDE approximation: We obtain the admissible region Kcde ,
(2)
by using P(Xj > x) ≈ Lj e−ηj x for j = 1, 2, where ηj is described in
(2)
Problem 119; however, Lj is obtained based on the meaning of L
(2)
in the CDE approximation in Section 10.2.2. In particular, Lj is the
fraction of fluid lost in a bufferless system (i.e., if the capacity of the
buffer j was zero). The details are again available in Kulkarni and
Gautam [70]. Thus, we have

(2) (2) (2)

Kcde = {(K1 , K2 ) : L1 e−η1 x ≤ 1 , L1 e−η2 x ≤ 2 }.

One of the difficulties in using this approach in general is that it is

not easy to determine the fraction of fluid lost in a bufferless system.
• SMP bounds: We obtain the admissible region Ksmp by taking advan-
tage of the fact that P(Xj > x) ≤ C∗j e−ηj x for j = 1, 2, where ηj and C∗j
are described in Problem 119. Thus, we have

Ksmp = {(K1 , K2 ) : C∗1 e−η1 x ≤ 1 , C∗2 e−η2 x ≤ 2 }.

This approach is guaranteed to be conservative, that is, the QoS con-

straint P(Xj > Bj ) < j for j = 1, 2 will be surely satisfied. However, as
we saw in Problem 119, the analysis is cumbersome.
672 Analysis of Queues

20
(2)
Kcde

15
K2
Ksmp
(1)
10 Kcde

5 Kebw

5 10 15 20 25 30 35
K1

FIGURE 10.24
(1) (2)
Regions N , Kebw , Ksmp , Kcde , Kcde . (From Gautam, N. and Kulkarni, V.G., Queueing Syst. Theory
Appl., 36, 351, 2000. With permission.)

Next, we compare the region Ksmp with the regions obtained using the
(1) (2)
CDE approximation, Kcde and Kcde , as well as the regions obtained by
effective-bandwidth approximation Kebw and N . We represent the regions
under consideration in Figure 10.24 using the numerical values stated in the
problem.
The region obtained by the SMP bounds, Ksmp , is conservative. There-
fore, if an admissible region has points in Ksmp , then those points are
guaranteed to satisfy the QoS criteria. Thus, the effective-bandwidth approx-
imation produces overly conservative results for these parameter values.
It is crucial to point out that although the effective bandwidth produces
conservative results usually, it is not guaranteed to be conservative, unlike
the results from SMP bounds. But in general, on one hand the effective-
bandwidth approximation is computationally easy, on the other hand, it
could either be too conservative (and hence leading to underutilization of
resources) or be nonconservative (and hence unclear about meeting the QoS
criteria). The CDE approximation, although computationally slower than
the effective-bandwidth approximation, is typically faster than the SMP
bounds technique. However, there are examples where we can show that
(1) (2)
the CDE approximation produces regions Kcde and Kcde with points (K1 , K2 )
that would actually result in the QoS criteria not being satisfied. Using SMP
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 673

bounds is computationally intensive. However, the computation can be done

off line and the feasible region can be stored in a table. The computation
needs to be repeated only when the input parameters change.

We conclude this section by answering a natural question: How do the

feasible region under timed round-robin compare against that of the static
priority policy. Understandably, the policies are very different, but do they
behave differently? We study that in the next problem, albeit using the same
method, namely, the SMP bounds.

Problem 121
Consider a multiclass fluid queueing system with N = 2 and IID exponen-
tial on-off sources for each class. For j = 1, 2, class j fluid enters into buffer
j from Kj exponential on-off sources with mean on-time 1/αj and mean off-
time 1/βj . Fluid is generated by each source at rate rj when the source is in
the on-state. Fluid is emptied by a channel with capacity c. Use the following
numerical values:

α1 = 1, β1 = 0.3, r1 = 1.0, 1 = 10−6 , B1 = 8,

α2 = 1, β2 = 0.2, r2 = 1.23, 2 = 10−9 , B2 = 10, and c = 13.22.

Consider two policies: (1) timed round robin with tso = 0.02 and T = c(B1 +
B2 ) + tso ; and (2) static priority. Compare the two policies by viewing the
admissible regions.
Solution
In Figure 10.25, we compare the two policies, timed round-robin and static
priority, by viewing their respective admissible regions (using SMP bounds,
hence the region corresponds to Ksmp in the previous problem and Ktrr
in Section 10.3.2) for two-class exponential on-off sources with parameters
given in the problem.
From the figure, we see the timed round-robin policy results in a smaller
admissible region. This is because unlike the static priority service policy,
the timed round-robin policy is not a work-conserving service discipline. In
particular, there is time switching between buffers as well as time spent in
buffers (recall that τ1 and τ2 are always spent) even if there is no traffic.
However, static priority service policy does not achieve fairness among the
classes of traffic. Therefore, it may not be an appropriate policy to use at
all times.

There are many other policies one could consider besides timed round-
robin and static priority. We briefly describe three of them in the following
section.
674 Analysis of Queues

14
Static priority
12
K2
10

6
Timed round-
4 robin

5 10 15 20 25 30
K1

FIGURE 10.25
Timed round-robin versus static priority. (From Gautam, N. and Kulkarni, V.G., Queueing Syst.
Theory Appl., 36, 351, 2000. With permission.)

10.3.4 Other Policies

We consider three policies in this section. The first is called generalized pro-
cessor sharing. It is fair like timed round-robin policy and work conserving
like static priority. However, implementing it is a little tricky. Recall the set-
ting described in Figure 10.15. There are N input buffers, one for each class
of traffic. The input to buffer j (j = 1, . . . , N) is from the Kj sources of class j.
The ith source of class j is driven by an independent random environment
process {Zij (t), t ≥ 0} for i = 1, 2, . . . , Kj and j = 1, . . . , N. At time t, source i of
class j generates fluid at rate rij (Zij (t)) into buffer j. All the classes of fluids
are emptied by a single channel of constant capacity c.
All classes of fluids are served using a generalized processor sharing (GPS)
service scheduling policy, which is described in the following manner.
Consider preassigned numbers φ1 , φ2 , . . . , φN for each of the N classes
such that

• φ1 + φ2 + · · · + φN = 1
• If all the input buffers have nonzero fluid, the scheduler allocates
output capacity c in the ratio φ1 : φ2 : · · · : φN to each of the N
buffers
• If some buffers are empty, the scheduler allocates just enough capac-
ity to those buffers (equal to the rates of traffic entering) so that
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 675

the buffers continue to be empty and the scheduler allocates the

remaining output capacity c in the ratio of φj ’s of the remaining
buffers

The GPS policy is in some sense the limiting timed round-robin policy,
where tso = 0 and for all j, τj → 0 such that τj /T → φj . The only exception to
timed round-robin is when a buffer is empty and here we assume that the
system is work conserving. So empty buffers are served only for a fraction
of their slot. The discrete version of the GPS is called the packetized general
processor sharing (PGPS) or weighted fair queueing, which is well-studied
in the literature. The quality-of-service aspects, effective bandwidths, and
admission control for the GPS and PGPS have been addressed in detail in
de Veciana et al. [24] and [22]. We recapitulate those results for GPS in the
next problem for the case N > 2, although the N = 2 case should be referred
to those articles.

Problem 122
What is the condition of stability? Assuming a stable system, let Xj be
amount of fluid in buffer j in steady state for all j ∈ [1, . . . , N]. Using effec-
tive bandwidth analysis, obtain an approximation for P(Xj > Bj ) for all
j ∈ [1, . . . , N].
Solution
Notice that the scheduler is work conserving, in other words it is impossible
that there is fluid in at least one buffer and the scheduler is draining at a rate
lower than c. Thus, the condition for stability is

K

N j

E[rij (Zij (∞)] < c.

j=1 i=1

Using the effective bandwidth analysis is a little tricky since the compen-
sating source is not easy to characterize except for some very special cases.
For some j ∈ [1, . . . , N], take buffer j. It is guaranteed a minimum bandwidth
of φj c at all times. However, the remaining (1 − φj )c (or greater, if buffer
j is empty) is shared among all the sources in ratios according to the GPS
scheme.
This is a little tricky to capture using a compensating source. Hence, we
consider a fictitious compensating source that is essentially the output from
a fictitious buffer with capacity (1 − φj )c and input being all the sources from
all the classes except j. Thus, when the fictitious buffer is nonempty, buffer
j gets exactly φj c capacity; however, when the fictitious buffer is empty,
all unutilized capacity is used by buffer j. It is not difficult to check that
this compensating source is rather conservative, that is, in the real setting,
676 Analysis of Queues

a lesser amount of fluid flows from the compensating source. In the spe-
cial case when N = 2 and K1 = K2 = 1 on-off source with on rates of class j
source being larger than φj c, this fictitious compensating source is identical
to the real compensating source. One could certainly develop other types
of compensating sources. The key idea here is a conservative one where
unless all the other buffers are empty, the remaining capacity is not allocated
to buffer j.
For such a compensating source, we solve for ηj as the unique solution to
⎛ ⎞
K Kk
j

ebij (ηj ) + min ⎝(1 − φj )c, ebik (ηj )⎠ = c.
i=1 k=j i=1

In fact, instead of the preceding expression one could have been more strict
and written down the output effective bandwidth from the fictitious buffer.
Thereby, we can obtain an approximation for the probability that there is
more than Bj amount of fluid in buffer j as P(Xj > Bj ) ≈ e−Bj ηj . Also for all
j = 1, 2, . . . , N,

e−Bj ηj < j

would ensure some QoS guarantees for buffer j.

That said, we move on to the next set of policies. Both are based on thresh-
olds of buffer contents. They leverage upon results from this chapter as well
as Chapter 9.

Problem 123
Consider a fluid queueing system with two infinite-sized buffers as shown
in Figure 10.26. For j = 1, 2, fluid enters buffer j according to an alternating
on-off process such that for an exponentially distributed time (with mean
1/αj ) fluid enters continuously at rate rj and then no fluid enters for another

Buffer 1

a C
Buffer 2

FIGURE 10.26
Two-buffer system. (From Aggarwal, V. et al., Perform. Eval., 59(1), 19, 2004. With permission.)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 677

exponentially distributed time (with mean 1/βj ). When the off-time ends,
another on-time starts, and so on. Let Xj (t) be the amount of fluid in buffer j
(for j = 1, 2) at time t. A scheduler alternates between buffers-1 and buffer-
2 while draining out fluid continuously at rate c. Assume that r1 > c and
r2 > c. The policy adopted by the scheduler is as follows: as soon as buffer-
1 becomes empty (i.e., X1 (t) = 0), the scheduler switches from buffer-1 to
buffer-2. When the buffer contents in buffer-1 reaches a (i.e., X1 (t) = a), the
scheduler switches back from buffer-2 to buffer-1. We denote 0 and a as the
thresholds for buffer-1. What is the stability condition? Assuming stability,
derive an expression using SMP bounds for the steady-state distribution of
the contents of buffer-2.

Solution
Note that the scheduler’s policy is dependent only on buffer-1. That means
even if buffer-2 is empty (i.e., X2 (t) = 0), as long as buffer-1 has less than a
(i.e., X1 (t) < a), the scheduler does not switch back to buffer-1. Also it is rela-
tively straightforward to model the dynamics of buffer-1 and obtain the state
probability P(X1 > x) for x > a assuming the buffer is stable (and as t → ∞,
X1 (t) → X1 ). This analysis is described in Chapter 9 (Problem 96). Here we
only consider bounds for P(X2 > x) assuming the system is stable. In fact, the
stability condition (for limiting distributions of the buffer contents X1 (t) and
X2 (t) to exist) is

r1 β1 r2 β2
+ < c.
α1 + β1 α2 + β2

We now focus our attention on studying the buffer content process of

buffer-2. In particular, our aim is to derive the limiting buffer-content distri-
bution as our main performance measure. If we consider buffer-2 in isolation,
its input is from an exponential on-off source but the output capacity alter-
nates between c (for T2 time, which is the time for the contents in buffer-1 to
go from 0 to a) and 0 (for T1 time, which is the time for the contents in buffer-
1 to go from a to 0). To use the SMP bounds technique, which assumes that
the output channel capacity is a constant, we first transform our model into
an appropriate one with a constant output channel capacity.
Consider a single-buffer fluid model for buffer-2 with a constant output
channel capacity c whose input is generated by the original exponential on-
off source and a fictitious compensating source. The compensating source is
such that it stays on for T1 time units and off for T2 time units. When the
compensating source is on, it generates fluid at rate c, and it generates fluid
at rate 0 when it is off. Note that the compensating source is independent of
the original source. Clearly, the dynamics of the buffer-content process (of
buffer-2) remain unchanged for this transformed single-buffer-fluid model
with two input sources (including the compensating source) and constant
output capacity c.
678 Analysis of Queues

Notice that the on-time of the compensating source corresponds to the

first passage time for buffer-1 contents to go from a to 0. The LST of this first
passage time Õ1 (·) can be obtained from Problem 94 as

&
w+β1 +cs0 (w) a s0 (w)
e if w ≥ w∗
Õ1 (w) = β1
∞ otherwise,

where
√
∗ 2 cα1 β1 (r1 − c) − r1 β1 − cα1 − cβ1
w = ,
r1

−b − b2 + 4w(w + α1 + β1 )c(r1 − c)
s0 (w) = ,
2c(r1 − c)

and b = (r1 −2c)w+(r1 −c)β1 −cα1 . The mean on-time E[T1 ] can be computed
as E[T1 ] = − dÕ1 (w)/dw at w = 0. Hence we have

r1 + a(α1 + β1 )
E[T1 ] = .
cα1 + cβ1 − r1 β1

Also, the off-time of the compensating source corresponds to the contents

of buffer-1 to go from 0 to a for the first time. The LST of this first passage
time Õ2 (·) can be obtained from Problem 95 as

2
β1 −a αr1 s+β 1 s+s
Õ2 (s) = e 1 s+r1 β1 .
β1 + s

Hence, the mean off-time E[T2 ] can be derived as E[T2 ] = − dÕ2 (s)/ds at s = 0
and is given by

r1 + a(α1 + β1 )
E[T2 ] = .
r1 β1

A detailed derivation of Õ1 (·) and Õ2 (·) is described in Aggarwal et al. [3].
Using this compensating source model and the SMP bounds analysis, we
can derive the limiting distribution of the buffer contents of buffer-2. We first
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 679

obtain the effective bandwidth of source 2 (the original source into buffer-2)
using Equation 10.18 as

r2 v − α2 − β2 + (r2 v − α2 − β2 )2 + 4β2 r2 v
eb2 (v) = .
2v

Using the effective bandwidth of a general on-off source described in Section

10.1.2, we can derive the effective bandwidth of the compensating source
(eb0 (v)) as the unique solution to

Õ1 (v eb0 (v) − cv) Õ2 (v eb0 (v)) = 1.

Then we can obtain η as the solution to

eb0 (η) + eb2 (η) = c. (10.33)

Thus, the limiting distribution of the contents of buffer-2 is bounded as

LB ≤ lim P{X2 (t) > x} ≤ UB,

t→∞

where

r2 β2 1 − Õ2 (ηeb0 (η)) c

LB = e−ηx ,
eb2 (η)(α2 + β2 ) E(T1 ) + E(T2 ) eb0 (η)(c − eb0 (η))η

r2 β2 1/Õ2 (ηeb0 (η)) − 1 c

UB = e−ηx ,
eb2 (η)(α2 + β2 ) E(T1 ) + E(T2 ) eb0 (η)(c − eb0 (η))η

and η is the solution to Equation 10.33.

Policies of the type described in the preceding problem typically arise

when there is a cost for switching from one buffer to the other. This is
called a hysteretic policy (akin to hysteresis in magnetism). Usually when
this switching cost is zero, the resulting policy is typically a threshold pol-
icy where one would always serve buffer-1 when the fluid is more than
threshold a and serve buffer-2 otherwise. In contrast, the hysteretic policy
avoids frequently switching between the buffers (due to the switching cost)
and has two values (here they are a and 0) so that the switching occurs at
those values from one to the other and back. We will consider a numerical
problem in the exercises that also aims to optimize a. An interesting obser-
vation to make is that when a = 0, the policy reduces to static priority. In the
next problem, we will consider a threshold policy as opposed to a hysteretic
policy.
680 Analysis of Queues

1 C1
X*
C

C2
2

FIGURE 10.27
Two-buffer system. (From Mahabhashyam, S. et al., Oper. Res., 56(3), 728, 2008. With
permission.)

Problem 124
Consider a two-buffer fluid flow system illustrated in Figure 10.27. For
j = 1, 2, class j fluid enters buffer j according to an alternating on-off process
so that fluid enters continuously at rate rj for an exponentially distributed
time (on-times) with mean 1/αj and then no fluid enters (off-times) for
another exponential time with mean 1/βj . The on and off times continue
alternating one after the other. The buffers can hold an infinite amount of
fluid; however, the contents of only one buffer is observed, buffer-1. There
are two schedulers that drain fluid from the two buffers. Scheduler-1 has
a capacity of c1 and scheduler-2 has a capacity c2 , which are the maximum
rates the respective schedulers can drain fluid. Let Xj (t) be the amount of
fluid in buffer j (for j = 1, 2) at time t. Fluid is drained from the two buffers
in the following fashion. When X1 (t) is nonzero, scheduler-1 serves buffer-1
and when X1 (t) = 0, scheduler-1 serves buffer-2. Also, if X1 (t) is less than a
threshold x∗ , scheduler-2 removes fluid from buffer-2, otherwise it drains out
buffer-1. Assuming stability, derive bounds for the steady-state fluid level in
buffer-2.
Solution
Notice that when X1 (t) = 0, both schedulers serve buffer-2 and when
0 < X1 (t) < x∗ scheduler-1 serves buffer-1 and scheduler-2 serves buffer-2,
whereas, when X1 (t) ≥ x∗ , both schedulers serve buffer-1. Since only X1 (t)
is observed, the buffer-emptying scheme depends only on it. If Cj (t) is
the capacity available for buffer j at time t, then C1 (t) = 0 when X1 (t) = 0,
C1 (t) = c1 when 0 < X1 (t) < x∗ , and C1 (t) = c1 +c2 whenever X1 (t) ≥ x∗ . Capac-
ity available for buffer-2 at any time t is C2 (t) = c1 + c2 − C1 (t). The stability
condition for the two-buffer system in Figure 10.27 is given by:

r1 β1 r2 β2
+ < c1 + c2 .
α1 + β1 α2 + β2

We assume that both r1 and r2 are greater than c1 + c2 .

Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 681

It is possible to obtain the state probability P(X1 > x) for x > x∗ assum-
ing the buffer is stable (and as t → ∞, X1 (t) → X1 ). This analysis is described
in Problem 97 (see Chapter 9). Here we only consider bounds for P(X2 > x),
where X2 is the amount of fluid in buffer-2 in steady state. For buffer-2, the
output capacity is not only variable but also inherently dependent on con-
tents of buffer-1. The input for buffer-2 is from an exponential on-off source
but the output capacity varies from zero to (c1 + c2 ) depending on the buffer
content in buffer-1. The variation of output capacity over time (say Ô(t)) with
respect to content of buffer-1 is as follows:
⎧
⎨ 0 when X1 (t) ≥ x∗ ,
Ô(t) = c2 when 0 < X1 (t) < x∗ ,
⎩
c 1 + c2 when X1 (t) = 0.

Consider the queueing system (as depicted in Figure 10.28), where there
are two input streams and a server with a constant output capacity c1 + c2 .
The first input stream is a compensating source, where fluid enters the queue
at rate c1 +c2 − Ô(t) at time t. The second input stream is identical to source-2,
where fluid enters according to an exponential on-off process with rates r2
when on and 0 when off. The environment process that drives traffic gener-
ation for the compensating source can be modeled as a four-state SMP. Let
Z1 (t) be the environment process denoting the on-off source for buffer-1. If
source-1 is on at time t, Z1 (t) = 1 and if source-1 is off at time t, Z1 (t) = 0. Con-
sider the Markov regenerative sequence {(Yn , Sn ), n ≥ 0} where Sn is the nth
regenerative epoch, corresponding to X1 (t) equaling either x∗ or 0, and Yn
is the state immediately following the nth Markov regenerative epoch such
that
⎧
⎪
⎪ 1 if X1 (Sn ) = 0 and Z1 (Sn ) = 0,
⎪
⎨ 2 if X1 (Sn ) = 0 and Z1 (Sn ) = 1,
Yn =
⎪ 3 if
⎪ X1 (Sn ) = x∗ and Z1 (Sn ) = 0,
⎪
⎩
4 if X1 (Sn ) = x∗ and Z1 (Sn ) = 1.

Compensating source (state 1 rate 0; state 2 rate C1;

state 3 rate C1; state 4 rate C1 + C2)

C1 + C2

Original on-off source

FIGURE 10.28
Buffer-2 with compensating source. (From Mahabhashyam, S. et al., Oper. Res., 56(3), 728, 2008.
With permission.)
682 Analysis of Queues

The compensating source is governed by an underlying environment

process, which is an SMP with four states {1, 2, 3, 4}. Fluid is generated at
rates 0, c1 , c1 , and c1 + c2 when the SMP is, respectively, in states 1, 2, 3, and
4. Let r(i) be the rate of fluid generation when SMP is in state i. Therefore,
r(1) = 0, r(2) = c1 , r(3) = c1 , and r(4) = c1 +c2 . The kernel of SMP G(t) = [Gij (t)]
will be computed first, which is given by

Gij (t) = P{Y1 = j, S1 ≤ t | Yo = i}.

Since X1 (t) = 0 in State 1 and X1 (t) > x∗ in State 4, there cannot be a

direct transition from State 1 to State 4 and vice versa without actually going
through State 2 or 3 (where 0 < X1 (t) < x∗ ). Therefore, G14 (t) = G41 (t) = 0.
Also, clearly, Gii (t) = 0 for i = 1, 2, 3, 4. In addition, according to the
definitions of States 2 and 3, G23 (t) = G32 (t) = G42 (t) = G13 (t) = 0.
The kernel matrix G(t) = [Gij (t)] of the SMP is given by
⎡ ⎤
0 G12 (t) 0 0
⎢ G21 (t) G24 (t) ⎥
⎢ 0 0 ⎥
G(t) = ⎢ ⎥.
⎣ G31 (t) 0 0 G34 (t) ⎦
0 0 G43 (t) 0

The expressions G12 (t), G21 (t), G24 (t), G31 (t), G34 (t), and G43 (t) need to
be obtained. Two of them are relatively straightforward to obtain, namely,
G12 (t) and G43 (t). First consider G12 (t). This is the probability that Yn changes
from 1 to 2 before time t, which is the same as the probability of the source-1
going from off to on. Hence G12 (t) is given by

G12 (t) = 1 − e−β1 t .

Next consider G43 (t). This is the probability that the buffer-2 content goes up
from x∗ and reaches x∗ in time t. This is identical to the probability that the
buffer content starts at zero, goes up, and comes back to zero within time t,
that is, equivalent to the busy period distribution. The LST of G43 (t) can be
obtained by substituting appropriate terms in the busy period distribution
of Problem 94. Hence,
&
w+β1 +cs0 (w)
if w > w∗
G̃43 (w) = β1
∞ otherwise

where

−b − b2 + 4w(w + α1 + β1 )c(r1 − c)
s0 (w) = ,
2c(r1 − c)
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 683

√
and b = (r1 − 2c)w + (r1 − c)β1 − cα1 , w∗ = (2 cα1 β1 (r1 − c) − r1 β1 − cα1 +
cβ1 )/r1 , and c = c1 + c2 .
To obtain expressions for the remaining terms in the kernel of the SMP,
namely, G21 (t), G24 (t), G31 (t), and G34 (t), turn to Problem 93. Using
a subscript of 1 for α, β, r, and c, it is relatively straightforward to
see that

G21 (t) = H21 (0, t)

G24 (t) = H22 (0, t)

G31 (t) = H11 (x∗ , t)

G34 (t) = H12 (x∗ , t)

with the RHS expressions corresponding to the notation in Problem 93 whose

LSTs have already been derived there. Hence we have

∗ ∗
G̃31 (w) = a11 (w)eS1 (w)x ψ1 (w) + a21 (w)eS2 (w)x ψ2 (w),

G̃21 (w) = a11 (w) + a21 (w),

∗ ∗
G̃34 (w) = a12 (w)eS1 (w)x ψ1 (w) + a22 (w)eS2 (w)x ψ2 (w),

G̃24 (w) = a12 (w) + a22 (w),

where

1
−b̂ − b̂2 + 4w(w + α1 + β1 )c1 (r1 − c1 )
S1 (w) = ,
2c1 (r1 − c1 )
1
−b̂ + b̂2 + 4w(w + α1 + β1 )c1 (r1 − c1 )
S2 (w) = ,
2c1 (r1 − c1 )

with b̂ = (r1 − 2c1 )w + (r1 − c1 )β1 − c1 α1 and for i = 1, 2,

β1
ψi (w) =
w + β1 + Si (w)c1
684 Analysis of Queues

and finally,

∗
eS2 (w)x
a11 (w) = ,
δ(w)

−ψ2 (w)
a12 (w) = ,
δ(w)
∗
−eS1 (w)x
a21 (w) = ,
δ(w)

ψ1 (w)
a22 (w) = ,
δ(w)
∗ ∗
with δ(w) = eS2 (w)x ψ1 (w) − eS1 (w)x ψ2 (w).
Now that we have characterized the compensating source as an SMP,
next we consider that and the original source-2 and obtain the effective
bandwidths. The effective bandwidth of the compensating source can be
computed as follows. For a given v such that v > 0, define the matrix χ(v, u)
such that
⎡ ⎤
0 G̃12 (vu) 0 0
⎢ G̃21 (vu − c1 v) 0 0 G̃24 (vu − c1 v) ⎥
⎥.
χ(v, u) = ⎢
⎣ G̃31 (vu − c1 v) 0 0 G̃34 (vu − c1 v) ⎦
0 0 G̃43 (vu − c1 v − c2 v) 0

Let e(A) be the Perron–Frobenius eigenvalue of a square matrix A. The effec-

tive bandwidth of the compensating source for a particular positive value of
v (i.e., ebc (v)) is given by the unique solution to

e(χ(v, ebc (v))) = 1.

Having obtained the effective bandwidth of the compensating source, the

next step is to obtain that for the other source. Since the original source-2 has
a CTMC as the environment process, its effective bandwidth (i.e., eb2 (v)) is

r2 v − α2 − β2 + (r2 v − α2 − β2 )2 + 4β2 r2 v
eb2 (v) = .
2v

Using the effective bandwidths of the original source 2 (eb2 (v)) and the
compensating source (ebc (v)), η can be obtained as the unique solution to

eb2 (η) + ebc (η) = c1 + c2 .

Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 685

For notational convenience, use the following:

γ2 = eb2 (η),

γc = ebc (η).

Define (η) = χ(η, ebc (η)) such that φij (η) is the ijth element of (η). Let
h be the left eigenvector of (η) corresponding to the eigenvalue of 1, that is,

h = h(η).

Also note that h = [h1 h2 h3 h4 ]. Next, using η, h, γ2 , γc , and other parameters

defined earlier, bounds for P(X2 > x) can be obtained as

K∗ e−ηx ≤ P(X2 > x) ≤ K∗ e−ηx ,

where
4 0
r2 β2 hi 4
γ2 (α2 +β2 ) φij (η) − 1
i=1 η(r(i) − γc ) j=1
K∗ =
min (i, j):pij > 0 min (i, j),

and
4 0
r2 β2 hi 4
γ2 (α2 +β2 ) φij (η) − 1
i=1 η(r(i) − γc ) j=1
K∗ =
max (i, j):pij > 0 max (i, j)

with min (i, j) and max (i, j) derived using the values in Table 10.3.

Similarly, many other multiclass fluid queues can be analyzed using

compensating sources. Also, there are other ways to evaluate things like

TABLE 10.3
Table of max and min Values
IFR & r(i) > γc IFR & r(i) ≤ γc DFR & r(i) > γc DFR & r(i) ≤ γc
φij (−η(r(i)−γc ))τi hi τi hi λij (∞) τi hi λij (∞) φ̃ij (−η(r(i)−γc ))τi hi
max (i, j) pij pi pi (λij (∞)−η(r(i)−γc )) pi (λij (∞)−η(r(i)−γc )) pij pi

τi hi λij (∞) φ̃ij (−η(r(i)−γc ))τi hi φ̃ij (−η(r(i)−γc ))τi hi τi hi λij (∞)
min (i, j) pi (λij (∞)−η(r(i)−γc )) pij pi pij pi pi (λij (∞)−η(r(i)−γc ))

Source: Mahabhashyam, S. et al., Oper. Res., 56(3), 728, 2008. With permission.
686 Analysis of Queues

joint distributions (we mainly considered marginal distributions of buffer

contents). With that thought we conclude this chapter.

Reference Notes
The main focus of this chapter was to determine approximations and bounds
for steady-state fluid levels in infinite-sized buffers. However, we did not
present the underlying theory of large deviations that enabled this. Inter-
ested readers can refer to Shwartz and Weiss [98] as well as Ganesh et al.
[38] for an excellent treatment of large deviations. The crucial point is that
the tail events are extremely rare and, in fact, only analytical models can be
used to estimate their probabilities suitably. There are simulation techniques
too but they are typically based on a change of measure argument follow-
ing the Radon–Nikodym theorem. Details regarding change of measures can
be found in textbooks such as by Ethier and Kurtz [33]. In fact, the bounds
described in this chapter are based on some exponential change of measure
arguments in Ethier and Kurtz [33].
The common theme in this chapter is the concept of effective band-
widths (also called effective capacity). The theoretical underpinnings for
effective bandwidth is based on large deviations and we briefly touched
upon the Gärtner–Ellis condition. Further details on the Gärtner–Ellis con-
ditions can be found in Kesidis et al. [61] and the references therein. An
excellent tutorial on effective bandwidths is Kelly [60] and it takes a some-
what different approach defining it at time t whereas what we present is
based on letting t → ∞. Our results are mainly based on Elwalid and Mitra
[30], Kesidis et al. [61], and Kulkarni [68] that show how to compute effec-
tive bandwidths of several types of traffic flows. Further, Chang and Thomas
[16], Chang and Zajic [17], and de Veciana et al. [23] explain effective
bandwidth computations for outputs from queues and extend the results
to networks.
Once we know how to compute effective bandwidths, they can be used
for approximating buffer content distributions. Recall from Section 9.2.3
that buffer content distributions can be obtained only when the sources are
CTMCs (based on Anick et al. [5], Elwalid and Mitra [28, 29], and Kulkarni
[69]). The effective bandwidth approximation lends itself well for computing
the tail distributions. The results presented in this chapter (Section 10.2.1)
are summarized from Elwalid and Mitra [30], Kesidis et al. [61], Krishnan
et al. [65], and Kulkarni [68]. These results were fine-tuned by Elwalid et
al. [31, 32] by considering Chernoff bounds (hence the CDE approximation
in Section 10.2.2). Although effective bandwidth and CDE approximations
are mainly for the tail probabilities, exponential bounds on the buffer con-
tent analysis (called SMP bounds because these require the sources to be
Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 687

SMPs) described in Section 10.2.3 are based on Palmowski and Rolski [87, 88]
and Gautam et al. [39].
These results can be suitably extended to multiclass fluid queues, where
each class of fluid has a dedicated buffer. Perhaps the most well-studied pol-
icy is the priority rule that gained popularity to aid differentiated services
and is based on its discrete counterpart. For a comprehensive study on effec-
tive bandwidths with priorities, see Berger and Whitt [9, 10], and Gautam
and Kulkarni [70]. Policies such as generalized processor sharing are consid-
ered in de Veciana et al. [22, 24]. The results on timed round-robin policy and
its comparison with static priority are based on Gautam and Kulkarni [40].
The analysis on threshold-based policies is based on Mahabhashyam et al.
[76] and Aggarwal et al. [3].

Exercises
10.1 Consider a fluid source driven by a three-state CTMC environ-
ment process {Z(t), t ≥ 0} with generator matrix and rate matrix
given by

⎡ ⎤ ⎡ ⎤
−β β 0 0 0 0
⎢ ⎥ ⎢ 0 r 0 ⎥
Q = ⎣ γ −γ − δ δ ⎦ and R=⎣ 1 ⎦.
0 α −α 0 0 r2

Derive the effective bandwidth of this source in terms of parameter

v by computing the largest real positive eigenvalue of Q/v + R.
10.2 Consider two independent on-off sources that input fluid into a
buffer. The on and off times for source i (for i = 1, 2) are expo-
nentially distributed with parameters αi and βi , respectively. Also
when source i is on, fluid is generated at rate ri .
(a) Write down the effective bandwidth of each of the sources, let
us call them eb1 (v) and eb2 (v).
(b) Now you can think of the input sources as a single four-state
CTMC environment process {Z(t), t ≥ 0} with generator matrix
and rate matrix given by

⎡ ⎤
−β1 − β2 β2 β1 0
⎢ −α2 − β1 ⎥
⎢ α2 0 β1 ⎥
Q= ⎢ ⎥
⎣ α1 0 −α1 − β2 β2 ⎦
0 α1 α2 −α1 − α2
688 Analysis of Queues

and
⎡ ⎤
0 0 0 0
⎢ 0 r2 0 0 ⎥
⎢ ⎥
R= ⎢ ⎥.
⎣ 0 0 r1 0 ⎦
0 0 0 r1 + r2

Note that the four states correspond to: both sources on, source-
1 off and 2 on, source-2 off and source-1 on, and both sources
off. Compute the effective bandwidth of this source, call it eb(v).
(c) Show that the algebraic expression for eb(v) is identical to the
effective bandwidth of the net input to the buffer eb1 (v) + eb2 (v).
10.3 Consider a single buffer fluid model with input from an on-off
source with hyperexponential on-time CDF (for x ≥ 0)

U(x) = 0.5(1 − e−2x ) + 0.3(1 − e−4x ) + 0.2(1 − e−x )

and Erlang off-time CDF (for x ≥ 0)

D(x) = 1 − e−2x − 2xe−2x .

When the source is on, fluid enters the buffer at rate r = 3 Mbps and
when the source is off, no fluid enters. The output channel capacity
c = 2 Mbps.
(a) Compute Ũ(s) and D̃(s), the LSTs of U(x) and D(x), respec-
tively.
(b) The tail probability of the limiting buffer contents P{X > x} for
very large x can be obtained using effective bandwidths as

P{X > x} ≈ e−ηx .

Show that η can be computed as the smallest real positive

solution to

Ũ(−η(r − c)) D̃(ηc) = 1.

(c) Write a computer program and obtain the value of η numeri-

cally.
(d) For x = 4 Mb, compute the probability P{X > x} by approximat-
ing it as

P{X > x} ≈ e−ηx .

Stochastic Fluid-Flow Queues: Bounds and Tail Asymptotics 689

10.4 Consider a single buffer system with K independent and identical

on-off sources with on-times and off-times distributed exponen-
tially with parameters α and β, respectively. Each source produces
fluid at rate r when it is on and at rate 0 when it is off. The buffer
has output capacity c. Assume that for stability Krβ/(α + β) < c.
Show that
⎧ !1 " 1
⎨ β cα β(Kr−c) Krβ
−1 + α
1− if < c < Kr
v∗ = r β(Kr−c) r cα α+β
⎩
∞ if c ≥ Kr.

Obtain the effective bandwidth of the output from the buffer in

terms of v.
10.5 Consider a series network of three nodes in tandem, with each node
having an infinite-sized buffer. Say you know the effective band-
width of the source inputing traffic to buffer-1 (i.e., eb1 (v)). The
traffic is processed at node 1 at a maximum capacity of c1 . This
traffic flows into node 2. It gets processed at a maximum capacity
of c2 and then flows into node 3. At node 3, the processor uses a
capacity c3 to process traffic, which then exits the system. Using
effective bandwidth approximations explain how you would com-
pute the probability that none of buffers 1, 2, and 3 have more than
B1 , B2 , and B3 amount of fluids in them.
10.6 Effective bandwidths for discrete time fluid sources: Fluid arrives from
a source at discrete time slots 1, 2, . . . such that during a time slot, a
random amount of fluid (distributed exponentially with mean 1/λ)
is generated by the source. Say Yi is the amount of fluid gener-
ated in the ith slot, then P(Yi ≤ y) = 1 − e−λy . Define An as the total
amount of fluid generated from slot 1 to n by the source. In fact,
An = Y1 + Y2 + · · · + Yn . The effective bandwidth definition for
discrete time sources is given by

1
eb(v) = lim log E{exp(vAn )}.
n→∞ vn

Compute eb(v) for the preceding source in terms of λ and v.

10.7 A multirate on-off source alternates between on and off states but
when it is on, it can generate fluid at different rates. The off-times
are according
√ to an Erlang distribution with mean 6 and standard
deviation 6. At the end of an off period the source switches to
on-state-i with probability i/6 for i = 1, 2, 3. In on-state-i, fluid is
generated by the source at rate 12/i for i = 1, 2, 3. The sojourn times
in the ith on state is according to an Erlang distribution with mean
690 Analysis of Queues

√
1 and standard deviation 0.5/ i. Graph the effective bandwidth of
this source eb(v) versus v for v ∈ [0, 2].
10.8 Recall the in-tree network in Figure 10.10 considered in Problem
108. It is desired that the probability of exceeding buffer level of
b = 14 kB must be less that 0.000001 in all seven buffers. Using effec-
tive bandwidth approximation, design the smallest output capacity
cj for j = 1, . . . , 7 to achieve such a quality of service. Use the same
numerical values of α = 5, β = 1, and r = 6.
10.9 Consider Problem 110 and obtain bounds for P(X > x) using SMP
bounds. Compare the results against those based on CDE approxi-
mation described in Problem 110.
10.10 Solve Problem 117 using CDE approximation. In particular, using
CDE approximation, obtain expressions for P(Xj > Bj ) for all
j = 1, . . . , N. Then graph the fraction of fluid lost assuming that the
size of the buffer is Bj . Compare against SMP bounds and effective
bandwidth approximation results presented in Problem 117.
10.11 Consider a static priority policy to empty fluids from three buffers
with buffer-1 given the highest priority. Into buffer i, fluid enters
from a general on-off source with on-time CDF pi (1 − e−3t ) + (1 −
pi )(1 − e−4t ) and off-time CDF 1 − e−t − te−t for i = 1, 2, 3. Also,
p1 = 0.5, p2 = 0.4, and p3 = 0.2. Traffic is generated at the rate of 2
per unit time when any source is on. The output capacity c = 1.8.
Using effective bandwidth approximations determine expressions
for the probability that each of the buffers would exceed a level x
in steady state.
10.12 For the setting in Problem 123, assume that there is a cost of
Cs to switch from one buffer to another. What is the optimal
value of a that would minimize the long-run average cost per
unit time subject to satisfying the constraints that P(Xj > Bj ) < j
for j = 1, 2. Use UB to ensure the constraint is satisfied. Illus-
trate the optimal solution for the following numerical val-
ues: β1 = 2, α1 = 8, r1 = 2.645, β2 = 3, α2 = 9, r2 = 1.87, c = 1.06, and
Cs = 100. Also, B1 = 2.5, B2 = 8, 1 = 0.001, and 2 = 0.01.
Appendix A: Random Variables

This appendix chapter acts as a refresher to the topic of random variables

typically covered in an elementary course on probability as well as describes
some minor extensions. In addition, the contents of this chapter can be used
as a reference when the corresponding results are stated in various portions
of this textbook. Further, this chapter acts as a point of clarification for nota-
tion used in this book. For detailed explanation of the derivation of some of
the results, the readers are encouraged to consult an elementary probability
and stochastic process book. We begin the chapter by characterizing random
variables.

A.1 Distribution and Moments

Any random variable X defined on a subset of the real line can be character-
ized by its cumulative distribution function (CDF), which we denote as FX (x)
and defined as

FX (x) = P{X ≤ x}

for all x ∈ (−∞, ∞). In this book, we drop X from FX (x) and just call the
CDF as F(x) especially where there is only one random variable being consid-
ered. There are two basic types of random variables: discrete and continuous.
Discrete random variables are defined on some discrete points on the real
line. However, continuous random variables are defined on a set of open
intervals on a real line but not on any discrete points. Of course, there is
a class of random variables called mixture or hybrid random variables that
are a combination of discrete and continuous random variables. The CDF for
these random variables have jumps (or discontinuities) at the discrete points.
Let D be the set of discrete points for a mixture random variable X, and for
all x ∈ D, let px = p{X = x}, be the magnitude of jumps.
We proceed with a generic mixture random variable with the under-
standing that both continuous and discrete random variables are special
cases corresponding to D = ∅ and px = 1, respectively. Thus, let X
x∈D
be a mixture random variable with CDF F(x) and discrete-point set D. The
expected value of X is defined as
∞
E[X] = xdF(x) + xpx
−∞ x∈D

691
692 Analysis of Queues

where the integral is Riemann type (however, it is indeed derived using the
Lebesgue integral). We present an example to illustrate.

Problem 125
Let X be a random variable that denotes Internet packet size for TCP trans-
missions (in bytes). On the basis of empirical evidence, say the CDF of X is
modeled as
⎧
⎪
⎪ 0 if x < 40
⎨ √
a x+b if 40 < x < 576
F(x) =
⎪
⎪ 0.0001x + 0.6424 if 576 < x < 1500
⎩
1 if x > 1500

where √
a = 0.3/24 −√ 40
b = 0.25 − a 40

Notice that there are jumps or discontinuities in the CDF at 40, 576, and
1500 bytes. Hence, D = {40, 576, 1500} with px given by 0.25, 0.15, and 0.2076
for x = 40, x = 576, and x = 1500, respectively. Compute E[X].
Solution
Using the definition

∞
E[X] = xdF(x) + xpx
−∞ x∈D

with D = {40, 576, 1500} and px given in the problem statement, we have

40
576
1500
E[X] = xdF(x) + xdF(x) + xdF(x)
−∞ 40 576

∞
+ xdF(x) + 40p40 + 576p576 + 1500p1500
1500

576
√
1500
=0+ 0.5a xdx + 0.0001xdx + 0 + 40
40 576

× 0.25 + 576 × 0.15 + 1500 × 0.2076

= 580.4901 bytes.
Random Variables 693

Further, the rth moment of a mixed random variable X is

∞
E[Xr ] = xr dF(x) + xr px
−∞ x∈D

with r = 1 corresponding to the standard expected value E[X]. The variance

of X can be computed as V[X] = E[X2 ] − {E[X]}2 . That said, for the remainder
of this chapter, we will only focus on discrete or continuous random vari-
ables but not on mixed ones. They are sometimes more elegantly handled
individually. We present that in the following two sections.

A.1.1 Discrete Random Variables

A discrete random variable X is also characterized by its probability mass
function (PMF) p(x) = P{X = x} for all values of x for which X is defined, say
D. The
PMF satisfies two important properties: 0 ≤ p(x) ≤ 1 for all x ∈ D
and p(x) = 1. Sometimes we write the PMF of X as pX (x); however,
x∈D
we drop the X if it is the only random variable under consideration. We can
compute the rth moment as

E[Xr ] = xr p(x).
x∈D

Next, we describe some commonly used discrete random variables, their

PMFs, means, and variances.

1. Discrete uniform distribution

• Description: If a random variable X assumes the values x1 ,
x2 , . . ., xk , with equal probabilities, then X is a discrete
uniform random variable.
• PMF:
1
p(x) = , x = x 1 , x2 , . . . , x k .
k
• Mean:
k
xi
i=1
E[X] = .
k
• Variance:
k
(xi − E[X])2
i=1
V[X] = .
k
694 Analysis of Queues

2. Bernoulli distribution
• Description: A Bernoulli trial can result in a success with
probability p and a failure with probability q with q = 1 − p.
Then the random variable X, which takes on 0 if the trial is
a failure and 1 if the trial is a success, is called the Bernoulli
random variable with parameter p.
• PMF:

p(x) = px + (1 − p)(1 − x), x = 0, 1.

• Mean:

E[X] = p.

• Variance:

V[X] = p(1 − p).

3. Binomial distribution
• Description: A Bernoulli trial can result in a success with
probability p and a failure with probability q with q = 1 − p.
Then the random variable X, the number of successes in n
independent Bernoulli trials is called the binomial random
variable with parameters n and p.
• PMF:

n x n−x
p(x) = p q , x = 0, 1, 2, . . . , n.
x

• Mean:

E[X] = np.

• Variance:

V[X] = npq.

4. Geometric distribution
• Description: A Bernoulli trial can result in a success with
probability p and a failure with probability q with q = 1 − p.
Then the random variable X, denoting the number of
Bernoulli trials until a success is obtained is the geometric
random variable with parameter p.
Random Variables 695

• PMF:

p(x) = pqx−1 , x = 1, 2, . . . .

• Mean:
1
E[X] = .
p

• Variance:
1−p
V[X] = .
p2

5. Negative binomial distribution

• Description: A Bernoulli trial can result in a success with
probability p and a failure with probability q with q = 1 − p.
Then the random variable X, denoting the number of the
Bernoulli trial on which the kth success occurs is the nega-
tive binomial random variable with parameters k and p.
• PMF:

x − 1 k x−k
p(x) = p q , x = k, k + 1, k + 2, . . . .
k−1

• Mean:

k
E[X] = .
p

• Variance:

k(1 − p)
V[X] = .
p2

6. Hypergeometric distribution
• Description: A random sample of size n is selected without
replacement from N items. Of the N items, k may be classi-
fied as successes and N − k are classified as failures. The
number of successes, X, in this random sample of size n is
a hypergeometric random variable with parameters N, n,
and k.
• PMF:
k
N−k

x n−x
p(x) = N
, x = 0, 1, 2, . . . , n.
n
696 Analysis of Queues

• Mean:

nk
E[X] = .
N

• Variance:

N−n k k
V[X] = n 1− .
N−1 N N

7. Poisson distribution
• Description: A Poisson random variable X with parameter
λ, if its PMF is given by

e−λ (λ)x
p(x) = , x = 0, 1, 2, . . .
x!

• Mean:

E[X] = λ.

• Variance:

V[X] = λ.

8. Zipf distribution
• Description: A random variable X with Zipf distribution
taking on values 1, 2, . . ., n has a PMF of the form

1/xs
p(x) = n s
, x = 1, 2, . . . , n
i=1 1/i

where the parameter s is called the exponent.

• Mean:
n
1/is−1
E[X] = i=1
n s
.
i=1 1/i

• Variance:
n
1/is−2
V[X] = i=1
n s
− {E[X]}2 .
i=1 1/i

It is worthwhile for the reader to verify that the PMF properties are sat-
isfied as well as verify E[X] and V[X] expressions for the various discrete
Random Variables 697

random variables. Note that unlike the other distributions, Poisson and Zipf
distributions descriptions are not based out of a random experiment. With
these few examples, we move to continuous distributions.

A.1.2 Continuous Random Variables

A continuous random variable X is also characterized by its probability
density function (PDF) f (x) = dF(x)/dx. The PDF satisfies two important
∞
properties: f (x) ≥ 0 for all x and f (x) = 1. Generally, we write the PDF of
−∞
X as fX (x); however, we drop the X when it is the only random variable under
consideration. To compute probabilities of events such as X being between
x1 and x2 (such that −∞ ≤ x1 < x2 ≤ ∞), we have

x2
P{x1 < X < x2 } = f (x)dx = F(x2 ) − F(x1 ).
x1

Also, we can compute the rth moment as

∞
E[Xr ] = xr f (x)dx.
−∞

Next, we describe some commonly used continuous random variables, their

PDFs, CDFs, means, and variances.

1. Exponential distribution (with parameter λ)

• PDF:
−λx
λe , x>0
f (x) =
0 elsewhere

• CDF:

1 − e−λx , x > 0
F(x) =
0 elsewhere

• Mean:

1
E[X] =
λ

• Variance:

1
V[X] =
λ2
698 Analysis of Queues

2. Erlang distribution (with parameters λ and k)

• PDF:

λ (λx)
k−1
−λx x>0
f (x) = (k−1)! e
0 elsewhere

• CDF:

⎧
⎪
⎨
k−1
(λx)r
1− e−λx x>0
F(x) = r!
⎪
⎩ r=0
0 elsewhere

• Mean:

k
E[X] =
λ

• Variance:

k
V[X] =
λ2

3. Gamma distribution (with parameters α and β)

• PDF:

1 α−1 e−x/β
βα (α) x x>0
f (x) =
0 elsewhere

∞
where (α) = xα−1 e−x dx. If α is an integer, then
0
(α) = (α − 1)!.
• CDF: There is no closed-form expression for the CDF in
the generic case (exception is when α is an integer). For
numerical values, use tables or software packages.
Random Variables 699

• Mean:

E[X] = αβ

• Variance:

V[X] = αβ2

4. Continuous uniform distribution (with parameters a and b)

• PDF:
1
b−a , a≤x≤b
f (x) =
0 elsewhere

• CDF:
⎧
⎨ 0 x<a
x−a
F(x) = b−a , a≤x≤b
⎩
1 x>b

• Mean:

a+b
E[X] =
2

• Variance:

(b − a)2
V[X] =
12

5. Weibull distribution (with parameters α and β)

• PDF:

αβxβ−1 e−αx
β
x>0
f (x) =
0 elsewhere

• CDF:

1 − e−αx
β
x>0
F(x) =
0 elsewhere

• Mean:

1
E[X] = α−1/β 1 +
β
700 Analysis of Queues

The function (·) is described alongside the gamma distri-

bution.
• Variance:

−2/β 2 1 2
V[X] = α 1+ − 1+
β β

6. Normal distribution (with parameters μ and σ2 )

• PDF:

1
e−(1/2)[(x−μ)/σ]
2
f (x) = √ −∞ < x < ∞
2πσ

• CDF: There is no closed-form expression for the CDF. For

numerical values, use tables or software packages.
• Mean:

E[X] = μ

• Variance:

V[X] = σ2

7. Chi-squared distribution (with parameter v)

• Special case of the gamma distribution where α = v/2 and
β = 2, where v is a positive integer known as the degrees of
freedom.
• PDF:

1
2v/2 (v/2)
xv/2−1 e−x/2 x > 0
f (x) =
0 elsewhere
∞
where (a) = xa−1 e−x dx.
0
• CDF: There is no closed-form expression for the CDF. For
numerical values (except in special cases), use tables or
software packages.
• Mean:

E[X] = v

• Variance:

V[X] = 2v
Random Variables 701

8. Log-normal distribution (with parameters μ and σ2 )

• A continuous random variable X has a log-normal dis-
tribution if the random variable Y = ln(X) has a normal
distribution with mean μ and standard deviation σ.
• PDF:

e−(1/2)[(ln(x)−μ)/σ] x > 0
2
√ 1
f (x) = 2πxσ
0 elsewhere

• CDF: There is no closed-form expression for the CDF. For

numerical values, use tables or software packages.
• Mean:
2 /2
E[X] = eμ+σ

• Variance:
2 2
V[X] = e2μ+σ (eσ − 1)

9. Beta distribution (with parameters α and β)

• PDF:

(α+β) α−1
(α)(β) x (1 − x)β−1 0<x<1
f (x) =
0 elsewhere
∞
where (a) = xa−1 e−x dx.
0
• CDF: There is no closed-form expression for the CDF. For
numerical values, use tables or software packages.
• Mean:
α
E[X] =
α+β

• Variance:

αβ
V[X] =
(α + β)2 (α + β + 1)

10. Pareto distribution (with parameters K and β)

• PDF:

βKβ
x≥K>0
f (x) = xβ+1
0 elsewhere.
702 Analysis of Queues

• CDF:
β
K
F(x) = 1− x x≥K>0
0 elsewhere.

• Mean:
Kβ
E[X] = ifβ > 1 while E[X] = ∞ if β ≤ 1.
β−1

• Variance:

K2 β
V[X] = if β > 2 while V[X] = ∞ if β ≤ 2.
(β − 1)2 (β − 2)

It is worthwhile to verify for each of the distributions that the PDF prop-
erties are satisfied. Also, compute E[X] and V[X] using the definitions and
verify that expressions given for the various distributions. Before wrapping
up, we briefly mention another metric that is frequently used in queueing
called coefficient of variation (COV).

A.1.3 Coefﬁcient of Variation

COV is a concept that is simple and gives an excellent platform to compare
the variability in positive-valued random quantities. It is a dimensionless
quantity that is a normalized measure of the variability. However, there
are many things one has to be careful about and often misunderstood. We
describe them next.

• COV is only defined for positive-valued random variables X and it

is equal to
√
V[X]
COV[X] = .
E[X]

For example, the notion of COV does not exist for a normal random
variable, say with mean 0 and variance 1.
• Exponential is not the only distribution √ with COV of 1. For example,
a Pareto random variable with β = 1 + 2 and any K > 0 has a COV
of 1. It is incorrect to use M/G/1 results for a G/G/1 queue with
COV of arrivals equal to 1. The results will match only when the
interarrival times are exponential.
• The relationship between COV and the hazard rate function of a
positive-valued continuous random variable needs to be stated very
carefully. First, let us define the hazard (or failure) rate function h(x)
Random Variables 703

of a positive-valued continuous random variable X with PDF f (x)

and CDF F(x) as

f (x)
h(x) =
1 − F(x)

for all x where f (x) > 0. Several references (e.g., Tijms [102] on page
438 and Wierman et al. [107] in Lemma 1) state that for a positive-
valued continuous random variable X with hazard rate function
h(x), if h(x) is increasing with x, then COV[X] ≤ 1 and h(x) is decreas-
ing with x then COV[X] ≥ 1. The preceding result is extremely useful
and intuitive, but one has to be careful to interpret it and use it. For
example, if one considers a Pareto
√ random variable, h(x) is decreas-
ing for all x ≥ K but if β > 2 + 1, then the COV is less than 1.
Although it appears to be contradicting the preceding result, if one
defines h(x) for all x ≥ 0, not just x ≥ K where f (x) > 0 for Pareto, then
h(x) is not decreasing for all x > 0 and the preceding result is valid.
Also, it is crucial to realize that the result goes only in one direction
(i.e., if h(x) is increasing or decreasing for all x ≥ 0, then COV would
be <1 or >1). Knowing the COV does not reveal the monotonicity of
the hazard rate function.

In addition, COV is different from “covariance,” which is defined as

E[XY] − E[X]E[Y] for random variables X and Y. To avoid that confusion, in
most instances in this book, we use squared coefficient of variation (SCOV)
which is the square of COV, that is, the ratio of variance to square of the
mean of a positive-valued random variable. As we conclude this section, it
is worthwhile noting that nonnegative random variables are what we use
predominantly in queueing. Thus, we give special considerations for non-
negative random variables. We just saw one such consideration, namely,
COV. In the following section, we will describe generating functions (GFs)
and transforms for discrete and continuous random variables, respectively.

A.2 Generating Functions and Transforms

Oftentimes while analyzing some queueing systems, it is not easy to obtain
the distributions of random variables (such as CDF, PDF, or PMF). In those
cases, GFs or transforms can be obtained, depending on whether the random
variables are discrete or continuous, respectively. We next define GFs and
subsequently two types of transforms, namely, Laplace–Stieltjes transform
(LST) and Laplace transform (LT).
704 Analysis of Queues

A.2.1 Generating Functions

Let X be a nonnegative integer-valued (discrete) random variable with PMF
pX (x) = P{X = x} = px for x = 0, 1, 2, . . . (with the understanding that for the
finite case where X ≤ a, the expressions of px for x > a will be zero). The GF
(for any complex z with |z| ≤ 1) is defined by

(z) = p0 + p1 z + p2 z2 + · · ·

∞

= pj zj
j=0

= E zX .

For example, if X ∼ Binomial(n, p), then (z) = (pz + 1 − p)n . Also, if X ∼

Geometric(p), then (z) = pz/[1 − (1 − p)z]. Further, if X ∼ Poisson(λ), then
(z) = e−λ(1−z) . It would be a worthwhile exercise to verify them. However,
what is crucial to note is that we usually do not know the PMF pX (x) but we
are able to compute (z). The question to ask is if we can go from (z) to
pX (x), or at least E[X] and V[X].
To do that, the following are some of the important properties:

(1) = 1

(1) = E[X]

(1) = E[X(X − 1)]

= E X2 − E[X]

= V[X] + {E[X]}2 − E[X]

where (z) and (z) are the first and second derivatives of (z) with
respect to z. Also, in many instances, when we obtain (z), there
would be an unknown parameter that can be resolved by using the result
(1) = 1.
A few other properties of a GF are described as follows:

1. GF uniquely identifies a distribution function. That means if you are

given a GF, one can uniquely determine the PMF for the random
variable.
Random Variables 705

2. The PMF can be derived from the GF using

1 dk
P{X = k} = (z)|z=0 .
k! dzk

3. Moments of the random variables can be derived using

dr
E[Xr ] = (z)|z=1
dzr

where Xr is the rth factorial power of X and written as Xr =

X(X − 1) . . . (X − r + 1).
4. When the limits exist, the following hold:

lim P{X = k} = lim (1 − z)(z),

k→∞ z→1

1
k
lim P{X = i} = lim (1 − z)(z).
k→∞ k z→1
i=0

See Chapter 2 for several examples of GFs used in contexts of queues. Next,
we move to continuous random variables, which can also be described using
GFs; however, for the purposes of this book, we mainly use transforms.

A.2.2 Laplace–Stieltjes Transforms

Consider a nonnegative-valued random variable X. The LST of X is given by

F̃X (s) = E e−sX .

Therefore, if X is continuous, mathematically, the LST can be written (and

computed) as

∞ ∞
F̃X (s) = e−sx dFX (x) = e−sx fX (x)dx
0 0

where fX (x) is the PDF of the random variable X. However, if X has a mixture
of discrete and continuous parts, then E[e−sX ] computation must be suitably
adjusted as described in Section A.1. For the remainder of this section, we
will assume X is continuous without any discrete parts.
Some examples of continuous random variables where LST of their
CDF can be computed are as follows: If X ∼ exp(λ), then F̃X (s) = λ/(λ + s);
if X ∼ Erlang(λ, k), then F̃X (s) = (λ/(λ + s))k ; if X ∼ Unif (0, 1), then F̃X (s) =
706 Analysis of Queues

1 − e−s /s. Similar to the PDF and CDF, for the LSTs too we drop the X and
say F̃(s) as the LST.
A few properties of LSTs are described as follows:

1. LST uniquely identifies a distribution function. That means if one

is given an LST, one can uniquely determine the CDF for the ran-
dom variable. However, obtaining the CDF or PDF from the LST is
not easy except under three circumstances: (1) when only numeri-
cal inversion is needed (for details see Abate and Whitt [2] based on
which there are several software packages available), (2) when the
LST can be written as partial fractions of the form given in one of the
earlier examples, or (3) when the corresponding LT and (converting
LST to LT will be discussed in the following section) can be inverted.
2. Moments of the random variables can be directly computed using

dr
E[Xr ] = (−1)r F̃(s)|s=0 .
dsr
3. The following properties of LSTs can be extended to any function
F(x) defined for x ≥ 0 (not just CDFs):
a. Let F, G, and H be functions with nonnegative domain and range.
Further, for scalars a and b, let H(x) = aF(x) + bG(x). Then
H̃(s) = aF̃(s) + bG̃(s).
b. Let F(x), G(x), and H(x) be functions of x with nonnegative
domain and range such that F(0) = G(0) = H(0) = 0. In addition,
assume that F(x), G(x), and H(x) either grow slower than esx or
are bounded. Letting

x x
H(x) = F(x − u)dG(u) = G(x − u)dF(u),
0 0

we have H̃(s) = F̃(s)G̃(s).

c. When the limits exist, the following hold:

lim F(t) = lim F̃(s),

t→∞ s→0

lim F(t) = lim F̃(s),

t→0 s→∞

F(t)
lim = lim sF̃(s).
t→∞ t s→0

LSTs are used throughout this book starting with Chapter 2. Examples of
their use can be found there. The main purpose is that in many instances the
Random Variables 707

CDF F(x) of a random variable X cannot be directly obtained but it may be

possible to obtain the LST F̃(s). Using the LST sometimes we can obtain F(x),
however, we can always easily obtain E[X] and V[X] by taking derivatives.
Next, we describe a closely related transform.

A.2.3 Laplace Transforms

Since LSTs can be interpreted as the expected value of a function of a random
variable, they have been widely used in applied probability. However, in
other fields, the LT is more widespread than LST. For any positive-valued
Riemann integrable function F(x) (not necessarily a CDF, although it can be)
the LT is

∞
F∗ (s) = e−sx F(x)dx.
0

We present some examples. As a trivial example, if F(x) = 1, then F∗ (s) = 1/s.

Further, if F(x) = xn−1 e−ax /(n − 1)!, then F∗ (s) = 1/(s + a)n . Special cases such
as n = 1, a = 0, and n = 2. can be used in this relation to get LTs of x, xn ,
e−ax , etc. As a final example, if F(x) = sin(ax), then F∗ (s) = a/(s2 + a2 ). There
are several websites and books that describe LTs for many other functions
(unlike LSTs which are not as widespread). For that reason, it may be a good
idea to convert LSTs to LTs and then invert them. The relation between the
LT and LST for a function F(x) defined for x ≥ 0 is

F̃(s) = sF∗ (s)

provided F(0) = 0 and F(x) either grows slower than esx or is bounded.
A few properties of LTs are described:

1. Let F, G, and H be positive-valued functions. Also, for scalars a and

b, H(x) = aF(x) + bG(x). Then H∗ (s) = aF∗ (s) + bG∗ (s).
2. Let f (x) = dF(x)/dx (it is not necessary that F(x) is a CDF and f (x)
is a PDF). An extremely useful relation especially while solving
differential equations is

f ∗ (s) = sF∗ (s) − F(0).

x
3. Let R(x) = 0 F(u)du, then

F∗ (s)
R∗ (s) = .
s
708 Analysis of Queues

4. Let T(x) = xn F(x), then

dn ∗
T∗ (s) = (−1)n F (s).
dsn

A.3 Conditional Random Variables

In many instances, we can obtain the distribution of a random variable or its
moments when we know distributions of other random variables that govern
its realization. In fact that is the reason sometimes we know the GF or LST of
a random variable as we saw in the previous section. The question we seek
to answer in this section is if we know the distribution of certain random
variables, could we obtain that of others’. For example, if we know the dis-
tribution of interarrival times and service times as well as the steady-state
number in a queueing system, could we obtain the steady-state distribu-
tion of the sojourn times? We have seen earlier in this book that is possible,
however, what enabled that is the study of conditional random variables.
In a typical probability book, one usually encounters the topic of condi-
tional random variables along with multiple random variables by studying
their joint, conditional, and marginal distributions. Interested readers are
encouraged to review appropriate texts in that regard. Here we restrict our-
selves to situations more commonly found in queueing systems. Also, for the
most part we only concentrate on the bivariate case, that is, when we have
two random variables. We describe that next.

A.3.1 Obtaining Probabilities

Consider two random variables X and Y. Say we know the distribution of
only X and we are interested in obtaining the distribution of Y. Here we also
assume that if we are given X, we can characterize Y. In other words, we can
obtain P{Y ≤ y|X = x}. Thus, if we know P{Y ≤ y|X = x} for all x and y as
well as the PMF pX (x) or PDF fX (x) of X depending on whether X is discrete
or continuous, respectively, we can obtain the CDF of Y as
⎧
⎪
⎪ P{Y ≤ y|X = x}pX (x) if X is discrete,
⎪
⎨ x
FY (y) = P{Y ≤ y} = ∞
⎪
⎪ P{Y ≤ y|X = x}fX (x)dx
⎪
⎩ if X is continuous.
−∞
(A.1)

This result is essentially the law of total probability that is typically dealt in
an elementary probability book or course. We illustrate this expression using
Random Variables 709

some examples. In many texts one would find examples where X and Y are
either both discrete or both continuous and we encourage the readers to refer
to them. However, we present examples where one of X or Y is discrete and
the other is continuous.

Problem 126
A call center receives calls from three classes of customers. The service times
are class-dependent: for class-1 calls they are exponentially distributed with
mean 3 min; for class-2 calls they are according to an Erlang distribution
with mean 4 min and standard deviation 2 min; and for class-3 calls they
are according to a uniform distribution between 2 and 5 min. Compute the
distribution of the service time of an arbitrary caller if we know that the
probability the caller is of class i is i/6 for i = 1, 2, 3.
Solution
Let Y be a continuous time random variable denoting the service time in
minutes of the arbitrary caller and X be a discrete random variable denot-
ing the class of that caller. From the problem statement, we know that
P(X = 1) = 1/6, P(X = 2) = 1/3, and P(X = 3) = 1/2. We also can write down
(based on the problem description) that for any y ≥ 0,

P{Y ≤ y|X = 1} = 1 − e−y/3

y2 y3
P{Y ≤ y|X = 2} = 1 − 1 + y + + e−y
2 6

y−2
P{Y ≤ y|X = 3} = min , 1 I(y > 2)
3

where I(y > 2) is an indicator function that is 1 if y > 2 and 0 if y ≤ 2. Thus,

the CDF of Y based on Equation A.1 is given by

y2 y3
(1 − e−y/3 ) 1− 1+y+ 2 + 6 e−y
FY (y) = +
6 3

y−2 I(y > 2)
+ min ,1
3 2

for all y ≥ 0.

Having described an example where Y is continuous and X is discrete,

next we consider a case where Y is discrete while X is continuous. When Y is
710 Analysis of Queues

discrete, we can convert CDF Equation A.1 to the PMF equation as

⎧
⎪
⎪ P{Y = y|X = x}pX (x) if X is discrete,
⎪
⎨ x
pY (y) = P{Y = y} = ∞
⎪
⎪ P{Y = y|X = x}fX (x)dx
⎪
⎩ if X is continuous.
−∞

With that understanding we describe the next problem.

Problem 127
The price of an airline ticket on a given day is modeled as a continuous ran-
dom variable X, which is according to a Pareto distribution with parameters
K and β (where β is an integer greater than 1 in this problem). The demand
for leisure tickets during a single day follows a Poisson distribution with
parameter C/X. What is the probability that the demand for leisure tickets
on a given day is r?
Solution
Let Y be a random variable that denotes the demand for leisure tickets on a
given day. We want P{Y = r}. To compute that, we use the fact that we know

(C/x)r
P{Y = r|X = x} = e−C/x .
r!

Also, for x ≥ K, we have

βKβ
fX (x) =
xβ+1

and fX (x) = 0 when X < K. Thus, we have

∞
P{Y = r} = P{Y = y|X = x}fX (x)dx
−∞

∞ (C/x)r βKβ
= e−C/x dx
r! xβ+1
K

1/K
(Ct)r
= e−Ct βKβ tβ−1 dt
r!
0
Random Variables 711

β 1/K

(r + β − 1)! K (Ct)r+β−1
= β e−Ct C dt
r! C (r + β − 1)!
0
⎛ ⎞
β
r+β−1
(r + β − 1)! K ⎝1 − −C/K (C/K)j
⎠
= β e
r! C j!
j=0

where the step in the middle is by making a transformation of variables

letting 1/x = t.

A.3.2 Obtaining Expected Values

Here we extend the results for conditional probabilities to conditional expec-
tation. Once again, let X and Y be random variables such that we know the
distribution of X and we are interested in E[Y] or E[g(Y)] for some function
g(·). We assume that it is possible to obtain E[g(Y)|X = x]. In fact we typ-
ically write the conditional expected values a little differently. Notice that
E[g(Y)|X] is a random variable and

E[g(Y)] = E[E[g(Y)|X]]

which in practice we write as

⎧
⎪
⎪ E[g(Y)|X = x]pX (x) if X is discrete,
⎪
⎨ x
E[g(Y)] = ∞
⎪
⎪ E[g(Y)|X = x]fX (x)dx if X is continuous.
⎪
⎩
−∞

Thus, we can easily obtain moments of Y, LST of Y, etc. We illustrate that via
a few examples.

Problem 128
The probability that a part produced by a machine is non-defective is p. By
conditioning on the outcome of the first part type (defective or not), compute
the expected number of parts produced till a non-defective one is obtained.
Solution
Let X be the outcome of the first part produced with X = 0 denoting a defec-
tive part and X = 1 denoting a non-defective part. Also, let Y be the number
of parts produced till a non-defective one is obtained. The question asks to
compute E[Y]. Although this problem can be solved by realizing that Y is a
geometrically distributed random variable with probability of success p, the
question specifically asks to condition on the outcome of the first part type.
712 Analysis of Queues

Clearly, we have E[Y|X = 1] = 1 since if the first part is non-defective, then

Y = 1. However, E[Y|X = 0] = 1+E[Y] since if the first part is defective, then it
would take one plus, however, many parts to produce to get a non-defective
part. Thus, we have

E[Y] = E[Y|X = 0]P{X = 0} + E[Y|X = 1]P{X = 1}

= (1 + E[Y])(1 − p) + 1p

and by solving for E[Y] we get E[Y] = 1/p. This is consistent with
the expected value of a geometric random variable with probability of
success p.

Problem 129
The average bus ride for Michelle from school to home takes b minutes; and it
takes her on average w minutes to walk from school to home. One day when
Michelle reached her bus stop to go home she found out that it would take a
random time for the next bus to arrive and that random time is according to
an exponential distribution with mean 1/λ minutes. Michelle decides to wait
for a maximum of t minutes at the bus stop and then walk home if the bus
does not arrive within t minutes. What is the expected time for Michelle to
reach home from the time she arrived at the bus stop? If Michelle would like
to minimize this, what should her optimal time t be?
Solution
Let X be the time the bus would arrive after Michelle gets to the bus stop
and Y be the time she would reach home from the time she arrived at the bus
stop. It is known that X is exponentially distributed with parameter λ. Also,

x + b if x ≤ t
E[Y|X = x] =
t + w if x > t

since x ≤ t implies Michelle would ride the bus and vice versa. Thus, by
unconditioning, we have

∞
E[Y] = E[Y|X = x]λe−λx dx
0

t ∞
= (x + b)λe−λx dx + (t + w)λe−λx dx
0 t

1
= 1 − e−λt − λte−λt + b 1 − e−λt + (t + w)e−λt
λ
Random Variables 713

which by rearranging terms we get E[Y] = b + (1/λ) − (b + (1/λ) − w)e−λt .

To find the optimal t (call it t∗ ) that would minimize E[Y] we find that if
b+(1/λ) > w, then t∗ = 0, otherwise t∗ = ∞. Thus, from Michelle’s standpoint
if b + (1/λ) is greater than w, she should not wait for the bus. However, if
b + (1/λ) is less than w, she should wait for the bus and not walk.

Problem 130
The orders received for grain by a farmer add up to X tons, where X is an
exponential random variable with mean 1/β tons. Every ton of grain sold
brings a profit of p, and every ton that is not sold is destroyed at a loss of
l. Let T be the tons of grains produced by the farmer, which is according to
an Erlang distribution with parameters α and k. Any portion of orders that
are not satisfied are lost without any penalty cost. What is the expected net
profit for the farmer?
Solution
Let R be the net profit for the farmer, which is a function of X. The expected
net profit conditioned on T is

T ∞
E[R|T] = [px − l(T − x)]βe−βx dx + pTβe−βx dx
0 T

p+l
= 1 − e−βT − βTe−βT − lT 1 − e−βT + pTe−βT
β

p+l
= 1 − e−βT − lT.
β

To compute E[R], we take the expectation of the expression to get

p+l
E[R] = E[E[R|T]] = 1 − E e−βT − lE[T].
β

Since T is according to an Erlang distribution with parameters α and k, we

have E[T] = k/α and E[e−βT ] = (α/(α + β))k by realizing that it is the LST at
β. Thus, we have
k
p+l α lk
E[R] = 1− − .
β α+β α

Notice that from the last two problems, we obtain some intriguing
results mainly due to the exponential distribution properties. In that light
714 Analysis of Queues

it is worthwhile to delve into exponential distribution further, which is the

objective of the following section.

A.4 Exponential Distribution

If we were to name one continuous random variable used most often in this
book, it would be the exponential random variable. In fact, a large number
of random variables such as Erlang, hyperexponential, Coxian, hypoexpo-
nential, and phase-type are indeed defined as a collection of exponential
random variables. For that reason, it is worthwhile describing the exponen-
tial distribution’s properties in one location so that they could be referred to
if needed.

A.4.1 Characteristics
Before describing the properties of exponential random variables, we first
recapitulate their characteristics so that they are in one location for easy
reference. A nonnegative continuous random variable X is distributed expo-
nentially with parameter λ if any of the following can be shown to hold:

1. The CDF of X is 1 − e−λx for all x ≥ 0.

2. The PDF of X is λe−λx for all x ≥ 0.
3. The LST of the CDF of X, F̃(s), is λ/(λ + s).

In other words, the three are equivalent. In fact in many instances, show-
ing the LST form is simpler than showing the CDF or PDF. Now, if X is
an exponentially distributed random variable with parameter λ, then we
symbolically state that as X ∼ exp(λ).
Another useful result to remember is that P{X > x} = e−λx for x ≥ 0. In
fact the hazard rate function (defined as fX (x)/(1 − FX (x)) for any nonnega-
tive random variable X) of the exponential random variable with parameter
λ is indeed λ. Further, in terms of moments, the expected value of X is
E(X) = 1/λ. Also, the variance of X is V[X] = 1/λ2 . Thus, the COV is 1 for
the exponential random variable. Next, we describe some useful properties.

A.4.2 Properties
The following is a list of useful properties of the exponential random vari-
able. They are presented without derivation. Interested readers are encour-
aged to refer to standard texts on probability and stochastic processes such
as Kulkarni [67].
Random Variables 715

• Memoryless property: If X ∼ exp(α) and Y is any random variable,

then

P(X − Y > t|X > Y) = P(X > t) = e−αt

and

P(X ≤ t + Y|X > Y) = P(X ≤ t) = 1 − e−αt .

This is called memoryless property because it is like as though the

amount Y that X has “lived” has been forgotten and the remainder
after that Y is still according to exp(α). The constant hazard rate is
also an artifact of this.
• Minimum of independent exponentials: If X1 ∼ exp(α1 ), X2 ∼ exp
(α2 ), . . ., Xn ∼ exp(αn ) and independent, then min(X1 , X2 , . . . , Xn ) ∼
exp(α1 + α2 + · · · + αn ) with mean 1/(α1 + α2 + · · · + αn ). In other
words, the minimum of n independent exponential random vari-
ables is exponentially distributed with parameter equal to the sum
of the parameters of the n independent variables.
• Smallest of independent exponentials: Let X1 ∼ exp(α1 ), X2 ∼ exp(α2 ),
. . ., Xn ∼ exp(αn ) and independent. Define N = i if Xi = min(X1 ,
X2 , . . . , Xn ), then P{N = j} = αj /(α1 + α2 + · · · + αn ) for all i and
j between 1 and n. For example, when n = 2, we have
P(X1 < X2 ) = α1 /(α1 + α2 ).
• Sum of fixed number of independent identically distributed (IID) expo-
nentials: For i = 1, . . . , n, let Xi ∼ exp(α) and independent, that is,
X1 , . . . , Xn are IID exponential random variables. Define the sum of
those n IID random variables as Z = X1 +· · ·+Xn . Then Z is according
to an Erlang distribution with parameters α and n.
• Sum of geometric number of IID exponentials: Let X1 , X2 , . . ., be a
sequence of IID exponential random variables with parameter α.
Define the sum of N of those IID random variables as Z = X1 +
· · · + XN , where N is a geometric random variable with parameter
p (i.e., P{N = i} = (1 − p)i−1 p for i = 1, 2, . . .). Then Z is according to an
exponential distribution with parameter pα.
• Sum of fixed number of independent exponentials: Let X1 ∼ exp(α1 ),
X2 ∼ exp(α2 ), . . ., Xn ∼ exp(αn ) and independent. Define the sum
of those n IID random variables as Z = X1 + · · · + Xn . Then the LST
of the distribution of Z is

−sZ n
αi
F̃Z (s) = E e = .
αi + s
i=1

The LST can be inverted and CDF obtained in two cases: (1) when
all αi values are equal, which result in the Erlang distribution,
716 Analysis of Queues

(2) when all αi values are different, which result in the hypoexpo-
nential distribution.

This leads us to study properties of collections of IID random variables,

which is the focus of the following section.

A.5 Collection of IID Random Variables

We begin by stating two fundamental results for collection of random vari-
ables. For that, let a1 , a2 , . . ., an be n constants and Z1 , Z2 , . . ., Zn be n random
variables each with finite mean and variance. Then

E [a1 Z1 + a2 Z2 + · · · + an Zn ] = a1 E[Z1 ] + a2 E[Z2 ] + · · · + an E[Zn ]

with no other restrictions on the random variables. However,

V [a1 Z1 + a2 Z2 + · · · + an Zn ] = a21 V[Z1 ] + a22 V[Z2 ] + · · · + a2n V[Zn ]

provided the random variables Z1 , Z2 , . . ., Zn are pairwise independent.

Next, we consider the title of this section, that is, a collection of IID ran-
dom variables X1 , X2 , . . ., with E[Xi ] = τ and V[Xi ] = σ2 for all i = 1, 2, . . .
(note that the random variables are independent). Now, define

X1 + X2 + · · · + Xn
Xn =
n
for any n. Then based on the results in the previous paragraph, we have

σ2
E Xn = τ and V Xn = .
n

In fact we also have E[X1 + X2 + · · · + Xn ] = nτ and V[X1 + · · · + Xn ] = nσ2 .

Now, we ask the question what if we let n → ∞ in this case? That
is described in two quintessential convergence results in the theory of
probability. According to the strong law of large numbers,

Xn → τ

almost surely as n → ∞. Also, if we define

√
n Xn − τ
Tn =
σ
Random Variables 717

then as n → ∞, Tn converges in distribution to a normally distributed ran-

dom variable with mean 0 and variance 1 according to the central limit
theorem. Both strong law of large numbers and central limit theorem can be
extended beyond the IID setting, but we do not present them here. Interested
readers are referred to several websites and books that have these results.
Next, we consider two processes that track events as they occur with IID
interevent times and study the process of counting events. We first present
the Poisson process and generalize it to renewal processes subsequently.

A.5.1 Poisson Process

Although Poisson processes can be approached from a stationary and inde-
pendent increments standpoint, we take a more elementary approach based
on IID exponential random variables. Let X1 , X2 , . . ., be a sequence of IID
random variables exponentially distributed with parameter λ. In particu-
lar, we assume that events occur one by one and the time between events
is exponentially distributed. Let Sn be the time of the nth event with

Sn = X1 + · · · + Xn

being the sum of the first n exponentially distributed random variables. In

addition, we assume that S0 = 0 for the sake of notation. From the proper-
ties of exponential distributions, we know that Sn is according to an Erlang
distribution with parameters λ and n.
Now, define N(t) as the number of events in time (0, t] so that the
interevent times are IID exponentially with parameter λ. In particular,

N(t) = max{n ≥ 0 : Sn ≤ t}

so that {N(t), t ≥ 0} is a counting process that increases by one every time

an event occurs. Using the fact that the events {N(t) ≥ n} and {Sn ≤ t} are
identical, we can derive

(λt)k
P{N(t) = k} = e−λt
k!

for all k = 0, 1, 2, . . . and any t ≥ 0. Thus, N(t) is according to a Poisson

distribution with E[N(t)] = λt and V[N(t)] = λt. Hence, the counting process
{N(t), t ≥ 0} is called a Poisson process with parameter λ, which we denote as
PP(λ). Next, we describe some properties most of which are derived from the
fact that the Poisson process has stationary and independent increments (i.e.,
N(t+s)−N(s) is independent of s as well as N(t+u)−N(t) and N(s+v)−N(s)
are independent if (t, t + u] and (s, s + v] do not overlap):
718 Analysis of Queues

• Stationarity property: If {N(t), t ≥ 0} is a PP(λ), then

(λt)k
P{N(t + s) − N(s) = k} = e−λt
k!

for any t > 0 and s > 0. Notice that it is identical to P{N(t) = k}.
• Independent increments: If {N(t), t ≥ 0} is a PP(λ), 0 ≤ t1 ≤ t2 ≤ · · · ≤
tn are fixed real numbers, and 0 ≤ k1 ≤ k2 ≤ · · · ≤ kn are fixed
integers, then

P{N(t1 ) = k1 , N(t2 ) = k2 , . . . , N(tn ) = kn }

= P{N(t1 ) = k1 }P{N(t2 ) − N(t1 ) = k2 − k1 }

. . . P{N(tn ) − N(tn−1 ) = kn − kn−1 }

(λt1 )k1 (λ(t2 − t1 ))k2 −k1 (λ(tn − tn−1 ))kn −kn−1

= e−λtn ...
k1 ! (k2 − k1 )! (kn − kn−1 )!

since nonoverlapping intervals have independent increments.

• Covariance of overlapping intervals: If {N(t), t ≥ 0} is a PP(λ), and t, s,
u, and v are fixed real numbers such that 0 ≤ t ≤ s ≤ t + u ≤ s + v,
then

Cov[N(t + u) − N(t), N(s + v) − N(s)] = λ(t + u − s)

which is equal to the variance of the overlapping portion (s, u + t].

• Conditional event times: If {N(t), t ≥ 0} is a PP(λ), and S1 , S2 , . . ., are
event times, then for any nonnegative integers n and k,

kt
if k ≤ n,
E[Sk |N(t) = n] = n+1
k−n
t+ λ if k > n.

• Superposition of Poisson processes: For i = 1, 2, . . . , r, let {Ni (t), t ≥ 0}

be a PP(λi ). If N(t) = N1 (t) + N2 (t) + · · · + Nr (t), then {N(t), t ≥ 0}
is a PP(λ) with λ = λ1 + λ2 + · · · + λr . In other words, by superim-
posing r independent Poisson processes we get a Poisson process
with mean event rate equal to the sum of the mean event rates of the
superimposed processes.
• Splitting of Poisson processes: Consider a PP(λ), {N(t), t ≥ 0} where
each event can be classified into r categories with probability p1 ,
p2 , . . ., pr of an event being of type 1, 2, . . . , r, respectively (such
that p1 + p2 + · · · + pr = 1). Under such a Bernoulli split let Ni (t) be
the number of events of type i that occurred in time (0, t] (note that
Random Variables 719

N(t) = N1 (t) + N2 (t) + · · · + Nr (t)), then {Ni (t), t ≥ 0} is a PP(pi λ) for

i = 1, 2, . . . , r.

Having described Poisson processes and their properties, we briefly

describe two extensions, namely nonhomogeneous Poisson processes (NPP)
and compound Poisson processes (CPP) next. In a PP(λ), the events
take place at rate λ at an average for all t ≥ 0, whereas in an NPP, the
instantaneous average rate at which events take place is λ(t). Define (t)
such that

t
(t) = λ(u)du.
0

Then we have the following probability

[(t + s) − (s)]k
P{N(t + s) − N(s) = k} = exp{−[(t + s) − (s)]}
k!

where similar to the regular Poisson process, N(u) is the number of events
that occur in time (0, u). Also, E[N(t)] = (t).
The concept of batch or bulk events can be modeled using CPP, which
essentially is the same as a regular Poisson process with the exception
that with every event, the counting process need not increase by one. Let
{N(t), t ≥ 0} be a PP(λ). Let {Zn , n ≥ 1} be a sequence of IID random variables
that is also independent of {N(t), t ≥ 0}. Define

N(t)
Z(t) = Zn
n=1

for t ≥ 0. The process {Z(t), t ≥ 0} is called a CPP. Let Zn be an integer value

random variable (then so is Z(t)). Let

pk (t) = P{Z(t) = k}.

Then for 0 ≤ t1 ≤ t2 . . . ≤ tn , we have

P{Z(t1 ) = k1 , Z(t2 ) = k2 , . . . , Z(tn ) = kn }

= pk1 (t1 )pk2 −k1 (t2 − t1 ) . . . pkn −kn−1 (tn − tn−1 ).

720 Analysis of Queues

Let {Z(t), t ≥ 0} be a CPP. Then

E[Z(t)] = λtE[Z1 ],

Var[Z(t)] = λtE Z21 .

A.5.2 Renewal Process

Having described Poisson processes, the next natural question to ask is:
what if the interevent times are not exponentially distributed? That is the
focus of this section. When the interevent times are generally distributed,
the resulting counting process is called a renewal process. We provide a
formal definition. Let Yn denote the time between the nth and (n − 1)st
events, that is, interevent time. We assume that Yn for all n ≥ 0 are IID ran-
dom variables with CDF G(·). Let Sn be the time of the nth event, therefore
Sn = Y1 + · · · + Yn , with S0 = 0. Let N(t) be the number of events till time t,
therefore N(t) = max{n : Sn ≤ t}. Then {N(t), t ≥ 0} is a stochastic process
called the renewal process.
Denote the probability

pk (t) = P{N(t) = k}

Then using the LST of pk (t) one can show that

∞
p̃k (s) = e−st dpk (t) = (G̃(s))k (1 − G̃(s))
0

where G̃(s) = E[e−sYn ], the LST of the CDF of Yn . For example, if Yn ∼ exp(λ);
G(y) = P{Yn ≤ y} = 1 − e−λy for y ≥ 0. Also, G̃(s) = λ/(λ + s). In this case,
{N(t), t ≥ 0} process is indeed a Poisson process with parameter λ. One can
get by inverting the LST the following:

(λt)k
pk (t) = P{N(t) = k} = e−λt .
k!
m−1
As another example, let Yn ∼ Erlang(m, λ); G(y) = P{Yn ≤ y} = 1 − r=0 e−λy
(λy)r /r! for y ≥ 0. Also, G̃(s) = λm /(λ + s)m . Clearly,
mk m(k+1)
λ λ
p̃k (s) = − .
λ+s λ+s

In general, using the LST and inverting it to get the distribution of N(t) is
tricky. However, there are several results that can be derived without need-
ing to invert. We present them here. For the remainder of this section, we
Random Variables 721

assume that E[Yn ] = τ and V[Yn ] = σ, such that both τ and σ are finite. The
main results (many of them being asymptotic) are as follows:

• As t → ∞, the random variable N(t)/t → τ almost surely.

• As t → ∞, the random variable N(t) converges in distribution to a
normally distributed random variable with mean t/τ and variance
σ2 t/τ3 .
• Let M(t) = E[N(t)] with LST M̃(s), then we have

G̃(s)
M̃(s) = ,
1 − G̃(s)

M(t) 1
lim = ,
t→∞ t τ

E[SN(t)+1 ] = τ(1 + M(t)),

t σ2 − τ2
lim M(t) − = .
t→∞ τ 2τ2

The next set of results have their roots in reliability theory, which would
explain the terminology. We define the following variables: A(t) = t − SN(t) ,
which is the time since the previous event, in reliability this would be the age;
B(t) = SN(t)+1 − t, which is the time the next event would occur, in reliability
this is the remaining life; and C(t) = A(t) + B(t), which is the time between
the previous and the next events, that is, in reliability this would be total life.
It is possible to derive:

1
x
lim P{B(t) ≤ x} = [1 − G(u)]du,
t→∞ τ
0

τ 2 + σ2
lim E[B(t)] = ,
t→∞ 2τ

τ2 + σ 2
lim E[A(t)] = .
t→∞ 2τ

Therefore, in the limit as t → ∞, B(t) as well as A(t) are according to what is

known as the equilibrium distribution of the interrenewal times. Interestingly,
as t → ∞, E[C(t)] → (τ2 + σ2 )/τ, and not τ which is what one would expect.
This is called inspection paradox. In fact (τ2 + σ2 )/τ > τ.
722 Analysis of Queues

Reference Notes
The contents of this chapter is a result of teaching various courses on prob-
ability and stochastic processes both at the undergraduate level and at the
graduate level. There are several excellent books on the topics covered in
this chapter. For example, the elementary material on probability, random
variables, and expectations can be found in Ross [93]. However, the nota-
tions used in this chapter and a majority of results are directly from Kulkarni
[67]. That would be a wonderful resource to look into the proofs and deriva-
tions for some of the results in this chapter. The notable exceptions not found
in either of those texts are as follows: the discussion on mixture distribu-
tions, coefficient of variation (COV), some special distributions, as well as
numerical inversion of LSTs and LTs. A description of relevant references
are provided for those topics. Finally, for a rigorous treatment of probabil-
ity, yet not abstract, an excellent source is Resnick [90], which also nicely
explains topics such as law of large numbers and central limit theorem.

Exercises
A.1 Let X be a continuous random variable with PDF
1
π x sin x if 0 < x < π
fX (x) =
0 otherwise.

Prove that

E(Xn+1 ) + (n + 1)(n + 2)E(Xn−1 ) = π n+1 .

A.2 The gamma function (which appears in gamma distribution and

some other distributions as well) is given for x > 0 as

∞
(x) = tx−1 e−t dt.
0

√
Using the PDF of the normal distribution, show that (1/2) = π.
A.3 The conditional variance of X, given Y, is defined by

Var(X|Y) = E({X − E(X|Y)}2 |Y).

Random Variables 723

Prove the conditional variance formula:

Var(X) = E(Var(X|Y)) + Var(E(X|Y)).

A.4 Suppose given a, b > 0, and let X, Y be two random variables with
values in Z+ (set of nonnegative integers) and R+ (set of nonnega-
tive real numbers), respectively. The joint distribution of X and Y is
characterized by

y (at)n
P[X = n, Y ≤ y] = b exp(−(a + b)t)dt.
n!
0

For n ∈ Z+ , compute E(Y|X = n). Also compute E(X|Y).

A.5 The number of claims received at an insurance company during
a week is a random variable with mean μ1 and variance σ21 . The
amount paid in each claim is a random variable with mean μ2 and
variance σ22 . Find the mean and variance of the amount of money
paid by the insurance company each week. What independence
assumptions are you making? Are these assumptions reasonable?
A.6 Bus A will arrive at a station at a random time between 10:00
a.m. and 11:00 a.m. tomorrow according to a beta distribution with
parameters α and β. Therefore, if X denotes the number of hours
after 10:00 a.m. bus A would arrive, then the PDF of X is

(α+β) α−1
(α)(β) x (1 − x)β−1 0<x<1
fX (x) =
0 elsewhere

∞
where (a) = xa−1 e−x dx. Also, E[X] = α/(α + β). Bus B will
0
arrive at the same station at a random time uniformly distributed
between the arrival time of bus A and 11:00 a.m. Find the expected
value of the arrival time of bus B.
A.7 Two parts (call them A and B) are manufactured in parallel on two
machines (call them machine 1 and machine 2). The processing time
for part A on machine 1 is distributed exponentially with parame-
ter α. Similarly, part B takes an exp(β) amount of time to process
on machine 2. If the processing starts at the same time on both
machines, what is the expected time to complete processing of both
parts (i.e., the expected time for both machines to become idle)?
Hint: Let XA and XB be random variables denoting the time to pro-
cess jobs A and B on machines 1 and 2, respectively. Then define
Z = max(XA , XB ). Compute E[Z].
724 Analysis of Queues

A.8 The amount of time it takes to spot a defect in a cast is distributed

exponentially with mean 15 min.
a. What is the probability of spotting a defect within 10 min?
b. If I have been unsuccessful in spotting a defect for the past
14 min, what is the expected time from now on that I will
spot it?
c. If there are four different students working on four similar
casts, when do you expect the first among the four students
to spot a defect?
d. Continuing with the previous question, what is the proba-
bility that in 30 min exactly two students will spot defects
and two will not?
A.9 A barber shop has three servers with expected service times of 20, 15,
and 10 min. The servers have been busy for 5, 7, and 6 min, respec-
tively with their current customers. What is the expected remaining
time until the next service completion?
A.10 There are B light bulbs and they are all switched on simultane-
ously. The lifetime of each light bulb is exponentially distributed
with mean θ hours. Assume S0 = 0 and for i ≥ 1, Si denotes the time
of ith failure (i.e., the ith time a light bulb dies). Define the counting
process {N(t), t ≥ 0} such that N(t) = max{n ≥ 0 : Sn ≤ t} for t ≥ 0.
Answer the following questions assuming that 2 < B < ∞:
i. {N(t), t ≥ 0} is a Poisson process. TRUE OR FALSE?
ii. What is E[S2 − S1 ]?
A.11 Let {N(t), t ≥ 0} be a renewal process with interrenewal times having
the following density:

λ3 t2 e−λt
g(t) = t ≥ 0.
2

Compute P{N(t) = k} for k = 0, 1, 2.

A.12 Let {Yn , n ≥ 1} be a sequence of IID exp(μ) random variables and
{Zn , n ≥ 1} be a sequence of IID exp(λ) random variables. Define
random variable Xn such that Xn = Yn + Zn with probability α and
Xn = Yn with probability 1 − α. The random variable Xn is said to
be according to Coxian-2 distribution. Assume λ = μ and let S0 = 0.
Define Sk = X1 + · · · + Xk for all k ≥ 1. Argue that {Sn , n ≥ 0} is
a renewal sequence. Then obtain an expression for p̃k (s) and the
average interrenewal time τ.
Appendix B: Stochastic Processes

In this appendix chapter, we will review stochastic processes by describ-

ing the ones used in this book. For that, we will walk through modeling,
analysis, and performance evaluation for stochastic systems. Essentially, a
probabilistic model of a system evolving randomly in time can be modeled
using stochastic process. In particular, queueing systems can be modeled
using discrete-time Markov chains (DTMC), continuous-time Markov chains
(CTMC), Markov regenerative processes (MRGPs), and Brownian motions.
Hence we will review them here in that order.

B.1 Discrete-Time Markov Chains

Consider a system that is observed at discrete points in time. These time
points can be equally spaced such as at the beginning of a day or an hour.
They can also be randomly spaced, such as immediately after a customer
departs from a queue or just before a customer joins a queue. The key feature
is that observations are made at discrete points in time and not continu-
ously. For such a system, let Xn be its state when observed for the nth time.
Although Xn could be a vector, for the rest of this section we treat it as though
it is a scalar with the understanding that it is possible to map any vector
uniquely using an appropriate scalar notation. However, in many practical
situations, it is wiser to use a vector notation. We will see examples of that
subsequently.
Next we consider S, the state space for the system, which is the set of
all possible values Xn can take for any n. Recall that the system under con-
sideration evolves randomly from observation to observation but every time
the system is in one of the states in S. The collection of random quantities
{Xn , n ≥ 0} forms an ordered sequence. Such a sequence is called a stochastic
process. For a stochastic process to be a DTMC, we require the system to sat-
isfy the Markov property: given the current state, the future is independent
of the past. For a DTMC, this is mathematically written as

P{Xn+1 = j|Xn = i, Xn−1 = in−1 , . . . , X0 = i0 } = P{Xn+1 = j|Xn = i}

for any i ∈ S, j ∈ S, and ik ∈ S for k = 0, . . . , n − 1.

725
726 Analysis of Queues

Besides the Markov property, another property that is useful is time-

homogeneity, which states that the probability of going from state i to state j
does not depend on time. For a DTMC, this is mathematically written as

P{Xn+1 = j|Xn = i} = P{X1 = j|X0 = i}

for any i ∈ S, j ∈ S, and n ≥ 0. Although it is not necessary for a DTMC to

be time-homogeneous, it is extremely convenient to analyze when it is so. In
that case, we define pij as

pij = P{Xn+1 = j|Xn = i}

for any i ∈ S, j ∈ S, and n ≥ 0. Clearly, pij is the probability of going from state
i to state j from one observation to the next. The matrix of pij values is called
the transition probability matrix

P = [pij ]

for all i ∈ S and j ∈ S.

B.1.1 Modeling a System as a DTMC

There are six steps involved in modeling a system as a DTMC. They are as
follows:

1. Define Xn , the state of the system at time n or nth observation (this

should be done carefully so that Markov and time-homogeneity
properties are satisfied)
2. Write down the state space S, which is a set of all possible values Xn
can take
3. Verify if the Markov property is satisfied
4. Verify if the time-homogeneity property is satisfied
5. Construct the transition probability matrix P

P = [pij ] = [P{X1 = j|X0 = i}]

(note that for the transition probability matrix, each row sums to
one)
6. Draw a transition diagram, that is, draw a directed network by
drawing the node set (the state space S) and the arcs (i, j) if pij > 0,
with arc cost pij for all i ∈ S and j ∈ S

Next we present a few examples to illustrate the modeling process.

Stochastic Processes 727

Problem 131
Packets arriving at a router are classified into two types: real-time (RT) and
non-real-time (NR) packets. An RT packet follows another RT packet with
probability 0.7 (therefore, the probability of an NR packet following an RT
packet is 0.3). Similarly, an NR packet follows another NR packet with prob-
ability 0.6 (therefore, the probability of an RT packet following an NR packet
is 0.4). Model the type of packets arriving at a router as a DTMC.
Solution
Let Xn denote the type of the nth packet (RT or NR) arriving at the router.
Clearly, Xn can take only one of two values, RT or NR. Thus the state space
is S = {RT, NR}. From the problem description to predict the type of the
next packet, we only need to know the type of the current packet but noth-
ing about the history. Also, the transition probabilities are time-invariant.
Therefore, Markov and time-homogeneity properties are satisfied.
Now we can write down the transition probability matrix as follows:

RT NR

RT 0.7 0.3
P= .
NR 0.4 0.6

Thus the probability of the next packet being NR given the current is NR is
0.7, which is the northwest corner of the P matrix. Notice the rows adding
to one. We can also draw the transition diagram as described in Figure B.1.
Thus the system is modeled as a DTMC.

This is perhaps one of the simplest examples of a DTMC with two states.
Next we state a slightly bigger example.

Problem 132
Consider three cell-phone companies A, B, and C. Every time a sale
is announced, a thrifty graduate student switches from one company to
another. If the student is with company A before a sale, he switches to B
or C with probability 0.4 or 0.3, respectively. Likewise if he is with B, he

0.7 0.6
0.4
RT NR
0.3

FIGURE B.1
Transition diagram for the RT/NR problem.
728 Analysis of Queues

0.4
0.3 0.5
A 0.3 C

0.5 0.1
0.4 0.2

B
0.3

FIGURE B.2
Transition diagram for the cell-phone switching problem.

switches to A or C with probabilities 0.5 or 0.2, respectively. Finally, if he is

with C, he switches to A or B with probability 0.4 or 0.1, respectively. Model
the cell-phone company this student uses as a DTMC.
Solution
Let Xn denote the cell-phone company with which this particular student is
just before the nth sale announcement. Since we are restricted to the three
companies, the state space is S = {A, B, C}. Based on the problem description,
notice that Markov and time-homogeneity properties are satisfied. Clearly,
the transition probability matrix is

A B C
⎡ ⎤
A 0.3 0.4 0.3
P = B ⎣ 0.5 0.3 0.2 ⎦ .
C 0.4 0.1 0.5

Also, the transition diagram can easily be drawn as shown in Figure B.2.
Thus the system is modeled as a DTMC.

Next we present a case where the state space has infinite elements.

Problem 133
Consider a time division multiplexer from which packets are transmitted at
times 0, 1, 2, etc. Packets arriving between time n and n + 1 have to wait
until time n + 1 to be transmitted. However, at most one packet can be trans-
mitted at a time. Let Yn be the number of packets that arrive during time n
to n + 1. Assume that ai = P{Yn = i}. Model the number of packets awaiting
transmission as a DTMC.
Solution
Let Xn be the number of packets awaiting transmission just before time n
(i.e., just before an opportunity to transmit). Clearly, S = {0, 1, 2, . . .}. Based
Stochastic Processes 729

on the problem description we have Xn+1 = Yn + max(Xn − 1, 0) since imme-

diately after time n there would be either one transmission if Xn > 0 or zero
transmission if Xn = 0 (thus max(Xn − 1, 0) packets immediately after time
n). Clearly, if Xn is known we can characterize Xn+1 without knowing any
history. Thus Markov property is satisfied. Also, Yn is not dependent on
n, hence the system is time-homogeneous. Therefore, we can obtain the
transition probabilities pij as (for i > 0)

pij = P{Xn+1 = j|Xn = i} = aj−i+1 .

Similarly,

p0j = P{Xn+1 = j|Xn = 0} = P{Yn = j|Xn = 0} = aj .

Therefore, the transition probability matrix P = [pij ] is

⎡ ⎤
a0 a1 a2 a3 ...
⎢ a0 a1 a2 a3 ... ⎥
⎢ ⎥
⎢ 0 a0 a1 a2 ... ⎥
⎢ ⎥
P=⎢ 0 0 a0 a1 ... ⎥.
⎢ ⎥
⎢ 0 0 0 a0 ... ⎥
⎣ ⎦
.. .. .. .. ..
. . . . .

Notice in this example that we did not provide the transition diagram.
This is fairly typical since there is a one-to-one correspondence between the
transition probability matrix and the transition diagram. See the exercises for
more example problems as well as Chapter 4.

B.1.2 Transient Analysis of DTMCs

Having described how to model a system as a DTMC, we take the next step,
namely, analysis of DTMCs. We begin by discussing transient analysis in
this section. For that, consider a DTMC {Xn , n ≥ 0} with state space S and
transition probability matrix P. To answer the question about what is the
probability that after n observations the DTMC will be in state j given it is in
state i is the focus of this transient analysis. Note that we have so far only con-
sidered the case n = 1, which can be immediately obtained from the P matrix.
The P matrix is also known as a one-step transition probability matrix. With
that introduction, we define the probability of going from state i to state j in
n steps as the n-step transition probability. It is defined and computed as

P{Xn = j|X0 = i} = [Pn ]ij .

730 Analysis of Queues

In other words, by raising the one-step transition probability matrix to the

nth power, one gets the n-step transition probability matrix.
As an example, consider the DTMC in Problem 132 with

S = {A, B, C}

and
⎡ ⎤
0.3 0.4 0.3
P = ⎣ 0.5 0.3 0.2 ⎦ .
0.4 0.1 0.5

If at time 0, the student is with cell-phone company B, then after the third
sale (i.e., n = 3) to obtain the probability that this student is with company C,
it can be computed as follows. Firstly,
⎡ ⎤
0.3860 0.2770 0.3370
P3 = ⎣ 0.3930 0.2760 0.3310 ⎦ .
0.3870 0.2590 0.3540

Thus we have

P{X3 = C|X0 = B} = [P3 ]BC = 0.3310

which is the element corresponding to row B and column C (second row and
third column).
Continuing with this example, notice that as n → ∞,
⎡ ⎤
0.3882 0.2706 0.3412
Pn → ⎣ 0.3882 0.2706 0.3412 ⎦ .
0.3882 0.2706 0.3412

It appears as though in the long run, the state of the DTMC is independent
of its initial state. In other words, irrespective of which phone company the
graduate student started with, he/she would eventually be with A, B, or C
with probability 0.3882, 0.2706, and 0.3412, respectively. Notice how the P∞
matrix has all identical rows. This is the focus of the following section.

B.1.3 Steady-State Analysis of DTMCs

Consider a DTMC {Xn , n ≥ 0} with state space S and transition probability
matrix P. As we saw in the previous example, as n approaches infinity, the
matrix Pn for “well-behaved” DTMCs has identical rows (or the columns
have the same elements). This will always be true for irreducible, aperiodic,
and positive recurrent DTMCs, which we called earlier as “well-behaved.”
Stochastic Processes 731

The reader is encouraged to refer to any standard stochastic processes book

for a definition but informally we provide the following definition. An irre-
ducible DTMC is one for which it is possible to go from any state to any other
state in one or more steps (i.e., there is a path from every node to every other
node in the transition diagram). A periodic DTMC is one for which Pn as
n → ∞ does not exist. For example, a periodic DTMC with period-2 would
have Pn for very large n toggle between two limits (likewise if the period is
3, it would toggle between three values and so on). An aperiodic DTMC is
one that is not periodic. Finally, a positive recurrent DTMC is one where the
probability of revisiting any state is one and the time of revisit is finite.
Define the long-run (or steady-state) probability of a “well-behaved”
(i.e., irreducible, aperiodic, and positive recurrent) DTMC being in state j as

πj = lim P{Xn = j}.

n→∞

We are interested in πj because it not only describes the probability of being

in state j in steady state, but also represents the fraction of time the DTMC
visited state j in the long run. That is due to what is called the ergodic the-
orem. As it turns out, it is also called as the stationary probability because
if one started in state j with probability πj , then at every observation, the
DTMC would be in state j with probability πj , that is, the system behaves
like as though it is stationary.
One way to compute πj is to raise P to a very large power. However, the
method will not work for symbolic matrices, infinite state-space DTMCs as
well as DTMCs with large dimensional state spaces. The easiest method to
compute πj is by obtaining the unique solution to the system of equations

π = πP

πj = 1
j∈S

where π is a row of πi values for all i ∈ S, that is, π = [πi ]. As an example,

consider Problem 131. The previous equations reduce to

0.7 0.3
(πRT πNR ) = (πRT πNR )
0.4 0.6
πRT + πNR = 1.

Thus we have (πRT πNR ) = (4/7 3/7). Therefore, in the long run, four-
sevenths of the packets will be real-time and three-sevenths non-real-time.
Such an analysis is also very useful to describe the performance of sys-
tems in steady state. We use a terminology of cost; however, it is not
732 Analysis of Queues

necessary that the cost has a financial connotation. With that understand-
ing, say when the system (DTMC) enters state i, it incurs a cost c(i) on an
average. Then the long-run average cost per unit time (or per observation or
per slot) is

c(i)πi
i∈S

which can be computed once the steady-state probabilities πi for all i ∈ S are
known.
As an example, consider Problem 133 describing a time-division mul-
tiplexer. Let πj be the limiting probability that there are j packets in the
multiplexer just before the nth attempted transmission (we assume that πj
for all j ∈ S can be calculated). Let us answer the question: what is the average
multiplexing delay for a packet that arrives immediately after an attempted
transmission in the long run? If there are i packets in the multiplexer when
this packet arrives, then this packet faces a latency (or delay) of (i + 1)τ units
of time. Note that τ is the time between successive multiplexing attempts.
Therefore, the average multiplexing delay is

π0 τ + τ iπi .
i=1

Also, notice that the throughput of the multiplexer is (1 − π0 )/τ. In fact we

can answer many such questions, for example, what is the probability that
the delay for an arriving packet is greater than 6τ? If the buffer can store a
maximum of 12 messages, what is the packet loss probability? and so on.

B.2 Continuous-Time Markov Chains

Having described DTMCs, the next natural stochastic process to consider is
the CTMC. Like the DTMC, here too the states are discrete-valued. However,
the system is observed continuously, hence the name CTMC. To explain fur-
ther, consider a stochastic system which changes as events occur over time.
The time between events must be exponentially distributed. For such a sys-
tem let X(t) be the state of the system at time t and this could be a scalar or
a vector. Typically, when an event occurs, the state changes. Since the time
between events are exponentially distributed, due to its memoryless prop-
erty, we can predict X(t + s) given X(t) without knowing how we got to X(t).
Thus Markov property is satisfied and hence this is a Markov chain.
Stochastic Processes 733

It is crucial to note that by modeling the state X(t) appropriately, other

distributions such as Erlang, hyperexponential, hypoexponential, coxian,
and phase type can also be used by incorporating the phase of the distri-
bution in the state. Also, for ease of analysis we assume that the system
is time-homogeneous, that is, the parameters of the exponential distri-
butions are not functions of time. We can write down the Markov and
time-homogeneity properties mathematically as follows:

P{X(t + s) = j|X(s) = i, X(u)∀u ∈ [0, s)} = P{X(t + s) = j|X(s) = i}

= P{X(t) = j|X(0) = i} = pij (t)

for any i ∈ S, j ∈ S, and s ≥ 0, where S is the state space. Thereby the stochastic
process {X(t), t ≥ 0} would be a CTMC.
A CTMC is typically characterized by its so-called infinitesimal generator
matrix Q with rows and columns corresponding the current state and the
next state in S. In other words, we keep track of epochs when the system state
changes, that is, X(t) changes with the understanding that between epochs
the state remains a constant. Thus in some sense if we considered the epochs
as observation times we indeed would have a DTMC. Next, to describe an
element qij of the Q matrix, it is the rate at which an event that would take
the system from state i to state j would occur. In other words, the triggering
event that drives the system from state i to state j happens after exp(qij ) time.
However, it is crucial to realize that j does not have to happen, another event
may have occurred prior to that. With that description, next we describe how
to model a system as a CTMC and state a few examples to clarify the earlier
description.

B.2.1 Modeling a System as a CTMC

There are five steps involved in modeling a system as a CTMC. They are as
follows:

1. Define X(t), the state of the system at time t (this must be selected
appropriately so that Markov and time-homogeneity properties are
satisfied)
2. Write down the state space S, which is a set of all possible values
X(t) can take
3. Verify if Markov and time-homogeneity properties are satisfied
(this is straightforward if the interevent times are exponentially
distributed with time-invariant parameters)
4. Construct the generator matrix Q = [qij ] as follows:
a. For i = j, qij is the rate of transitioning from state i to state j (this
means that if no other event occurs then it would take an expo-
nential amount of time with mean 1/qij to go from state i to state j;
734 Analysis of Queues

also, if there are multiple events that can take the CTMC from i to
j, then the rate qij is the sum of the rates of all the events)

b. For i = j, qij = − qij (the rows in Q sum to zero and all

j ∈ S,j=i
diagonal elements in Q are negative)
5. Draw the rate diagram, that is, draw a directed network by drawing
the node set (the state space S) and the arcs (i, j) if qij > 0, with arc
cost qij

Next, we present a few examples to illustrate the modeling process.

Problem 134
Consider a machine that toggles between two states, up and down. The
machine stays up for an exponential amount of time with mean 1/α hours
and then goes down. Then the machine stays down for an exponential
amount of time with mean 1/β hours before it gets back up. Model the
machine states using a CTMC.
Solution
Let X(t) be the state of the machine at time t. Therefore, if X(t) = 0, then
the machine is down at time t. Also, if X(t) = 1, the machine is up at time t.
Clearly, we have the state space as S = {0, 1}. The generator matrix in terms
of {0, 1} by {0, 1} is

0 1

0 −β β
Q= .
1 α −α

The rate diagram is provided in Figure B.3. Hence the system is modeled
a CTMC.

Problem 135
Consider a telephone switch that can handle at most N calls simultaneously.
Assume that calls arrive according to PP(λ) to the switch. Any call arriving

0 1

FIGURE B.3
Rate diagram for up/down machine.
Stochastic Processes 735

λ λ λ λ λ

0 1 2 N–1 N

μ 2μ 3μ (N – 1)μ Nμ

FIGURE B.4
Rate diagram for telephone switch.

when there are N other calls in progress receives a busy signal (and hence
rejected). Each accepted call lasts for an exponential amount of time with
mean 1/μ amount of time (this is the duration of a phone call, also called
hold times). Model the number of ongoing calls at any time as a CTMC.
Solution
Let X(t) be the number of ongoing calls in the switch at time t. Clearly, there
could be anywhere between 0 and N calls. Hence we have the state space
as S = {0, 1, . . . , N}. In many problem instances including this one, it is eas-
ier to draw the rate diagram and use it for analysis. In that light, the rate
diagram is illustrated in Figure B.4. To explain that, consider some i such
that 0 < i < N. When X(t) = i, one of two events can occur: either a new call
could arrive (this happens after exp(λ) time) or an existing call could com-
plete (this happens after exp(iμ) time). Of course if X(t) = 0, the only event
that could occur is a new call arrival. Likewise if X(t) = N, the only event of
significance is a call completing. Notice how memoryless property and min-
imum of exponentials property of exponential random variables are used in
the description.
Then the generator matrix is

⎡ ⎤
−λ λ 0 0 ... 0 0
⎢ μ −(λ + μ) λ 0 ... 0 0 ⎥
⎢ ⎥
⎢ ⎥
Q= ⎢ 0 2μ −(λ + 2μ) λ ... 0 0 ⎥.
⎢ ⎥
⎢ .. .. .. .. . . .. .. ⎥
⎣ . . . . . . . ⎦
0 0 0 0 . . . Nμ −Nμ

Notice how easy is the transition between the rate diagram and the generator
matrix.

Problem 136
Consider a system where messages arrive according to PP(λ). As soon as a
message arrives, it attempts transmission. The message transmission times
736 Analysis of Queues

are exponentially distributed with mean 1/μ units of time. If no other mes-
sage tries to transmit during the transmission time of this message, the
transmission is successful. If any other message tries to transmit during
this transmission, a collision results and all transmissions are terminated
instantly. All messages involved in a collision are called backlogged and
are forced to retransmit. All backlogged messages wait for an exponential
amount of time (with mean 1/θ) before starting retransmission. Model the
system called “unslotted Aloha” as a CTMC.
Solution
Let X(t) denote the number of backlogged messages at time t and Y(t) be
a binary variable that denotes whether or not a message is under transmis-
sion at time t. Then we model the stochastic process {(X(t), Y(t)), t ≥ 0} as a
CTMC. Notice that the state of the system is a two-tuple vector and the state
space is

S = {(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1), (3, 0), (3, 1), . . .}.

Say the state of the system at time t is (i, j) for some (i, j) ∈ S. If j = 1, then
one of three events can change the state of the system: a new arrival at rate λ
would take the system to (i+2, 0) due to a collision; a retransmission attempt
at rate iθ would take the system to (i + 1, 0) due to a collision; and a trans-
mission completion at rate μ would take the system to (i, 0). However, if
j = 0, then one of two events can change the state of the system, a new arrival
at rate λ would take the system to (i + 1, 1), and a retransmission at rate iθ
would take the system to (i − 1, 1). Based on that we can show that the rate
diagram would be as described in Figure B.5.

Notice in this example that we did not provide the Q matrix since it can
easily be inferred from the rate diagram. See the exercises for more example
problems as well as Chapters 2 and 3. Next, we move on to some analysis of
CTMCs. Much like DTMCs, here too, we first present transient analysis and
then move on to steady-state analysis.

0,0 1,0 2,0 3,0

θ 2θ
μ 3θ μ
μ λ
λ θ
μ λ
2θ λ
λ λ
0,1 1,1 λ 2,1 3,1

FIGURE B.5
Rate diagram for unslotted Aloha.
Stochastic Processes 737

B.2.2 Transient Analysis of CTMCs

Consider a CTMC {X(t), t ≥ 0} with state space S and generator matrix Q. The
probability of going from state i to state j in time t (possibly using multiple
transitions) is

pij (t) = P{X(t) = j|X(0) = i}

for any i ∈ S and j ∈ S. This is the same pij (t) described for the Markov and
time-homogeneity properties. The matrix P(t) = [pij (t)] satisfies the following
matrix differential equation:

dP(t)
= P(t)Q = QP(t)
dt

with initial condition P(0) = I and boundary condition j ∈ S pij (t) = 1 for
every i ∈ S and any t ≥ 0. The solution to the differential equation can be
written as

P(t) = exp(Qt)

where the exponential of a square matrix A is defined as

A2 A3
exp(A) = I + A + + + ···
2! 3!

which in the scalar special case would reduce to the usual exponential. It is
crucial to notice that the solution works only if the CTMC has finite number
of states. There are efficient ways of computing it especially when the entries
of Q are numerical (and not symbolic). However, there are other techniques
to use when Q is symbolic or if there are infinite elements in S.
As an example, consider a four-state CTMC {X(t), t ≥ 0} with S =
{1, 2, 3, 4} and
⎡ ⎤
−5 1 2 2
⎢ 0 −2 1 1 ⎥
Q=⎢
⎣ 1
⎥.
3 −5 1 ⎦
2 0 0 −2

Since P(t) = exp(Qt), we can compute at t = 1

⎡ ⎤
0.1837 0.2938 0.1356 0.3869
⎢ 0.1570 0.3522 0.1343 0.3565 ⎥
P(1) = exp(Q) = ⎢
⎣ 0.1656
⎥.
0.3328 0.1378 0.3638 ⎦
0.2106 0.2270 0.1257 0.4367
738 Analysis of Queues

Therefore, if the CTMC is in state 1 at time t = 0, then there is a probability

of 0.2938 that it would be in state 2 at time t = 1. Likewise p34 (1) = 0.3638,
but the CTMC could have jumped several states between t = 0 when it
was in state 3 and t = 1 when it will be in state 4. As another example,
P{X(15.7) = 2|X(14.7) = 3} = 0.3328 due to time homogeneity.
Now, let us see what happens at t = 10. Using P(t) = exp(Qt), we can
compute at t = 10

⎡ ⎤
0.1842 0.2895 0.1316 0.3947
⎢ 0.1842 0.2895 0.1316 0.3947 ⎥
P(10) = exp(10Q) = ⎢
⎣ 0.1842
⎥.
0.2895 0.1316 0.3947 ⎦
0.1842 0.2895 0.1316 0.3947

Notice how the rows are identical, that is, the columns have the same
elements each. In other words, irrespective of which state 1 is currently,
eventually with probability 0.1842 the system will be in state 1. This is sim-
ilar to the steady-state behavior we saw with DTMCs. Next, we describe
steady-state analysis of CTMCs.

B.2.3 Steady-State Analysis of CTMCs

Consider a CTMC {X(t), t ≥ 0} with state space S and generator matrix Q.
As we saw in the previous section as t approaches infinity, the matrix P(t)
defined there has identical rows (or the columns have the same elements).
However, that happened only for “well-behaved” CTMCs. It appears as
though in the long run, the state of such a CTMC is independent of its initial
state. This will always be true for irreducible and positive recurrent CTMCs
(also called ergodic). For a definition of irreducible and positive recurrent
CTMCs, we turn to DTMCs. In particular, recall the embedded DTMC that
corresponds to the CTMC state, which is observed every time the system
changes state. The CTMC is irreducible and positive recurrent if the embed-
ded DTMC is irreducible and positive recurrent, respectively. We describe
steady-state analysis only for such CTMCs.
Define the long-run (or steady-state) probability of the ergodic CTMC
being in state j as

pj = lim P{X(t) = j}.

t→∞

We are interested in pj because it not only describes the probability of being

in a particular state in steady state, but also represents the fraction of times
the CTMC spent in state j in the long run. Without loss of generality, assum-
ing S = {0, 1, . . . , j, . . .}, the easiest method to compute pj is by obtaining the
Stochastic Processes 739

unique solution to the system of equations (called balance equations)

pQ = 0

pj = 1
j∈S

where
p is the vector (p0 , p1 , . . . , pj , . . .)
0 is a row vector of zeros

Also, when the system (CTMC) enters state i, it incurs a cost c(i) per unit
time on average. Again, the cost does not necessarily mean “dollar” cost but
any other performance measures as well. Then the long-run average cost
incurred per unit time is

c(i)pi .
i∈S

We illustrate these concepts using some examples.

Consider the CTMC in Problem 134. The balance equations are

−β β
(p0 p1 ) = (0 0)
α −α

p0 + p1 = 1.

Thus (p0 p1 ) = (α/(α + β) β/(α + β)). Further, when the machine is up,
it produces products at a rate of ρ per second and no product is produced
when the machine is down. Then the long-run average production rate is
0 × p0 + ρ × p1 = ρβ/(α + β).
Next, consider the telephone switch system in Problem 135. Let
(p0 , p1 , . . . , pN ) be the solution to pQ = 0 and pi = 1. For this system, the
probability that an arriving call in steady state receives a busy signal (or is
rejected) is pN . Also, the long-run average rate of call rejection is λpN and the

N
long-run average switch utilization is ipi .
i=0
Finally, for Problem 136, let (p00 , p01 , p10 , p11 , p20 , p21 , . . .) be the solution
to pQ = 0 and (i,j) ∈ S pij = 1. Then the average number of backlogged mes-

∞
sages in steady state is i(pi0 +pi1 ) and the long-run system throughput

∞ i=0
is pi1 μ2 /(μ + iθ + λ).
i=0
740 Analysis of Queues

B.3 Semi-Markov Process and Markov Regenerative Processes

In this section, we extend DTMCs and CTMCs to more general types
of discrete-state stochastic processes. Interestingly, in some sense it also
extends renewal processes. Although we do not present all the interesting
results, the reader is encouraged to consider texts on stochastic processes in
this regard. All we present here are things useful in the analysis of queues,
and in particular we concentrate on steady-state performance measures.

B.3.1 Markov Renewal Sequence

Consider a stochastic process {Z(t), t ≥ 0} that models a system whose state
at time t is a discrete value Z(t). Say the state of the system is observed at
discrete times S1 , S2 , . . ., so that Sn is the time of the nth observation of the
stochastic process {Z(t), t ≥ 0}. Then the observation made at time Sn is called
Yn so that typically Yn is a function of Z(Sn ) and Sn . For example, if an M/G/1
queue is observed at every departure, then Z(t) could be the number in the
system at time t, Sn is the time of nth departure and Yn = Z(Sn +), that is, the
number in the system immediately after a departure. Usually if the nth obser-
vation changes the state of the stochastic process {Z(t), t ≥ 0}, then typically
Yn is Z(Sn +) or Z(Sn −), that is, just before or just after the observation. How-
ever, if the observation does not change the system state, one usually uses
Yn = Z(Sn ). Define J as the set of all possible values Yn can take for any n.
Given S0 = 0 and S0 ≤ S1 ≤ S2 ≤ S3 ≤ . . ., and Yn ∈ J for all n ≥ 0, the
bivariate sequence {(Yn , Sn ), n ≥ 0} is called a Markov renewal sequence (MRS)
if it satisfies the following:

P{Yn+1 = j, Sn+1 − Sn ≤ x|Yn = i, Sn , Sn−1 , . . . , Yn−1 , Yn−1 , . . .}

= P{Yn+1 = j, Sn+1 − Sn ≤ x|Yn = i}

= P{Y1 = j, S1 ≤ x|Y0 = i}

for any i ∈ J , j ∈ J , and x ≥ 0. The two equations here are similar to the
Markov and time-homogeneity properties respectively. Now, for any i ∈ J ,
j ∈ J , and x ≥ 0 define

Gij (x) = P{Y1 = j, S1 ≤ x|Y0 = i}

such that the matrix

G(x) = [Gij (x)]

is called the kernel of the MRS. We explain using an example.

Stochastic Processes 741

Problem 137
Consider a G/M/1 queue with independent identically distribute (IID) inter-
arrival times continuously distributed with common CDF A(·) and exp(μ)
service times. Let Z(t) be the number of customers in the system at time t, Sn
be the time of the nth arrival into the system with S0 = 0, and Yn = Z(Sn −)
be the number of customers in the system just before the nth arrival. For any
i ≥ 0 and j ≥ 0 obtain Gij (x) for the MRS {(Yn , Sn ), n ≥ 0}.
Solution
Clearly, we have S0 = 0 and S0 ≤ S1 ≤ S2 ≤ S3 ≤ . . ., since arrivals
occur one by one and the n + 1st arrival occurs after the nth. Then define
J = {0, 1, 2, . . .}. Then for any i ∈ J and any j ∈ J , we have

Gij (x) = P{Y1 = j, S1 ≤ x|Y0 = i}

⎧ x
⎪
⎪ (μy)i+1−j
e−μy (i+1−j)! dA(y) if 0 < j ≤ i + 1,
⎪
⎨ 0
=
⎪ A(x) − Gik (x) if j = 0,
⎪
⎪
⎩ i+1≥k>0
0 otherwise.

This result is due to the fact that when i + 1 ≥ j > 0, Gij (x) is the probability
of having exactly i + 1 − j service completions in time S1 and S1 ≤ x (where
S1 is an interarrival time). Likewise if j = 0, then Gi0 (x) is the probability that
the i + 1st service completion occurs before S1 and S1 ≤ x. Finally, if there are
i customers just before time Sn , then just before time Sn+1 it is not possible to
have more than i + 1 customers in the system, so j must be less than or equal
to i + 1 and thus Gij (x) = 0 if j > i + 1.

We will present other examples in the subsequent sections on semi-

Markov processes (SMPs) and MRGPs. However, it is worthwhile restating
three examples we have seen earlier. First of all, the usual renewal sequence
{Sn , n ≥ 0} where Sn is the time of the nth event is a special case of an
MRS where one could think of all Yn values being identical. Hence the
kernel reduces to a scalar G(x), which is the CDF of the interrenewal
times. Interestingly, as a second example, a DTMC {Yn , n ≥ 0} with state
space S and transition probability matrix P can be modeled as an MRS
where Sn = n since the system is observed at every unit of time. The third
example is when {Z(t), t ≥ 0} is a CTMC with state space S and genera-
tor matrix Q. Then, Sn is the time of the nth transition of the CTMC and
Yn = Z(Sn +), the state immediately after the nth transition. Then for any i ∈ S
and j ∈ S, Gij (x) = P{Y1 = j|Y0 = i}P{S1 ≤ x|Y0 = i} = (1 − eqii x )qij /(−qii ) since S1
is independent of Y1 when Y0 is given.
742 Analysis of Queues

There are some properties MRSs satisfy that are important to address. Say
we are given an MRS {(Yn , Sn ), n ≥ 0} with kernel G(x). Then the stochastic
process {Yn , n ≥ 0} is a DTMC with transition probability matrix P = G(∞).
In fact in many analysis situations we may only have the LST of the kernel
G̃(s), then one can easily obtain G(∞) as G̃(0) using one of the LST proper-
ties. Then, it is also crucial to notice that if we know the initial distribution
a = [P{Y0 = i}], then the MRS is completely characterized by a and G(x). With
this, we move onto two stochastic processes that are driven by MRSs.

B.3.2 Semi-Markov Process

Recall the stochastic process {Z(t), t ≥ 0} on a countable state space and the
MRS {(Yn , Sn ), n ≥ 0} described in the previous section. The SMP is a special
type of {Z(t), t ≥ 0} where Z(t) changes only at times Sn . In other words,
Z(t) versus t is piecewise constant and the only transitions occur at S0 , S1 , S2 ,
. . ., and hence Yn = Z(Sn +) is also equal to Z(Sn+1 −). A CTMC is a special
type of SMP. A few other examples of SMPs can be found in Chapters 9 and
10. Toward the end of this section, we present another example. We first
describe SMP analysis.
Since we do not use SMP transient analysis in this text, we do not present
that here. Instead we go straight to steady-state behavior. Like we saw in
DTMCs and CTMCs, here too for the steady-state analysis we require the
SMP to be irreducible, aperiodic, and positive recurrent. For that, all we
require is that the DTMC {Yn , n ≥ 0} be irreducible, aperiodic, and positive
recurrent, while the time between transitions Sn − Sn−1 have finite moments.
Note that the SMP’s aperiodicity is not an issue if the time between events
are not all discrete. With that description in place, consider such an irre-
ducible, aperiodic, and positive recurrent SMP {Z(t), t ≥ 0} on state space
{1, 2, . . . , }. As before, Sn denotes the time of the nth jump epoch in the
SMP with S0 = 0. Recall Yn as the state of the SMP immediately after the
nth jump, that is,

Yn = Z(Sn +).

Let

Gij (x) = P{S1 ≤ x; Y1 = j|Y0 = i}.

The kernel of the SMP (which is the same as that of the MRS) is

G(x) = [Gij (x)]i,j=1,..., .

For the DTMC {Yn , n ≥ 0}, let the transition probability matrix be

P = G(∞).
Stochastic Processes 743

Let the conditional CDF

Gi (x) = P{S1 ≤ x|Y0 = i) = Gij (x)

j=1

and the expected time the SMP spends in state i continuously before
transitioning out be

τi = E(S1 |Y0 = i)

which can be computed from Gi (x). Let

πi = lim P{Yn = i}
n→∞

be the stationary distribution of the DTMC {Yn , n ≥ 0}. It is given by the

unique nonnegative solution to

[π1 π2 . . . π ] = [π1 π2 . . . π ]P and πi = 1.

i=1

The stationary distribution of the SMP is given by

πi τi
pi = lim P{Z(t) = i} =
.
t→∞
πm τm
m=1

Next, we present an example to illustrate SMP modeling and analysis.

Problem 138
Consider a system with two components, A and B. The lifetime of the system
has a CDF F0 (x) and mean μ0 . When the system fails, with probability q it
is a component A failure and with probability (1 − q) it is a component B
failure. As soon as a component fails, it gets repaired and the repair time
for components A and B have CDF FA (x) and FB (x), respectively, as well
as means μA and μB , respectively. Model the system as an SMP and obtain
the steady-state probability that the system is up with components A and
component B are under repair.
Solution
Let Z(t) be the state of the system with state space {0, A, B} such that Z(t) = 0
implies the system is up and running at time t, whereas if Z(t) is A or B,
then at time t the system is down with component A or B, respectively,
under repair. Let Sn denote the epoch when Z(t) changes for the nth time
744 Analysis of Queues

and it corresponds to either the system going down or a repair completing.

Also, Yn is the state of the system immediately after time Sn . Then the pro-
cess {Z(t), t ≥ 0} is an SMP with kernel (such that the states are in the order
0, A, B)
⎡ ⎤
0 qF0 (x) (1 − q)F0 (x)
⎢ ⎥
G(x) = ⎣ FA (x) 0 0 ⎦.
FB (x) 0 0

Having modeled the SMP, next we perform steady-state analysis. Notice

that the DTMC {Yn , n ≥ 0} is irreducible and positive recurrent, with transi-
tion probability matrix
⎡ ⎤
0 q 1−q
P = G(∞) = ⎣ 1 0 0 ⎦.
1 0 0

Also, the conditional CDF Gi (x) = P{S1 ≤ x|Y0 = i) = j = 1 Gij (x) for i = 0,
A, B is Fi (x) and the expected time the SMP spends in state i before transi-
tioning out is μi (as described in the problem).
Then, we can compute (π0 , πA , πB ), the stationary distribution of the
DTMC {Yn , n ≥ 0} by solving

[π0 πA πB ] = [π0 πA πB ]P and π0 + πA + πB = 1.

Hence we get

[π0 πA πB ] = 0.5[1 q (1 − q)].

The stationary distribution of the SMP for i = 0, A, B can be computed using

πi μi
pi = lim P{Z(t) = i} = .
t→∞ π0 μ0 + πA μA + πB μB

Thus we have
1
[p0 pA pB ] = [μ0 qμA (1 − q)μB ].
μ0 + qμA + (1 − q)μB

B.3.3 Markov Regenerative Processes

We briefly present an extension to SMPs, namely, MRGP. Essentially in the
SMP, we were restricted by having Z(t) stay constant between epochs that
occur at times Sn for all n ≥ 0. Here we generalize that to allow for Z(t) to
Stochastic Processes 745

change between epochs. However, this stochastic process {Z(t), t ≥ 0} is pre-

cisely the one we considered in Section B.3.1. Although when we described
MRS we did not pay attention to the {Z(t), t ≥ 0} process. We do that here.
Indeed that process {Z(t), t ≥ 0} is an MRGP. However, since we considered
MRGPs only very briefly in this book (see G/M/1 queues in Chapter 4), we
do not delve deep into this concept and interested readers are suggested texts
such as Kulkarni [67].
As described earlier, two immediate examples of MRGPs are as follows:

• G/M/1 queue with Z(t) as the number in the system at time t, Sn as

the time of the nth arrival, and Yn = Z(Sn −)
• M/G/1 queue with Z(t) as the number in the system at time t, Sn as
the time of the nth departure, and Yn = Z(Sn +)

For the steady-state analysis, consider an MRGP {Z(t), t ≥ 0} on state

space S. For any i ∈ S and j ∈ S, let

Gij (x) = P{S1 ≤ x; Y1 = j|Y0 = i}.

The kernel of the MRGP (which is the same as that of the MRS) is

G(x) = [Gij (x)].

For the DTMC {Yn , n ≥ 0}, let the transition probability matrix be

P = G(∞).

Let the conditional CDF

Gi (x) = P{S1 ≤ x|Y0 = i) = Gij (x)

j∈S

and

τi = E(S1 |Y0 = i)

which can be computed from Gi (x). Also, let

πi = lim P{Yn = i}
n→∞

be the stationary distribution of the DTMC {Yn , n ≥ 0} written in vector form

as π = [πi ] for all i ∈ S. It is given by the unique nonnegative solution to

π = πP and πi = 1.
i∈S
746 Analysis of Queues

Define αkj as the expected time spent in state j from time 0 to S1 , given
that Y0 = k, for all k ∈ S and j ∈ S. This could be tricky to compute in some
instances. The stationary distribution of the MRGP for any j ∈ S is given by

πk αkj
pj = lim P{Z(t) = j} =
k ∈ S .
t→∞ πk τk
k∈S

Of course all this assumes that the stationary distribution exists, which
only requires that the DTMC {Yn , n ≥ 0} is irreducible and positive recur-
rent (assuming that the epochs Sn occur continuously over time). With that
understanding, we move on to the final type of stochastic processes in
this chapter and the only one where the states are not countable (note that
other stochastic processes such as Ornstein–Uhlenbeck process and Gaussian
process are not considered although they have been used in chapters of
this book).

B.4 Brownian Motion

We begin this section by modeling a Brownian motion as a limit of a discrete
random walk as done in Ross [91]. Consider a symmetric random walk: in
each time unit, say one is equally likely to take a unit step either to the left
or to the right. Take smaller and smaller steps at smaller and smaller time
intervals. Say at each t time units we take a step of size x either to the
left or to the right with equal probabilities. Let Xi = 1 if the ith step of length
x is to the right, and Xi = − 1 if the ith step of length x is to the left. We
assume Xi are independent and

1
P{Xi = 1} = P{Xi = −1} = .
2

Define X(t) as the position at time t. Clearly

X(t) = X(0) + x(X1 + X2 + · · · + X t/t

). (B.1)

We let X(0) = 0, otherwise we will just look at X(t)−X(0). Note that E[Xi ] = 0
and Var[Xi ] = 1. From Equation B.1, we have

E[X(t)] = 0,

t
Var[X(t)] = (x)2 . (B.2)
t
Stochastic Processes 747

Now let x and t become zero. However, we must be careful to ensure that
the limit exists for Equation B.2. Therefore, we must have

(x)2 = σ2 t,

for some positive constant σ. Taking the limit as t → 0, we have

E[X(t)] = 0,

and

Var[X(t)] → σ2 t.

From central limit theorem we have: X(t) is normally distributed with mean
0 and variance σ2 t. With that we now formally define a Brownian motion.

B.4.1 Deﬁnition of Brownian Motion

A stochastic process {X(t), t ≥ 0} satisfying the following properties is called
a Brownian motion:

• X(t) has independent increments; that is, for every pair of disjoint
time intervals (s, t) and (u, v), s < t ≤ u < v, the increments {X(t) −
X(s)} and {X(v) − X(u)} are independent random variables. There-
fore, the Brownian motion is a Markov process.
• Every increment {X(t) − X(s)} is normally distributed with mean 0
and variance σ2 (t − s).

Note that if X(t) is Gaussian (i.e., normal)

P{X(t) ≤ x|X(s) = x0 } = P{X(t) − X(s) ≤ x − x0 } = (α)

where √
α = (x − x0 )/(σ t − s)
y √ −u2 /2
(y) = −∞ 1/ 2πe du

We assume that σ is a constant independent of t (hence time homogene-

ity) and x. When σ = 1, the process is called standard Brownian motion.
Since any Brownian motion X(t) can be converted into a standard Brownian
motion X(t)/σ, we will use σ = 1 for the most part in this chapter (unless
stated otherwise).
Next, we consider an important extension, namely, the geometric
Brownian motion. Let {X(t), t ≥ 0} be a Brownian motion. Define Y(t)
748 Analysis of Queues

such that

Y(t) = eX(t) .

The process {Y(t), t ≥ 0} is called a geometric Brownian motion. Using the fact
that the moment generating function of a normal random variable X(t) with
mean 0 and variance t is
2 /2
E[esX(t) ] = ets ,

we have

E[Y(t)] = E[eX(t) ] = et/2 ,

Var[Y(t)] = E[Y2 (t)] − (E[Y(t)])2 = e2t − et .

As a rule of thumb, if absolute changes are IID, we use Brownian motion

to model the process. However, if percentage changes are IID, we use geo-
metric Brownian motion to model the process. For example, the percentage
change of asset price in finance are IID and hence we model asset price using
a geometric Brownian motion.
Another important extension is Brownian motion reflected at the origin.
Let {X(t), t ≥ 0} be a Brownian motion. Define Z(t) such that

Z(t) = |X(t)|.

The process {Z(t), t ≥ 0} is called Brownian motion reflected at the origin. The
CDF of Z(t) can be obtained for y > 0 as

P{Z(t) ≤ z} = P{X(t) ≤ z} − P{X(t) ≤ −z}

= 2P{X(t) ≤ z} − 1

2 z
e−x
2 /2t
=√ dx − 1.
2πt −∞

Further, we have

2t
E[Z(t)] = ,
π

2
Var[Z(t)] = 1 − t.
π
Stochastic Processes 749

As a final extension, we consider Brownian motion with drift. Let B(t) be

a standard Brownian motion. Define X(t) such that

X(t) = B(t) + μt.

Then the process {X(t), t ≥ 0} is a Brownian motion with drift coefficient μ

such that

• X(0) = 0
• {X(t), t ≥ 0} has stationary and independent increments
• X(t) is normally distributed with mean μt and variance t

Note that the variance would be σ2 t if B(t) was not a “standard” Brownian
motion.

B.4.2 Analysis of Brownian Motion

Consider a process {X(t), t ≥ 0}, which is a Brownian motion with drift coef-
ficient μ and variance of X(t) is σ2 t. Let the CDF F(t, x; x0 ) be defined as
follows:

F(t, x; x0 ) = P{X(t) ≤ x|X(0) = x0 }.

It can be shown that the CDF satisfies the following diffusion equation:

∂ ∂ σ2 ∂ 2
F(t, x; x0 ) = −μ F(t, x; x0 ) + F(t, x; x0 ). (B.3)
∂t ∂x 2 ∂x2
Initial condition: X(0) = x0 implies

0 if x < x0
F(0, x; x0 ) =
1 if x ≥ x0 .

If there is a reflecting barrier placed on the x-axis, the boundary condition is

F(t, 0; x0 ) = 0 for all t > 0.

Equation B.3 is also called forward Kolmogorov equation or Fokker–Planck

equation. Next, we provide a solution to this system by presenting as a
problem.

Problem 139
Consider the Brownian motion {X(t), t ≥ 0} with drift coefficient μ and vari-
ance of X(t) is σ2 t. Assume there is a reflecting barrier placed on the x-axis.
Solve the PDE and obtain steady-state probabilities.
750 Analysis of Queues

Solution
The solution to the PDE (Equation B.3) is

x − x0 − μt −x − x0 − μt
− e−2xμ/σ φ
2
F(t, x; x0 ) = φ √ √ .
σ t σ t

In steady state as t → ∞, let F(t, x; x0 ) → F(x). Then F(x) satisfies (letting

t → ∞ in Equation B.3)

dF(x) σ2 d2 F(x)
0 = −μ + .
dx 2 dx2

The solution is
2
F(x) = 1 − e2xμ/σ ,

which is the CDF of an exponential random variable with parameter

−2μ/σ2 .

B.4.3 Itô’s Calculus

For a sufficiently smooth function f (x), recall Taylor’s expansion:

1
f (x + dx) = f (x) + f (x)dx + f (x)(dx)2 + o((dx)2 ). (B.4)
2

By letting df (x) = f (x + dx) − f (x), we can write

1
df (x) = f (x)dx + f (x)(dx)2 + o((dx)2 ).
2

Dividing both sides by dx and letting dx → 0, we get the familiar derivative

formula:

f (x + dx) − f (x)
lim = f (x).
dx→0 dx

For ordinary calculus, we do not need the second-derivative term in

Equation B.4, as it is associated with higher-order infinitesimal (dx)2 . So, we
can simply write

df (x) = f (x)dx.
Stochastic Processes 751

When Brownian motion is involved, things turn out a little differently. Let
Bt be a standard Brownian motion (the same as B(t) but to avoid too many
parentheses in our formulae we use Bt ). Consider a function f (Xt ), where

Xt = μt + σBt .

From Equation B.4, we have

1
df (Xt ) = f (Xt )dXt + f (Xt )(dXt )2 + o((dXt )2 ).
2

Recall the fundamental relation

(dBt )2 = dt

when dt → 0. Hence we have

(dXt )2 = (μdt + σdBt )2

= μ2 (dt)2 + 2μσ(dt)(dBt ) + σ2 (dBt )2

= σ2 dt,

where on the last line we have ignored terms of higher order than dt. Substi-
tuting in the Taylor’s series expansion, and omitting higher-order terms, we
have

1
df (Xt ) = f (Xt )dXt + f (Xt )σ2 dt. (B.5)
2

In other words, the second-derivative term can no longer be ignored when

Brownian motion is involved. This is the essence of Itô’s calculus.
One of the main applications of this analysis is in finance. In particular,
we show an example to obtain the probability distribution of asset price at
time t. Let St denote asset price at time t. For a given time T, we would like
to obtain the distribution of the random variable ST . The dynamics of St is
modeled as a geometric Brownian motion

dSt
= μdt + σdBt . (B.6)
St

Now consider ln(St ). We would like to obtain d(ln(St )). Note that for f (x) =
ln(x), we have f (x) = 1/x and f (x) = − 1/x2 . Therefore, we have (using
752 Analysis of Queues

Equation B.6)

dSt 1 (dSt )2
d(ln(St )) = −
St 2 S2t

1
= μdt + σdBt − (μdt + σdBt )2
2
1
= μdt + σdBt − σ2 dt.
2

Define ν such that ν = μ − σ2 /2. Taking integral from 0 to T, on both sides

of the equation, we have

ln(ST ) − ln(S0 ) = νT + σBT .

Rewriting in terms of ST , we have

ST = S0 eνT+σBT .

Therefore, ST follows a log-normal distribution, or ln(ST ) follows a normal

distribution:

σ2 2
ln(ST ) ∼ Normal ln(S0 ) + μ − T, σ T .
2

It is important to note that while the geometric Brownian motion in Equa-

tion B.6 characterizes the dynamics of the asset price over time, this log-
normal distribution only specifies the distribution of the asset price at a
single time point T.

Reference Notes
Like the previous chapter, this chapter is also mainly a result of teach-
ing various courses on stochastic processes especially at the graduate level.
The definitions, presentations, and notations for the first part of this chap-
ter (DTMC, CTMC, MRS, SMP, and MRGP) are heavily influenced by
Kulkarni [67]. Another excellent resource for those topics is Ross [92]. For
the Brownian part, the materials presented are based out of Ross [91] and
Medhi [80]. Several topics such as diffusion processes, Ornstein–Uhlenbeck
process, Gaussian process, and martingales have been left out. Some of these
such as martingales can be found in both Ross [91] and Resnick [90]. Also,
Stochastic Processes 753

the topic of stochastic process limits (i.e., also not considered here) can be
found in Whitt [105].

Exercises
B.1 A discrete-time polling system consists of a single communication
channel serving N buffers in a cyclic order starting with buffer-1. At
time t = 0, the channel polls buffer-1. If it has any packets to trans-
mit, the channel transmits exactly one and then moves to buffer-2
at time t = 1. The same process repeats at each buffer until at time
t = N − 1 the channel polls buffer N. Then at time t = N, the chan-
nel polls buffer-1 and the cycle repeats. Now consider buffer-1. Let
Yt be the number of packets it receives during the interval (t, t + 1].
Assume that Yt = 1 with probability p and Yt = 0 with probability
1 − p. Let Xn be the number of packets available for transmission
at buffer-1 when it is polled for the nth time. Model {Xn , n ≥ 1} as a
DTMC.
B.2 Consider a DTMC with transition probability matrix
⎛ ⎞
p0 p1 p2 p3 ···
⎜ 1 0 0 0 ··· ⎟
⎜ ⎟
⎜ 0 1 0 0 ··· ⎟
P=⎜ ⎟
⎜ 0 0 1 0 ··· ⎟
⎝ ⎠
.. .. .. .. ..
. . . . .

∞
∞
where pj > 0 for all j and pj = 1. Let M = (j pj ) and
j=0 j=0
assume M < ∞. Show that π0 = 1/(1 + M) and hence find the sta-
tionary probability distribution π = (π0 π1 π2 . . .).
B.3 Conrad is a student who goes out to eat lunch everyday. He eats
either at a Chinese food place (C), at an Italian food place (I), or
at a Burger place (B). The place Conrad chooses to go for lunch on
day n can be modeled as a DTMC with state space S = {C, I, B} and
transition matrix
⎡ ⎤
0.3 0.3 0.4
P=⎣ 1 0 0 ⎦.
0.5 0.5 0

That means if Conrad went to the Chinese food place (C) yesterday,
he will choose to go to the Burger place (B) today with probabil-
ity 0.4, and also there is a 30% chance he will go to the Italian
754 Analysis of Queues

food place (I). Is the DTMC irreducible, aperiodic, and positive

recurrent?
(a) After several days (assume steady state is reached) if you ask
Conrad what he had for lunch, what is the probability that he
would say “burger”?
(b) If on day n = 1000 Conrad went to the Chinese food place, what
is the probability that on day n = 1001 he will go to the Italian
food place?
(c) If all the students at Conrad’s university follow the earlier
Markov chain, in the long run, what fraction of the students
going to lunch eat at the Italian food place?
(d) Continuing with the previous question, if the average cost per
lunch is $1.90 in the Burger place, $5.70 in the Italian food place,
and $3.80 in the Chinese food place, what is the average revenue
per day per customer from all the three eating places together.
(e) How much does Conrad spend on lunch (on the average)
everyday?
B.4 Consider a colony of amoebae whose lifetimes are independent
exp(μ) random variables. During its lifetime, each amoeba produces
offsprings (i.e., splits into two) in a PP(λ) fashion. All amoebae
behave independently. As soon as the last amoeba in the colony
dies, a lab technician introduces a new amoeba after an exp(λ0 ) time.
Let X(t) be the number of amoebae in the colony at time t. Model
{X(t), t ≥ 0} as a CTMC.
B.5 Say that {X(t), t ≥ 0} is a CTMC with state space S = {0, 1, 2} and
infinitesimal generator matrix:

⎡ ⎤
2 3
⎢ ⎥
Q=⎣ 1 −1 ⎦.
0 −2

Fill up the blanks in Q. If the CTMC is in state 0 at time 0, what

is the probability that the CTMC would be in state 2 at time t = 1?
Given that p0 = 2/9, compute p1 and p2 , where p0 , p1 , and p2 form
the limiting distribution of the CTMC. If the cost incurred per unit
time in state i is $18i, what is the long-run average cost incurred by
the system?
B.6 There are two identical photocopy machines in an office. The up
times of each machine is exponentially distributed with a mean up
time of 1/μ days. The repair times of each machine is exponentially
distributed with mean 1/λ days. Let X(t) be the number of machines
that are up at time t. Model the {X(t), t ≥ 0} process as a CTMC under
Stochastic Processes 755

two conditions: one repair person and two repair persons. Would
it better to employ one or two repair persons for this system? If a
machine is up, it generates a revenue of $r per unit time. However,
each repair person charges $c per unit time.
B.7 Consider a three-state SMP {Z(t), t ≥ 0} with state space {1, 2, 3}.
The elements of the kernel of this SMP is given as follows:
G12 (t) = 1 − e−t − te−t , G21 (t) = 0.4(1 − e−0.5t ) + 0.3(1 − e−0.2t ),
G23 (t) = 0.2(1 − e−0.5t ) + 0.1(1 − e−0.2t ), G32 (t) = 1 − 2e−t + e−2t , and
G11 (t) = G13 (t) = G22 (t) = G31 (t) = G33 (t) = 0. Obtain the probability
that the SMP is in state i in steady state for i = 1, 2, 3.
B.8 Let {X(t), t ≥ 0} be a standard Brownian motion. Define {Z(t), t ≥ 0}
such that Z(t) = |X(t)|, the Brownian motion reflected at the origin.
Using the CDF of Z(t), derive expressions for E[Z(t)] and Var[Z(t)].
B.9 For any constant k show that {Y(t), t ≥ 0} is a martingale if
Y(t) = exp{kB(t) − k2 t/2}, where B(t) is a standard Brownian motion.
For that, all you need to show is the following is satisfied:

E[Y(t)|Y(u), 0 ≤ u ≤ s] = Y(s).

B.10 Let {B(t), t ≥ 0} be a standard Brownian motion and X(t) = B(t) + μt

for some constant μ. Compute Cov(X(t), X(s)).
This page intentionally left blank
References

1. S. Aalto, U. Ayesta, and R. Righter. On the Gittins index in the M/G/1 queue.
Queueing Systems, 63(1–4), 437–458, 2009.
2. J. Abate and W. Whitt. Numerical inversion of Laplace transforms of probability
distributions. ORSA Journal on Computing, 7, 36–43, 1995.
3. V. Aggarwal, N. Gautam, S.R.T. Kumara, and M. Greaves. Stochastic fluid-flow
models for determining optimal switching thresholds with an application to
agent task scheduling. Performance Evaluation, 59(1), 19–46, 2004.
4. S. Ahn and V. Ramaswami. Efficient algorithms for transient analysis of
stochastic fluid flow models. Journal of Applied Probability, 42(2), 531–549, 2005.
5. D. Anick, D. Mitra, and M.M. Sondhi. Stochastic theory of a data handling
system with multiple sources. Bell System Technical Journal, 61, 1871–1894, 1982.
6. L. Arnold. Stochastic Differential Equations: Theory and Applications, Krieger
Publishing Company, Melbourne, FL, 1992.
7. F. Baccelli and P. Bremaud. Elements of Queuing Theory: Palm Martingale Calculus
and Stochastic Recurrences, 2nd edn., Springer, Berlin, Germany, 2003.
8. F. Baskett, K.M. Chandy, R.R. Muntz, and F. Palacios. Open, closed and mixed
networks of queues with different classes of customers. Journal of the ACM, 22,
248–260, 1975.
9. A.W. Berger and W. Whitt. Effective bandwidths with priorities. IEEE/ACM
Transactions on Networking, 6(4), 447–460, August 1998.
10. A.W. Berger and W. Whitt. Extending the effective bandwidth concept to net-
work with priority classes. IEEE Communications Magazine, 36, 78–84, August
1998.
11. G.R. Bitran and S. Dasu. Analysis of the PHi /PH/1 queue. Operations Research,
42(1), 159–174, 1994.
12. G. Bolch, S. Greiner, H. de Meer, and K.S. Trivedi. Queueing Networks and Markov
Chains, 1st edn., John Wiley & Sons Inc., New York, 1998.
13. M. Bramson. Stability of queueing networks. Probability Surveys, 5, 169–345,
2008.
14. P.J. Burke. The output of a queuing system. Operations Research, 4(6), 699–704,
1956.
15. J.A. Buzacott and J.G. Shanthikumar. Stochastic Models of Manufacturing Systems,
Prentice-Hall, New York, 1992.
16. C.S. Chang and J.A. Thomas. Effective bandwidth in high-speed digital net-
works. IEEE Journal on Selected Areas in Communications, 13(6), 1091–1100,
1995.
17. C.S. Chang and T. Zajic. Effective bandwidths of departure processes from
queues with time varying capacities. In: Fourteenth Annual Joint Conference of the
IEEE Computer and Communication Societies, Boston, MA, pp. 1001–1009, 1995.
18. X. Chao, M. Miyazawa, and M. Pinedo. Queueing Networks: Customers, Signals,
and Product Form Solutions, John Wiley & Sons, New York, 1999.

757
758 References

19. H. Chen and D.D. Yao. Fundamentals of Queueing Networks, Springer-Verlag,

New York, 2001.
20. N. Chrukuri, G. Kandiraju, N. Gautam, and A. Sivasubramaniam. Analyti-
cal model and performance analysis of a network interface card. International
Journal of Modelling and Simulation, 24(3), 179–189, 2004.
21. G.L. Curry and N. Gautam. Characterizing the departure process from a
two server Markovian queue: A non-renewal approach. In: Winter Simulation
Conference, Miami, FL, pp. 2075–2082, 2008.
22. G. de Veciana and G. Kesidis. Bandwidth allocation for multiple qualities of
service using generalized processor sharing. In: IEEE Global Telecommunications
(GLOBECOM-94), San Francisco, CA, pp. 1550–1554, 1994.
23. G. de Veciana, C. Courcoubetis, and J. Walrand. Decoupling bandwidths for
networks: A decomposition approach to resource management. In: Proceedings
of Fourteenth Annual Joint Conference of the IEEE Computer and Communica-
tions Societies, 1994 (INFOCOM-94), Toronto, Ontario, Canada, pp. 466–473,
1994.
24. G. de Veciana, G. Kesidis, and J. Walrand. Resource management in wide-area
ATM networks using effective bandwidths. IEEE Journal on Selected Areas in
Communications, 13(6), 1081–1090, 1995.
25. J.G. Dai. Stability of fluid and stochastic processing networks. MaPhySto
Miscellanea Publication, No. 9, 1999.
26. R.L. Disney and P.C. Kiessler. Traffic Processes in Queueing Networks: A Markov
Renewal Approach, Johns Hopkins University Press, Baltimore, MD, 1987.
27. S.G. Eick, W.A. Massey, and W. Whitt. The physics of the Mt /G/1 queue.
Operations Research, 41(4), 731–742, 1993.
28. A.I. Elwalid and D. Mitra. Analysis and design of rate-based congestion con-
trol of high speed networks. Part I: Stochastic fluid models, access regulation.
Queueing Systems: Theory and Applications, 9, 29–64, 1991.
29. A.I. Elwalid and D. Mitra. Fluid models for the analysis and design of statisti-
cal multiplexing with loss priorities on multiple classes of bursty traffic. IEEE
Transactions on Communications, 42(11), 2989–3002, 1992.
30. A.I. Elwalid and D. Mitra. Effective bandwidth of general Markovian traffic
sources and admission control of high-speed networks. IEEE/ACM Transactions
on Networking, 1(3), 329–343, June 1993.
31. A.I. Elwalid, D. Heyman, T.V. Lakshman, D. Mitra, and A. Weiss. Fundamental
bounds and approximations for ATM multiplexers with applications to video
teleconferencing. IEEE Journal on Selected Areas in Communications, 13(6), 1004–
1016, 1995.
32. A.I. Elwalid and D. Mitra. Analysis, approximations and admission control of
a multi-service multiplexing system with priorities. In: Proceedings of Fourteenth
Annual Joint Conference of the IEEE Computer and Communications Societies, 1995
(INFOCOM-95), Boston, MA, pp. 463–472, 1995.
33. S.N. Ethier and T.G. Kurtz. Markov Processes: Characterization and Convergence,
1st edn., John Wiley & Sons, Inc., New York, 1986.
34. M. Fackrell. Fitting with matrix-exponential distributions. Stochastic Models, 21,
377–400, 2005.
35. R.M. Feldman and C. Valdez-Flores. Applied Probability and Stochastic Processes,
PWS Publishing Company, Boston, MA, 1995.
References 759

36. A. Feldmann and W. Whitt. Fitting mixtures of exponentials to long-tail dis-

tributions to analyze network performance models. Performance Evaluation, 31,
245–279, 1998.
37. M. Ferguson and Y. Aminetzah. Exact results for nonsymmetric token ring
systems. IEEE Transactions on Communications, 33(3), 223–231, 1985.
38. A. Ganesh, N. O’Connell, and D. Wischik. Big queues. In: Series: Lecture Notes in
Mathematics, Vol. 1838, Springer-Verlag, Berlin, Germany, 2004.
39. N. Gautam, V.G. Kulkarni, Z. Palmowski, and T. Rolski. Bounds for fluid mod-
els driven by semi-Markov inputs. Probability in the Engineering and Informational
Sciences, 13, 429–475, 1999.
40. N. Gautam and V.G. Kulkarni. Applications of SMP bounds to multiclass traffic
in high-speed networks. Queueing Systems: Theory and Applications, 36, 351–379,
2000.
41. N. Gautam. Buffered and unbuffered leaky bucket policing: Guaranteeing QoS,
design and admission control. Telecommunication Systems, 21(1), 35–63, 2002.
42. N. Gautam. Pricing issues in web hosting services. Journal of Revenue and Pricing
Management, 4(1), 7–23, 2005.
43. N. Gautam. Quality of service metrics. In: Frontiers in Distributed Sensor Net-
works, S.S. Iyengar and R.R. Brooks (eds.), Chapman & Hall/CRC Press, Boca
Raton, FL, pp. 613–628, 2004.
44. N. Gautam. Queueing theory. In: Operations Research and Management Science
Handbook, A. Ravindran (ed.), CRC Press, Taylor & Francis Group, Boca Raton,
FL, pp. 9.1–9.37, 2007.
45. E. Gelenbe and I. Mitrani. Analysis and Synthesis of Computer Systems, Academic
Press, London, U.K., 1980.
46. P.W. Glynn. Diffusion approximations. In: Handbooks in Operations Research and
Management Science Volume 2, Stochastic Models, D.P. Heyman and M.J. Sobel
(eds.), North-Holland, Amsterdam, the Netherlands, pp. 145–198, 1990.
47. W.J. Gordon and G.F. Newell. Closed queueing systems with exponential
servers. Operations Research, 15(2), 254–265, 1967.
48. T.C. Green and S. Stidham. Sample-path conservation laws, with applications to
scheduling queues and fluid systems. Queueing Systems: Theory and Applications,
36, 175–199, 2000.
49. D. Gross and C. M. Harris. Fundamentals of Queueing Theory, 3rd edn., John Wiley
& Sons Inc., New York, 1998.
50. S. Halfin and W. Whitt. Heavy-traffic limits for queues with many exponential
servers. Operations Research, 29, 567–588, 1981.
51. M. Harchol-Balter. Queueing disciplines. In: Wiley Encyclopedia of Operations
Research and Management Science, John Wiley & Sons, New York, 2011.
52. J.M. Harrison. Brownian Motion and Stochastic Flow Systems, John Wiley & Sons
Inc., New York, 1985.
53. D.P. Heyman and M.J. Sobel. Stochastic Models in Operations Research, Volume I,
Stochastic Processes and Operating Characteristics, McGraw-Hill, New York, 1982.
54. M. Hlynka. Queueing theory page. http://web2.uwindsor.ca/math/hlynka/
queue.html
55. M. Hlynka. List of queueing theory software. http://web2.uwindsor.ca/math/
hlynka/qsoft.html
56. J.R. Jackson. Networks of waiting line. Operations Research, 5, 518–521, 1957.
760 References

57. M. Kamath. Rapid analysis of queueing systems software. http://www.okstate.

edu/cocim/raqs/
58. O. Kella. Stability and non-product form of stochastic fluid networks with Levy
inputs. The Annals of Applied Probability, 6(1), 186–199, 1996.
59. F.P. Kelly. Reversibility and Stochastic Networks, John Wiley & Sons Inc., Chich-
ester, U.K., 1994.
60. F.P. Kelly. Notes on effective bandwidths. In: Stochastic Networks: Theory and
Applications. F.P. Kelly, S. Zachary, and I.B. Ziedins (eds.), Oxford University
Press, Oxford, UK, pp. 141–168, 1996.
61. G. Kesidis, J. Walrand, and C.S. Chang. Effective bandwidths for multiclass
Markov fluids and other ATM sources. IEEE/ACM Transactions on Networking,
1(4), 424–428, 1993.
62. J.P. Kharoufeh and N. Gautam. Deriving link travel time distributions via
stochastic speed processes. Transportation Science, 38(1), 97–106, 2004.
63. L. Kleinrock. Queueing Systems, Vol. 2, John Wiley & Sons Inc., New York,
1976.
64. H. Kobayashi. Application of the diffusion approximation to queueing net-
works. Part I: Equilibrium queue distribution. Journal of the Association for
Computing Machinery, 21(2), 316–328, 1974.
65. K.R. Krishnan, A.L. Neidhardt, and A. Erramilli. Scaling analysis in traffic
management of self-similar processes. In: Proceedings of the 15th International
Teletraffic Congress, Washington, DC, pp. 1087–1096, 1997.
66. V.G. Kulkarni and T. Rolski. Fluid model driven by an Ornstein–Ühlenbeck
process. Probability in Engineering and Informational Sciences, 8, 403–417, 1994.
67. V.G. Kulkarni. Modeling and Analysis of Stochastic Systems. Texts in Statistical
Science Series, Chapman & Hall, Ltd., London, U.K., 1995.
68. V.G. Kulkarni. Effective bandwidths for Markov regenerative sources. Queueing
Systems: Theory and Applications, 24, 137–153, 1996.
69. V.G. Kulkarni. Fluid models for single buffer systems. In: Frontiers in Queueing,
Probab. Stochastics Ser., CRC Press, Boca Raton, FL, pp. 321–338, 1997.
70. V.G. Kulkarni and N. Gautam. Admission control of multi-class traffic with
service priorities in high-speed networks. Queueing Systems: Theory and Appli-
cations, 27, 79–97, 1997.
71. T.G. Kurtz. Strong approximation theorems for density dependent Markov
chains. Stochastic Processes and Their Applications, 6(3), 223–240, 1978.
72. R.C. Larson. Perspectives on queues: Social justice and the psychology of
queueing. Operations Research, 35(6), 895–905, 1987.
73. G. Latouche and V. Ramaswami. Introduction to Matrix Analytic Methods in
Stochastic Modeling. ASA-SIAM Series on Statistics and Applied Probability, Society
for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1999.
74. W.E. Leland, M.S. Taqqu, W. Willinger, and D.V. Wilson. On the self-similar
nature of Ethernet traffic (Extended Version). IEEE/ACM Transactions on Net-
working, 2(1), 1–15, 1994.
75. S. Mahabhashyam and N. Gautam. On queues with Markov modulated service
rates. Queueing Systems: Theory and Applications, 51(1-2), 89–113, 2005.
76. S. Mahabhashyam, N. Gautam, and S.R.T. Kumara. Resource sharing queueing
systems with fluid-flow traffic. Operations Research, 56(3), 728–744, 2008.
77. A. Mandelbaum, W.A. Massey, and M.I. Reiman. Strong approximations for
Markovian service networks. Queueing Systems, 30(1–2), 149–201, 1998.
References 761

78. A. Mandelbaum, W.A. Massey, M.I. Reiman, A. Stolyar, and B. Rider. Queue
lengths and waiting times for multiserver queues with abandonment and
retrials. Telecommunication Systems, 21(2–4), 149–171, 2002.
79. W.A. Massey and W. Whitt. Uniform acceleration expansions for Markov chains
with time-varying rates. The Annals of Applied Probability, 8(4), 1130–1155, 1998.
80. J. Medhi. Stochastic Models in Queueing Theory, Elsevier Science, Boston, MA,
2003.
81. D.A. Menasce and V.A.F. Almeida. Scaling for E-Business: Technologies, Mod-
els, Performance, and Capacity Planning, Prentice Hall, Upper Saddle River, NJ,
2000.
82. S.P. Meyn. Control Techniques for Complex Networks, Cambridge University Press,
New York, 2009.
83. M. Moses, S. Seshadri, and M. Yakirevich. HOM Software. http://www.stern.
nyu.edu/HOM
84. A. Narayanan and V.G. Kulkarni. First passage times in fluid models with
an application to two priority fluid systems. Proceedings of the IEEE Interna-
tional Computer Performance and Dependability Symposium, Urbana-Champaign,
IL, 1996.
85. M.F. Neuts. Matrix-Geometric Solutions in Stochastic Models—An Algorithmic
Approach, The Johns Hopkins University Press, Baltimore, MD, 1981.
86. T. Osogami and M. Harchol-Balter. Closed form solutions for mapping general
distributions to quasi-minimal PH distributions. Performance Evaluation, 63, 524–
552, 2006.
87. Z. Palmowski and T. Rolski. A note on martingale inequalities for fluid models.
Statistic and Probability Letter, 31(1), 13–21, 1996.
88. Z. Palmowski and T. Rolski. The superposition of alternating on-off flows and a
fluid model. Report no. 82, Mathematical Institute, Wroclaw University, 1996.
89. N.U. Prabhu. Foundations of Queueing Theory, Kluwer Academic Publishers,
Boston, MA, 1997.
90. S.I. Resnick. A Probability Path, Birkhauser, Boston, MA, 1998.
91. S.M. Ross. Stochastic Processes, John Wiley & Sons Inc., New York, 1996.
92. S.M. Ross. Introduction to Probability Models, 8th edn., Academic Press, New York
2003.
93. S.M. Ross. A First Course in Probability, 8th edn., Pearson Prentice Hall, Upper
Saddle River, NJ, 2010.
94. D. Sarkar and W.I. Zangwill. Expected waiting time for nonsymmetric cyclic
queueing systems–Exact results and applications. Management Science, 35(12),
1463–1474, 1989.
95. L.E. Schrage and L.W. Miller. The queue M/G/1 with the shortest remaining
processing time discipline. Operations Research, 14, 670–684, 1966.
96. R. Serfozo. Introduction to Stochastic Networks, Springer-Verlag, New York, 1999.
97. L.D. Servi. Fast algorithmic solutions to multi-dimensional birth-death pro-
cesses with applications to telecommunication systems. In: Performance Evalu-
ation and Planning Methods for the Next Generation Internet, A. Girard, B. Sanso,
and F. Vazquez-Abad (eds.), Springer, New York, pp. 269–295, 2005.
98. A. Shwartz and A. Weiss. Large Deviations for Performance Analysis, Chapman &
Hall, New York, 1995.
99. W.J. Stewart. Introduction to the Numerical Solution of Markov Chains, Princeton
University Press, Princeton, NJ, 1994.
762 References

100. S. Stidham Jr. Optimal Design of Queueing Systems, CRC Press, Boca Raton, FL,
2009.
101. H. Takagi. Queueing analysis of polling models. ACM Computing Surveys, 20(1),
5–28, 1988.
102. H.C. Tijms. A First Course in Stochastic Models, John Wiley & Sons Inc., Bognor
Regis, West Sussex, England, 2003.
103. W. Whitt. The queueing network analyzer. The Bell System Technical Journal,
62(9), 2779–2815, 1983.
104. W. Whitt. Departures from a queue with many busy servers. Operations Research,
9(4), 534–544, 1984.
105. W. Whitt. Stochastic-Process Limits, Springer, New York, 2002.
106. W. Whitt. Efficiency-driven heavy-traffic approximations for many-server
queues with abandonments. Management Science, 50(10), 1449–1461, 2004.
107. A. Wierman, N. Bansal, and M. Harchol-Balter. A note comparing response
times in the M/G/1/FB and M/G/1/PS queues. Operations Research Letters, 32,
73–76, 2003.
108. R.W. Wolff. Stochastic Modeling and the Theory of Queues, Prentice Hall, Engle-
wood Cliffs, NJ, 1989.
Manufacturing and Industrial Engineering

Analysis of Queues
Methods and Applications

“The breadth and scope of topics in this book surpass the books currently on the market. For most
graduate engineering or business courses on this topic the selection is perfect. … presented in
sufficient depth for any graduate class. I like in particular the “problems” presented at regular intervals,
along with detailed solutions. … excellent coverage of both classical and modern techniques in
queueing theory. Compelling applications and case studies are sprinkled throughout the text. For many
of us who teach graduate courses in queueing theory, this is the text we have been waiting for!”
—John J. Hasenbein, The University of Texas at Austin

“Dr. Gautam has an obvious passion for queueing theory. His delight in presenting queueing paradoxes
beams through the pages of the book. His relaxed conversational style makes reading the book
a pleasure. His introductory comments about having to account for a large variety of educational
backgrounds among students taking graduate courses indicate that he takes education very seriously. It
shows throughout the book. He has made an excellent choice of topics and presented them in his own
special style. I highly recommend this queueing text by an expert who clearly loves his field.”
—Dr. Myron Hlynka, University of Windsor, Ontario, Canada

Features
• Explains concepts through applications in a variety of domains such as production,
computer communication, and service systems
• Presents numerous solved examples and exercise problems that deepen students’
understanding of topics
• Includes discussion of fluid-flow queues, which is not part of any other textbook
• Contains prerequisite material to enhance methodological understanding
• Emphasizes methodology, rather than presenting a collection of formulae
• Provides 139 solved problems and 154 unsolved problems
• Promotes classroom discussions using case studies and paradoxes

K10327
6000 Broken Sound Parkway, NW ISBN: 978-1-4398-0658-6
Suite 300, Boca Raton, FL 33487 90000
711 Third Avenue
an informa business New York, NY 10017
2 Park Square, Milton Park
www.taylorandfrancisgroup.com Abingdon, Oxon OX14 4RN, UK
9 781439 806586
w w w. c rc p r e s s . c o m

Anoop Desai (Author) - Aashi Mital (Author) - Production Economics - Evaluating Costs of Operations in Manufacturing and Service Industries (2018, CRC Press) (10.1201 - 9780429487040) - Libgen - Li
100% (1)
Anoop Desai (Author) - Aashi Mital (Author) - Production Economics - Evaluating Costs of Operations in Manufacturing and Service Industries (2018, CRC Press) (10.1201 - 9780429487040) - Libgen - Li
551 pages
(McGraw-Hill Series in Management) William J. Stevenson - Mayer, Raymond R - Production Operations Management-McGraw-Hill Companies (1968) PDF
100% (1)
(McGraw-Hill Series in Management) William J. Stevenson - Mayer, Raymond R - Production Operations Management-McGraw-Hill Companies (1968) PDF
229 pages
LLB Environmental Law Notes
63% (8)
LLB Environmental Law Notes
56 pages
Diagnostic Techniques in Industrial Engineering PDF
No ratings yet
Diagnostic Techniques in Industrial Engineering PDF
254 pages
Andreas Behr (Auth.) - Production and Efficiency Analysis With R (2015, Springer International Publishing) PDF
No ratings yet
Andreas Behr (Auth.) - Production and Efficiency Analysis With R (2015, Springer International Publishing) PDF
235 pages
Multi Echelon Inventory EOQ
No ratings yet
Multi Echelon Inventory EOQ
4 pages
Module 5: Failure Criteria of Rock and Rock Masses: 5.4.3 Hoek and Brown Criterion
No ratings yet
Module 5: Failure Criteria of Rock and Rock Masses: 5.4.3 Hoek and Brown Criterion
8 pages
Agent-Based Business Process Simulation A Primer With Applications and Examples
No ratings yet
Agent-Based Business Process Simulation A Primer With Applications and Examples
195 pages
Recommended Reading For Time Series Analysis
No ratings yet
Recommended Reading For Time Series Analysis
2 pages
Operations Research Theory and Practice
No ratings yet
Operations Research Theory and Practice
659 pages
Operations Management Research and Cellular Manufacturing Systems Innovative Methods and Approaches
No ratings yet
Operations Management Research and Cellular Manufacturing Systems Innovative Methods and Approaches
456 pages
(Sayama) Introduction To The Modeling and Analysis of Complex Systems PDF
67% (3)
(Sayama) Introduction To The Modeling and Analysis of Complex Systems PDF
498 pages
Op Tim Ization Principles Algorithms 2018
No ratings yet
Op Tim Ization Principles Algorithms 2018
738 pages
The Philosophy of Quantitative Methods Understanding Statistics
No ratings yet
The Philosophy of Quantitative Methods Understanding Statistics
169 pages
Simulation Industrial Systerm PDF
100% (4)
Simulation Industrial Systerm PDF
538 pages
Control Engineering and Finance Lecture Notes in Control and Information Sciences PDF
100% (3)
Control Engineering and Finance Lecture Notes in Control and Information Sciences PDF
312 pages
10.1007@978 981 15 4550 4
100% (1)
10.1007@978 981 15 4550 4
613 pages
Robust Ordinal Regression
No ratings yet
Robust Ordinal Regression
429 pages
Operations Research
No ratings yet
Operations Research
574 pages
Regression Analysis: Unified Concepts, Practical Applications, and Computer Implementation
100% (2)
Regression Analysis: Unified Concepts, Practical Applications, and Computer Implementation
280 pages
Operations Management and Systems Engineering
No ratings yet
Operations Management and Systems Engineering
253 pages
Production Control Systems A Guide To Enhance Performance of Pull Systems
100% (1)
Production Control Systems A Guide To Enhance Performance of Pull Systems
123 pages
Network Data Envelopment Analysis
No ratings yet
Network Data Envelopment Analysis
483 pages
Database Modeling For Industrial Data Management - Emerging Technologies and Applications - (Zongmin - Ma) PDF
100% (1)
Database Modeling For Industrial Data Management - Emerging Technologies and Applications - (Zongmin - Ma) PDF
391 pages
2019 Book Twin-Control PDF
100% (2)
2019 Book Twin-Control PDF
298 pages
Akritas Probability & Statistics With R For Engineers and Scientists
No ratings yet
Akritas Probability & Statistics With R For Engineers and Scientists
256 pages
Stochastic Programming
No ratings yet
Stochastic Programming
326 pages
(Vijayan Sugumaran, Arun Kumar Sangaiah, Arunkumar
No ratings yet
(Vijayan Sugumaran, Arun Kumar Sangaiah, Arunkumar
379 pages
Supply Chain Disruption Management Using Stochastic Mixed Integer Programming (PDFDrive)
No ratings yet
Supply Chain Disruption Management Using Stochastic Mixed Integer Programming (PDFDrive)
364 pages
IACT 422 - 03 - Term Project - SUPPLY CHAIN SIMULATION FOR 4th PARTY LOGISTICS
100% (1)
IACT 422 - 03 - Term Project - SUPPLY CHAIN SIMULATION FOR 4th PARTY LOGISTICS
37 pages
Undergraduate Text
No ratings yet
Undergraduate Text
351 pages
Data-Enabled Analytics: Joe Zhu Vincent Charles Editors
No ratings yet
Data-Enabled Analytics: Joe Zhu Vincent Charles Editors
370 pages
(Interdisciplinary Applied Mathematics 40) René Vidal, Yi Ma, S.S. Sastry (Auth.) - Generalized Principal Component Analysis-Springer-Verlag New York (2016)
No ratings yet
(Interdisciplinary Applied Mathematics 40) René Vidal, Yi Ma, S.S. Sastry (Auth.) - Generalized Principal Component Analysis-Springer-Verlag New York (2016)
590 pages
PDF Evolutionary Optimization Algorithms Full Online: Book Details
No ratings yet
PDF Evolutionary Optimization Algorithms Full Online: Book Details
1 page
Flexible Flow Shop Sheduling Problem With Machine Eligibility
100% (1)
Flexible Flow Shop Sheduling Problem With Machine Eligibility
12 pages
The Design of Approximation Algorithm 2011 PDF
No ratings yet
The Design of Approximation Algorithm 2011 PDF
500 pages
Industrial Engineering Operation Management
100% (1)
Industrial Engineering Operation Management
271 pages
Systemarchitekturskript Prof. Paul
No ratings yet
Systemarchitekturskript Prof. Paul
520 pages
As Time Goes by
No ratings yet
As Time Goes by
211 pages
Stochastic Discrete Event Systems
100% (3)
Stochastic Discrete Event Systems
393 pages
Plant Simulation Book - Robinson
100% (3)
Plant Simulation Book - Robinson
339 pages
Completeness Quality For The 21st Century
No ratings yet
Completeness Quality For The 21st Century
280 pages
(IET Control Robotics and Sensors Series 99) Ikuo Yamamoto - Practical Robotics and Mechatronics - Marine, Space and Medical Applications-The Institution of Engineering and Technology (2016)
No ratings yet
(IET Control Robotics and Sensors Series 99) Ikuo Yamamoto - Practical Robotics and Mechatronics - Marine, Space and Medical Applications-The Institution of Engineering and Technology (2016)
169 pages
(Simulation Foundations, Methods and Applications) Dietmar P.F. Möller (Auth.)-Introduction to Transportation Analysis, Modeling and Simulation_ Computational Foundations and Multimodal Applications-S
100% (2)
(Simulation Foundations, Methods and Applications) Dietmar P.F. Möller (Auth.)-Introduction to Transportation Analysis, Modeling and Simulation_ Computational Foundations and Multimodal Applications-S
356 pages
BPM Aris Part1
No ratings yet
BPM Aris Part1
10 pages
Regression Modeling PDF
100% (1)
Regression Modeling PDF
598 pages
Idea Group Neural Networks in Business Forecasting
No ratings yet
Idea Group Neural Networks in Business Forecasting
311 pages
The Complete Guide To Machine Learning in Retail Demand Forecasting Links
100% (1)
The Complete Guide To Machine Learning in Retail Demand Forecasting Links
20 pages
Operations Research - Methodologies (Ravindran) (2009) PDF
100% (1)
Operations Research - Methodologies (Ravindran) (2009) PDF
464 pages
2015 - Guarnieri - Decision Models in Engineering and Management
100% (2)
2015 - Guarnieri - Decision Models in Engineering and Management
322 pages
Industrial Applications of Machine Learning PDF
100% (3)
Industrial Applications of Machine Learning PDF
349 pages
(Chapman & Hall Statistics Textbook Series) Derek Bissell (Auth.) - Statistical Methods For SPC and TQM-Springer US (1994) - 01
No ratings yet
(Chapman & Hall Statistics Textbook Series) Derek Bissell (Auth.) - Statistical Methods For SPC and TQM-Springer US (1994) - 01
5 pages
Computer Simulation
100% (1)
Computer Simulation
314 pages
Time Series Forecasting ANN
No ratings yet
Time Series Forecasting ANN
8 pages
(Advances in Applied Mechanics 30) John W. Hutchinson and Theodore Y. Wu (Eds.) - Classical Mechanics_ Applied Mechanics and Mechatronics-Springer (1993)
100% (2)
(Advances in Applied Mechanics 30) John W. Hutchinson and Theodore Y. Wu (Eds.) - Classical Mechanics_ Applied Mechanics and Mechatronics-Springer (1993)
273 pages
Analytical Performance Modeling For Computer Systems Third Edition
No ratings yet
Analytical Performance Modeling For Computer Systems Third Edition
173 pages
Queueing
No ratings yet
Queueing
379 pages
Analytical Performance Modeling For Computer Systems, 3 Ed., Claypool, 2018
No ratings yet
Analytical Performance Modeling For Computer Systems, 3 Ed., Claypool, 2018
171 pages
Queueing
No ratings yet
Queueing
183 pages
Queueing
No ratings yet
Queueing
182 pages
Network Analysis Dec 06
No ratings yet
Network Analysis Dec 06
204 pages
Network Analysis
0% (1)
Network Analysis
204 pages
Emission Estimate
No ratings yet
Emission Estimate
27 pages
Sustainable Water Supply - Surface and Groundwater
No ratings yet
Sustainable Water Supply - Surface and Groundwater
68 pages
Applications of Queuing Theory For Open May - MA - T - 2013
100% (2)
Applications of Queuing Theory For Open May - MA - T - 2013
79 pages
Groundwater Investigations
No ratings yet
Groundwater Investigations
67 pages
Module 11: Rock Blasting: 11.3.3 Blast Damage & Vibration Criteria
No ratings yet
Module 11: Rock Blasting: 11.3.3 Blast Damage & Vibration Criteria
9 pages
MINESIGHT
No ratings yet
MINESIGHT
198 pages
Lecture 20
No ratings yet
Lecture 20
10 pages
Module 8: Rock Slope Stability: 8.4.2 Shear Strength Reduction Technique
No ratings yet
Module 8: Rock Slope Stability: 8.4.2 Shear Strength Reduction Technique
7 pages
PPV Mining Blast
No ratings yet
PPV Mining Blast
11 pages
Module 11: Rock Blasting: 11.3 Optimized and Control Blasting
100% (1)
Module 11: Rock Blasting: 11.3 Optimized and Control Blasting
8 pages
Module 11: Rock Blasting
100% (1)
Module 11: Rock Blasting
8 pages
Lecture 34
No ratings yet
Lecture 34
11 pages
Planes of Weakness in Rocks, Rock Frctures and Fractured Rock
No ratings yet
Planes of Weakness in Rocks, Rock Frctures and Fractured Rock
10 pages
Module 7: Planes of Weakness in Rocks: 7.4 Barton'S Equation For Shear Strength
No ratings yet
Module 7: Planes of Weakness in Rocks: 7.4 Barton'S Equation For Shear Strength
6 pages
Module 7: Planes of Weakness in Rocks: 7.2.10 Joint Friction Angle
No ratings yet
Module 7: Planes of Weakness in Rocks: 7.2.10 Joint Friction Angle
4 pages
Module 7: Planes of Weakness in Rocks: 7.2.4 Joint Roughness Coefficient (JRC)
No ratings yet
Module 7: Planes of Weakness in Rocks: 7.2.4 Joint Roughness Coefficient (JRC)
7 pages
Module 6: Stresses Around Underground Openings: 6.6 Excavation Shape and Boundary Stress
100% (1)
Module 6: Stresses Around Underground Openings: 6.6 Excavation Shape and Boundary Stress
10 pages
Lecture 18
No ratings yet
Lecture 18
8 pages
Module 5: Failure Criteria of Rock and Rock Masses
100% (1)
Module 5: Failure Criteria of Rock and Rock Masses
5 pages
Tổng hợp các đề thi vào 10 chuyên Anh
No ratings yet
Tổng hợp các đề thi vào 10 chuyên Anh
7 pages
Status of The Brick Industries in Bangladesh: Acase Study From The Raozan Subdistrict of Chittagong District
No ratings yet
Status of The Brick Industries in Bangladesh: Acase Study From The Raozan Subdistrict of Chittagong District
8 pages
Project Management: Assignment 2
No ratings yet
Project Management: Assignment 2
6 pages
Principles and Practice of Cleaning in Place
No ratings yet
Principles and Practice of Cleaning in Place
41 pages
Space, Time, and Life: The Probabilistic Pathways of Evolution
No ratings yet
Space, Time, and Life: The Probabilistic Pathways of Evolution
126 pages
Answers to Problems: Principles of Adaptive Filters and Self-learning Systems by Anthony Zaknich
No ratings yet
Answers to Problems: Principles of Adaptive Filters and Self-learning Systems by Anthony Zaknich
8 pages
Biotech Agri DLP
No ratings yet
Biotech Agri DLP
5 pages
Ips M El 181
No ratings yet
Ips M El 181
20 pages
AP27.10-U-2702AA Schimb Ulei Cutie Automata
No ratings yet
AP27.10-U-2702AA Schimb Ulei Cutie Automata
3 pages
Vibration in Phase-Shifting Interferometry: Peter J. de Groot
No ratings yet
Vibration in Phase-Shifting Interferometry: Peter J. de Groot
13 pages
q3 Eng9 w21 l8.1 Ode To The West Wind
No ratings yet
q3 Eng9 w21 l8.1 Ode To The West Wind
32 pages
Grade 9 Exam Review (Simple)
No ratings yet
Grade 9 Exam Review (Simple)
3 pages
Dka Hhns Pathway
No ratings yet
Dka Hhns Pathway
7 pages
Work and Energy - Class 9 Science SA2 NCERT Solutions Free PDF Download
No ratings yet
Work and Energy - Class 9 Science SA2 NCERT Solutions Free PDF Download
20 pages
Activity Book Home Safety
No ratings yet
Activity Book Home Safety
12 pages
Diversity and Community Structure of Basidiomycetes in Alimodian Draaaaaftiloilo Philippines Draft 1
No ratings yet
Diversity and Community Structure of Basidiomycetes in Alimodian Draaaaaftiloilo Philippines Draft 1
32 pages
CATALOG CARRAMICA - February 2023 PDF
No ratings yet
CATALOG CARRAMICA - February 2023 PDF
34 pages
Western Digital 2105241638 Sandisk Sdada4dr 64g c2830407
No ratings yet
Western Digital 2105241638 Sandisk Sdada4dr 64g c2830407
57 pages
Vacation Body Blueprint
100% (2)
Vacation Body Blueprint
56 pages
Historical Context of The Rise of Buddhism
No ratings yet
Historical Context of The Rise of Buddhism
18 pages
B53-35H-011 Polyethylene Copolymer: Typical Properties
No ratings yet
B53-35H-011 Polyethylene Copolymer: Typical Properties
2 pages
Descriptive Statistics - Step by Step Solutions
No ratings yet
Descriptive Statistics - Step by Step Solutions
10 pages
Spur Gear
No ratings yet
Spur Gear
1 page
Apj Abdul Kalam Technological University Thiruvananthapuram, Kerala, INDIA
No ratings yet
Apj Abdul Kalam Technological University Thiruvananthapuram, Kerala, INDIA
20 pages
Spinal Conditions in The Athlete
No ratings yet
Spinal Conditions in The Athlete
280 pages
Compass 2009 2.4L
No ratings yet
Compass 2009 2.4L
273 pages
Laying Out A Curve by Deflection Angle
No ratings yet
Laying Out A Curve by Deflection Angle
5 pages
DLP in Health 8 Stages of Infection
100% (3)
DLP in Health 8 Stages of Infection
8 pages
YEAR 10 Fractions Decimals and Percentage
No ratings yet
YEAR 10 Fractions Decimals and Percentage
8 pages