Performance Modeling of Communication Networks with Markov Chains
Performance Modeling of Communication Networks with Markov Chains
Communication Networks
with Markov Chains
Synthesis Lectures on
Communication Networks
Editor
Jean Walrand, University of California, Berkeley
Synthesis Lectures on Communication Networks is an ongoing series of 50- to 100-page
publications on topics on the design, implementation, and management of communication
networks. Each lecture is a self-contained presentation of one topic by a leading expert. The topics
range from algorithms to hardware implementations and cover a broad spectrum of issues from
security to multiple-access protocols. The series addresses technologies from sensor networks to
reconfigurable optical networks.
The series is designed to:
• Provide the best available presentations of important aspects of communication networks.
• Help engineers and advanced students keep up with recent developments in a rapidly
evolving technology.
• Facilitate the development of courses in this field.
Network Simulation
Richard M. Fujimoto, Kalyan S. Perumalla, George F. Riley
2006
Copyright © 2010 by Morgan & Claypool
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in
printed reviews, without the prior permission of the publisher.
DOI 10.2200/S00269ED1V01Y201004CNT005
Lecture #5
Series Editor: Jean Walrand, University of California, Berkeley
Series ISSN
Synthesis Lectures on Communication Networks
Print 1935-4185 Electronic 1935-4193
Performance Modeling of
Communication Networks
with Markov Chains
Jeonghoon Mo
Yonsei University, Seoul, Korea
M
&C Morgan & cLaypool publishers
ABSTRACT
This book is an introduction to Markov chain modeling with applications to communication net-
works. It begins with a general introduction to performance modeling in Chapter 1 where we
introduce different performance models. We then introduce basic ideas of Markov chain model-
ing: Markov property, discrete time Markov chain (DTMC) and continuous time Markov chain
(CTMC). We also discuss how to find the steady state distributions from these Markov chains
and how they can be used to compute the system performance metric. The solution methodologies
include a balance equation technique, limiting probability technique, and the uniformization. We
try to minimize the theoretical aspects of the Markov chain so that the book is easily accessible to
readers without deep mathematical backgrounds. We then introduce how to develop a Markov chain
model with simple applications: a forwarding system, a cellular system blocking, slotted ALOHA,
Wi-Fi model, and multichannel based LAN model. The examples cover CTMC, DTMC, birth-
death process and non birth-death process. We then introduce more difficult examples in Chapter 4,
which are related to wireless LAN networks: the Bianchi model and Multi-Channel MAC model
with fixed duration. These models are more advanced than those introduced in Chapter 3 because
they require more advanced concepts such as renewal-reward theorem and the queueing network
model. We introduce these concepts in the appendix as needed so that readers can follow them
without difficulty. We hope that this textbook will be helpful to students, researchers, and network
practitioners who want to understand and use mathematical modeling techniques.
KEYWORDS
Markov Chain modeling, continuous time Markov chain, discrete time Markov chain,
performance modeling, communication networks, Markov property, queueing theory,
queueing network, balance equation, steady state distribution, uniformization, limiting
probability, production form solution, Jackson network, BCMP network, wireless LAN,
Wi-Fi, blocking probability, slotted ALOHA, Bianchi model, CSMA Markov chain,
Multi-channel MAC
vii
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Performance Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
A Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Author’s Biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Preface
A Markov chain is a very powerful and widely used tool in various fields including physics,
economics, engineering and so on. It is popularly used because of its simplicity, flexibility and ease of
computing key metric. The Markov property, which a Markov chain must satisfy, makes the chain
very simple to be described. Though it is simple, it is flexible enough to model various systems with
arbitrary number of states and its transition matrix. Furthermore, the beauty of the Markov chain in
the performance modeling is that it provides a simple numerical method to compute performance
metric.
Even with the wide acceptance of the Markov chain modeling, many students who lack in
knowledge of mathematics and stochastic processes find it very difficult to do the Markov chain
modeling. Especially, the deep mathematical backgrounds of the Markov process make it hard for
students to use the Markov chain modeling. Even students with mathematical backgrounds do not
know how to develop a model, since the model development process is more of “art.”
The purpose of this lecture is to provide concise, self-sufficient and easy to read materials
for advanced undergraduate and graduate level students to understand the performance modeling
methodologies with a Markov chain. The book is written as an introductory guide to Markov
chain modeling for beginners. We try to answer how to define states and transition probabilities in
Chapter 2 without detailing the theoretical aspects of a Markov chain. We provide many examples
of modeling in Chapter 3 so that students can learn modeling by follow the examples. Examples in
Chapter 4 are more advanced because they require knowledge of queueing networks and renewal
reward theorem. They are good examples for advanced students who are interested in performance
research with the Markov chain. We would like to provide students with various backgrounds how
to develop a Markov chain model using various examples in communication networks.
I am grateful to many reviewers for their valuable comments with earlier versions of the draft.
In particular, I thank Jiwoong Lee, Daehan Kwak, Moonsoo Kang, Jin Guang and two anonymous
reviewers for providing constructive comments. I thank Mike Morgan of Morgan & Claypool for
his encouragement and his help in getting this text reviewed and published. I thank professor Jean
Walrand for guiding me and encouraging me to start writing this lecture note. Without his guidance,
this book would not be possible. Most importantly, I would like to dedicate this book to my wife,
Mijung Park for her unwavering support and my two young kids, Sangwoo and Sunwoo.
Jeonghoon Mo
Sinchon, Seoul
April 2010
1
CHAPTER 1
Performance Modeling
1.1 SYSTEM, MODEL AND MODELING
1.1.1 WHAT ARE SYSTEM, MODEL AND MODELING?
A system, which is a target of modeling, is a collection of interacting entities. More formally, it is “a
set of components that are related by some forms of interaction, and which act together to achieve
some objective or purpose.” (25). For example, a computer system consists of hardware, operating
system (OS), application software, and a networking system. These entities interact with each other
to serve a common goal, which is to provide computing service to the computer user in this case.
Another example of a system can be a wireless LAN system that consists of Wi-Fi devices and an
access point. Wi-Fi devices are equipped with wireless LAN cards and send or receive packets to or
from the access point. They collectively form a network system.
A model is a simplified representation of a system. Models are used to improve the understand-
ing of systems. The are different types of models: physical, computer programs, and mathematical
equations.
• DNA Model. Another example is the double-helix DNA model of Watson and Crick, which
represents the structure of the DNA in a cell. The two scientists developed the model to
visualize the structure and improve the understanding of the DNA.
F = ma,
which states that the force (F ) is equal to mass (m) times acceleration (a). The equation is
very simple, but it enables to study the motion of objects.
Modeling is the process of developing a model from a target system. Figure 1.1 shows two
steps involved in modeling.
System Modeling
Definition
example, the basic understanding from the DNA model made it possible to complete the human
GENOME project (1). The test dummies are used to improve the safety of vehicles. Mathematical
models of Wi-Fi networks can be used to compare performance of different MAC protocols or to
optimize system parameters.
Possible reasons for using a model instead of the target system are as follows:
• A target system may not exist. This situation usually happens in engineering fields when a
company plans to develop a new product. For example, to understand the performance of its
next generation microprocessor, Intel uses a model such as a computer simulation.
• It can be dangerous to experiment with the target system. In the case of the car crash test, ex-
perimenting with human beings is too dangerous, and it can be fatal. So dummies are used
instead.
• It can be expensive to experiment with the target system. The flight simulator model is popularly
used for training purposes since the cost of flying an airplane is so high that training with
realistic simulation environments is preferred in airline companies.
• Models can be classified based on the academic fields. Examples include economic models,
biological models, chemical models, molecular models, queueing models, and on.
• Another way of classifying models is based on how they are represented. Physical models use
physical objects; mathematical models use mathematical languages; computer models use
computer languages. A simulation model is a kind of computer model because most simulations
are performed using computer and computer languages.
1.2. PERFORMANCE MODELS 3
• A model can be stochastic or deterministic. Stochastic models are used to represent uncertainty
in the real world. They are concerned with phenomena that vary as time advances, and where
the variation has a significant chance component. For instance, consider the availability of a fax
machine in the department office or the stock price of Samsung Electronics that varies over
time. Deterministic models do not have the chance component that the stochastic models
have. Their outcomes are precisely determined through combinations of events and states.
Therefore, deterministic models behave the same when the initial conditions are the same. An
example of a deterministic model is Newton’s law of motion.
• A model is static if it does not account for the time element; otherwise, it is dynamic. A Markov
chain model, which we will explain in Chapter 2, is dynamic because it models the change of
states over time. An example of static model is Newton’s model F = ma. A static model can be
considered as a snapshot. Dynamic models can be further classified into continuous time models
or discrete time models based on how time advances. In a discrete time model, time advances
only at discrete points while a continuous time model advances continuously. A Markov chain
can be either a discrete or a continuous time model.
Figure 1.3: An example of a simulation model: The computer code is shown on the right. The left part
of the figure shows the corresponding conceptual network topology.
languages save much time because they already include built-in tools for simulation event handling,
time advancing, random number generation, data analysis and more. A modeler may choose a
general purpose language such as C/C++ because he/she is familiar with it. The advantage of a
general purpose language is its flexibility.
Most simulators are based on the discrete event simulation (DES) technique, in which the
operations of a system are represented as a chronological sequence of events. In DES, each event
occurs when the state of the system changes. For performance evaluation purposes, we mostly use
DES.The Monte Carlo simulation, named after the Monte Carlo casino in Monaco, is another popular
method to study static systems. Unlike DES, it is used to model a system that does not change status
with time. That method is useful when a deterministic algorithm cannot provide an exact solution.
1.2. PERFORMANCE MODELS 5
It uses repeated random inputs to calculate the desired output. For example, calculation of definite
integrals is a common application of Monte Carlo simulation.
d2
m x(t) = −grad(V (x(t))), (1.1)
dt 2
where x(t) is the position of a particle in a potential field in physics. It can be an optimization model
of the following form,
• Queueing models are used to analyze systems that provide services to customers whose arrival
times and service durations are random. They can be used to model diverse systems including
telecommunication, manufacturing, and more. Figure 1.4 shows a queueing model that mimics
queue
arrivals departures
Figure 1.4: A Queueing Model Example (right) of Uplink Transmission of a Cellular Network System
(left).
the behavior of a cellular data system.The system consists of a base station and two transmitters.
The mobile transmitters send packets to the base station, which form an input traffic. Upon
6 1. PERFORMANCE MODELING
reception of packets, the base station stores them, processes and forwards them in a first come,
first serve manner. The buffer in the base station corresponds to a queue, as shown on the right
side, and the CPU in the base station corresponds to the server of the queue.
A queueing model can be a single queue model or a queueing network model based on the
number of queues in the model as shown in Figure 1.5. Single queue models are further classi-
fied by arrival process, service distribution, number of servers and so on. For example, M/M/1 2
is a queueing model in which the input traffic is Poisson process; the service distribution is
exponential and the number of server is 1.The M stands for memoryless, which we will explain
in 2.6. The basic queueing models are combined to construct a queueing network model, which
is a network of queues.The queueing network models can be used to analyze more complicated
systems that consists of multiple queues. For more on models, refer to (14; 18; 15; 26).
The popularity of the queueing model comes from the fact that it provides useful steady state
measures such as the average number in the queue, average time in the queue, the server
utilization, and more. These service measures are related to user dissatisfaction and a queueing
model provide a way to analyze the system.
• Stochastic models are popularly used in performance studies because they mimic the behavior
of dynamic systems with uncertainties that evolves over time. The stochastic models use time
sequenced collections of random variables to represent the system. The random variables are
employed to model uncertainties in real world systems and “collections of them” are needed to
mimic the behaviors over time. Consider the packet arrivals in the Figure 1.4. Packet generation
times are uncertain because they depend on many unknown factors such as user behavior and
2 A queueing model is usually specified by a formalism called Kendall Notation
A/B/C,
where A: arrival process; B: service distribution; and C: the number of servers.
1.2. PERFORMANCE MODELS 7
more. Due to the uncertainties, we use stochastic input traffic model with a few parameters.
The stochastic input traffic model can be used to construct a queueing model of the system.
The Markov chain or a renewal process are examples of the stochastic model. The stochastic
model can be Markovian or Non-Markovian as shown in Figure 1.6. The Markov model is
subclassified into discrete time Markov chain and continuous time Markov chain. Popular
stochastic models include the birth-death process and Poisson processes. (Refer to 2.6.3).
Stochastic Models
Markov Models
Poisson
Process
Non-Markov Models
Birth Death
Process
• Other analytical performance models include regression models, reliability models, and so
on. The regression model shows relationship between a dependent variable and one or more
independent variables. It helps us to understand how the independent variables impact the
dependent variable. For example, y = ax + b is a linear regression model where x is an inde-
pendent variable, y is a dependent variable, and [a, b] are constants. The reliability model is a
probability model that help us to understand failure rate or availability of systems. Prevention
of failure is very important in system such as nuclear plants, satellite or airplane as one failure
can cause tragic outcomes.
8 1. PERFORMANCE MODELING
Perform Experiments
Interpret Results
1. State Goals and Define the System. When we start a performance evaluation project, the
first question to answer is “What do you want to know or understand from this study?” The system
definition comes next and basically answers the question “What constitutes the system or what
are components of the system?” by delineating the boundaries.
The goal selection is very important since not only can it change a system definition, but also it
impacts the modeling step. Assume that you are a network administrator of Yonsei University.
Given the campus network, the goal may be to decide whether or not to upgrade the access
1.3. PERFORMANCE STUDY STEPS 9
router. In this case, the system would consist of the access router of which performance depends
heavily on the input traffic. On the other hand, if the goal is to decide additional memory
should be install in the access router, the system may be limited to memory unit in the access
routers and other components that interact with the memory.
2. Select Performance Metric and Factors. A metric is the criteria used to compare system
performances. The most popular performance metrics in network systems include throughput,
delay, and blocking probability. Availability is another popular metric used in many system
performance assessments. In the above example of Yonsei University, we can pick the delay
and the throughput per user as the system metrics.
There are many parameters that can affect the system performance. To perform a thorough
study, we need to list all of those parameters that can affect performance. Among those, some
are more important than others, or some are controllable and some are not. Those controllable
parameters, that we would like to vary to see their impact, are called factors.
3. Select a Method of Study. Performance study can be done in many different ways as shown
in Figure 1.2. One can experiment with a real system or can play with a mathematical or a
simulation model. Table 1.1 shows characteristics of different evaluation methods. Regarding
the accuracy of results, the real system is the best; simulation model is the second; the mathe-
matical model is the last. This is because modeling typically includes simplifying assumptions
in the model development process. The mathematical model is more restrictive than the sim-
ulation model as it requires more simplifying assumptions in general. However, the analytical
model is, in general, cheaper and takes less time than other methods. The real system is more
expensive than the mathematical model.
Therefore, the methodology selection is typically carried out in consideration of cost, required
accuracy of output, and available time. If you need an answer in a few days, a simple analytical
model can be considered at the sacrifice of accuracy. However, if accuracy of results is important,
real measurement can be selected.
Sometimes, the real system may not exist. In such cases, model development is a must. This
often happens in the new product development process. When a company develops a next
generation product, it must develop a model to predict the performance.
10 1. PERFORMANCE MODELING
4. Develop a Model. The process of model development is an art since it involves subjective
handling. A modeler’s intention is reflected when (s)he decides the scope, timescale, and
complexity of a model.
The modeler needs to figure out important aspects of the system in relation to the goal of a given
study. More important parts are modeled in details while less important parts are simplified
or left out. For example, if the goal is to understand the impact of a transport protocol, the
details of the protocol should be kept while the network topology can be simplified to a two
node network with a source and a destination. However, if the goal is to understand the
performance of an ethernet based network, the transport protocol can be simplified, but the
details of ethernet based protocol should be kept in the model.
The timescale of a model should be selected appropriately also given the goal. A CPU chip
designer would like to mimic the cycle accurate behavior of chip on the scale of nanoseconds
while software designer can consider the CPU as a blackbox with a timescale of microseconds.
Depending on the goal, appropriate timescale can be different.
A model can be very accurate and close to a target system or very coarse and simplified. A
modeler decides the complexity of model with consideration of the goals of study. When the
goal is to assess the performance of a computer, important subsystem of a computer should
be modeled. However, when the goal is to understand the performance of computer network
with 100 computers, each computer is considered as a blackbox with the details of computers
being neglected.
The budget and time constraints also impact the decision on the complexity. The less the
available time and cost, the coarser the model. When accuracy is important, detailed model
can be a better choice. In general, as there are trade offs between accuracy and cost, a good
modeler balance the two aspects to achieve the goals of the study.
Development of a simulation model is a writing a computer code that mimics behavior of a
target system. An analytical model uses mathematical equations or stochastic processes for
representing a performance model. An analytical model is in general more concise and simpler
than the computer simulation code. However, it is more restrictive with stronger assumptions
than a simulation model. On the other hand, simulation models are more flexible and can be
more realistic. Another advantage of a mathematical model is that it provides valuable insight
on system behaviors while developing the model.
We discuss the details of Markov chain model development in Section 2.
5. Verify and Validate the Model. Once a model is developed, the modeler needs to verify and
validate the model. Verification and validation, often used interchangeably, mean different
things. Verification is a process of making sure the model does what it is intended to do while
the validation is checking the developed model for a good representation of the real world
target system. Suppose that you develop a conceptual model of your target system before
1.4. TOWARDS A GOOD PERFORMANCE MODELER 11
writing a simulation code or a simulation model. The computer code that you have written
may include logical errors or bugs. Checking whether your computer code does not have these
bugs is the process of verification. Validation is checking the model against a real target system.
Even though your code may not have any logical error, it still may not be a good representation
of the target system. Verification and validation and are often blended in practice.
Oftentimes, we can observe cross validation between simulation and analytical models in
research papers. People use its counter-part to validate a model. A simulation model is used
to validate an analytical one or vice versa.
6. Perform Experiments and Interpret Results. With a developed model or a real testbed,
you can perform experiments to see the impact of the selected factors. You can design your
experiments to assess the impact of factors. Once you collect the results, you try to interpret
them to meet the goals of the study.
1.5 SUMMARY
• A system, a collection of related entities, is a target of modeling process, and it has a boundary.
CHAPTER 2
A Markov chain, named after a Russian mathematician Andrey Markov (1856-1922), is one
of the most popular mathematical tools that is used to model a dynamic system that changes its
state over time. It is used in various fields including engineering, economics, genetics, social sciences
and more. Its popularity stems from various reasons including its simplicity, flexibility and ease of
computation. In this chapter, we introduce backgrounds on the Markov chain in order to execute
performance modeling and evaluation. For more information on this subject, interested readers may
refer to the textbook (22).
Model
Dynamic System
Markov Model
- CTMC
X(t) Modeling - DTMC
Non-Markov Model
time
Figure 2.2: A dynamical system can be modeled into a Markov model or a non-Markov model.
number x(t) of customers is better represented by a continuous time model. Markov models are also
classified into discrete time Markov chains (DTMC) or continuous time Markov chains (CTMC),
depending on when the state changes.
Not all characteristics of dynamical systems can be represented by a Markov chain, only those
that satisfy the so called “Markov property.” Basically, the Markov property says that the transition
rule is simple enough to be represented by a matrix. 1 Characteristics without the Markov property
are modeled with different methods such as simulation or non-Markov models.
With the help of the property, Markov models are specified by two components: the state
space S of the system and the state transition rules. If the Markov model is a DTMC, the transition
rules are represented by a transition probability matrix P; if the model is a CTMC, the rules are
represented by a transition rate matrix Q .
Accordingly, the Markov chain is specified by (S , P) if the model is a DTMC and by (S , Q )
if it is a CTMC.
• More formally, the Markov property can be defined as follows: A discrete-time stochastic
process {X(n), n = 0, 1, 2, · · · } has a Markov property if
• A Markov chain can be used to model a dynamical system whose state changes over time. More
formally, if we let X(n) be the state of a system at time n, the sequence {X(n), n = 1, 2, 3, · · · , }
of states represent a dynamical system.
• The state space S is the set of all possible states. At any instance, the state of a Markov chain
belongs to the state space S . The set S can be finite or countably infinite, as we explain in
examples below.
S = {Sunny, Rainy}. 2
• In the DTMC, the transition rule between states is specified by transition probabilities. For the
purpose of discussing a simple model, we make the simplifying assumption that sequence of
characteristics Sunny or Rainy of the weather in Seoul on successive days is a DTMC.
The transition probability pij from state i ∈ S to j ∈ S can be written as follows:
where X(n) denotes the state of a Markov chain at time n. It is a conditional probability that
the process is in state j at time n + 1 given that it is in state i at time n. The next state is
probabilistically determined by the transition probabilities.
In the case of the weather example, there are four possible transitions: Sunny → Sunny, Sunny
→ Rainy, Rainy → Sunny and Rainy → Rainy. Let pSS = a, pSR = b, pRS = c and pRR = d
be the transition probabilities for the four cases. The transition probabilities can be shown in
a matrix form, which is called “transition probability matrix” as follows:
pSS pSR a b
P= = . (2.2)
pRS pRR c d
Since the conditional probability must satisfy the axioms of the probability,
1. a, b, c, d ≥ 0;
2 We are making simplifying assumptions that weather does not change during a day and that there are only two types of weather,
sunny and rainy.
16 2. MARKOV CHAIN MODELING
2. a + b = 1;
3. c + d = 1.
where pij := P (X(n + 1) = j |X(n) = i), ∀i, j ∈ S and m is the size of S . For a matrix to
be a transition probability matrix, its elements satisfy
pij ≥ 0, pij = 1, ∀i ∈ S .
j
• A state transition diagram can be used to visualize a Markov chain. Figure 2.3 shows the state
transition diagram for the Markov chain of (2.3). Each node corresponds to a state ∈ S and
a directed link from node i to node j corresponds to transition probability pij . There exists a
link from node i to node j iff pij > 0.
In the figure, we can see that the chance of tomorrow being sunny on the condition that today
is sunny is p and that the chance of tomorrow being rainy is 1 − a. Similarly, the chance of
tomorrow being rainy (sunny) given that today is rainy is d (1 − d).
• A Markov chain model is a probability model because its transition rule is a probabilistic one
described by a “transition probability matrix.” When a chain is in state i at time n, its next
state is probabilistically determined by the i-th row [pij , ∀j ∈ S ] of the transition probability
matrix. In the weather example, if today’s weather is ‘Sunny’, it will be ‘Sunny’ tomorrow with
probability of a or ‘Rainy’ with probability of 1 − a.
• Note that in a Markov chain model, the probability of future state X(n + 1) depends on
the current state X(n) but not on the history of state changes {X(m), m < n} when n is
the current. If it is ‘Sunny’ today, the probability of being ‘Rainy’ tomorrow is (1 − a) in the
example. Whether or not it was ‘Rainy’ or ‘Sunny’ yesterday, the probability is the same as
2.2. DISCRETE TIME MARKOV CHAINS 17
a 1-a d
S R
1-d
long as it is ‘Sunny’ today. In other words, the current state provides enough information to
predict the future state. Note also the dependence of the future state on the current state. The
probability of being ‘Rainy’ is different if is it ‘Sunny’ or ‘Rainy’ today.
For this example, the Markov property does not really hold, i.e., the current state does not fully
characterize the future state of a process. In such a case, the transition probability matrix is
not enough to specify the evolution of states. Tomorrow’s weather depends on today’s weather
but also on the weather of the previous days. Since the transition probability matrix does not
provide enough information, we cannot use the Markov chain model in this case. We should
understand that the ‘Markov property’ is key to a Markov chain model.
Modeling
Weather in Seoul
Figure 2.4: A Markov chain models a dynamic system by specifying (S, P) where S is the set of states
and P represent transition rules. A dynamical system changes its states over time.
18 2. MARKOV CHAIN MODELING
• The information contained in the definition of each state must be sufficient to allow you to
achieve the goals of the study.
• The states must contain enough information to allow you to construct the single step transition
probability matrix P or the transition rate matrix Q.
The first criterion specifies that we should be able to compute a required performance metric
from the steady state solution of the Markov chain. Thus, the state definition is appropriate if it can
serve the purposes of the study. Regarding the weather model, we cannot answer the appropriateness
of state definition, because the goals are not specified.
The second criterion concerns the Markov property, which states that the current state pro-
vides sufficient information to determine the future. We should define states so that the state
transition rules satisfy the Markov property. In the weather model; then, remember that we as-
sumed that today’s weather is sufficient to predict that of tomorrow in a probabilistic manner.
With the assumption, we come up with a Markov chain model {X(n), n = 1, 2, 3, · · · }. How-
ever, if the assumption does not hold, the model may not be a good one. Suppose that a careful
study reveals that the weather of tomorrow depends on that of today and also that of yesterday.
The modeler thought that it is important to reflect the new study in the weather model, then
we need to define a new states. Note that if we define a new state Y (n) := (X(n + 1), X(n)) ,
{Y (n), n = 1, 2, 3, · · · } holds the Markov property because knowing Y (n − 1) provides sufficient
information to predict Y (n + 1), Hence, {(X(n + 1), X(n)), n = 1, 2, 3, · · · } is a Markov chain.
The state space is S = {(S, S), (S, R), (R, S), (R, R)}, and the transition probabilities need to be
changed appropriately.
In general, we have multiple options on how to define states that satisfy the two criteria
discussed above. The simplest possible model is preferable because it makes computations of the
steady state solution easier.
As an example, suppose that we would like to understand the performance of a Wi-Fi system
with two Wi-Fi devices, say D1 and D2, and two available channels, say C1 and C2. Each device
works in a time slotted manner. In the beginning of the slot, a device either decides to transmit a
packet in one of the channel or remains idle. Upon decision of transmission, it checks the availability
of a randomly selected channel before transmission. If the channel is idle, then it starts to transmit
a packet. Otherwise, it waits for a random duration.
2.4. SOLVING DISCRETE TIME MARKOV CHAINS (DTMC) 19
One possible definition of states that satisfy the two criteria is X1 = (x1 , x2 ) where xi is the
location of Di for i = 1, 2. This definition forms state space S1 given by:
S1 = {(Idle, Idle), (C1, Idle), (Idle, C1), (C2, Idle), (Idle, C2), (C1,C2), (C2,C1)}.
However, we can reduce the number of states by using the fact that the devices behave similarly.
For the purpose of analyzing the transmission rate of the system, it is not necessary to keep track of
the status of each device. Accordingly, we define the state to be the status of the two channels. Then
the state space is
We can further reduce the number of states by defining the state to be the number of busy channels.
Then the corresponding state space is
S3 = {0, 1, 2}.
With the three definitions of state above, we can compute the steady state throughput of the
system from the steady state probabilities. The simplest model is the last one for which the system
throughput is 1π1 + 2π2 where πi is the steady state probability of state i for i = 1, 2, respectively.
As another example, consider two models of voice traffic popularly used in a telephone per-
formance test as shown in Figure 2.5. The first model, shown on the left of the figure, has two states
whereas the second model shown on the right has six states. In the first model, a caller alternates
between talk and silence states. The second model considers two speakers, A and B. The six states
represent the state of the two speakers and the possible sequences of states. Since each speaker can be
in either a talk or silence period, there are at least four states. The model further divides the mutual
silence and double talk states into two. Hence, there are six states. The arrows in Figure 3.11 show
appropriate events that trigger state changes.
Figure 2.5: Two voice traffic models: two-state model (left) and six-state model (right).
Solving a
Markov Chain Steady State
Markov Chain
Model Distribution
(S, P)
(S, P)
steady state distribution if the process Xn can go from any state to any other state (not necessarily
in one step). Markov chains with this property are said to be irreducible.
From the obtained steady state distribution or π , we can easily compute a metric of our interests.
The umbrella manufacturer can utilize the fraction of Rainy days to forecast its demands. The bank
manager can estimate the fraction of idle bank tellers from the steady state probability that there
are no customers. Similarly, the blocking probability of a cell in a cellular network can be calculated
from the fraction of time that the number of on-going calls is equal to the capacity of the cell.
As these examples show, the steady state distribution of a Markov chain enables to calculate useful
performance metrics of a system when the model is appropriate.
There are two standard methods for calculating the steady state distribution π of a Markov
chain model (S , P):
• Note first that the entries of the columns of the matrices converge to the same value as the
power of P increases.The difference of two values in the first column decreases from 0.35(0.6 −
0.25) in P to 0(0.3846 − 0.3846) in P(16) . After convergence, any row of the converged
matrix, is a row vector equal to the steady state distribution. In this example, the steady
state distribution (πs , πr ) = (0.3846, 0.6154). The steady-state distribution also provides the
long-run probability of sunny weather (about 38%) and of rainy weather (about 62%).
• The theoretical result is as follows. Assume the existence of limn→∞ P(n) and let
Then each row vector of is identical and taking the first row give a steady state distribution
π of P. In practice, multiply P until the maximum difference in all columns is less than a given
threshold value, say 10−5 . This limit exists for irreducible Markov chains such that the process
Xn can go from a state to itself in a number of steps that is not necessarily a multiple of a
integer larger than one. Such a Markov chain is said to be aperiodic.
• The n-th power P(n) of P is called “n-step transition probability matrix” because its ele-
(n)
ment pij corresponds to the probability that X(m+n) = j given that Xm = i or P [X(m+n) =
j |Xm = i].To see this, let us use the weather example again. We like to compute the probability
that it will rain the day after tomorrow given that it rains today or P [X3 = R|X1 = R]. Since
it can be ‘Sunny’ or ‘Rainy’ tomorrow, two possible paths are R → R → R and R → S → R.
P (R → R → R) = 43 · 43 = 16 9
. P (R → S → R) = 41 · 0.4 = 0.1 Since the two events are
exclusive, P [X3 = R|X1 = R] = 16 9
+ 0.1 = 0.6625. Note the value is the same as the value
(2) (2)
of pRR in P .
22 2. MARKOV CHAIN MODELING
• The above result follows from the well-known Chapman-Kolmogorov equation
(m+n)
Pij = Pikm Pkjn , ∀n, m ≥ 0, ∀i, j, (2.7)
k∈S
which provides a way to compute n-step transition probabilities. It says that probability of
(m) (n)
moving from state i to state j in m + n steps is the sum of multiplications pik and pkj over
all possible k. This result follows from the Markov property and conditioning on k. In a matrix
form, it can be
P(m+n) = P(m) · P(n) . (2.8)
That is, the n-step transition probability can be computed from simple matrix multiplications.
(n)
• The limit of the n-step transition probability pij is πj . This shows that, as n increases, the
impact of the initial state i diminishes and the limit depends on only final state j not on the
initial state i (if the Markov chain is irreducible and aperiodic).
πi = πj pj i or πi (1 − pii ) = πj p j i . (2.11)
j j =i
The left term πi (1 − pii ) can be interpreted as the long term transition rate out of state i while
the right term j =i πj pj i is the long term transition rate into state i. The term “balance” in
the balance equation comes from this observation that the rates in and out of state i are equal
or balanced.
2.5. CALCULATION OF PERFORMANCE VALUE 23
• In Section 3.2, if X represents the number of packets in the buffer, and we are interested in
knowing the average number of packets in the system, F (X) = X and
∞
E[F (X)] = E[X] = j · πj .
j =1
• In Section 3.3, when a cellular system allows K simultaneous calls in a single cell, the blocking
probability of the system is the probability that arriving calls sees the system is in state K. If
we assume the Poisson arrival process, then πK is the blocking probability.
Oftentimes, the reward at state X = j is an independent random variable Rj with mean value
E[Rj ]. In this case, we can calculate
2.6.1 DEFINITION
A continuous time stochastic process {Xt , t ≥ 0} is a continuous time Markov chain (CTMC) if it
has the following Markov property:
P [Xt+s = j |Xs = i, Xu = xu , 0 ≤ u < s] = P [Xt+s = j |Xs = i] ∀i, j ∈ S , ∀s, ∀t, (2.15)
24 2. MARKOV CHAIN MODELING
where S is the state space. This property, as in discrete-time, says that, given the current state Xs ,
the future state Xt+s and past states {Xu , u < s} are independent. That is, the current state Xs is
sufficient to determine the evolution of the process.
Suppose that a CTMC Xt enters state i at time 0 and that it does not leave the state for a
duration of s. What is the probability that this process stays in state i for the next t units of time? If
we let Yi denote the amount of time that the chain stays in state i before jumping to another state,
then
P [Yi ≥ s + t|Yi ≥ s],
is the answer to the question. Due to the Markov property, given the current state Xs = i, the history
of the process between [0, s] is independent of the future evolution. Hence, we have
where qi is a parameter that represents an average rate of event occurrences. 3 You can check that
the expected duration of the exponential random variable is E[Yi ] = q1i .
We learned that a CTMC stays in state i ∈ S for an exponential random duration with
rate qi before moving to another state. When it changes its state, where does it jumps to? In
DTMC, transition probability pij determines the next state. Similarly, we can construct a CTMC, by
specifying transition probability pij where j =i pij = 1. After staying in state i for an exponential
duration with rate qi , it moves to state j with probability pij . Hence, One way to construct a CTMC
is by specifying (S , qi , pij , ∀i, j ).
Alternatively, we can specify a CTMC model by a state space S and transition rates qij , i, j ∈
S where
qij = qi · pij ∀i, j ∈ S .
3The geometric distribution is the discrete counterpart of the continuous exponential distribution and it also has the memoryless
property, expressed for discrete values of time.
2.6. CONTINUOUS TIME MARKOV CHAIN (CTMC) 25
n n+1 t t+Y
i j i j
Pij qij
Figure 2.7: Discrete Time Markov Chain vs. Continuous Time Markov Chain: In the CTMC, the
change of state can happen any time while it is limited in the DTMC. In DTMC, transition probability
pij from state i to j is used while transition rate qij from state i to j is used in the CTMC.
Then qij is the rate at which the process moves to state j when it is in state i. We call qij a transition
q
rate from i to j . As the rate qi is equal to j =i qij and pij is the same as qiji , specifying (S , qij ) is
equivalent to (S , qi , pij ).
The memoryless property is very important in establishing the Markov property of the
CTMC. Because the Markov chain spends an exponentially distributed random time in each state,
the remaining lifetime is independent of the age. Accordingly, as long as the process is in state i
at time t, the remaining lifetime distribution remains the same, independently of the past of its
evolution.
• Consider a radio channel which can be either in a good state or bad state as in Figure 2.84 . In
this case, the state space is
S = {Good, Bad} .
4 We simplified the channel by assuming that it has only two possible states.
26 2. MARKOV CHAIN MODELING
Good Bad
• Assume that the transition rate qgb from ‘Good’ to ‘Bad’ state = λ and qbg = μ. The transition
rate matrix is
qgg qgb −λ λ
Q = = . (2.18)
qbg qbb μ −μ
The diagonal entries qgg and qbb are determined so that the sum of the elements of each row
is zero.
where qii = − j =i qij and 0 ≤ qij < ∞ for all i = j . The diagonal entry qii does not
provide any new information because it is a dependent term. It is selected to be the negative
sum of all other entries in a row for the sake of computational convenience. As you shall see, the
balance equation can be computed simply from π Q = 0, if we define qii = −qi = − j =i qij .
Here, we assumed that the number of states is m. The component qij represents the transition
rate from i to j . Note that the sum of each row is zero.
history
s s+t time
Figure 2.9: The Markov property says that the current state is sufficient to predict the future evolution.
Knowing Xs is sufficient and the history of evolution [Xu , u < s] does not help in predicting the future.
For a representative example, assume that there are k customers in some system. When a new
arrival enters the system, the state changes to k + 1, whereas when a departure happens, the state
becomes k − 1(≥ 0). The arrival rate is λk and the departure rate is μk .
S = {0, 1, 2, · · · , L},
where L is the maximum number of customers allowed in the system. (L can be ∞.) The transition
rates qkl for k = l are given by
⎧
⎨ λk , l = k + 1, 0 ≤ k < L;
qkl = μ , l = k − 1, k ≥ 1; (2.20)
⎩ k
0, otherwise.
28 2. MARKOV CHAIN MODELING
The transition rate matrix Q for L = 4 is given by:
⎡ ⎤
−λ0 λ0 0 0 0
⎢ μ1 −(λ1 + μ1 ) λ1 0 0 ⎥
⎢ ⎥
Q =⎢
⎢ 0 μ2 −(λ2 + μ2 ) λ2 0 ⎥.
⎥ (2.21)
⎣ 0 0 μ3 −(λ3 + μ3 ) λ3 ⎦
0 0 0 μ4 −μ4
A birth-and-death process is a pure death process if λk = 0, for all k ≥ 0. It is a pure birth process if
μk = 0 for all k ≥ 0. A Poisson process is a pure birth process with L = ∞ and λk = λ for all k ≥ 0.
which denotes the probability that a process will be in state j after t units of time, given that it is
presently in state i.
We extend the definition of the limiting probability in a DTMC to the case of the CTMC,
assuming the existence of the limiting probability as follows:
One can show that a finite CTMC has a unique steady state distribution if it is irreducible,
which means that the process Xt can go from every state to every other state. If it is infinite and
irreducible, then the CMTC has at most one steady state distribution.
Getting a steady state probability π from a CTMC (S , Q ) is called ‘solving the CTMC.’
We can solve a CTMC (S , Q ) using two different approaches:
• Converting the CTMC (S , Q ) to a DTMC (S , P) and then applying methods in Section 2.4.
πQ = 0, (2.24)
πj = 1. (2.25)
j
2.7. SOLVING CTMC (S , Q ) 29
Continuous Solving
Steady State
Time Markov a CTMC
Distribution
Chain Model
(S, P)
(S, P)
Uniformization
Solving
Discrete Time a DTMC
Markov Chain
Model
(S, P)
πg λ = πb μ, πg + πb = 1. (2.27)
μ
Solving the equations gives (πg , πb ) = ( λ+μ λ
, λ+μ ).
−π0 (λ0 ) + π1 μ1 = 0
πk−1 λk−1 − πk (λk + μk ) + πk+1 μk = 0, k ≥ 1.
qi πi = qj i πj
j =i
where qi = j =i qij , The left-hand side is the rate at which the process leaves state i and
the right-hand side is the rate at which the process enters state i. The naming of the “balance
equations” comes from the fact that the two rates are equal or balanced. In the channel example,
the balance equation is πg λ = πb μ. The first term corresponds to the rate at which the process
leaves the ‘good’ state while the second term corresponds to the rate at which the process enters
the ‘good’ state.
2.7.2 UNIFORMIZATION: (S , Q ) → (S , P)
In general, it is easier to work with a discrete time model than with a continuous time one. Uni-
formization is a technique to convert a CTMC into a corresponding DTMC. The uniformization
procedure is as follows:
1. Pick a rate γ that it is larger than the maximum of −qii over i, i.e.,
γ ≥ max{−qii } .
i
It is not difficult to see that pij ∈ [0, 1] for all i, j and j pij = 1 for all i.
2.8 SUMMARY
• A Markov Chain is used to model a dynamical system that changes its state over time. Markov
chains are classified into continuous time Markov chains (CTMC) and discrete time Markov
chains (DTMC). The state of a DTMC is allowed to change only at discrete instants while
that of a CTMC can change at any time. The state evolution of a Markov chain depends only
on the current state, not on the past states. This is the Markov property.
• Because of the Markov property, a DTMC is specified by (S , P), where S is the state space
and P is the transition probability matrix of the DTMC. A state transition diagram helps
visualize a Markov chain model. In such a diagram, states are represented by circles and arrows
between circles correspond to transitions between states; arrows are usually marked with the
corresponding transition probability.
• The steady state distribution π is used to calculate performance metrics of interest, such as
average delay, average backlog, and blocking probability.
• A CTMC is specified by (S , Q ), where S is the state space and Q is the transition rate matrix
of the DTMC. As the state transition from i to j in a CTMC is exponentially distributed,
element qij of matrix Q is the transition rate of an exponential random variable. Due to
memoryless property of the exponential random variables, the Markov property holds for a
CTMC.
• A birth-and-death process is one of the most popular CTMC models in performance model-
ing. Because it has a closed form steady state solution, it is widely used for many applications.
CHAPTER 3
Understanding the System. The first and most important step in any modeling effort is to have
a proper understanding of the system. Without it, the model could be misleading. Furthermore, it
is important for a good modeler to have a sufficient understanding of the modeling objectives. In
many cases, a team approach is appropriate. For example, a modeling expert would team up with a
system expert to execute the modeling project.
34 3. DEVELOPING MARKOV CHAIN PERFORMANCE MODELS
In the development of performance models, it is necessary to identify the potential bottleneck
components of the system. These are the components that impact the overall system performance
more than others. Oftentimes, capturing the potential impact of bottleneck components is the key
to the success of the model. However, the components that are bottleneck may not be obvious in
the beginning of the investigation and modelers typically revise and improve the model after the
validation step. To reduce the cost and risk of model development, many modelers start with a very
simple high-level model. They use that model to confirm that the modeling direction is appropriate.
Once the simple model provides the initial understanding, the modelers revise it by adding details
that are important to the performance of the system.
Model Construction. After understanding the system, a modeler selects an appropriate model type.
Since there are many model types, the models chooses one based on the nature of problem, taking
into account the time and budget resources. For example, simulation models take a longer time
and more resources in general. Analytical modeling takes less effort but requires more assumptions
and approximations. In the discussion below, we suppose that the modeler selected a Markov chain
model.
To develop a Markov chain model, the following questions need to be answered:
• How does the state change and what are the transition rules?
Verification and Validation. After the model development, we validate the model against real data
or other types of model. Most commonly used is the cross validation of an analytical model against
a simulation model or real field data. The validation step is important and it makes the model more
precise. When we find noticeable differences, we revise the model to make it more accurate. The
3.2. A SIMPLE FORWARDING SYSTEM 35
modeler must think carefully to understand the gap. The thought process is very helpful in revising
the developed model.
The feedback arrows in Figure 3.1 shows the revision process.
System Description Consider a campus network that has a group of local area networks (LANs)
and one access router as a gateway to the Internet. As students start complaining of slow network
connections, the network manager decides to investigate the performance issue and finds that the
access router is a potential bottleneck. To address the problem, the network manager is not sure
whether to replace or upgrade the router. If the decision is to purchase a new router, the manager
must determine its characteristics. To study these questions, the manager decides to develop a
performance model to assess the different options.
Packet Packet
Arrival Departure
Modeling The manager asks you to help model the network. As an expert, you decide to simplify
the model by focusing on the key aspects. Since the LANs are not bottleneck, you focus on the
router as shown in Figure 3.2. You ignore the complexities of the LANs but model the total traffic
they generate for the router. You study historical data and identify the input traffic patterns such as
the number of packets. You also study the router CPU and memory.
You know that you can study this system either though computer simulation or with a math-
ematical model. You opt for a mathematical model as the system is not excessively complex and
a Markov chain model can provide a detailed understanding of the router performance. The next
question is whether to use a DTMC or CTMC. Both seem to be good candidates, and you decide
to use a discrete-time model.
States In the DTMC, you need to define the state Xn at time n. What should the state be
what should the time epoch n mean? As you are interested in the performance of the access router,
36 3. DEVELOPING MARKOV CHAIN PERFORMANCE MODELS
to observe the behavior of buffer occupancy and CPU, you let Xn be the number of packets in the
system at time n, then
Xn ∈ S := {0, 1, 2, 3, · · · , L − 1, L}. (3.1)
Here, n is the time slot long enough to transmit one packet and L is the size of the buffer, in number
of packets.
After studying the historical data, you find out that packets arrive at the router with probability
α in time slot n, independently of anything else. The packet that arrives at time slot n is available to
be forwarded in the next time slot n + 1. The CPU of the router is involved in many different tasks
and allocates only a fraction β of its cycles to the forwarding task. As a consequence, the router is
able to forward a packet with probability β in a given slot. With probability of (1 − β), the CPU
is performing a different task. You further simplify the system by assuming that the arrivals and
departures are independent.
Transition Probabilities With this understanding of the arrivals and of the router behavior,
you can specify the transition probabilities. Note that state transitions happen when a packet arrives
or departs. When a packet arrives at time n, the buffer occupancy Xn either increases or stays the
same depending on whether departure happens or not. If there is a departure in the same slot, then
Xn+1 = Xn ; otherwise, Xn+1 = Xn + 1. Similarly, if there is no arrival in time slot n, Xn either
stays the same or decreases by one. So, the transition probability matrix P is as follows:
⎧
⎪
⎪ p1 = α(1 − β), j = i + 1, i = 1, 2, · · · ;
⎪
⎪
⎪
⎪ p2 = (1 − α)β, j = i − 1, i = 1, 2, · · · ;
⎪
⎨
p3 = αβ + (1 − α)(1 − β), j = i, i = 1, 2, · · · ;
pij = (3.2)
⎪
⎪ α, i = 0, j = 1;
⎪
⎪
⎪
⎪ (1 − α), i = 0, j = 0;
⎪
⎩ 0, otherwise
The transition probability p0,1 from state 0 to 1 is α because there can be no departure when the
router buffer is empty.
Figure 3.3 shows the state transition diagram of the router buffer occupancy.
Solution of the model First, note that the Markov chain model that we have is a discrete
time birth-and-death process. The balance equations of the process are:
π0 · α = π1 · p 2 , (3.3)
π1 = π0 · α + π 1 · p 3 + π 2 · p 2 , (3.4)
πn = πn−1 · p1 + πn · p3 + πn+1 · p2 for n ≥ 2. (3.5)
πn · p1 = πn+1 · p2 for n ≥ 1.
Figure 3.3: A Discrete Time Markov Chain Model of Router Buffer Occupancy.
From these equations and πn = 1, we find that the steady state distribution πn is given by:
α p1 n−1
πn = π0 (3.6)
p p2
2 −1
α p2
π0 = 1+ . (3.7)
p2 p2 − p 1
This invariant distribution is similar to that of the continuous time birth-and-death process that we
discussed earlier.
From the steady state distribution, the average backlog E[X] can be computed as:
∞
E[X] = n · πn .
n
The average backlog is plotted in Figure 3.4 for different values of α and β. The x-axis
correspond to the value of α and the y-axis corresponds to the average backlog. The three lines from
left to right corresponds to different forwarding probabilities β = 0.3, 0.6, and 0.9, respectively.
Note that as the arrival probability approaches β, the average buffer occupancy increases rapidly. We
can calculate the average delay D from the average backlog L using Little’s result that states that
L = αD.
Validation You now have your first DTMC model of the campus access router represented by (3.1)
and (3.2) and its analytical solution can be used to compute metrics of interest.To determine whether
the model is accurate enough or requires further modification, the validation work can take much
more time and effort than the model development.
You can validate the model against benchmark cases. For example, you can measure the
average delay through the router and the arrival rate of packets. By comparing these numbers with
the prediction of the model, you can estimate the value of β. You can then repeat the measurements
38 3. DEVELOPING MARKOV CHAIN PERFORMANCE MODELS
Figure 3.4: The number of backlogs vs. probability α of arrivals for different values of β = {0.3, 0.6, 0.9}.
at different times of the day that correspond to different average arrival rates and delays. If the model
predictions are accurate enough, you can trust the model. Otherwise, you may need to revise it.
System Description Assume that you are a network designer for a cellular network company. As
there have been numerous complaints from customers on the call blocking, your boss asks you to
initiate performance studies for the problematic area.
The network blocks calls because the number of channels is finite. Consider a cellular system
that consists of single base station that serves incoming and outgoing phone calls. There are a limited
number of wireless channels for these phone calls, and if all the channels are being used when a call
is placed, the user receives a busy signal and must try to make a connection at later time. Once a
connection is made, the channel is unavailable to other users until the current user terminates the
connection by hanging up. Figure 3.5 shows the cellular system with two on-going phone calls. The
number of channels is six in the figure.
The company has a policy of controlling the blocking probability below 1%. However, initial
investigation of the system reveals that the customers in the areas have experienced more frequent
call blocking than this threshold. It turns out that the population in the area has increased constantly
as new developments were built.
3.3. CELLULAR SYSTEM WITH BLOCKING 39
1
2
3
4
5
6
Figure 3.5: A Cellular System with Blocking: Two channels out of six are busy and the others are idle.
As an experienced engineer, you decide to reduce the size of cells in the area to lower the
blocking probability. With a smaller cell size, the blocking probability goes down as the rate of
incoming calls is reduced. But you are not sure how to select the reduced cell radius? To answer the
question, you decide to develop an analytical model.
Modeling The important factors that affect the blocking probability are the call arrival rate, the
average call duration, and the number of available channels. We can collect information on these
factors from historical data.
The study can be done by simulation or by an analytical model. In order to develop an analytical
model, many simplifying assumptions are needed.
Assumptions We make the following assumptions:
2. The time between phone calls to the system is distributed exponentially with mean 1/λ, even
when all the channels are busy.
The above assumptions seem very restrictive, but they are essential to develop a CTMC model.
The mean value of the call durations can be obtained from historical data and we further assume that
the call durations are exponentially distributed to make a CTMC model. The second assumption
is another way of saying that call arrivals form a Poisson process with rate λ. When the number of
potential users is large, the Poisson assumption is quite accurate.
40 3. DEVELOPING MARKOV CHAIN PERFORMANCE MODELS
States As discussed in the previous chapter, a CTMC model is determined by (S , Q ). In
the cellular system model, the state space is
S = {0, 1, 2, · · · , K − 1, K}, (3.8)
where K is the number of available channels. The state X(t) represents the number of busy channels
at time t and X(t) ∈ S .
Transition Rates The state changes when a new call arrives and when a call terminates, which
corresponds to a birth-and-death process. The arrival rate λi when there are i channels are busy is
λi = λ i ∈ 0, 1, · · · , K − 1. (3.9)
The death rate μi when the number of busy channels is i is:
μi = iμ i ∈ 1, · · · , K. (3.10)
This is because each conversation terminates at rate μ, independently of the other conversations.
Accordingly, the first conversation terminates at rate iμ.
Figure 3.6 shows the state transition diagram of the Markov model.
Figure 3.6: A Markov Chain Model of the Cellular System with K channels.
where ρ = λ
μ . The value of π0 is obtained from i πi = 1.
The blocking probability of the cellular system is πK . Indeed, the arrival rate of calls does
not depend on the state of the system. Consequently, the probability that a call arrives when all the
channels are busy is the probability πK that all channels are busy.
Figure 3.7 shows the blocking probability πK of the cellular system when the number K of
channels is 20. We assume that the arrival rate per square kilometer is two calls per hour and that
arrival rate λ = 3.14 × r 2 × 2 calls. The X-axis corresponds to the cell radius and the Y -axis to the
blocking probability. The solid line with square tick are from the Erlang B formula (3.15) and the
dotted line with circle tick is from the simulation. We can observe that the blocking probability
is almost zero when the cell radius is smaller than 4Km and that it reaches 9% as the cell radius
becomes 5.5Km. We can conclude that the size of the cell is smaller than 5Km to satisfy the blocking
probability threshold from the model.
Validation and Verification of the Model. The analytical models we developed seem restrictive
with many assumptions such as exponential duration of phone calls and the Poisson arrival of calls.
Because of these assumptions, validation and verification processes are needed. As an alternative to
the mathematical model, oftentimes simulation models are developed to cross-validate the analytical
model. An advantage of the simulation model is that we do not need to make the assumptions that
we made in the Markov model. However, it is more complex to develop the simulation model.
The dotted line in Figure 3.7 shows the results from the simulation model. The results are
very close except when the radius is 5.5Km. The blocking probability 14% from the simulation is
a bit higher than 9% from the Markov model. From the cross-check, we can assure the correctness
42 3. DEVELOPING MARKOV CHAIN PERFORMANCE MODELS
of the model. When the simulation model does not exist, practitioners can rely on field data, and
expert opinions for the validation efforts.
ALOHA was a pioneering wireless networking system developed at the University of Hawaii
in the early 1970s. To enable communication between separated campuses of the University of
Hawaii, Professor Abramson developed the packet switched network with the first multiple access
scheme as shown in Figure 3.8. As multiple transmitters share a communication channel, if two
or more nodes transmit simultaneously, the transmissions fail. If no node transmits, the channel
is unused. The challenge in the multiple access protocol is to coordinate multiple transmitters to
achieve efficiency and fairness. Refer to (3; 27) for thorough coverage of the multiple access protocols
and issues.
3.4. SLOTTED ALOHA 43
System Description There are two versions of the ALOHA protocol: pure and slotted. In the
pure ALOHA protocol, each transmitter can send packet at any time, while packet transmission is
restricted to the beginning of time slots in the slotted ALOHA protocol. The slotted protocol is
more synchronized than the pure one and it is known that this restriction improves the throughput
by a factor of two over the pure ALOHA protocol by reducing the number of collisions.
The multiple access protocol of the slotted ALOHA protocol works as follows. When a
packet arrives at an unbacklogged node, the node simply transmits the packet in the first slot after
the arrival, thus risking occasional collisions but achieving very small delays if collisions are rare.
When a collision occurs, each node sending one of the colliding packets discovers the collision when
it does not receive an acknowledgment for its packet and it becomes backlogged. The backlogged
node waits for a random number of time slots and retransmits the packet. If the backlogged nodes
were to retry transmission in the next slot, another collision would be inevitable. A random delay is
included in the protocol to avoid repetitive collisions.
To understand the performance of slotted ALOHA, including the average delay or the max-
imum sustainable throughput, we develop a Markov chain model.
Modeling
As time is a sequence of slots in slotted ALOHA, it is natural to consider a discrete time
model. We model the system with a DTMC model, which was proposed in (3).
Assumptions We make the following assumptions:
• There are N nodes in the system; let n be the number of backlogged nodes at the beginning
of a given slot.
• Each backlogged node transmits a packet with probability of p, independently of other nodes.
Each one of the N − n unbacklogged nodes transmits a packet that arrived in the previous
slot.
• Packets arrive to each node according to a Poisson process with rate λ/N. Since the number of
such arrivals in one time unit is Poisson distributed, no packet arrives with probability e−λ/N ;
thus the probability that the unbacklogged node transmits in a given slot is
q := 1 − e−λ/N .
State Space. Let Xt be the number of backlogged nodes in the system at the beginning of
time slot t. Then the state space S of the system is
S = {0, 1, 2, · · · , N},
where N is the number of nodes in the system. The possible events that change the number of
backlogged nodes are: (1) packet arrivals to unbacklogged nodes, (2) successful transmission, and
(3) failure of transmission. The number of backlogged nodes increases by the number of new arrivals
44 3. DEVELOPING MARKOV CHAIN PERFORMANCE MODELS
to unbacklogged nodes. It decreases by one if a successful packet transmission happens in the time
slot. A successful transmission can happen (1) if one arrival from unbacklogged nodes and no
transmission attempts from the backlogged nodes or (2) if no arrivals to the backlogged nodes and
one attempts from the backlogged nodes. Thus, we have the following transition probabilities.
Transition Probabilities. Let ru (i, n) be the probability that i unbacklogged nodes transmit
packets in a given slot. Similarly, let rb (i, n) be the probability that i backlogged nodes transmit
packets in a given slot. Then,
N −n i
ru (i, n) = q (1 − q)N −n−i (3.16)
i
n i
rb (i, n) = p (1 − p)n−i . (3.17)
i
With these notations, the transition probability can be written by:
⎧
⎪
⎪ ru (0, n)rb (1, n), i = −1;
⎨
ru (1, n)rb (0, n) + ru (0, n)[1 − rb (1, n)], i = 0;
Pn,n+i = (3.18)
⎪
⎪ r (i, n)[1 − rb (0, n)], i = 1;
⎩ u
ru (i, n), 2 ≤ i ≤ N − n.
Note that the state n goes to n − 1 if only one of the backlogged nodes transmits a packet and no
unbacklogged nodes transmit. Since both events are independent, the probability is ru (0, n)rb (1, n).
The state n remains the same either 1) when there is one attempt from the unbacklogged nodes and
zero attempts from the backlogged nodes or (2) when there is zero attempts from the unbacklogged
nodes and zero attempts or collision from the backlogged nodes. The state n goes to n + 1 when one
attempt from the unbacklogged nodes collides with attempts from the backlogged nodes. The state
increases by i(≥ 2) when there are i attempts from the unbacklogged nodes regardless of backlogged
nodes.
Numerical Solution Note that Pij = 0 for j ≤ i + 2. The balance equations π = πP of the
model are:
i+1
πi = Pij πj , for i = 0, 1, · · · , N − 1; (3.19)
j =0
N
πN = PNj πj . (3.20)
j =0
Due to this structure, the steady state probabilities can be computed in an iterative way. Since π1 , π2
· · · , πN can be expressed in terms of π0 , the value of π0 can be found numerically from i πi = 1.
Unlike the previous two examples, we do not have a closed form solution for this model, but it can be
obtained numerically. With the steady state distribution, the average number of backlogged nodes
can be found, from which we compute the average delay experienced by packets.
3.4. SLOTTED ALOHA 45
Figure 3.9: The number of backlogged nodes vs. the transmission probability p for different values of
λ = {0.1, 0.2, 0.3}.
Figure 3.9 plots the average number of backlogged nodes for different values of p and λ when
the number N of nodes is 50. When transmission probability is too small, the backlogged nodes
are larger because the radio resource is rather idle. As p increases, the backlog reduces and when
it is more than a certain value, i.e., 0.2 when λ = 0.1, it increases very rapidly. That is because the
number of collision increases with a higher value of transmission probability p. The figure shows
the importance of the transmission probability p to the performance of the ALOHA system.
As the transmission probability p is important to the stability of the ALOHA networks,
people have tried to estimate an appropriate value of p. However, since the number of nodes n in
the system is unknown to the nodes, proposed algorithms tried to estimate it. The essence of such
algorithms is to increase p upon idle slots and decrease p upon collisions. For example, (10) proposed
a recursive control algorithm of p that achieves stable operation of the slotted ALOHA networks.
Heuristic Analysis of slotted ALOHA It is known that the efficiency of the slotted ALOHA
protocol is 36% under some idealized assumptions. This means that out of 100 time slots, only 36
slots can be used for data transmission and the remaining 64 slots are wasted. The waste comes from
collisions, from postponing due to random delay and so on.
To understand the efficiency of 36%, consider the following throughput equation of the slotted
ALOHA:
S = Ge−G . (3.21)
In this formula, S is the total throughput rate in packets per time slot and G is the total rate
of transmission attempts in packets per time slot. Multiplying the total attempts rate G by the
46 3. DEVELOPING MARKOV CHAIN PERFORMANCE MODELS
successful transmission probability e−G gives the total throughput rate S in (3.21). To see why e−G
is the successful transmission rate, we argue as follows. When a large number of nodes attempt to
transmit, each with a small probability, the transmission attempts form a Poisson process. A given
node is successful if no other node attempts to transmit in a duration equal to 1. The probability of
that event is the probability that the Poisson process with rate G does no jump in one unit of time.
This is the probability that the time until the next transmission is larger than 1. Since that time
is exponentially distributed with rate G, this probability is e−G . Figure 3.10 shows the throughput
formula Ge−G . Note that the maximum throughput is e−1 , which is roughly 36%.
System Description A group of Wi-Fi devices use the carrier sense multiple access with collision
avoidance (CSMA/CA) protocol to share a radio channel. When using this protocol, before trans-
mitting, each device checks whether the radio channel is busy or idle. If busy, the device waits until
the channel becomes idle. If a collision happens, each device waits (backs off ) for a random time.The
device with the shortest random waiting time retransmits its packet. If two transmitters are close
3.5. WI-FI NETWORK - CSMA MARKOV CHAIN 47
enough, they cannot transmit simultaneously under the CSMA/CA protocol because one senses
that the radio channel is busy when the other transmits. We are interested in the average throughput
of transmitters in the Wi-Fi network.
Figure 3.11: The Bianchi model can be applied only to the left case. In a single collision domain model,
each link interferes with all other links. In a multiple collision domain model, since there are more than
one collision domain, a link interferes with some of the links. In the left plot, link C interferes with link
D, but not with link E.
Figure 3.11 shows two simple network examples. Links A and B cannot be used simultaneously
as they are neighboring with each other. However, Links C and E can be used simultaneously.
Contention Graph. The CSMA Markov chain uses an interference model called a contention graph.
The vertices of the contention graph are the links of the network and its edges connect the interfering
links, i.e., links that cannot transmit simultaneously.
Figure 3.12 shows the contention graphs that correspond to the networks of Figure 3.11. For
instance, the contention graph on the right of the figure shows that links C and D conflict and so
do links D and E. The graph also indicates that links C and E do not conflict since they are not
connected by an edge in that graph.
A B C D E
Figure 3.12: Contention Graphs: A node in a contention graph corresponds to a link in the network; a
link in the contention graph represents contention relationship between two links of the network.
48 3. DEVELOPING MARKOV CHAIN PERFORMANCE MODELS
Modeling the CSMA Markov Chain
Assumptions:
The exponential distribution in the first two assumptions is needed to model the link activities as
a CTMC. The mean transmission time is set to one for simplicity. The third assumption indicates
that we are interested in the saturated throughput when the sources are not a bottleneck. With these
assumptions, we define a CTMC as follows.
State Space. The CSMA Markov chain represents the state of the links L in the network.
Let xl (t) be the state of link l at time t, which can be either active (i.e., transmitting) or idle. When
the link is idle, xl (t) = 0; otherwise, xl (t) = 1. Let X(t) = [xl (t), l ∈ L] be the vector of states of
all links in the network and S be the set of possible values of X(t). Because of conflicts between
links, S has fewer than 2K elements.
Consider the network in the left of Figure 3.11. Since links A and B interfere with each other,
the state (1, 1) is not possible. The state space for this example is
Similarly, the state space for the network on the right of the figure is
S = {(0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0, 1), (1, 0, 1)}.
The number of possible states in the CSMA Markov chain is the same as the number of
independent sets. Two links A and B are said to be independent when they can be simultaneously
active. In the conflict graph, an independent set is a set of vertices that are not connected by an edge.
Transition Probabilities. There are two types of transition in the CSMA Markov chain
model: a start of transmission and the end of a transmission. When a transmission starts, one
idle link becomes active and the state changes from x i to x i + ek with rate Rk . (Here, ek is a K-
dimensional vector whose k-th element is one and all others are zero.) This transition is possible if
link k is not active in state x i (i.e., xki = 0) and if all the links that conflict with k are not active in
x i . When a transmission ends, the state x i + ek becomes x i with rate 1.
Note that the rate Rk is the rate of the exponential waiting time of link k and it represents
the aggressiveness of that link. The higher the value of Rk , the more aggressive link k is in trying to
capture the radio channel.
3.5. WI-FI NETWORK - CSMA MARKOV CHAIN 49
(1,0,0)
RE
(1,0)
RA
RC
(1,0,1)
1
(0,0)
(0, 0, 0)
RE (0,0,1)
RC
1
RB
RD
(0,1)
(0,1,0)
Figure 3.13: Example state transition diagrams for the networks of Figure 3.11.
This model focuses on the multiple collision domains represented by a general conflict graph.
In contrast, the Bianchi model (4) focuses on the binary exponential backoff algorithm in a single
collision model.
Figure 3.13 shows the state transition diagram of the CSMA Markov chain model for two net-
works shown in Figure 3.11. In the model on the left, there are only three states {(0, 0), (1, 0), (0, 1)},
because two links are conflicting. State (0, 0) transitions to state (1, 0) with rate RA and to (0, 1)
with rate RB . The model on the right corresponds to the three-link network of Figure 3.11. In this
model, we do not show the transition rate 1 of transmission completions, for the sake of simplicity.
Solution. We can find the steady state probability π by solving the balance equations π Q = 0.
For the first example, the transition rate matrix Q is given by:
⎡ ⎤
−(RA + RB ) RA RB
Q =⎣ 1 −1 0 ⎦ .
1 0 −1
We find
1 RA RA
π= , , .
1 + R A + RB 1 + R A + R B 1 + R A + R B
Similarly, the transition rate matrix for the second example is
⎡ ⎤
(0, 0, 0) −(RC + RD + RE ) RE RD RC 0
⎢ (1, 0, 0) 1 −(RC + 1) 0 0 RC ⎥
⎢ ⎥
Q =⎢
⎢ (0, 1, 0) 1 0 −1 0 0 ⎥.
⎥
⎣ (0, 0, 1) 1 0 0 −(RE + 1) RE ⎦
(1, 0, 1) 0 1 0 1 −1
50 3. DEVELOPING MARKOV CHAIN PERFORMANCE MODELS
(1, RC , RD , RE , RC · RE )
π= .
1 + R C + RD + RE + R C · RE
We can see that the utilization of link is
(RC + RC · RE , RD , RE + RC · RE )
(ρC , ρD , ρE ) = .
1 + R C + R D + R E + R C · RE
If RC = RD = RE = 5, (ρC , ρD , ρE ) = ( 30 5 30
41 , 41 , 41 ). Observe that the middle link D is much less
utilized than the other two, which reveals the unfairness issue of the CSMA/CA protocol of Wi-Fi
networks.
Operations of a Dedicated Control Channel MAC In this dedicated control channel MAC, every
device has two radios. One radio is tuned to a channel dedicated to control messages; the other radio
1 Refer to (19) for other multichannel MAC protocols. They classified multichannel MAC protocols into four: dedicated control
channel, split, common hopping, and multiple rendezvous approaches. They differ in how devices agree on the channel to be used
for transmission and how they resolve potential contention for a channel.
3.6. A MULTI-CHANNEL MAC PROTOCOL MODEL 51
can tune to any other channel. In principle, all devices can overhear all the agreements made by other
devices, even during data exchange. This system’s efficiency is limited only by the contention for the
control channel and the number of available data channels. Fig. 3.14 illustrates the operations of
Dedicated Control Channel MAC. Note that channel 0 is the control channel and that channels 1,
2, and 3 are for data transmission. When device A wants to send a packet to device B, it transmits
an RTS (request-to- send) packet on the control channel. That RTS specifies the lowest-numbered
free channel. Upon receiving the RTS, B responds with a CTS (clear-to-send) packet on the control
channel, confirming the data channel suggested by A. The RTS and CTS packets also contain a
Network Allocation Vector (NAV) field, as in 802.11, to inform other devices of the duration for
which the sender, the receiver, and the chosen data channel are busy. Since all devices listen to the
control channel at all times, they can keep track of the busy status of other devices and channels
even during data exchange. Devices avoid busy channels when selecting a data channel.
The major advantage of Dedicated Control Channel Protocol is that it does not require
time synchronization; rendezvous always happens on the same channel. The main disadvantage of
this protocol is that it requires a separate dedicated control radio and a dedicated channel, thereby
increasing cost and decreasing spectral efficiency when few channels are available.
where M is the number of data channels of the system. The number M of data channels is two
in the example of Figure 3.15. The evolution of Xn or the number of busy data channels is
0, 0, 0, 1, 2, 2, 1, 1, 0, and 1 in the example.
Assumptions. To make the model fit into a DTMC, we make the following assumptions.
1. Upon making an agreement, the devices can transmit only one packet (one may think of a
“packet” as the amount of data that can be transferred per channel agreement);
2. The packet lengths, which are integer multiples of slot durations, are independent and geo-
metrically distributed with parameter q (i.e., packet duration has a mean of 1/q slots);
52 3. DEVELOPING MARKOV CHAIN PERFORMANCE MODELS
C: collision
S: success
I : Idle
data
channel 2
data
channel 1
C S I S I C C S I control
channel
time
3. Devices always have packets to send to all the other devices; in each time slot, an idle device
attempts to transmit with probability p. The receiver of a sender is decided randomly with
equal probability among possible candidates.
The second and third simplifications are essential to make the model Markovian. The second
assumption, the geometric length of packet size, makes the termination events independent of the
past of Xn because the geometric distribution is memoryless. Similarly, the third assumption makes
the birth events Markovian because each device transmits with probability of p independently of
the past evolution.
Transition Probabilities.
Xn+1 = Xn + An − Dn , n ≥ 0, (3.22)
where An is the number of new agreements at time n, and Dn is the number of terminations at
time n. Note that An = 1 if a new agreement is made in time slot n and An = 0, otherwise. If
Xn = M, which means that all channels are busy, then An = 0 with probability 1. The number
of departures Dn at time n ranges from 0 to Xn . If Xn = 0, then Dn = 0 with probability 1.
• We assume the slotted ALOHA model 2 to model the exchange of RTS/CTS. When a device
has a packet to transmit, it attempts to transmit a packet with probability p by sending an
RTS. The agreement is made when only one device attempts to transmit an RTS. Hence, the
success probability Sk in the next time slot, given that k pairs are communicating in the current
slot, is given below:
Sk = (N − 2k)p(1 − p)(N −2k−1) . (3.23)
2The ALOHA model assumes the single collision domain as in the Bianchi model.
3.6. A MULTI-CHANNEL MAC PROTOCOL MODEL 53
0 1 2
Here, N is the number of devices in the network. Since k pairs are currently communicating,
N − 2k devices are inactive. Also, p(1 − p)(N −2k−1) is the probability that one specific device
transmits an RTS with probability p in a given time slot while all other N − 2k − 1 devices
do not. Since all N − 2k devices can try to transmit an RTS, we have the expression for Sk .
(j )
• The probability Tk that j transfers finish when the system is in state k is given by the
following expression:
(j )
Tk = P r[j transfers terminate at time t|Xt−1 = k]
(3.24)
= jk q j (1 − q)k−j .
Since one active device terminates transmission with probability p independently, the proba-
bility of j termination out of k active transmission is equal to q j (q − q)k−j . Since there are
k
j possible combinations, we have the probability shown in (3.24).
i∈Si · πi
ρ= , (3.26)
M
where πi is the limiting probability that the system is in state i and S is the state space of the Markov
chain. One obtains πi by solving the balance equations of the Markov chain. We obtain the total
system throughput as
T Hded = M · C · ρ, (3.27)
where C is the channel transmission rate per channel.
We evaluated a system with following parameters shown in Table 3.1. We assume that the
duration of each slot is the time to transmit one RTS and one CTS frame, which, according to the
standard, is equal to RTS + SIFS + CTS + SIFS = 288 + 10 + 240 + 10 ≈ 550μsec. The success
probabilities Sk for k = 0, 1 are
S0 = 20 · p(1 − p)19 = 0.3774
S1 = 18 · p(1 − p)17 = 0.3763.
3.7. SUMMARY 55
Since we assume that the geometric termination probability is q = 0.1, the average packet
size is E[L] = 10 slot duration or 10 ∗ 550(μsec) ∗ 1(Mbps)/8 = 687.5bytes.
With these parameters, we can formulate the transition probability matrix
⎡ ⎤ ⎡ ⎤
(1 − S0 ) S0 0 .6226 .3774 0
⎢ ⎥
P = ⎣ (1 − S1 )T1(1) (1 − S1 )T1(0) + S1 T1(1) S1 T1(0) ⎦ = ⎣ .0623 .5990 .3387 ⎦ .
(2) (1) (0) .0100 .1800 .8100
T2 T2 T2
Multiplying P by itself 100 times gives
⎡ ⎤
.0709 .3339 .5952
P(100) = ⎣ .0709 .3339 .5952 ⎦ .
.0709 .3339 .5952
Since each row is the steady state distribution π, the system utilization ρ is given by ρ = (.3339 ∗
1 + .5952 ∗ 2)/2 = 0.7622. The capacity of the system T Hded is given by:
T Hded = 2(channels) × 1(Mbps/channel) × 0.7622 ≈ 1.52Mbps.
3.7 SUMMARY
• A Markov chain model development follows these steps: (1) understanding the system,
(2) model construction, and (3) verification and validation. It is an iterative process. Once
a model is constructed through verification and validation, the model is revised until it seems
appropriate.
• We used a simple forwarding system to illustrate a discrete-time, birth-and-death process.
This process has an analytical solution, which can be found by solving the balance equations.
• The blocking model is a continuous-time, birth-and-death process whose blocking probability
is known as the Erlang formula.
• The DTMC model of slotted ALOHA is not a birth-and-death process, and it does not have
a closed form solution. However, we can compute the steady state distribution numerically.
• The CSMA Markov chain models the behavior of links in a wireless LAN as on-off processes
with exponential durations. This model captures the interference relationship between links
using a conflict graph. With appropriate selection of parameters, the model can be used to
assess the performance of large-scale wireless LAN networks that are not limited to a single
collision domain.
• The multichannel MAC model captures the behavior of a dedicated control channel protocol
with a DTMC. Though the model is not as accurate as the Bianchi model in that the backoff
timer is not modeled, it can be used to assess the throughput enhancements of a multiple
channel MAC protocol.
57
CHAPTER 4
P
16
P
32
1-P
P
1024
Figure 4.1: Bianchi model consists of two parts: a DTMC model and Network Model.
expires, the station attempts a transmission at the beginning of the next slot. If the transmission
is successful, the device repeats the same steps for the next packet. In the event of a collision, the
device doubles the value of CW, which explains the name “binary exponential backoff ” given to this
algorithm. The initial value of CW is set to CWmin (typically 32). When the value of CW reaches
a predetermined CWmax , (typically 1024), it stops increasing. The device then drops the packet if
three more transmission attempts again result in collision.
The motivation for the binary exponential backoff algorithm is that transmission failures tend
to indicate that many device are attempting to transmit. Accordingly, to limit the chances of collision,
the devices should reduce the probability that they try in a given time slot. The device reduces its
probability of transmission by increasing CW.
It is important to understand the behavior of the backoff timer as it determines the average
transmission attempt of Wi-Fi devices. Bianchi uses a two-dimensional Markov chain model of the
backoff timer.
State Space. The backoff timer is modeled by a two-dimensional state vector (s(t), b(t)). In this
vector, b(t) is the value of the backoff timer at time t and s(t) is the backoff stage that represents
the number of previous unsuccessful transmission attempts of the current packet. The state space of
4.1. THE BIANCHI MODEL 59
the DTMC model B(t) = (s(t), b(t)) is given by:
where m is the maximum number of allowed transmissions. The value of m is 7 in the Figure 4.1.
The maximum contention window size CWmax (s) when the backoff stage is s is:
1, s=0;
CWmax = (4.1)
23+s , s = 1, 2, · · · , 7.
Note that CWmax starts from 16 and increases up to 1024. State (0, 0) is the initial state after a
successful transmission.
• The backoff timer value b(t) is decreased by one every clock tick as long as the channel is
sensed to be idle. Therefore, the probability of transition from state (b, s) to state (b − 1, s)
is one for b ≥ 1. The horizontal arrows in the left plot of Figure 4.1 correspond to this.
• When the backoff timer value b(t) reaches zero, the mobile device attempts transmission.
The model assumed that the trial is a failure with probability p and a success with probability
(1 − p), independent of anything else. Upon failure, s(t + 1) = s(t) + 1 and the backoff timer
value b(t + 1) is uniformly selected among [0, CWmax − 1] where CWmax is the maximum
contention window size. The value of CWmax is doubled upon failure of transmission attempt.
The downward arrows in Figure 4.1 corresponds to the failure and probability CWpmax is equally
assigned to all next level nodes, which models the random generation. The upward arrow from
state (s, 0) to (0, 0) with probability (1 − p) happens when transmission attempt is a successful
one.
Solving the DTMC. We can solve the DTMC with the methods explained in Section 2.4 to
have a steady state distribution πs,b for a given p, the probability of collision given that there is a
transmission in the slot. The objective of the DTMC model is to find the probability τ that a device
transmits in a randomly chosen time slot. From the Markov chain, we can express the probability
τ (p) as a function of p:
m
τ (p) = πi,0 , (4.2)
i=1
which is the sum of probabilities that a backoff timer value is zero over backoff stage from 1 to
m. Since it attempts transmission when the counter reaches zero, if we sum πi,0 over i, it is the
probability that a device attempts transmission.
Since the value of p is unknown, Bianchi uses another equation between p and τ :
p = 1 − (1 − τ )N −1 , (4.3)
60 4. ADVANCED MARKOV CHAIN MODELS
C: collision
S: success
I : Idle
C S I I S C C
T1 T2 T3 T4 T5 T6 T7
time
which states that the probability p that a transmitted packet encounters a collision, is the probability
that at least one of the N − 1 remaining devices transmits, which is given by the right-hand side
of (4.3). By using the two equations (4.2) and (4.3), we can find the value of τ and p, which turns
out to be unique.
where T and P represent the length of a cycle time and size of a packet transmitted in a cycle. Since
there are three possible cases, the expected length E[T ] of a cycle is given by
E[T ] = pe Te , (4.5)
e∈{idle, succ, coll}
4.1. THE BIANCHI MODEL 61
Tsucc
Tcoll
where Te is the duration of event e ∈ {idle, succ, coll}. Similarly, the expected reward E[P ] or the
expected packet size during one cycle is
where L is the length of a packet. Since a packet is transmitted only in the case of “success,” we have
a zero reward in the two other cases.
The probabilities pe of events can be expressed as follows:
pidle = (1 − τ )n ; (4.7)
psucc = nτ (1 − τ )n−1 ; (4.8)
pcoll = 1 − pidle − psucc . (4.9)
where δ is the propagation delay, H is the time to transmit PHY/MAC headers and E[L] is the
mean time to transmit a data payload. Figure 4.3 illustrates how Tsucc and Tcoll are derived. The
RTS, CTS, and ACK are the MAC level control frames, and those in the above equation are the
duration of those frames. Similarly, DIFS and SIFS is the duration of the interframe space between
frames. Table 4.1 shows the parameter values for the 802.11b network.
62 4. ADVANCED MARKOV CHAIN MODELS
Figure 4.4: The probability (τ ) of transmission (left) and the saturated throughput of Wi-Fi network
from the Bianchi Model.
4.2.1 OVERVIEW
Assume that there are M data channels and one control channel, and that devices are equipped
with a single radio interface, which is shared for both control and data transmissions. The control
channel is used for making new agreements on data transmissions. As there is only one interface,
after making an agreement over the control channel, the device jumps to an agreed data channel and
transmits a packet. Once the transmission is done, the device returns to the control channel. As the
device cannot overhear the control channel while it is away for data transmission, it cannot know
the status of the data channels. If it starts transmission immediately, it is possible that the channel
it selected is the busy channel.
One solution to the problem is to force devices to wait for a certain duration, i.e., Tmax before
making a new agreement. As long as Tmax is longer than the maximum packet transmission time,
the device can avoid collision because all on-going transmissions must be ended by then. This idea
was used in multichannel protocols in (24; 16). We will discuss the model of the WiFlex introduced
in (16).
The multichannel protocol consists of three phases: observation, contention and transmission
as shown in Figure 4.5. The observation phase is the duration of waiting to avoid collision, which
we just discussed. During the contention phase, devices try to make an agreement over the control
channel. Devices agree upon the channel and duration. Once agreement is made, both devices move
to the agreed channel and start transmission.
Assume that device A returns to the control channel after finishing packet transmission.
Device A, after waiting for the duration of Tmax , tries to send a packet to another device, say B.
A contends for the transmission chance over the control channel and makes an agreement with B.
Both devices jump to an agreed channel and exchange a packet. It returns back to the control channel
and repeats the same process.
64 4. ADVANCED MARKOV CHAIN MODELS
4.2.2 MODEL DESCRIPTION
Consider a network that consists of N single radio devices. The devices transmit packets using one
of M data channels and exchange control packets in a separate control channel. The devices are in
one of three phases: observation, contention, or transmission. The device stays in the observation
phase for the fixed duration of Tmax . It then contends for data transmission to make an agreement
in the control channel. After making an agreement, the device jumps to the agreed data channel and
transmits a packet. We consider two models: a three queue model and an (L+2) queue model. In the
first one, we simply assume that the duration of the observation is exponential while in the second
one, we approximate the observation phase with L queues of exponential duration with a mean of
Tmax
L . The first model is a special case of the second when L = 1. We first explain the case L = 1 to
introduce a closed queueing network model.
Exponential Exponential
Fixed Duration
Duration Duration
Figure 4.5: Each device goes through three states: observe, contend and transmit. The fixed length of
the observation phase make the Markovian modeling difficult.
State Space. Let ni (t) be the number of devices in state i for i ∈ {o, c, t} at time t. The three-
dimensional vector (no , nc , nt ) can be used to represent the state of the system. 1 So the state space
S is given by:
State Transition Rates. A typical device moves along the closed queueing network
with three stages as shown in the Figure 4.5.
• The first circle models the observe phase of devices. After finishing transmission, the device
takes a vacation at the first queue. Though it is of a fixed length, we assume that the first queue
has exponentially distributed independent service times with rate μo = (T )−1 . So its service
rate is
μo (i) := iμo , 0 ≤ i ≤ N. (4.11)
1 Since n + n + n = N , a two-dimensional vector is enough.
o c t
4.2. A MULTICHANNEL MAC PROTOCOL WITH A FIXED DURATION 65
2
• The second circle models the contention for RTS/CTS exchange in the (single) control
channel. We assume that the queue has an exponential service rate μc .The rate μc is introduced
to model the CSMA/CA contention success rate on the average. If RTS/CTS duration is dts
and success probability of reservation is psucc in each RTS/CTS duration, then μc can be
approximated by:
psucc
μc ≈ . (4.12)
dts
Then the service rate μc (i) of the first queue with i devices in the queue is
Therefore, the transition rate matrix for the continuous time Markov chain is
⎧
⎪
⎪ no · μo , if n = (no − 1, nc + 1, nt );
⎨
μc if n = (no , nc − 1, nt + 1);
qn,n = (4.18)
⎪
⎪ min(nt , K) · μt , if n = (no + 1, nc , nt − 1);
⎩
0 otherwise.
2 RTS (Request to Send) and CTS (Clear to Send) are control messages exchanged between sender and receiver to make an
agreement.
66 4. ADVANCED MARKOV CHAIN MODELS
Solving a Queueing Network. The closed queueing network model is known to have a product
form solution because it belongs to a BCMP network, a generalization of a Jackson network. A
queueing network is a BCMP network if 1) all queues are one of four specific types discussed below
and 2) the next queue that a customer enters is random. The four types of queue are 1) a FCFS
queue with exponential service duration, 2) infinite server queue, 3) processor sharing queue, and
4) a single server queue with LCFS with pre-emptive resume.
As the observation queue belongs to the infinite server queue and the contention and the trans-
mission queues belong to the FCFS queue, the model is a BCMP network, which has a productive
form solution. The stationary distribution with respect to n can be found as a product form:
π(no , nc , nt ) = γfo (no )fc (nc )ft (nt )
( μλo )no
fo (no ) =
no !
λ nc
fc (nc ) = ( )
μc
( μλt )nt
ft (nt ) = +,
min(nt , M)!M (nt −M)
where γ is a normalization constant
−1
γ = fc (nc )ft (nt )fv (nv ) , (4.19)
n:n∈S
and λ is a stationary state input rate to a queue to be chosen appropriately to compute the system
performance.
The system throughput T H can be computed using the marginal distribution π(nt ), which
we can obtain by summing π(no , nc , nt ) over nc from 0 to N − nt . With the marginal distribution
π(nt ), the system throughput T H is:
min(N,K)
TH = nt · π(nt ) · R, (4.20)
nt =1
State Space. The state space of the new model is similar to the three queue model but with more
dimensions of L + 2, which is given by
The first L queues model the observe phase and the last two queues model the contend and transmit
phase, respectively.
o1 o2 … oL contend transmit
State Transition Rates. A state transition rate matrix of the Markov chain is as follows:
⎧ 1
⎪
⎪ ni · μo , for i = 1, 2, · · · , L;
⎨ L
μc if i = L + 1;
qn,n =n−ei +ei+1 = (4.21)
⎪
⎪ min(nt , K) · μt , if i = L + 2;
⎩
0 otherwise.
Note that transition rate is the same as that of 3-Queue model except the first L queues.The transition
rate from n to n − ei + ei + 1 slowed down by a factor of L and it is L1 ni · μo for i = 1, 2, · · · , L.
Solving the Queueing Network. Note that the given CTMC model belongs to the BCMP queueing
network because the first L queues are infinite server queues and others are exponential serviced
FCFS queues. The steady state distribution has the product form solution as follows:
L+2
π(n) = γ fi (ni ), (4.22)
i=1
68 4. ADVANCED MARKOV CHAIN MODELS
where ⎧
⎪
⎪ ( λL
μc )
ni
⎪
⎨ ni ! , i = 1, 2, · · · , L;
fi (ni ) = ( μλo )no , i = L + 1; (4.23)
⎪
⎪
⎪
⎩ ( μλ )nt
t
+ , i = L + 2.
min(nt ,M)!M (nt −M)
The system throughput can be computed the same way as the 3-Queue model. After com-
puting the marginal distribution π(nt ), we compute the throughput from equation (4.20).
2,0,0 1,0,1
1,1,0 0,1,1
0,2,0
We can convert the transition rate matrix Q into a transition probability matrix P by dividing
Q by μt + μo and adding an identity matrix:
⎛ ⎞
(2, 0, 0) (1, 1, 0) (0, 2, 0) (1, 0, 1) (0, 1, 1)
⎜ ⎟
⎜ (2, 0, 0) 0.8804 0.1196 0 0 0 ⎟
⎜ ⎟
Q ⎜ (1, 1, 0) 0 0 0.1196 0.8804 0 ⎟
P=I+ =⎜ ⎟.
μt + μ o ⎜ (0, 2, 0) 0 0 0.1196 0 .8804 ⎟
⎜ ⎟
⎝ (1, 0, 1) 0.1196 0 0 0.7607 0.1196 ⎠
(0, 1, 1) 0 0.1196 0 0 .8804
(4.27)
The steady state distribution of the system is given by
4.3 SUMMARY
• Two advanced Markov chain models are introduced in the chapter: the Bianchi model and
Multi-channel MAC model.
70 4. ADVANCED MARKOV CHAIN MODELS
• The Bianchi model, the first throughput model of a wireless LAN network, is based on DTMC
and Aloha model. It is very accurate to compute the saturated throughput but is limited to a
single collision domain.
• The multi-channel MAC with fixed duration model is an application of a closed queueing
network. Because it belongs to BCMP network, the closed form solution exists.
71
APPENDIX A
Exercises
RENEWAL PROCESS AND THE RENEWAL REWARD
THEOREM
If an event is happening repeatedly, a renewal process can be used to model the repeating events.
The renewal process is a counting process with inter-event times that are independent, identical, and
random variables. For example, suppose that buses arrive at a station every five minutes, independently
of anything else. The counting process N(t) of bus arrivals is a renewal process.
QUEUEING NETWORKS
A queueing network consists of a group of queues and customers to the queues and it is very
popularly used to model a system with multiple resources. A computer system with hard-disk, CPU
and processes can be modeled with a queueing network (15). So do manufacturing systems with
multiple machines (6).
With more queues, analyzing a queueing network is more difficult than a single queue model
in general. However, Jackson, in his seminal paper (11), discovered a class of queueing networks that
are tractable analytically, which is named ‘Jackson Networks” (11).
• The arrivals to queue i form a Poisson process with rate λi for all i = 1, 2, · · · , m, where m
is the number of queues and λi can be zero for a subset of queues.
• The service time is exponentially distributed with rate μi and customers are served in a FCFS
manner.
• Upon completion of service at i, the customer moves to queue j with probability pij or leaves
the system with probability 1 − j pij .
Jackson showed that the steady state queue distribution π exists when the utilization ρi is less
than one at every queue and the distribution for state (k1 , k2 , · · · , km ) is given by :
π(k1 , k2 , · · · , km ) = ρiki (1 − ρi ). (A.1)
i
Note that it is product of individual queue equilibria and is known for “product form solution”. The
utilization ρi at queue i is the total arrival rate divided by the service rate or ρi := μrii . The total
arrival rate ri is the sum of external arrival rate and internal arrival rate to queue i, which is given by
ri = λi + pki rk , for i = 1, 2, · · · , m .
k
Consider a computer system consisting of a CPU and a hard disk. The arrivals of processes
to the CPU is Poisson with rate λ and the processing time at the CPU is exponentially distributed
with rate μ1 . With probability of 41 , the process leaves the system or it moves to the hard disk with
probability of 43 . The processing time at the hard disk is also exponential with rate μ2 . All process
returns to the CPU upon completion at the hard disk.
73
2
π(k1 , k2 ) = ρiki (1 − ρi ),
k=1
ri
where ρi = μi for i = 1, 2.
Closed Queueing Network A queueing network is open if there are external arrivals to one or
more queues and departures. However, if there are no external arrivals to and departures from the
system, it is called closed queueing network . A Jackson network is a closed one, if λi = 0 for all i
and if j pij = 1 for all i. Assume that there are K customers are moving along a closed Jackson
network. As there being no arrivals and departures, the number of people in the system is fixed to
K or
ki = K.
i
−
→
Let K be a set of states k = (k1 , k2 , · · · , km ) such that i ki = K or
−
→
K := { k | ki = K}.
i
BCMP networks The simplicity of the product form solution attracted many researchers and they
tried to find more general queueing networks with the property. A group of four scientists, Baskett,
Chandy, Muntz and Palacios, found more general queueing networks that exhibit the product form
solution, which is named BCMP network (8). A queueing network is a BCMP network if 1) all
queues are one of the four types and 2) the next queue that a customer enters is random. The four
types of the queue are 1) a FCFS queue with exponential service duration, 2) infinite server queue,
3) processor sharing queue, and 4) a single server queue with LCFS with pre-emptive resume. Refer
to (26) for queueing network models.
75
Bibliography
[1] Human Genome Project Information.
http://www.ornl.gov/sci/techresources/Human_Genome/home.shtml. 2
[2] The Network Simulator 2, http://www.isi.edu/nsnam/ns. 3
[3] D. P. Bertsekas and R. Gallager, Data Networks, Prentice Hall, 2nd edition, 1992. 42, 43
[4] G. Bianchi. “Performance Analysis of the IEEE 802.11 Distributed Coordination
Function” IEEE Journal on Selected Areas in Communications, 18(3):535–547, 2000.
DOI: 10.1109/49.840210 49, 57
[5] R. R. Boorstyn, A. Kershenbaum, B. Maglaris, and V. Sahin, “Throughput Analysis in Multi-
hop CSMA Packet Radio nNetworks,” IEEE Transactions on Communications, 35(3):267–274,
1987. DOI: 10.1109/TCOM.1987.1096769 46
[6] J. Buzzacot and J. G. Shanthikumar, Stochastic Models of Manufacturing Systems, Prentice Hall,
1993. 72
[7] O. Dhal and K. Nygaard, “SIMULA: An Algol-based Simulation Language,” Communications
of the ACM, 9:671–678, 1966. DOI: 10.1145/365813.365819 3
[8] R. R. Muntz F. Baskett, K. M. Chandy and F. G. Palacios, “Open, Closed and Mixed Networks
of Queues with Different Classes of Customers,” Journal of the ACM, 22:248–260, 1975.
DOI: 10.1145/321879.321887 73
[9] R. M. Fujimoto, K. Perumalla, A. Park, H. Wu, M. H. Ammar, and G. F. Riley, “Large-scale
Network Simulation: How Big? How Fast?,” In the 11th IEEE/ACM International Symposium
on Modeling, Analysis and Simulation of Computer Telecommunications Systems, pages 116–123,
2003. DOI: 10.1109/MASCOT.2003.1240649 3
[10] B. Hajek and V. Loon, “Decentralized Dynamic Control of a Multiaccess Broad-
cast Channel,” IEEE Transactions on Automatic Control, 27(3):559–569, 1982.
DOI: 10.1109/TAC.1982.1102978 45
[11] J. R. Jackson, “Jobshop-like queueing systems,” Management Science, X:131–142, October
1963. DOI: 10.1287/mnsc.10.1.131 72
[12] R. K. Jain, The Art of Computer Systems Performance Analysis: Techniques for Experimental
Design, Measurement, Simulation, and Modeling,” Wiley, 1991.
76 BIBLIOGRAPHY
[13] L. Jiang and J. Walrand, “A Distributed Algorithm for Optimal Throughput and Fairness in
Wireless Networks with a General Interference Model,” In Proc. of the Allerton Conference,
2008. 46
[16] J. Lee, J. Mo, T. Trung, J. Walrand, and H. So, “Wiflex: Multi-channel Cooperative MAC
Protocol for Heterogeneous Wireless Devices,” In Proc. of IEEE WCNC, 2008. 63
[18] J. Medhi, Stochastic Models in Queueing Theory, Academic Press, 2nd edition, 2002. 6
[19] J. Mo, W. So, and J. Walrand, “Comparison of Multichannel MAC Protocols,” Transactions
on Mobile Computing, 7(1):50–65, 2008. DOI: 10.1109/TMC.2007.1075 50
[22] S. M. Ross, Introduction to Probability Models, Academic Press, 7th edition, 2000. 13, 72
[23] S. M. Ross, Stochastic Processes, Wiley, New York, 2nd edition, 2005.
[24] J. Shi, T. Salonidis, and E. Knightly, “Starvation Mitigation Through Multi-channel Coor-
dination in CSMA based Wireless Networks,” In Proc. of ACM MobiHoc 2006, May 2006.
DOI: 10.1145/1132905.1132929 63
[25] K. Case W. Turner, J. Mize and J. Nazemetz, Introduction to Industrial and Systems Engineering,
Prentice Hall, New Jersey, 3rd edition, 1993. 1
[28] R. W. Wolff, Stochastic Modeling and the Theory of Queues, Prentice Hall, 1989. 72
77
Author’s Biography
JEONGHOON MO
Jeonghoon Mo received the B.S. degree from Seoul National University, Korea, and the M.S. and
Ph.D. degrees from the University of California, Berkeley. He is currently a professor in the depart-
ment of Information and Industrial Engineering at Yonsei University, Korea. Before joining Yonsei,
he has previously worked at AT&T Labs, Tera Blaze, and KAIST. His research interests include
network economics, wireless communications and mobile services, optimization, game theory, and
performance analysis.
79
Index