4DC10 Lecture Notes
4DC10 Lecture Notes
Contents 3
1 Introduction 5
2 Basics 9
2.1 Permutations and combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Standard series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Probability Models 11
3.1 Basic ingredients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Conditional probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Discrete random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Continuous random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 Central limit theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.6 Joint random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.7 Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4 Manufacturing Models 53
4.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Key performance measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3 Capacity, flow time and WIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4 Little’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.5 Variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.6 Process time variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6.1 Natural variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6.2 Preemptive outages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6.3 Non-Preemptive outages . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6.4 Rework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.7 Flow variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.8 Variability interactions - Queueing . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.9 Zero-buffer model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4 Contents
Appendices 103
In the course Analysis of Production Systems we study the behavior of manufacturing systems.
Understanding its behavior and basic principles is critical to manufacturing managers and engineers
trying (i) to develop and control new systems and processes, or (ii) to improve existing systems.
The importance of manufacturing management should not be underestimated: the success of
a company strongly depends on the effectiveness of its management. One might say that
the future of the Dutch manufacturing industry, which constitutes about 17% of the Dutch
gross national product and employs about 15% of the Dutch work force, depends on how
well manufacturing managers and engineers are able to exploit the newest developments in
information and manufacturing technology, also referred to as Smart Industry or the fourth
industrial revolution.
Figure 1.1: Smart Industry – Dutch industry fit for the future
The aim of this course is to develop understanding of manufacturing operations, with a focus on
the material flow through the plant and key performance indicators, such as throughput, equipment
efficiency, investment in equipment material, work in process and so on. Clearly, managers and
engineers need to rely on their “manufacturing intuition” in order to fully understand the
consequences of the decisions they make on system design, control and maintenance.
This course is based on the book Factory Physics [2]. This book provides the basis for
manufacturing science, by offering (i) a clear description of the basic manufacturing principles
(i.e., the physical laws underlying manufacturing operations), (ii) understanding and intuition
about system behavior, and (iii) a unified modeling framework to facilitate synthesis of complex
manufacturing systems.
As we will see, a crucial and disrupting element in manufacturing operations is variability.
Theories that effectively describe the sources of variability and their interaction, are Probability
Theory and Queueing Theory. So, not surprisingly, Factory Physics is firmly grounded on these
theories, since a good probabilistic intuition is a powerful tool for the manager and engineer.
This also explains why the present course consists of:
6 Introduction
• Manufacturing models.
This is the main part, which is devoted to Manufacturing Science. It is based on the
book Factory Physics [2], explaining the basic models and principles in manufacturing
operations. The goal of this part is to develop, through analytical manufacturing models,
understanding of the behavior and basic principles of manufacturing operations, with a
focus on the material flow through the plant.
• Probability models.
As mentioned above, a disrupting element in manufacturing operations is variability. As
force is driving acceleration in physics, so is variability driving production lead times and
work in process levels in manufacturing. The phenomenon of variability can be effectively
described by Probability Theory. Hence, this part treats elementary concepts in Probabil-
ity Theory, including probability models, conditional probabilities, discrete and continuous
random variables, expectation, central limit theorem and so on. It is based on the book
Understanding Probability [6], which puts emphasis on developing probabilistic intuition and
which is full of challenging examples. The goal of this part is to develop basic skills in
formulating and analyzing probability models.
Detailed models of real-life manufacturing systems are often too complicated for analytical
treatment. A powerful, and in manufacturing practice, a very popular tool to analyse complicated
real-life models is discrete-event computer simulation. Therefore, this course also offers an
introduction to the use of simulation models, for which we will employ the python simulation
library Pych. We would like to emphasize that simulation is not only a tool to study complex
systems, but it is also ideal to develop probabilistic intuition. Most of the challenging problems
in [6] can be “solved” or verified by simulation and famous results such as the central limit
theorem can be demonstrated “in action” by using simulation. So the third part of this course
consists of:
• Simulation models.
The simulation library Pych is used as vehicle to demonstrate simulation modeling and
analysis. We believe that simulation modeling should be learned by doing. Hence, this
part is based on self-study. To support self-study, an interactive learning environment
for Pych in Jupyter Notebook is made available, and during the lectures, many examples
of Pych models will be presented. The goal of this part is to obtain some hands-on
experience in developing and using simulation models through Pych.
We conclude the introduction by an illustrative example in the area of automated warehousing.
Example 1.1. (Kiva systems) A new technology in warehousing is mobile shelf-based order
picking, pioneered by Kiva Systems. The items are stored on movable storage racks, see Figure
1.2.
The racks, containing items ordered by a customer, are automatically retrieved from the storage
area and transported to an order picker by robots. These robots are small autonomous drive
M3 ( p = 0.5 ) 110.88 90.7 13.11 837.8 92.4 88.7 8.50 678.7 92.3
M4 ( p = 0.5 ) 110.88 82.0 6.95 598.2 92.4 80.6 5.56 546.4 92.4
M3 ( p = 0.2 ) 44.35 90.7 13.11 2094.6 92.4 88.7 9.14 1750.0 92.3
M4 ( p = 0.2 ) 44.35 82.0 6.95 1495.5 92.4 80.7 5.80 1386.2 92.4
Table 9
Experiment 2, maximum throughput per hour. 7
R=2 R=8 R = 14
To answer this question we consider an order picker, with an average pick time of 3 minutes
per rack. When the items for a customer order are picked, the robot stores the rack at some
place in the storage area and then retrieves the next rack from another place and brings it to
the order picker. The average time required by a robot to store and retrieve a rack from the
storage area depends on the layout of the storage area (i.e., length-to-width ratio), the storage
strategy (where to locate which items?) and the location of the pick station. Figure 1.3 shows
the layout of a storage area with 12 aisles, 14 cross-aisles and 5 pick stations.
Figure 1.3: Layout with 12 aisles, 14 cross-aisles and 5 pick stations (source: [4])
Fig. 11. Variants with different length-to-width ratios.
Given the layout and picker location we can calculate the average time to store and retrieve a
rack. Let us assume it ofnearly
is the15
robots and the workstation from the analytical method is
minutes. Clearly, the variability ofanalytical
the same as for the simulation. The mean length of the ex-
the method
storage and the simulation. However, the differences are
and retrieval times
relatively small for the average order cycle time. The estimates of
ternal
may be high, since the racks order queue does differ between the analytical
can be located anywhere, and intensive method and the analytical method typically
traffic of robots stay below
may10 percent
cause of the esti-
the simulation and this affects the estimates of the order cycle mates of the simulation, except for high arrival rate when R = 2. As
unpredictable congestiontime.
in Thetheaverage
storage
order cyclearea. Ononarrival
time depends at theis pick
the mean length station,
evident from the tables,the
using robot joins
zones lowers theutilization
the robot
Storage area
Buffer
Pick station
20
15
Throughput (racks/hour)
10
5
0
2 4 6 8 10
nr of robots
mu = 20.0
env = Environment()
a = Channel(env)
b = Channel(env)
c = Channel(env)
G = Generator(env, a, N)
Ss = [ Storage(env, a, b, la) for j in range(N) ]
B = Buffer(env, b, c)
P = Pick(env, c, a, mu, 10000)
env.run()
The graph in Figure 1.5 shows the relationship between number of robots and throughput. We
can conclude that, to achieve a throughput of picking 15 racks per hour, at least 6 robots
are required, and that the benefit of adding more robots sharply decreases (and merely results
in additional robots waiting at the picker). Note that the required number of robots may be
different for other order pickers, since the storage and retrieval times depend on the location of
the picker.
The results in Figure 1.5 are obtained by computer simulation, but under certain model as-
sumptions, it is also possible to derive a closed formula for the throughput TH as a function
of the number of robots N, where 1μ is the average pick time and 1λ the average storage and
retrieval time of a robot. This formula explicitly shows how the throughput depends on the
model parameters N, μ and λ.
μ N
TH = μ 1 − N λ
.
P μ i
λ
i =1
Of course, at this point, we do not ask you to understand the above Pych simulation model nor
the formula for TH, but during this course we intend to teach you how models and formulas,
like the ones above, can be derived, where these models and formulas do apply and where not,
and what they learn us about the behavior of manufacturing operations.
2
Basics
In this chapter we briefly summarize some basic results, useful for studying probability models
(see also the Appendix in [6]).
Similarly we have
∞
X ∞
X ∞
X ∞
X ∞
X 2x2 x x2 + x
i2 xi = i(i − 1)xi + ixi = x2 i(i − 1)xi−2 + x ixi−1 == + = .
(1 − x)3 (1 − x)2 (1 − x)3
i =0 i =0 i =0 i=0 i =0
• The sample space S (often denoted by Ω) which is the set of all possible outcomes;
• Events are subsets of the possible outcomes in S;
• New events can be obtained by taking the union, intersection, complement (and so on) of
events.
The sample space S can be discrete or continuous, as shown in the following examples.
Example 3.1. Examples of the sample space:
• Flipping a coin, then the outcomes are Head (H) and Tail (T), so S = {H, T}.
• Rolling a die, then the outcomes are the number of points the die turns up with, so
S = {1, 2, . . . , 6}.
• Rolling a die twice, in which case the outcomes are the points of both dies, S = {(i, j ) |
i, j = 1, 2, . . . , 6}.
• Process time realizations on a machine, which can be any nonnegative (real) number, so
S = [0, ∞).
• Sometimes process times have a fixed off-set, say 5 (minutes), in which case S = [5, ∞).
12 Probability Models
E1 E2 E3 E4
The events are all subsets of the sample space S (though in case of a continuous sample space
S one should be careful and exclude “weird” subsets).
Example 3.2. Examples of events of the sample spaces mentioned in the previous example are:
• Flipping a coin, E = ∅ (i.e., the empty set), E = {H}, E = {T} and E = {H, T} = S (these
are all possible events).
• Rolling a die, E = {1, 2}, E = {1, 3, 5} (the odd outcomes).
• Rolling a die twice, E = {(1, 2), (3, 4), (5, 6)}, E = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)}
(the first die turned up with 1).
• Process times, E = (0, 1), E = (1, ∞)
• Process times with offset, E = (10, 15)
• Number of failures, E = {3, 4, . . .}
• Throwing darts, E = {(x, y) | 0 ≤ x ≤ 14 , 0 ≤ y ≤ 14 } = [0, 14 ] × [0, 14 ].
The other ingredient of a probability model are, of course, probabilities. These are defined as
a function of events, and this function should obey the following elementary rules.
For each event E there is a number P(E) (the probability that event E occurs) such that:
Example 3.3.
E EF F
In case the sample space S is discrete, so S = {s1 , s2 , . . .}, then we can assign probabilities P(s)
to each s ∈ S, which should be between 0 and 1 and add up to 1. Then
X
P(E) = sum of the probabilities of the outcomes in the set E = P(s). (3.1)
s∈E
Note that we are sloppy here as probability is defined on events, which are subsets of the sample
space, so really P(s) denotes P({s}). If S is a finite set, S = {s1 , s2 , . . . , SN }, and all outcomes
are equally likely, so P(si ) = 1/N, then (3.1) reduces to
|E|
P(E) =
N
where |E| is the number of outcomes in the set E. An example is the experiment of rolling two
dice.
Based on the ingredients of a probability model, the following properties of probabilities can be
mathematically derived. Note that these properties all correspond to our intuition.
Property 3.1.
• If finitely many E1 , E2 , . . . , En are mutually disjoint (they have nothing in common), then
P(E1 ∪ E2 ∪ · · · ∪ En ) = P(E1 ) + P(E2 ) + · · · + P(En ).
• If the event Ec is the complement of E (all outcomes except the ones in E), so Ec = S \ E,
then
P(Ec ) = 1 − P(E).
• For the union of two events (so event E or F occurs),
P(E ∪ F) = P(E) + P(F) − P(EF),
where EF = E ∩ F is the intersection of both events (so event E and F occur), see Figure
3.2.
14 Probability Models
−1 1
−1
Figure 3.3: Randomly sampling points from the square [−1, 1] × [−1, 1]
It is remarkable that based on the simple ingredients of a probability model the frequency
interpretation of probabilities can be derived, i.e., the probability of event E can be estimated
as the fraction of times that E happened in a large number of (identical) experiments.
Property 3.2. (Law of large numbers)
If an experiment is repeated an unlimited number of times, and if the experiments are indepen-
dent of each other, then the fraction of times event E occurs converges with probability 1 to
P(E).
For example, if we flip a fair coin an unlimited number of times, then an outcome s of this
experiment is an infinite sequence of Heads and Tails, such as
s = (H, T, T, H, H, H, T, . . .).
Then, if Kn (s) is the number if Heads appearing in the first n flips of outcome s, we can
conclude that, according to the law of large numbers,
Kn (s) 1
lim =
n→∞ n 2
with probability 1.
The method of computer simulation is based on this law.
Example 3.4. The number π can be estimated by the following experiment. We randomly
sample a point (x, y) from the square [−1, 1] × [−1, 1], see Fig. 3.3 and the experiment is
successful if (x, y) falls in unit circle. Then by the law of large numbers:
number of successful experiments area unit circle π
≈ = .
total number of experiments area square 4
Exercise 1. (Problem 7.3 [6]) Four black socks and five white socks lie mixed up in a
drawer. You grab two socks at random from the drawer. What is the probability of having
grabbed one black sock and one white sock?
Exercise 2. (Problem 7.5 [6]) Two players A and B each roll one die. The absolute
difference of the outcomes is computed. Player A wins if the difference is 0, 1, or 2; otherwise,
player B wins. What is the probability that player A wins?
Exercise 3. (Problem 7.7 [6]) You have four mathematics books, three physics books
and two chemistry books. The books are put in random order on a bookshelf. What is the
probability of having the books ordered per subject on the bookshelf?
3.2 Conditional probabilities 15
Exercise 4. (Problem 7.29 [6]) A small transport company has two vehicles, a truck and a
van. The truck is used 75% of the time. Both vehicles are used 30% of the time and neither
of the vehicles is used for 10% of the time. What is the probability that the van is used on
any given day?
Exercise 5. (Problem 7.33 [6]) You roll a fair die six times in a row. What is the
probability that all of the six face values will appear? What is the probability that one or more
sixes will appear?
1
1
P(E|F) = 36 =
1
.
6
6
This corresponds to our intuition: knowing that the first roll is 4 does not tell us anything
about the outcome of the second roll.
• Given that one of the dice turned up with 6, what is the probability that the other one
turned up with 6 as well? Now F = {(6, j ) | j = 1, 2, . . . , 5}∪{(i, 6) | i = 1, 2, . . . , 5}∪{(6, 6)}
and EF = {(6, 6)}. Hence
1
1
P(E|F) = 36 11
= .
36
11
q
Example 3.6. Consider the experiment of throwing darts on a unit disk, so S = (x, y) 1 x2 + y2 ≤
and P(E) is the area of E, divided by area of unit disk, which is π. Given that the outcome is
16 Probability Models
−1 1
Figure 3.4: Conditional probability that distance > 21 given that outcome is in right half of unit
disk
in the right half of the unit disk, what is the probability that its distance to the origin is greater
that 12 ? For this conditional probability we get (see Figure 3.4)
1π − 1π
3
P(distance of (x, y) to 0 > 12 |x > 0) = 2
1π
8 = .
2
4
The formula for conditional probability can be rewritten in the intuitive form
P(EF) = P(E|F) P(F).
This form, which is also known as the product rule, is frequently used to calculate probabilities,
since in many situations, it simplifies calculations or the conditional probabilities are directly given.
Example 3.7. (Champions League [6]) Eight soccer teams reached the quarter finales, two teams
from each of the countries England, Germany, Italy and Spain. The matches are determined by
drawing lots, one by one. What is the probability that the two teams from the same country
play against each other in all four matches? This probability can be calculated by counting all
outcomes of the lottery. The total number of outcomes of the lottery, in which only teams from
the same country play against each other, is 24 · 4! (for example, (E1 , E2 ; G2 , G1 ; I2 , I1 ; S1 , S2 ) is
a possible outcome). The total number of outcomes is 8!. Hence,
24 · 4!
P(Only teams of the same country play against each other) = . (3.2)
8!
It is, however, easier to use conditional probabilities. Imagine that the first lot is drawn. Then,
given that the first lot is drawn, the probability that the second lot is the other team of the
same country is 17 . Then the third lot is drawn. Given that first two lots are of the same
country, the probability that the fourth one is of the same country as the third one is 51 , and
so on. This immediately gives
1 1 1
P(Only teams of the same country play against each other) = · · ,
7 5 3
which is indeed the same answer as (3.2).
The special case that
P(E|F) = P(E),
means that knowledge about the occurrence of event F has no effect on the probability of E.
In other words, event E is independent of F. Substitution of P(E|F) = P(EF)/P(F) leads to the
conclusion that events E and F are independent if
P(EF) = P(E) P(F).
3.2 Conditional probabilities 17
F1
E
F2 F3 F4
that no heads appears. To calculate P(E), we condition on the number that turns up on the
k
die. Let Fk be the event that k turns up. Clearly P(Fk ) = 61 and P(E|Fk ) = 21 . Hence,
P(E) = P(E|F1 ) P(F1 ) + P(E|F2 ) P(F2 ) + · · · + P(E|E6 ) P(F6 )
1 1 1 1 1 1
= · + · + ··· + ·
2 6 4 6 64 6
= 0.1641.
Example 3.12. (Batch production) Items are produced in batches on a machine, the size of which
ranges from 1 to n. The probability that the batch size is k is 1n for k = 1, . . . , n. Immediately
after production, the quality of each item in the batch is checked, and with probability p the
quality of the item meets the threshold, and otherwise, it is scrapped. What is probability that
all items in the batch meet the quality threshold? Let E be the event that none of the items
in the batch is scrapped, and let Fk be the event that the batch size is k. Then P(Fk ) = 1n and
P(E|Fk ) = pk . Hence,
n
X n
1X p 1 − pn
P(E) = P(E|Fk )P(Fk ) = pk = · .
k=1
n k=1 n 1−p
Example 3.13. (Tour de France [6]) The Tour de France will take place from July 1 through
July 23 with 180 cyclists participating. What is the probability that none of them will have
same birthdays during the tournament? Let E be the event that none of them have the
same birthday. To calculate P(E) we first condition on the number of cyclists having birthday
during the tournament. Let Fk be the event that exactly k of them have birthday during the
tournament. Then F0 , . . . , F180 are disjoint events. To calculate P(Fk ), note that you can
choose 180k different groups of size k from the 180 cyclists, and the probability each one in
k
the group has his birthday during the tournament of 23 days is 23 , while the probability that
365
180−k
the other 180 − k cyclists do not have their birthday during the tournament is 365−23 .
365
Hence,
180 23 k 365 − 23 180−k
P(Fk ) = , k = 0, 1, . . . , 180,
k 365 365
Clearly P(E|F0 ) = 1, and
23 22 32 − k + 1
P(E|Fk ) = · ··· , k = 1, . . . , 23
23 23 23
and P(E|Fk ) = 0 for k ≥ 24. Hence
X
180 X
23
P(E) = P(E|Fk ) P(Fk ) = P(E|Fk ) P(Fk ) = 0.8841.
k=0 k=0
In some applications, the probabilities P(E), P(F) and P(E|F) are given, while we are interested
in P(F|E). This probability can then be calculated as follows
P(EF) P(E|F) P(F)
P(F|E) = = .
P(E) P(E)
This is also known as Bayes’ rule (see Section 8.3 in [6]).
Example 3.14. The reliability of a test for a certain disease is 99%. This means that the
probability that the outcome of the test is positive for a patient suffering form the disease, is
99%, and it is negative with probability 99% for a patient free from this disease. It is known
3.2 Conditional probabilities 19
that 0.1% of the population suffers from this disease. Suppose that, for a certain patient, the
test is positive. What is the probability that this patient indeed suffers form this disease? Let E
be the event that the test is positive, and F is the event that the patient has the disease. Then
we need to calculate P(F|E). It is given that P(E|F) = P(Ec |Fc ) = 0.99, P(F) = 0.001 and thus
P(E) = P(E|F)P(F) + P(E|Fc )P(Fc ) = 0.99 · 0.001 + 0.01 · 0.999 = 0.01098. Hence
P(E|F)P(F) 0.99 · 0.001
P(F|E) = = = 0.09.
P(E) 0.01098
Note that this number is much smaller than might have been suggested by the reliability of the
test!
Example 3.15. (System availability) Consider a system composed of n components, where the
status of each component (up or down) is independent of the other components. Component i
is either up with probability qi or down with probability 1 − qi . An important aspect of the system
design is the availability A, which is the probability that the system works. The availability A
depends on system configuration. In the serial system of Figure 3.6 all components have to be
1 2 ··· n
up for the system to work (so each component is critical). Hence the availability is calculated as
A = P(all components are up) = P(component 1 is up) · · · P(component n is up)
n
Y
= q1 q2 · · · qn = qi
i =1
In a parallel (or redundant) system (see Figure 3.7) at least one component has to work for
the system to work. So in this case, the system availability A is calculated as
..
.
n
Y
A = P(at least one component is up) = 1 − P(all components are down) = 1 − (1 − qi )
i=1
A combination of both systems is also possible, as shown in Figure 3.8 This is a serial system
of n critical components where component i is a parallel system of ki parts. The jth part of
component i is up with probability qi,j . So the probability that component i works is given by
ki
Y
Ai = 1 − (1 − qi,j )
j =1
20 Probability Models
1 1 1
2 2 2
···
.. .. ..
. . .
k1 k2 kn
Exercise 6. (Problem 8.3 [6]) Every evening, two weather stations issue a weather forecast
next day. The weather forecasts of the two stations are independent of eachother. On average,
the weather forecast of station 1 is correct in 90% of the cases, irrespective of the weather
type. This percentage is 80% for station 2. On a given day, station 1 predicts sunny weather
for the next day, whereas station 2 predicts rain. What is the probability that the weather
forecast of station 1 will be correct?
Exercise 7. (Problem 8.5 [6]) You simultaneously grab two balls at random from an urn
containing two red balls, one blue ball and one green ball. What is the probability that you
have grabbed two non-red balls given that you have grabbed at least one non-red ball? What
is the probability that you have grabbed two non-red balls given that you have grabbed the
green ball? Can you give an intuitive explanation of why the second probability is larger than
the first one?
Exercise 8. (Problem 8.17 [6]) Two fair coins are tossed. Let A be the event that heads
appears on the first coin and let B be the event that the coins display the same outcome. Are
the events A and B independent?
Exercise 9. (Problem 8.18 [6]) You have two identical boxes in front of you. One of
the boxes contains 10 balls numbered 1 to 10 and the other box contains 25 balls numbered
1 to 25. You choose at random one of the boxes and pick a ball at random from the chosen
box. What is the probability of picking the ball with number 7 written on it?
Exercise 10. (Problem 8.19 [6]) A bag contains three coins. One coin is two-headed and
the other two are normal. A coin is chosen at random from the bag and is tossed twice? What
is the probability of getting two heads? If two heads appear, what is the inverse probability that
the two-headed coin was chosen?
Exercise 11. Consider a serial system consisting of 3 critical components, numbered 1, 2 and
3 (see Figure 3.9). The probability that component 1 works is 0.6. Component 2 works with
probability 0.5 and component 3 works with probability 0.7. Component 2 has two copies, and
at least one of them has to work for the system to work. Denote by A the availability of the
system, that is, A is the probability that the system works.
1. Determine the availability A of the system in Figure 3.9.
3.3 Discrete random variables 21
1 3
Figure 3.9: Serial system consisting of components 1, 2 and 3, where component 2 has 2
copies
2. Suppose that one copy of one of the components may be added. Of which component
a copy should be added so as to maximize the system availability?
A discrete random variable X can only take (possibly infinitely many) discrete values, say
x1 , x2 , . . ., and the function
pj = P(X = xj ), j = 1, 2, . . . ,
is called the probability mass function or probability distribution of X.
Example 3.17.
• If Y = max(i, j ) where i is the first and j the second roll of a die, then
2k − 1
P(Y = k) = , k = 1, . . . , 6.
36
P(N = n) = (1 − p)n−1 p, n = 1, 2, . . .
22 Probability Models
For a random variable X with probability mass function pj = P(X = xj ), j = 1, 2, . . ., its expected
value (or expectation or first moment or “centre of probability mass”) is defined as
∞
X
E [X] = x j pj ,
j =1
where we assume that the infinite sum exists. So the expected value of X is a weighted
average of possible values of X. It is not the same, however, as the most probable value, nor
is it restricted to possible values of X.
Example 3.18.
For example, by repeatedly rolling a die, the average value of the numbers that turn up, gets
closer and closer to 3.5 as the number of rolls increases. This is the law of large numbers for
the expected value. More general, if Xk is the outcome of the kth repetition of an experiment,
then the average 1n (X1 + · · · + Xn ) over the first n repetitions converges with probability 1 to
E(X). Hence, the expected value E(X) can thus be interpreted as the long run average of X.
Example 3.19. Let Y be the total number that shows up by rolling a die twice. Then
X
6 X
6
1
E [Y] = (i + j ) = 7 = 2 × 3.5,
i =1 j =1
36
so E [Y] is two times the expected value of a single roll. This is no coincidence, since by writing
Y = X1 + X2 where X1 is first roll and X2 is second roll, we get
X
6 X
6
E [Y] = (i + j )P(X1 = i, X2 = j )
i=1 j =1
X6 X 6 X
6 X
6
= iP(X1 = i, X2 = j ) + jP(X1 = i, X2 = j )
i=1 j =1 i =1 j =1
X6 X 6 X 6 X 6
= i P(X1 = i, X2 = j ) + j P(X1 = i, X2 = j )
i=1 j =1 j =1 i =1
X6 X
6
= iP(X1 = i) + jP(X2 = j )
i=1 j =1
= E [X1 ] + E [X2 ] = 7.
Example 3.19 shows that E [X1 + X2 ] = E [X1 ] + E [X2 ] where X1 is the first roll of a die and
X2 the second one. This property is true in general. For any two random variables X and Y
3.3 Discrete random variables 23
we have
∞ X
X ∞
E [X + Y] = (xi + yj )P(X = xi , Y = yj )
i =1 j =1
∞ X
X ∞ ∞ X
X ∞
= xi P(X = xi , Y = yj ) + yj P(X = xi , Y = yj )
i =1 j =1 i=1 j =1
∞
X ∞
X ∞
X ∞
X
= xi P(X = xi , Y = yj ) + yj P(X = xi , Y = yj )
i =1 j =1 j =1 i =1
∞
X ∞
X
= xi P(X = xi ) + yj P(Y = yj ),
i =1 j =1
so
E [X + Y] = E [X] + E [Y] .
Hence the expectation of the sum is the sum of the expectations. More general, for any number
of random variables X1 , X2 , . . . , Xn ,
E [X1 + · · · + Xn ] = E [X1 ] + · · · + E [Xn ] .
Example 3.20.
• (Tour de France [6]) What is the expected number of joint birth days during the tour-
nament? Let Xi be 1 if there is a joint birthday on day i of the tournament, and 0
otherwise. If none or exactly one of the cyclists has its birthday on day i, then day i is
not a joint birth day, so Xi = 0. Hence,
180 179
364 1 364
P(Xi = 0) = 1 − P(Xi = 1) = + 180 · = 0.912,
365 365 365
so E [Xi ] = 0 · P(Xi = 0) + 1 · P(Xi = 1) = P(Xi = 1), and the expected number of joint
birthdays is
E [X1 ] + E [X2 ] + · · · + E [X23 ] = 23(1 − 0.912) = 2.02.
• (Tall children [6]) Suppose that n children of different lengths are placed in line at random.
You start with the first child and then walk till the end of the line. When you encounter a
taller child than seen so far, she will join you. Let X be the number that joins you. What
is E(X)? Let Xi be 1 if child i in line joins you, and 0 otherwise. Then X = X1 +· · ·+ Xn .
Child i joins you if she is the tallest among the first i children in line. Since the first i are
ordered at random, the probability that the tallest is at place i is 1i . Hence
1
P(Xi = 1) = 1 − P(Xi = 0) = ,
i
and E [Xi ] = 0 · P(Xi = 0) + 1 · P(Xi = 1) = 1i , thus
1 1
E [X] = E [X1 ] + E [X2 ] + · · · + E [Xn ] = 1 + + ··· + .
2 n
A convenient property is that for any function g(x), the expectation of the random variable
Y = g(X) can be directly calculated as
X X
E [Y] = E [g(X)] = g(xi )P(X = xi ) = g(xi )pi .
xi xi
24 Probability Models
Example 3.21. Let N be the number of flips of a fair coin until the first Head. Then
P(N = i) = 21i and E(N) = 2 (see Example 3.18). For g(x) = x2 , we get
h i ∞
X 1
E [g(N)] = E N2 = i2 = 6.
i =1
2i
h i
In general, E [g(X)] ̸= g(E [X]) (for the example above, we have E N2 = 6 and (E [N])2 =
22 = 4). However, linear functions g(x) = ax + b are an exception, since for any constants a
and b,
X
E [aX + b] = (axi + b)P(X = xi )
xi
X X
= a xi P(X = xi ) + b P(X = xi )
xi xi
= a E [X] + b. (3.4)
An important measure for the spread of the possible values of X is the variance, defined as the
expected squared deviation from the mean,
h i
Var [X] = E (X − E(X))2 .
In many situations it is, however, more convenient to consider the standard deviation of X,
which is the square root of the variance,
q
σ[ X ] = Var [X].
This quantity has the same units as E(X). The variance Var [X] can be calculated as
h i
Var [X] = E (X − E [X])2
X
= (xi − E [X])2 P(X = xi )
xi
X
= x2i − 2E [X] xi + (E [X])2 P(X = xi )
xi
X X X
= x2i P(X = xi ) − 2E [X] xi P(X = xi ) + (E [X])2 P(X = xi )
xi xi xi
h i
= E X2 − 2E [X] E [X] + (E [X])2
h i
= E X2 − (E [X])2
and for Y = aX + b we then get
∞
X
Var [aX + b] = (axj + b − (a E [X] + b))2 pj
j =1
∞
X
= a2 (xj − E [X])2 pj
j =1
= 2
a Var [X] .
Example 3.22. Let N be the number of flips of a fair coin until the first head. Then
P(N = i) = 21i , E(N) = 2 and E(N2 ) = 6 (see Example 3.21), so
h i
Var [N] = E N2 − (E [N])2 = 6 − 4 = 2.
3.3 Discrete random variables 25
For the number of tails hat appear before the first head, which is N − 1, we get
Var [N − 1] = Var [N] = 2.
We have seen that E [X + Y] = E [X] + E [Y] for any two random variables X and Y. However,
usually Var [X + Y] is not equal to Var [X] + Var [Y], though it is true for independent random
random variables X and Y. This is shown below.
The random variables X and Y are independent if for all x, y, the events {X ≤ x} and {Y ≤ y}
are independent, thus
P(X ≤ x, Y ≤ y) = P(X ≤ x) P(Y ≤ y)
or equivalently, when X and Y are discrete random variables,
P(X = x, Y = y) = P(X = x) P(Y = y).
It is clear that when X and Y are independent, then so are the random variables f (X) and g(Y)
for any functions f (x) and g(y). Further, if X and Y are independent, then
∞ X
X ∞
E [XY] = xi yj P(X = xi , Y = yj )
i=1 j =1
∞ X
X ∞
= xi yj P(X = xi )P(Y = yj )
i=1 j =1
∞
X ∞
X
= xi P(X = xi ) yj P(Y = yj )
i=1 j =1
= E [X] E [Y] ,
and as a consequence,
h i
Var [X + Y] = E (X + Y)2 − (E [X] + E [Y])2
h i
= E X2 + 2XY + Y2 − ((E [X])2 + 2E [X] E [Y] + (E [Y])2 )
h i h i
= E X2 + 2E [XY] + E Y2 − ((E [X])2 + 2E [X] E [Y] + (E [Y])2 )
h i h i
= E X2 + 2E [X] E [Y] + E Y2 − ((E [X])2 + 2E [X] E [Y] + (E [Y])2 )
h i h i
= E X2 − (E [X])2 + E Y2 − (E [Y])2
= Var [X] + Var [Y] .
So the variance of the sum is the sum of the variances, provided the random variable are
independent!
Suppose that the random variables X1 , . . . , Xn have the same probability distribution, so in
particular
E [X1 ] = · · · = E [Xn ] , Var [X1 ] = · · · = Var [Xn ] .
For the mean of the sum X1 + · · · + Xn we get
E [X1 + · · · + Xn ] = E [X1 ] + · · · + E [Xn ] = n E [X1 ] ,
Stock level
3 Base-stock
Day
0 1 2 3
Bernoulli
A Bernoulli random variable X with success probability p, takes the values 0 and 1 with
probability
P(X = 1) = 1 − P(X = 0) = p.
Then
E [X] = 0 · P(X = 0) + 1 · P(X = 1) = p
h i
E X2 = 02 · P(X = 0) + 12 · P(X = 1) = p
and h i
Var [X] = E X2 − (E [X])2 = p − p2 = p(1 − p).
3.3 Discrete random variables 27
Binomial
A Binomial random variable X is the number of successes in n independent Bernoulli trials
X1 , . . . , Xn , each with probability p of success,
n k
P(X = k) = p (1 − p)n−k , k = 0, 1, . . . n.
k
The Binomial distribution is shown in Figure 3.11 for n = 20 and p = 0.25, 0.5, 0.65
(which graph corresponds to which parameter p?). Since X = X1 + · · · + Xn , we get
E [X] = E [X1 ] + · · · + E [Xn ] = np, Var [X] = Var [X1 ] + · · · + Var [Xn ] = np(1 − p).
0.20
0.15
prob
0.10
0.05
0.00
0 5 10 15 20
Figure 3.11: Binomial probability distribution with parameters n = 20 and p = 0.25, 0.5, 0.65
Poisson
A Poisson random variable X with parameter λ > 0, has probability distribution
λk
P(X = k) = e−λ
, k = 0, 1, 2, . . .
k!
The Poisson distribution is shown in Figure 3.12 for λ = 1, 5, 10 (which graph corresponds
to which parameter λ?).
Then ∞ ∞ ∞
X X λk X λk−1
E [X] = kP(X = k) = ke−λ = λe−λ = λe−λ eλ = λ
k ! (k − 1)!
k=0 k=1 k=1
and accordingly,
∞
X X∞ ∞
X
λk −λ λk−2 −λ
E [X(X − 1)] = k(k − 1)P(X = k) = k(k − 1)e 2
=λ e = λ2 .
k! (k − 2)!
k=0 k=2 k=2
h i
Hence, since E [X(X − 1)] = E X2 − E [X], we get
h i
E X2 = λ2 + λ
and thus h i
Var [X] = E X2 − (E [X])2 = λ.
So, for a Poisson random variable, the variance is equal to the mean.
28 Probability Models
0.3
0.2
prob
0.1
0.0
0 5 10 15 20
Hypergeometric
A Hypergeometric random variable X is the number of red balls in a random selection of
n balls taken from an urn with R red balls and W white balls (so n ≤ R + W),
R W
P(X = r ) = r R+nW−r , r = 0, 1, . . . , n.
n
Geometric
A Geometric random variable X is the number of Bernoulli trials till the first success, each
trial with probability p of success, so
P(X = k) = (1 − p)k−1 p, k = 1, 2, . . .
The Geometric distribution is shown in Figure 3.12 for p = 0.9, 0.5, 0.2 (which graph
corresponds to which success probability p?).
Then
X∞ X∞
1 1
E [X] = k(1 − p)k−1 p = p k(1 − p)k−1 = p 2 = ,
k=1 k=1
p p
and accordingly,
∞
X ∞
X 2 2(1 − p)
E [X(X − 1)] = k(k−1)(1−p)k−1 p = p(1−p) k(k−1)(1−p)k−2 = p(1−p) = .
k=1 k=2
p3 p2
h i
Hence, since E [X(X − 1)] = E X2 − E [X], we obtain
h i 2(1 − p) 1 1 1 − p
Var [X] = E X2 − (E [X])2 = + − 2 = .
p2 p p p2
3.3 Discrete random variables 29
0.8
0.6
prob
0.4
0.2
0.0
1 3 5 7 9
Figure 3.13: Geometric distribution with success probability p = 0.9, 0.5, 0.2
Negative binomial
A Negative binomial random variable X is the number of Bernoulli trials till the rth success,
each trial with probability p of success,
k−1
P(X = k) = (1 − p)k−r pr , k = r, r + 1, . . .
r−1
We can write X = X1 + · · · Xr , where Xi are independent and Geometric with success
probability p, so
r r (1 − p)
E [X] = E [X1 ] + . . . + E [Xr ] = , Var [X] = Var [X1 ] + . . . Var [Xr ] = .
p p2
Remark 3.1. (Relation between Binomial and Poisson) The total number of successes X in n
independent Bernoulli trials, each with success probability p, is Binomial distributed, so
n k
P(X = k) = p (1 − p)n−k , k = 0, 1, . . . n.
k
If the number of trials n is large and the success probability p small, then
n k (np)k
P(X = k) = p (1 − p)n−k ≈ e−np .
k k!
Hence, for large n and small p, the Binomial distribution of X is approximately equal to the
Poisson distribution with parameter λ = np. This is demonstrated in Figure 3.14.
Example 3.24. (k out of n system) Consider a technical system composed of n identical
components, and q is the probability that a component works (independent of the others), see
Figure 3.15.
At least k components have to work for the system to work. What is the probability that the
system works? Let Xi indicate whether component i works or not, so P(Xi = 1) = 1 − P(Xi =
0) = q, and let X be the total number of components that work, X = X1 + · · · + Xn . Then X
is Binomial with parameters n and q, and the probability Q that the system works is equal to
n
X n
X n
Q = P(X ≥ k) = P(X = i) = qi (1 − q)n−i .
i=k i =k
i
30 Probability Models
0.25
0.20
0.15
prob
0.10
0.05
0.00
0 5 10 15 20
Figure 3.14: Binomial distribution with n = 10 and p = 12 (black), Binomial distribution with
n = 50 and p = 10
1 (blue) and Poisson distribution with λ = 5 (red)
Example 3.25. (Coincidence) Two people, complete strangers to one another, both living
in Eindhoven, meet each other in the train. Each has approximately 200 acquaintances in
Eindhoven. What is the probability that these two strangers have an acquaintance in common?
Eindhoven has (approximately) 200000 inhabitants. To find the desired probability, imagine that
the acquaintances of the first stranger are colored red, and all other inhabitants of Eindhoven are
colored white. Assuming that the acquaintances of the second strangers are randomly distributed
among the population of Eindhoven, then the question can be translated into: What is the
probability by randomly selecting 200 inhabitants that at least one of them is red? Let X
denote the number of red inhabitants found in this random selection. Then X is Hypergeometric
with parameters n = R = 200 and W = 200000 − 200. So
200000−200
P(X ≥ 1) = 1 − P(X = 0) = 1 − 200
200000
= 1 − 0.818 = 0.182.
200
Exercise 12. (Problem 9.3 [6]) A bag contains three coins. One coin is two-headed and
3.3 Discrete random variables 31
the other two are normal. A coin is chosen at random from the bag and is tossed twice. Let
the random variable X denote the number of heads obtained. What is the probability mass
function of X?
Exercise 13. (Problem 9.7 [6]) You throw darts at a circular target on which two concentric
circles of radius 1 cm and 3 cm are drawn. The target itself has a radius of 5 cm. You receive
15 points for hitting the target inside the smaller circle, 8 points for hitting the middle annular
region, and 5 points for hitting the outer annular region. The probability of hitting the target
at all is 0.75. If the dart hits the target, the hitting point is a completely random point on the
target. Let the random variable X denote the number of points scored on a single throw of
the dart. What is the expected value of X?
Exercise 14. (Problem 9.15 [6]) What is the expected number of distinct birthdays within
a randomly formed group of 100 persons? What is the expected number of children in a class
with r children sharing a birthday with some child in another class with s children? Assume that
the year has 365 days and that all possible birthdays are equally likely.
Exercise 15. (Problem 9.17 [6]) What is the expected number of times that two con-
secutive numbers will show up in a lotto drawing of six different numbers from the numbers
1, 2, . . . , 45?
Exercise 16. (Problem 9.33 [6]) In the final of the World Series Baseball, two teams play
a series consisting of at most seven games until one of the two teams has won four games.
Two unevenly matched teams are pitted against each other and the probability that the weaker
team will win any given game is equal to 0.45. Assuming that the results of the various games
are independent of each other, calculate the probability of the weaker team winning the final.
What are the expected value and the standard deviation of the number of games the final will
take?
Exercise 17. (Problem 9.42 [6]) In European roulette the ball lands on one of the numbers
0, 1, . . . , 36 in every spin of the wheel. A gambler offers at even odds the bet that the house
number 0 will come up once in every 25 spins of the wheel. What is the gambler’s expected
profit per dollar bet?
Exercise 18. (Problem 9.44 [6]) In the Lotto 6/45 six different numbers are drawn at
random from the numbers 1, 2, . . . , 45. What are the probability mass functions of the largest
number drawn and the smallest number drawn?
Exercise 19. (Problem 9.46 [6]) A fair coin is tossed until heads appears for the third
time. Let the random variable X be the number of tails shown up to that point. What is the
probability mass function of X? What are the expected value and standard deviation of X?
Exercise 20. (Homework exercises 1) Consider a bin with 10 balls: 5 are red, 3 are green
and 2 are blue. You randomly pick 3 balls from the bin. What is the probability that each of
these balls has a different color?
Exercise 21. (Homework exercises 2) Jan and Kees are in a group of 12 people that are
seated randomly at a round table (with 12 seats). What is the probability that Jan and Kees
are seated next to each other?
Exercise 22. A machine tool (as shown in Figure 3.16) has 50 different cutting tools available
32 Probability Models
for its machining operations. These cutting tools are stored in a tool carousel (see Figure 3.17),
from which they can be automatically loaded on and removed from the machine tool. The
machine can hold 4 cutting tools. Consider a work part that needs 3 cutting tools for its
machining operations and assume that the current tool configuration of the machine is arbitrary
(i.e., any configuration of 4 tools from the available 50 tools is equally likely).
1. What is the probability that none of the 3 required tools is loaded on the machine?
2. What is the probability that exactly one of the 3 required tools is loaded on the machine?
3. What is the expected number of the 3 required tools that is loaded on the machine?
Exercise 23. Batches of printed circuit boards (PCBs) are tested on their functionality. With
1 , a PCB fails the test, independent of the other PCBs in the batch. The batch
probability 10
size N is equal to 10, 11 or 12 with equal probabilities (so P(N = 10) = P(N = 11) = P(N =
12) = 13 ).
1. What is the probability that in a batch of size N = 10 at least 2 PCBs fail the test?
2. What is the probability that in an arbitrary batch at least 2 PCBs fail the test?
the function f (x) is called the probability density of X. The probability that X takes a value in
the interval (a, b] can then be calculated as
P(a < X ≤ b) = P(X ≤ b) − P(X ≤ a )
Z b Z a
= f (x)dx − f (x)dx
−∞ −∞
Z b
= f (x)dx,
a
The function f (x) can be interpreted as the density of the probability mass, that is, we have for
small ∆ > 0 that Z x+∆
P(x < X ≤ x + ∆) = f (y)dy ≈ f (x)∆,
x
from which we can conclude (or from (3.5)) that f (x) is the derivative of the distribution
function F(x),
d
f (x) = F(x).
dx
Example 3.27. (Uniform) The Uniform random variable X on the interval [a, b] has density
f (x) =
1
b− a a ≤ x ≤ b,
0 else.
and the distribution function is given by
0 x < a,
F(x) = bx−−aa a ≤ x ≤ b,
1 x > b.
A simple recipe for finding the density is to first determine the distribution F(x) = P(X ≤ x)
(which is in many cases easier) and then differentiate F(x).
Example 3.28.
• (Example 10.1 in [6]) Break a stick of length 1 at random point in two pieces and let X
be the ratio of shorter piece and longer piece (so 0 < X < 1). What is F(x) = P(X ≤ x)
for 0 < x < 1? Let U be the point where the stick is broken. Then X = 1−UU if U < 12
and X = 1−UU otherwise. Hence the event X ≤ x is valid for all U satisfying 0 ≤ U ≤ 1+x x
or 1+1 x ≤ U ≤ 1 (see Figure 3.18), so
x 1 x 1 2x
F(x) = P(X ≤ x) = P U ≤ +P U≥ = +1− = , 0 < x < 1.
1+x 1+x 1+x 1+x 1+x
The density of X is obtained by taking the derivative of F(x), yielding
d 2
f (x) = F(x) = , 0 < x < 1.
dx (1 + x)2
34 Probability Models
1
u 1
1−u u< 2
x x= 1−u 1
u u> 2
0 x 1
1+x 1+x 1 u
Figure 3.18: Random variable X = U if U < 21 and X = 1−U if U > 21 where U is random
1−U U
point on stick of unit length
• Let X = − ln(U)/λ where U is random on (0, 1). What is the density of X? Clearly
X ≥ 0, so F(x) = 0 for x < 0. For x ≥ 0 we get
F(x) = P(X ≤ x) = P(− ln(U)/λ ≤ x) = P(ln(U) ≥ −λx) = P(U ≥ e−λx ) = 1 − e−λx .
Hence,
d
f (x) = F(x) = λe−λx , x ≥ 0,
dx
and f (x) = 0 for x < 0. This distribution is called the Exponential distribution.
• The random variable X is the distance to the origin 0 of a random point in a disk of
radius r. What is density of X? Clearly 0 ≤ X ≤ r, so F(x) = 0 for x < 0 and F(x) = 1
for x > r. In case 0 ≤ x ≤ r we get
πx2 x2
F(x) = P(X ≤ x) = = .
πr 2 r 2
Hence,
d 2x
f (x) = F(x) = 2 , 0 ≤ x ≤ r,
dx r
and f (x) = 0 otherwise.
For a continuous random variable X with density f (x), its expected value (or expectation or first
moment) is defined as Z ∞
E [X] = xf (x)dx
−∞
where we assume that the integral exists, and for any function g(x), the expectation of Y = g(X)
can be calculated as Z ∞
E [Y] = E [g(X)] = g(x)f (x)dx.
−∞
The variance of X is
h i Z ∞
Var [X] = E (X − E(X))2 = (x − E [X])2 f (x)dx,
−∞
Example 3.29. The random variable X is the distance to the origin 0 of a random point in a
disk of radius r. Then Z r Z r
2x 2
E [X] = xf (x)dx = x 2 dx = r
0 0 r 3
3.4 Continuous random variables 35
f (x)
25 50 75 x
and h i Z r
1
E X2 = x2 f (x)dx = r 2 .
0 2
Hence h i 1 2
Var [X] = E X2 − (E [X])2 = r .
18
Below we list some important continuous random variables.
Uniform
A Uniform random variable X on [a, b] has density
1
f (x) = , a < x < b,
b−a
and f (x) = 0 otherwise. Then
x−a a+b 1
P(X ≤ x) = , a < x < b, E [X] = , Var [X] = (b − a )2 .
b−a 2 12
Triangular
A Triangular random variable X on the interval [a, b] has density
(
h mx−−aa a ≤ x ≤ m,
f (x) =
h bb−−mx m ≤ x ≤ b,
and f (x) = 0 otherwise, see Figure 3.19. The height h follows from the requirement that
Rb (b−a )h
a f (x)dx = = 1, yielding h = b−
2 . Then
2 a
1 1 2 2
E [X] = (a + b + m), Var [X] = (a + b + m2 − ab − am − bm).
3 18
Exponential
An Exponential random variable X with parameter (or rate) λ > 0 has density
f (x) = λe−λx , x > 0,
and f (x) = 0 otherwise. Then
1 1
P(X ≤ x) = 1 − e−λx , x > 0, E [X] = , Var [X] = 2 .
λ λ
36 Probability Models
Erlang
An Erlang random variable X with parameters n and λ, is the sum of n independent
Exponential random variables X1 , . . . , Xn , each with parameter λ. Its density is given by
(λx)n−1
f (x) = λe−λx , x > 0,
(n − 1)!
Then
n n
E [X] = E [X1 ] + · · · + E [Xn ] = , Var [X] = Var [X1 ] + · · · + Var [Xn ] = 2 .
λ λ
Gamma
A Gamma random variable X with parameters α > 0 and λ > 0, has density
1
f (x) = λα xα−1 e−λx , x > 0,
Γ(α)
Then α α
E [X] = , Var [X] = 2 .
λ λ
Normal
A Normal random variable X with parameters μ and σ > 0, has density
1 1
f (x) = √ e− 2 (x−μ) /σ ,
2 2
−∞ < x < ∞.
σ 2π
Then
E(X) = μ, var(X) = σ2 .
The density f (x) is denoted as N(μ, σ2 ) density.
Standard Normal
A Standard Normal random variable X is a Normal random variable with mean μ = 0 and
standard deviation σ = 1. So it has the N(0, 1) density,
1 1 2
f (x) = φ(x) = √ e− 2 x
2π
and Z x
1 1 2
P(X ≤ x) = Φ(x) = √ e− 2 y dy.
2π −∞
s t s+t
Thinking of X as the lifetime of a component, then this property states that, if at time s
the component is still working, then its remaining lifetime is again Exponential with exactly
the same mean as a new component. In other words, used is as good as new! This
memoryless property is important in many application and can be straightforwardly derived
from the definition of conditional probabilities,
P(X > t + s, X > s)
P(X > t + s|X > s) =
P(X > s)
P(X > t + s)
=
P(X > s)
e−λ(t+s)
=
e−λs
−λt
= e .
Hence, the minimum of X1 and X2 is again Exponential, and the rate of the minimum is
the sum of the rates λ1 and λ2 .
Remark 3.2. By integrating the gamma function by parts we obtain
Γ(α) = (α − 1)Γ(α − 1).
So, if α = n, then
Γ(n) = (n − 1)Γ(n − 1) = · · · = (n − 1)!Γ(1) = (n − 1)!,
since Γ(1) = 1. Hence, the Gamma distribution with parameters n and λ is the same as the
Erlang distribution.
from which we can conclude that aX + b is Normal with parameters aμ + b and aσ.
• Additivity: If X and Y are independent and Normal, then X + Y is Normal.
• Excess probability: The probability that a Normal random variable X lies more than z
times the standard deviations above its mean is
X−μ
P(X ≥ μ + zσ) = 1 − P(X ≤ μ + zσ) = 1 − P( ≤ z) = 1 − Φ(z),
σ
X −μ
since σ is Standard Normal.
• Percentiles: The 100p% percentile zp of the Standard Normal distribution is the unique
point zp for which
Φ(zp ) = p.
For example, for p = 0.95 we have z0.95 = 1.64, and z0.975 = 1.96 for p = 0.975.
We conclude this section by the concept of failure rate for positive random variables. Thinking
of X as the lifetime of a component, then the probability that the component will fail in the
next ∆ time units when it reached the age of x time units, is equal to
P(x < X ≤ x + ∆) f (x)∆
P(X ≤ x + ∆|X > x) = ≈ .
P(X > x) 1 − F(x)
Dividing this probability by ∆ yields the rate r (x) at which the component will fail at time x
given that it reached time x. This rate is called the failure rate or hazard rate of X, and it
given by
f (x)
r (x) = .
1 − F(x)
Example 3.30. (Exponential) If X is Exponential with rate λ, then (see also (3.7))
f (x) λe−λx
r (x) = = −λx = λ.
1 − F(x) e
Hence, the failure rate of an Exponential random variable is constant: at any point in time
the component is equally likely to fail, or in other words, the coomponents is always as good
as new. In fact, this is the only distribution for which the failure rate is constant. In practice,
however, many components have a bath-tube failure rate, reflecting that initially and at the end
of the lifetime of a component is higher than average.
3.4 Continuous random variables 39
Exercise 24. (Problem 10.1 [6]) The life time of an appliance is a continuous random
variable X and has a probability density f (x) of the form f (x) = c(1 + x)−3 for x > 0 and
f (x) = 0 otherwise. What is the value of the constant c? Find P(X ≤ 0.5), P(0.5 < X ≤ 1.5)
and P(0.5 < X ≤ 1.5|X > 0.5).
Exercise 25. (Problem 10.3 [6]) Sizes of insurance claims can be modeled by a continuous
1 (10 − x) for 0 < x < 10 and f (x) = 0 otherwise.
random variable with probability density f (x) = 50
What is the probability that the size of a particular claim is larger than 5 given that the size
exceeds 2?
Exercise 26. (Problem 10.4 [6]) The lengths of phone calls (in minutes) made by a
travel agent can be modeled as a continuous random variable with probability density f (x) =
0.25e−0.25x for x > 0. What is the probability that a particular phone call will take more than
7 minutes?
Exercise 27. (Problem 10.5 [6]) Let X be a positive random variable with probability
density function f (x). Define the random variable Y by Y = X2 . What is the probability density
function of Y? Also, find the density function of the random variable W = V2 if V is a number
chosen at random from the interval (−a, a ) with a > 0.
Exercise 28. (Problem 10.8 [6]) A stick of unit length is broken at random into two
pieces. Let the random variable X represent the length of the shorter piece. What is the
probability density of X? Also, use the probability distribution function of X to give an
alternative derivation of the probability density of the random variable X/ (1 − X), which is the
ratio of the length of the shorter piece to that of the longer piece.
Exercise 29. (Problem 10.11 [6]) The javelin thrower Big John throws the javelin more
−(x−50) 2
than x meters with probability P(x), where P(x) = 1 for 0 ≤ x < 50, P(x) = 12001200 for
− x)2
50 ≤ x < 80, P(x) = (90400 for 80 ≤ x < 90, and P(x) = 0 for x ≥ 90. What is the expected
value of the distance thrown in his next shot?
Exercise 30. (Problem 10.19 [6]) A point Q is chosen at random inside a sphere with
radius r. What are the expected value and the standard deviation of the distance from the
center of the sphere to the point Q?
Exercise 31. (Problem 10.20 [6]) The lifetime (in months) of a battery is a random
variable X satisfying P(X ≤ x) = 0 for x < 5, P(X ≤ x) = [(x − 5)3 + 2(x − 5)]/12 for 5 ≤ x < 7
and P(X ≤ x) = 1 for x ≥ 7. What are the expected value and the standard deviation of X?
Exercise 32. (Problem 10.23 [6]) In an inventory system, a replenishment order is placed
when the stock on hand of a certain product drops to the level s, where the reorder point s
is a given positive number. The total demand for the product during the lead time of the
replenishment order has the probability density f (x) = λe−λx for x > 0. What are the expected
value and standard deviation of the shortage (if any) when the replenishment order arrives?
Exercise 33. (Problem 10.27 [6]) The lifetime of a light bulb has an uniform probability
density on (2, 12). The light bulb will be replaced upon failure or upon reaching the age 10,
whichever occurs first. What are the expected value and the standard deviation of the age of
the light bulb at the time of replacement?
40 Probability Models
Exercise 34. (Problem 10.28 [6]) A rolling machine produces sheets of steel of different
thickness. The thickness of a sheet of steel is uniformly distributed between 120 and 150
millimeters. Any sheet having a thickness of less than 125 millimeters must be scrapped. What
are the expected value and the standard deviation of a non-scrapped sheet of steel?
Exercise 35. (Problem 10.30 [6]) Limousines depart from the railway station to the airport
from the early morning till late at night. The limousines leave from the railway station with
independent inter-departure times that are exponentially distributed with an expected value of
20 minutes. Suppose you plan to arrive at the railway station at three o’clock in the afternoon.
What are the expected value and the standard deviation of your waiting time at the railway
station until a limousine leaves for the airport?
Exercise 36. (Homework exercises 12) The lifetimes of two components in an electronic
system are independent random variables X1 and X2 , where X1 and X2 are exponentially
distributed with an expected value of 2 respectively 3 time units. Let random variable Y
denote the time between the first failure and the second failure. What is the probability density
distribution f (y) of Y?
Exercise 38. Demand during a single period is described by a continuous random variable X
with density (
1
f (x) = ce− 10 x+4 , x > 40
0, elsewhere
3. Suppose you put 50 units on stock to meet the demand during a single period. What
is the probability that a stock of 50 units is not enough to meet the demand in a single
period?
4. How much to put on stock such that the probability that the stock is not enough to meet
the demand in a single period is equal to 5%?
1. Determine the constant c. Hint: The primitives of the functions xeax and x2 eax are given
by Z Z
ax 1 1 ax ax 1 2 2 2 ax
xe dx = x− 2 e , 2
x e dx = x − 2x + 3 e .
a a a a a
2. What is the probability that the pick-and-place machine works during 4 consecutive hours
without any error?
3. What is the mean time till an error occurs?
Exercise 40. Consider two parallel systems, each with two components in series (see Figure
3.21). So system 1 works as long as both components A and B work. The lifetime of each
component is exponentially distributed. The mean lifetime is 1 hour for components A and B,
and 2 hours for components C and D. At time t = 0 all components work.
A B
C D
A B
C D
But what is the distribution of X1 + · · · + Xn when n is large? The Central Limit Theorem
states that, for any a < b,
X1 + · · · + Xn − nμ
lim P (a ≤ ≤ b) = Φ(b) − Φ(a ),
n→∞ σ√ n
An important application of the Central Limit Theorem is the construction of confidence intervals.
Consider the problem of estimating the unknown expectation μ = E(X) of a random variable X.
Suppose n independent samples X1 , . . . , Xn are generated, then by the law of large numbers,
the sample mean is an estimator for the unknown μ = E(X),
n
1X
X(n) = X
n k=1 k
where the percentile z1− 1 α is the point for which the area under the Standard Normal curve
2
between points −z1− 1 α and z1− 1 α equals 100(1 − α)%. Rewriting (3.8) leads to the following
2 2
interval containing μ with probability 1 − α,
!
σ σ
P X(n) − z1− 1 α √ ≤ μ ≤ X(n) + z1− 1 α √ ≈ 1 − α. (3.9)
2 n 2 n
Thus, for large n, an approximate 100(1 − α)% confidence interval for μ is X(n) ± z1− 1 α S√(nn) ,
2
that is, this interval covers the mean μ with probability 1 − α. Note that the confidence interval
is random, not the mean μ, see Figure 3.23.
Remark 3.3.
3.6 Joint random variables 43
0.3
0.25
0.2
0.15
0.1
0.05
-0.05
-0.1
-0.15
-0.2
-0.25
0 10 20 30 40 50 60 70 80 90 100
Figure 3.23: 100 confidence intervals to estimate the mean 0 of the Uniform random variable
on (−1, 1), where each interval is based on 100 samples
• If the standard deviation σ is unknown, it can be estimated by square root of the sample
variance n h i2
1X
2
S (n) = Xk − X(n)
n k=1
and then this estimate S(n) can be used in (3.9) instead of σ.
• About 100 times as many samples are needed to reduce the width of a confidence interval
1!
by a factor 10
is the joint probability distribution function of X and Y, where f (x, y) is the joint density,
satisfying Z Z ∞ ∞
f (x, y) ≥ 0, f (x, y)dxdy = 1.
x=−∞ y=−∞
The function f (x, y) can be interpreted as the density of the joint probability mass, that is, for
small ∆ > 0
P(x < X ≤ x + ∆, y < Y ≤ y + ∆) ≈ f (x, y)∆2
The joint density f (x, y) can be obtained from the joint distribution P(X ≤ x, Y ≤ y) by taking
partial derivatives,
∂2
f (x, y) = P(X ≤ x, Y ≤ y).
∂x∂y
The marginal densities of X and Y follow from
Z ∞
fX (x) = f (x, y)dy,
Z−∞
∞
fY (y) = f (x, y)dx.
−∞
• The random variable X is the distance to the origin 0 and Y is angle (in radians) of a
random point in a disk of radius r. What is the joint distribution and joint density of X
and Y?
y
πx2 2π x2 y
P(X ≤ x, Y ≤ y) = = , 0 ≤ x ≤ r, 0 ≤ y ≤ 2π
πr 2 r 2 2π
and by taking the partial derivatives, the joint density is found to be
2x 1
f (x, y) = , 0 ≤ x ≤ r, 0 ≤ y ≤ 2π.
r 2 2π
Hence the marginal densities are
Z 2π Z r
2x 1
fX (x) = f (x, y)dy = , 0 ≤ x ≤ r, fY (y) = f (x, y)dx = , 0 ≤ y ≤ 2π,
y=0 r2 x=0 2π
so
Z ∞ Z ∞
P(X1 ≤ X2 ) = λ1 e−λ1 x1 λ2 e−λ2 x2 dx2 dx1
x1 =0 x2 =x1
Z ∞ Z ∞
= λ1 e−λ1 x1 λ2 e−λ2 x2 dx2 dx1
x1 =0 x2 =x1
Z ∞
= λ1 e−λ1 x1 e−λ2 x1 dx1
x1 =0
λ1
= .
λ1 + λ2
Thus the probability that X1 is the smallest one is proportional to the rates.
• Let X be random on (0, 1) and Y be random on (0, X). What is the density of the area
of the rectangle with sides X and Y? So we need to calculate the density of Z = XY.
First note that the joint density of X and Y is equal to
1
f (x, y) = , 0 ≤ y ≤ x ≤ 1,
x
and f (x, y) = 0 elsewhere. Hence, for 0 ≤ z ≤ 1,
P(Z ≤ z) = P(XY ≤ z)
Z √
zZ x Z 1 Z z
x
= f (x, y)dydx + f (x, y)dydx
x=0 y=0 x=√z y=0
Z √z Z x Z 1 Z z
1 x1
= dydx + dydx
x=0 y=0 x x= √z y=0 x
Z 1
√ 1
= z+ dx
x=√z x2
√
= z + √z − z
√
= 2 z − z.
The expectation of the random variable g(X, Y), where g(x, y) is any function, can be calculated
as Z ∞Z ∞
E [g(X, Y)] = g(x, y)f (x, y)dxdy.
−∞ −∞
For example, taking g(x, y) = ax + by, where a and b are constants,
Z ∞ Z ∞
E [aX + bY] = (ax + by)f (x, y)dxdy
−∞ −∞
Z ∞ Z ∞ Z ∞ Z ∞
= a x f (x, y)dydx + b y f (x, y)dxdy
−∞ −∞ −∞ −∞
Z ∞ Z ∞
= a xfX (x)dx + b yfY (y)dy
−∞ −∞
= a E [X] + bE [Y] .
and similarly, if X and Y are independent (so f (x, y) = fX (x)fY (y)),
E [XY] = E [X] E [Y] . (3.10)
46 Probability Models
Example 3.34. Pick two points X and Y at random from the interval (0, 1). What is the mean
distance between these two points? We need to calculate E [|X − Y|]. The joint density of X
and Y is f (x, y) = 1 for 0 ≤ x, y ≤ 1, so
Z 1 Z x Z 1
1 2 1
E [|X − Y|] = 2 (x − y)dydx = 2 x dx = ,
x=0 y=0 x=0 2 3
as expected.
Exercise 41. (Problem 11.8 [6]) Let the joint probability density function of the random
variables X and Y be given by f (x, y) = ce−2x for 0 < y ≤ x < ∞ and f (x, y) = 0 otherwise.
Determine the constant c. What is the probability density of X − Y?
Exercise 42. (Problem 11.12 [6]) The joint density of the random variables X and Y is
given by f (x, y) = 4xe−2x(1+y) for x, y > 0. What are the marginal densities of X and Y?
Exercise 43. (Homework exercises 17) We select a point (x, y) at random from a square
with sides 1. Let X be the random variable for the x-coordinate and Y the random variable
for the y-coordinate of that point. What is the probability that X > 1.5 − Y?
3.7 Conditioning
In this section we summarize the concepts of conditional probabilities and conditional expec-
tations for random variables. Conditioning is a fruitful technique to calculate probabilities and
expectations.
Let X and Y be discrete random variables. The conditional probability mass function of X
given Y = y is defined as
P(X = x, Y = y)
P(X = x|Y = y) = .
P(Y = y)
Then the unconditional probabilities P(X = x) can be calculated by conditioning on the possible
values of Y, X
P(X = x) = P(X = x|Y = y)P(Y = y).
y
Example 3.35. Simultaneously roll 24 dice and next roll with those that showed 6. Let X be
number of sixes in the first roll and let Y be those in the second roll.
3.7 Conditioning 47
• What is P(Y = y|X = x)? This means that in the second roll, x dice are used. Hence, the
probability that y dice out of x show six is
y x−y
x 1 5
P(Y = y|X = x) = . (3.11)
y 6 6
• What is P(Y = y)? To calculate this probability we condition on the outcome of X and
then use the above result, so
X
24
P(Y = y) = P(Y = y|X = x)P(X = x)
x= y
X24 x 24−x
24 1 5
= P(Y = y|X = x)
x= y
x 6 6
X24 y x−y x 24−x
x 1 5 24 1 5
= . (3.12)
x= y
y 6 6 x 6 6
• And what is P(X = x|Y = y)? Using the definition of conditional probability,
P(X = x, Y = y) P(Y = y|X = x)P(X = x)
P(X = x|Y = y) = = ,
P(Y = y) P(Y = y)
so this probability can be calculated by substituting (3.11) and (3.12) and
x 24−x
24 1 5
P(X = x) = .
x 6 6
Now let X and Y be continuous random variables with joint density f (x, y) and marginal densities
fX (x) and fY (y). The conditional density of X given Y = y is defined analogously to the discrete
case,
fX (x|y)dx = P(x < X ≤ x + dx|y < Y ≤ y + dy)
P(x < X ≤ x + dx, y < Y ≤ y + dy)
=
P(y < Y ≤ y + dy)
f (x, y)dxdy
=
fY (y)dy
f (x, y)
= dx,
fY (y)
so the conditional density of X given Y = y is
f (x, y)
fX (x|y) = .
fY (y)
Then the conditional probability P(X ≤ x|Y = y) follows from
Z x
P(X ≤ x|Y = y) = fX (z|y)dz
z=−∞
and the unconditional probability P(X ≤ x) can be calculated by conditioning on the possible
values of Y, Z ∞
P(X ≤ x) = P(X ≤ x|Y = y)fY (y)dy.
−∞
48 Probability Models
Example 3.36.
• Point (X, Y) is randomly chosen in the unit circle. What is the conditional density of X
given Y = y? We have
1
f (x, y) = , x2 + y2 ≤ 1,
π
and f (x, y) = 0 otherwise. So for −1 ≤ y ≤ 1,
Z √1−y2
1 2q
fY (y) = √ dx = 1 − y2 .
− 1−y2 π π
q q
Hence, for − 1 − y2 ≤ x ≤ 1 − y2 ,
1
f (x, y) 1
fX (x|y) = = qπ = q ,
fY (y) 2 1 − y2 2 1 − y2
π
q q
which is the Uniform density on the interval (− 1 − y2 , 1 − y2 ), as expected.
• You are waiting for the metro. Once the metro has stopped, the distance to the nearest
metro door is Uniform between 0 and 2 metres. If the distance to the nearest door is y
metres, then you are able to find a place to sit with probability
r
1
1− y
2
What is the probability Mr Johnson finds a place to sit? Let the random variable Y be the
distance to the nearest door and X indicate whether you find a place to sit (so X = 1 if
you can sit and X = 0 otherwise). To calculate the probability P(X = 1), note that Y is
Uniform on (0, 2), so
1
fY (y) = , 0 < y < 2,
2
and fY (y) = 0 elsewhere. Also,
r
1
P(X = 1|Y = y) = 1 − y.
2
Hence
Z 2 Z 2 r
1 1 1
P(X = 1) = P(X = 1|Y = y)fY (y)dy = (1 − y) dy = .
y=0 y=0 2 2 3
q q
Note that this probability is not equal to 1 − 1 E [Y] =
2 1− 1!
2
Conditioning is a fruitful technique, not only to calculate probabilities, but also to calculate
expectations. For discrete random variables X and Y, the conditional expectation of X given
Y = y is defined as X
E [X|Y = y] = xP(X = x|Y = y),
x
and then the (unconditional) expectation of X can be calculated by conditioning on the possible
values of Y, X
E [X] = E [X|Y = y] P(Y = y).
y
3.7 Conditioning 49
and Z ∞
E [X] = E [X|Y = y] fY (y)dy.
−∞
Example 3.37.
• Generate two random numbers X1 and X2 from (0, 1). Let X be the smallest of X1
and X2 and Y the largest. What are E [X|Y = y] and E [X]? Note that
f (x, y) = fX1 (x)fX2 (y) + fX1 (y)fX2 (x) = 1 + 1 = 2, 0 < x < y < 1,
and
fY (y) = P(X1 < y)fX2 (y) + fX1 (y)P(X2 < y) = 2y, 0 < y < 1.
So
f (x, y) 1
fX (x|y) = = , 0 < x < y,
fY (y) y
which means that X is uniform on (0, y) if Y = y (as expected). Hence,
Z y
1 1
E [X|Y = y] = x dx = y,
x=0 y 2
and Z 1 Z 1
1
E [X] = E [X|Y = y] fY (y)dy = y2 dy = .
y=0 y=0 3
Of course, since fX (x) = 2(1 − x) for 0 < x < 1, the expectation of X can also be calculated
directly,
Z 1 Z 1
1
E [X] = xfX (x)dx = 2x(1 − x)dx = .
x=0 x= 0 3
• A batch consists of a random number of items N, where P(N = n) = (1 − p)pn−1 , n ≥ 1.
The production time of a single item is Uniform between 4 and 10 minutes. What is the
mean production time of a batch? Let B denote the production time of the batch, and
Xi the production time of a single item (so E [Xi ] = 7 mins). Then we can write
N
X
B= Xi .
i =1
Hence
∞
X ∞
X 7
E [B] = E [B|N = n] P(N = n) = 7n(1 − p)pn−1 = (mins).
n=1 n=1
1−p
• Process time X is Exponential with rate Λ, where Λ itself is also random with density
1 2
fΛ (λ) = λe− 2 λ , λ > 0.
50 Probability Models
What is the mean process time? To calculate E [X] we first condition on the rate Λ = λ,
which leads to
1
E [X|Λ = λ] = ,
λ
and then we get
Z Z Z r
∞
1 − 12 λ2 √ ∞
1 ∞ √ π
− 21 λ2
E [X] = E [X|Λ = λ] fΛ (λ)dλ = λe dλ = 2π √ e dλ = 2πΦ(0) = .
λ=0 λ=0 λ 2π λ=0 2
Example 3.38. (Random setups) Suppose that a machine needs a setup after having produced
on average k jobs. This means that, after having produced a job, the machine needs a setup
Y with probability p = 1k . Let X denote the natural process of a job, Y the setup time, and Z
denotes the effective process time, i.e., Z is X plus the possible setup Y,
X+Y with probability p,
Z=
X with probability 1 − p.
What is the mean and variance of the effective process time Z? To calcute E [Z] and Var [Z]
we condition on having a setup or not,
E [Z] = E [X + Y] p + E [X] (1 − p)
= (E [X] + E [Y])p + E [X] (1 − p)
= E [X] + pE [Y] ,
and similarly
h i h i h i
E Z2 = E (X + Y)2 p + E X2 (1 − p)
h i h i h i
= (E X2 + 2E [X] E [Y] + E Y2 )p + E X2 (1 − p)
h i h i
= E X2 + 2pE [X] E [Y] + pE Y2 .
So h i
Var [Z] = E Z2 − (E [Z])2 = Var [X] + pVar [Y] + p(1 − p)(E [Y])2 .
Clearly, Var [Z] is greater than Var [X] (even when Var [Y] = 0, i.e. Y is constant).
Exercise 44. (Problem 13.4 [6]) Two dice are rolled. Let the random variable X be the
smallest of the two outcomes and let Y be the largest of the two outcomes. What are the
conditional mass functions P(X = x|Y = y) and P(Y = y|X = x)?
Exercise 45. (Problem 13.7 [6]) Let X and Y be two continuous random variables with
the joint probability density f (x, y) = xe−x(y+1) for x, y > 0 and f (x, y) = 0 otherwise. What are
the conditional probability densities fX (x|y) and fY (y|x)? What is the probability that Y is larger
than 1 given that X = 1?
Exercise 46. (Problem 13.8 [6]) Let X and Y be two continuous random variables with
the joint probability density f (x, y) = 27(2x + 5y) for 0 < x, y < 1 and f (x, y) = 0 otherwise.
What are the conditional probability densities fX (x|y) and fY Y(y|x)? What is the probability that
X is larger than 0.5 given that Y = 0.2?
3.7 Conditioning 51
Exercise 47. (Problem 13.23 [6]) You generate three random numbers from (0,1). Let
X be the smallest of these three numbers and Y the largest. What are the conditional expected
values E(X|Y = y) and E(Y|X = x)?
Exercise 48. (Homework exercises 15) Let X be a positive random variable with cumulative
density function c
P(X ≤ x) = FX (x) = 1 − .
2+x
(a) Determine the constant c.
(b) Calculate the conditional probability P(X > 8|X ≤ 10).
52 Probability Models
4
Manufacturing Models
The essence of any manufacturing system is to transform raw material into physical products to
meet demand. This transformation (or manufacturing) is done by a system, which is a collection
of elements. This system can be viewed at various levels, see Figure 4.1:
• Factory level. This is the whole factory (also referred to as plant or fab), and may consists
of several (functional) areas.
• Area level. In an area we find several machines or groups of machines.
• Equipment level. At this level we have machines or groups of machines (also referred to
as manufacturing cells or workstations).
Figure 4.3: BIM (Breakthrough in Manufacturing) lines of NXP for the assembly of ICs
• Continuous flow processes. Continuous products (such as food and chemicals) literally flow
through the production line along a fixed routing.
We will mainly focus our attention on manufacturing systems producing discrete parts on
disconnected flow lines. Disconnected flow lines can be found in, for example, wafer fabs,
producing integrated circuits (ICs) on silicon wafers, whereas the assembly of these ICs in their
supporting packages is done on high volume dedicated assembly lines, see Figure 4.3.
The ideal situation is that demand is perfectly aligned with the manufacturing process. However,
in practice, this is almost never the case (since machines may break down, raw material is missing,
demand suddenly increases, and so on). To align these two processes, buffers are needed. So a
buffer is an excess resource to correct for the misalignment between demand and manufacturing
processes. It can appear in the following three forms:
4.1 Terminology
In this section we introduce some useful terminology.
• Throughput or throughput rate is the number of good (i.e., non-defective) parts or jobs
produced per unit time.
• Capacity (or maximal throughput) is an upper limit on the throughput of a production
process.
• Work-In-Process (WIP) is all the products from the start to the end point of a product
routing.
• Cycle time or flow time, throughput time or sojourn time is the time it takes from the
release of a job at the beginning of its routing to go through the system and to reach
the end of its routing. In other words, it is the time a job spends as WIP.
• Utilization of a machine is the fraction of time it is not idle for lack of parts. This is not
necessarily the fraction of time the machine is processing jobs, but typically also includes
failures, setups, and so on. It can be calculated as
Tnonidle
u= ,
Ttotal
where Tnonidle is the time the machine is not idle during the total time frame Ttotal , or
alternatively,
realized production rate
u= ,
effective production rate
where the effective production rate is the maximum rate at which the machine can process
jobs, including effects of failures, setups and so on.
56 Manufacturing Models
• Cycle time factor is the ratio of the cycle time and the raw process time t0 , so
Cycle Time
Cycle Time Factor = ,
t0
where t0 is the sum of the average process times of each workstation in the line, or in
other words, it is the average time it takes a single job to go through the empty line.
Example 4.1. (Capacity) A workstation consists of 2 identical machines. The raw process time
of a machine is t0 = 0.2 hours. Then the capacity (or maximal throughput) of each machine is
t0 = 5 jobs per hour, and the capacity of the workstation is 10 jobs per hour.
1
Example 4.2. (Utilization) The raw process time of a machine is t0 = 0.15 hours and its
(realized) production rate is 5 lots per hour. Then the utilization u of the machine is u =
5 · 0.15 = 0.75.
l B1 M1 B2 M2 d
Figure 4.5: Lot-time diagram for release rate of (a) 1/3 lots/hour and (b) 1/2 lots/hour
For a release rate of 1/3 lots/hour we determine from Figure 4.5(a) that the flow time is 5
hours. For a release rate of 1/2 lots/hour we see in Figure 4.5(b) that the flow time keeps
increasing. For any release rate under the maximal throughput, the flow time is 5 hours, for
any release rate above the maximal throughput, the flow time grows unlimited. We can also
use the lot-time diagram to determine the mean WIP level. In Figure 4.6 we derive the WIP
4.3 Capacity, flow time and WIP 57
t [ hrs] t [ hrs]
2 4 6 8 10 12 2 4 6 8 10 12
0 1 2 0 1 2
1 1 2 w(t=7) 1 1 q 2 w(t=7)
2 1 2 2 1 q 2
3 1 2 3 1 q 2
1 1 q
lot l = 1/3 lots/hour lot l = 1/2 lots/hour
w w
[lots] [lots]
3
2 2
1 1
0 0
2 4 6 8 10 12 2 4 6 8 10 12
t [ hrs] t [ hrs]
(a) (b)
Figure 4.6: Lot-time diagram and w-t-diagram for release rate of (a) 1/3 lots/hour and (b)
1/2 lots/hour
level over time from the lot-time diagram. For instance, for a release rate of 1/3 lots/hour, at
t = 7 there are two lots in the system (lot 1 and 2).
For a release rate of 1/3 lots/hour, the behaviour becomes periodic for t > 3 hours with a period
of 3 hours, see Figure 4.6(a). The mean WIP level is 31 · 1 + 23 · 2 = 53 lots. For Figure 4.6(b)
the WIP level keeps increasing. For a release rate higher than the maximal throughput, the
WIP level grows to infinity.
For the flow line in Figure 4.4 it is easy to determine the maximal throughput, and for constant
release rate and constant process times, lot-time diagrams can be used to determine the flow
time and mean WIP level. Below we describe how to determine the maximal throughput
for an arbitrary configuration of workstations. Treatment of the more complicated problem of
estimating mean flow time and mean WIP level in an arbitrary configuration with stochastic
arrivals and process times starts in Section 4.8.
Consider a manufacturing system with N workstations, labeled W1 , . . . , WN , see Figure 4.7.
Workstation Wi consists of mi parallel identical machines. The raw process time of a machine
in workstation Wi is t0i . A fraction pij of the throughput of workstation Wi is diverted to Wj ,
and a fraction pi0 of the throughput is finished and leaves the system. The arrival rate to
the manufacturing system is λ jobs per unit time, and a fraction qi is immediately diverted to
workstation Wi .
The inflow or throughput of workstation Wi consists of both internal jobs (coming from other
workstations) and external jobs. What is the throughput of workstation Wi ?
Let δi denote the throughput (or total outflow) of workstation Wi . The job flow in the
manufacturing system obeys the principle of conservation of flow. This means that the total
flow of jobs out of Wi is equal to the total flow into Wi . Hence, for each workstation we
obtain
N
X
δi = λqi + δj pji , i = 1, . . . , N,
j =1
where the left hand side is the flow out, the first term at the right hand side is the fresh inflow
in Wi and the second term is the total internal inflow. These linear equations have a unique
solution for the throughput δi , provided the routing is such that every job eventually leaves the
manufacturing system. The capacity of workstation Wi is mt0ii , so Wi is able to handle all work
58 Manufacturing Models
q2 λ
p20
W2
q1 λ
p12
p2N
qN λN
W1
WN
mi
δi < ,
t0i
δi t0i
ui = < 1.
mi
Note that “=” is only feasible in the ideal deterministic world (but as soon as there is variability,
and “=” holds, the WIP in workstation Wi will grow without bounds). The bottleneck workstation
Wb is the one with the highest utilization ub . This station also dictates the maximal inflow λmax
for which the system is stable. That is, λmax is the rate for which ub becomes equal to 1.
Example 4.3. Consider the manufacturing system consisting of 4 workstations, listed in Figure
4.8. The inflow in stations W1 and W4 is λ jobs per hour. The throughput of the system is δ.
λ 0.3 δ
W1 W2
0.8
W3
λ
W4
Exercise 49. (Exercise 2.3.1 [1]) Consider the manufacturing system with rework and
bypassing in Figure 4.9. The manufacturing system consists of three buffers and four machines.
Lots are released at a rate of λ lots/hour. The numbers near the arrows indicate the fraction
of the lots that follow that route. For instance, of the lots leaving buffer B1 90% goes to
machine M1 and 10% goes to buffer B3 . The process time of each machine is listed in the
table in Figure 4.9.
Exercise 50. (Exercise 2.3.2 [1]) Consider a three-workstation flowline. The workstations
each have an infinite buffer and contain 1, 3, and 2 machines respectively. The process time
of the machines in workstation 1,2 and 3 is 0.9, 3.0, and 1.9 hours respectively.
2. Use a lot-time-diagram to determine the flowtime of a lot for release rates under the
maximum throughput.
3. Determine the mean WIP level in the line for the maximum throughput.
Exercise 51. (Exercise 2.3.3 [1]) We have a three-workstation flowline. Each workstation
consists of an infinite buffer and a number of machines. The number of machines has still to
be determined. The machines in workstation 1,2, and 3 have a process time of 0.10, 0.15,
and 0.06 hours respectively. We want to establish a throughput of 60 lots/hour.
1. Determine the flowtime of a lot for release rates under the maximum throughput.
2. Determine the number of machines in workstation 1,2, and 3 required to attain the desired
throughput of 60 lots/hour.
3. Which workstation (what workstations) is (are) the bottleneck?
t0 = 2.0 hr t0 = 3.0 hr
(a) l B1 M1 B2 M2 d
t0 = 3.0 hr t0 = 2.0 hr
(b) l B1 M2 B2 M1 d
Exercise 52. (Exercise 2.3.4 [1]) Consider a two-workstation flowline with two machines.
The two machines have a process time of 2.0 and 3.0 hours respectively. The machines may
be placed in any order. The two alternatives are shown in Figure 4.10.
Exercise 53. In the manufacturing system in Figure 4.11, jobs arrive with rate λ jobs per hour
in workstations W1 and W3 . The number of (identical) machines mi in workstation Wi and the
1 1
5 5
2
λ δ
W1 W2 3 W3 W4
scrap λ
Exercise 54. Jobs arrive with rate λ jobs per hour at the manufacturing system in Figure
4.12. 20% of the jobs are of type A, 80% is of type B. Type A jobs are processed in
workstations W1 , W2 and W5 , type B jobs in W3 , W4 en W5 . Every workstation has a single
machine. The numbers at the arrows in Figure 4.12 indicate which fraction of the throughput
of that workstation moves in that direction. In W2 10% of the throughput is scrapped. For
every workstation, mean process times ti (A) of type A en ti (B) of type B (in hours) are listed
in Table 4.2.
0.5
0.1
λA = 0.2λ W1 W2
W5
λB = 0.8λ W3 W4
0.06
0.2
Exercise 55. In the manufacturing system in Figure 4.13, 2 types of jobs arrive, type 1 (solid)
and type 2 (dashed). The total arrival rate is λ jobs per hour, 40% type 1 and 60% type
2. The number of (identical) machines mi in workstation Wi and the mean natural processing
times t0i (1) of type 1 jobs and t0i (2) of type 2 jobs are listed in Table 1.
4
scrap 5
1
2 2
5λ W1 W2
4
1 5
3
W3 scrap
3
5λ
1. Formulate the balance equations for the throughput δi (1) of type 1 jobs of workstation
Wi .
2. Express the throughput δi (1) of each workstation in terms of λ.
3. Formulate the balance equations for the throughput δi (2) of type 2 jobs of workstation
Wi .
4. Express the throughput δi (2) of each workstation in terms of λ.
5. What percentage of all jobs is ultimately scrapped?
6. Express the utilization ui of each workstation in terms of λ.
7. What is the maximal inflow λmax for the manufacturing system to remain stable?
System
δ δ
ϕ
Figure 4.14: System with WIP w, throughput δ and average flow time ϕ
total number w
time t
Figure 4.15: Illustration of Little’s law, where the solid line is the total input to the system in
(0, t), and the dashed line the total output in (0, t)
The power of Little’s law is its generality (no specific assumptions for the system are required)
and flexibility (as it applies to any system). For example, it can be applied to the buffer of a
specific workstation W, in which case it gives the relation
WIP in the buffer of workstation W = Throughput of W × Time spent in buffer,
and when it is applied to the whole manufacturing system, we get
WIP in the whole manufacturing system = Throughput of the system × Cycle time.
According to Little’s law, the same throughput δ can be achieved with
• large WIP w and long flow times ϕ, but also with
• small WIP w and short flow times ϕ.
What causes the difference? In most cases the answer is: Variability! The concept of variability
will be studied in the following sections, and in particular, its corrupting effect on system
performance.
Exercise 56. (Exercise 2.3.5 [1]) Consider the re-entrant flowline in Figure 4.16. Lots
are released in the line by generator G at a rate of λ lots/hour. Each lot passes through the
flowline twice and has the same fixed route, namely M1 , M2 , M1 , M2 . If a newly released lot
and a re-entrant lot arrive at a machine at the same time, the re-entrant lot is processed first.
64 Manufacturing Models
t0 = 2.0 hr t0 = 2.0 hr
l
G B1 M1 B2 M2 E
1. Express the utilisation of machine 1 in terms of release rate λ and determine the maximum
throughput.
2. Construct a lot-time diagram (in which the state of individual lots is shown over time) for
λ = 15 lots/hour and determine the (mean) flowtime of a lot.
3. From the lot-time diagram, derive a WIP-time diagram (in which the WIP is shown over
time), and determine the mean WIP-level in steady state.
4. Verify your result using Little’s law.
l W1 W2 W3
d
0.2 0.8
W5 W4 W6
Exercise 57. (Exercise 2.4.1 [1]) The manufacturing system in Figure 4.17 consists of
6 workstations. 20% of the lots processed by workstation W4 need rework. The lot is first
stripped in workstation W5 and then reprocessed by workstation W2 and W3 . The numbers at
the arrows indicate the fraction of the lots that follow that route.
1. Calculate the total throughput of workstation 2,3, and 5 in terms of release rate λ.
2. Calculate the throughput δW4 W5 and δW4 W6 for workstation 4.
3. Verify that conservation of mass holds for system W2 W3 W4 W5 .
l d
0.7
W1 W2 W3 W4
0.1
0.1
0.1
Exercise 58. (Exercise 2.4.2 [1]) Figure 4.18 shows a flow line with bypassing loops
going from workstation 1 to workstation 2, 3, and 4. The numbers at the arrows indicate the
fraction of the lots that follow that route.
Exercise 59. (Exercise 2.4.3 [1]) Figure 4.19 shows a flowline of workstations with rework
loops. 20% of the lots processed by workstation 2 need rework on workstation 1 and 2. 20%
of the processed by workstation 3 need rework on workstation 2 and 3. Finally, 30% of the
lots processed by workstation 4 need rework on 3 and 4.
to
1 2 3 4 out
l in 1.0
in W1 W2
1 0.1 0.8 0.1
from
d
2 0.2 0.7 0.1
W3 W4 out 3 0.1 0.2 0.7
Exercise 60. (Exercise 2.4.4 [1]) Figure 4.20 shows a so-called job shop. In a job shop
lots can each have very different routings. The from-to-matrix is used to indicate the fraction
of the lots that go from a specific workstation to another workstation. For example, 10% of
lots that finish processing on workstation 1 go to workstation 2, 80% goes to workstation 3,
and 10% goes to workstation 4. All lots enter the job shop via workstation 1 and leave the
job shop via workstation 4.
1. Intuitively, what workstation processes the most lots in this job shop?
2. Write down the mass conservation equations for each workstation.
66 Manufacturing Models
3. Show that we can write these equations as the following matrix equation.
−1
0.2 0.1 0 δW1 −λ
0.1 −1 0.2 0.2 δW 0
2 =
0.8 0.7 −1 0.3 δW3 0
0.1 0.1 0.7 −1 δW4 0
4.5 Variability
Variability may be formally defined as the quality of non-uniformity of entities [2], and it
is closely related to randomness. Therefore, to understand the effect of variability, we must
understand the concept of randomness which is studied in Probability Theory, and its basic results
are presented in Chapter 3 of these lecture notes.
Distinction can be made between two two types of variability:
Controllable variation. This is the result of (bad) decisions. For example, variability is introduced
by the mix of products produced in the plant, where each type of product can have its
own processing characteristics and routing through the plant. Another example is use of
batch transportation of material, where the first finished part in the batch has to wait longer
for transportation than the last one in the batch.
Random variation. This is the result of events beyond our control. For example, the time
elapsing between customer demands, or machine failures.
Our intuition is quite good with respect to first-moment effects (i.e., the mean): we understand
that we get more products out by speeding up the bottleneck machine, or by adding more
product carriers. This type of intuition is based on a deterministic world. However, most of
us have a much less developed intuition for second-moment effects (i.e., the variance). For
example:
• Which is more variable: the time to process an individual part or the time to produce a
whole batch of those parts?
• Which results in greater improvement of line performance: Reduce variability of process
times closer to raw materials (upstream the line), or closer to the customers (downstream
the line)?
• Which are more disruptive to line performance: Short frequent machine failures, or long
infrequent machine failures?
In the next section we will first pay attention to understanding the mean and variance in process
times and flows, and then to the interaction of these two sources of randomness.
4.6 Process time variability 67
R1 = 6 R2 = 2.5
X0 = 10
F1 = 3.5 F2 = 5 F3 = 3
Example 4.5. Consider two machines, M1 and M2 . For machine M1 we have t0 = 15,
σ0 = 3.35, c0 = 0.223, mf = 744, mr = 248 minutes and cr = 1. For M2 the parameters
are t0 = 15, σ0 = 3.35, c0 = 0.223, mf = 114, mr = 38 minutes and cr = 1. So M1 has
infrequent long stops, M2 has frequent short ones. For both machines, the availability is
A = 0.75,
so te = 20 minutes. Hence, both machines have the same effective capacity re = te = 20
1 1 jobs
per minute (or 3 jobs per hour). From (4.3) we obtain that for machine M1 ,
c2e = 6.25
and for M2 ,
c2e = 1.
4.6.4 Rework
Another source of variability is quality problems, and its effect can be quantified in the same
way as in the previous sections. Suppose that a workstation performs a task, and then checks
whether it has been done correctly. If not, the task is repeated until it is eventually correct.
70 Manufacturing Models
t a , ca t d , cd
An arrival process that deserves special attention because of its practical importance is the Poisson
process, for which the times between arrivals are independent and Exponential with rate λ. This
proces has the following properties:
• Memoryless property. Since the inter-arrival times are Exponential, and thus memoryless
(see Property 3.3), we have for small ∆ > 0,
P(arrival in (t, t + ∆)) = 1 − e−λ∆ ≈ λ∆
So, in each small interval of length ∆, there is an arrival with probability λ∆ (and none
otherwise). This means that a Poisson process is a “truly random” arrival process.
4.7 Flow variability 71
• Binomial distribution. By dividing the interval (0, t) into many small intervals of length ∆,
then we will observe in each interval 0 or 1 arrivals. Hence, the total number of arrivals
in (0, t) is binomial with parameters n = t/ ∆ and p = λ∆.
• Poisson distribution. Since n is large and p is small, this number is Poisson distributed with
parameter np = λt, see Remark 3.1. Hence, as ∆ tends to 0),
(λt)k
P(k arrivals in (0, t)) = e−λt , k = 0, 1, 2, . . .
k!
This explains the name “Poisson process”.
• Clustered arrivals. Since the density f (x) = λe−λx is maximal for x = 0, short inter-arrival
times occur more frequently than long ones. So arrivals tend to cluster, as seen in Figure
4.24.
• Many rare arrival flows. The superposition of many independent rarely occurring arrival
flows is close to Poisson (and the more flows, the more it will look like Poisson). This is
why Poisson flows so often occur in practice!
• Merging. By merging two independent Poisson flows, say red arrivals with rate λ1 and
blue arrivals with rate λ2 (see Figure 4.25), we again obtain a Poisson flow with rate
λ1 + λ2 , since
P(arrival in (t, t + ∆)) ≈ λ1 ∆ + λ2 ∆ = (λ1 + λ2 )∆.
• What is the probability that the next job to arrive is type A? Given that a job arrives in
(t, t + ∆) it is of type A with probability 2+
2 = 2.
3 5
• What is the probability that during 2 hours at least 2 type B jobs arrive? The number of
type B arrivals in the interval (0, 2) is Poisson distributed with parameter 3 · 2 = 6. Hence,
P(at least 2 arrivals in (0, 2)) = 1 − P(0 or 1 arrivals in (0, 2))
= 1 − e−6 (1 + 6) = 1 − 7e−6 .
• Merging. By merging two independent arrival flows, say red arrivals with mean inter-arrival
time ta (1) and coefficient of variation ca (1), and blue arrivals with mean inter-arrival time
ta (2) and coefficient of variation ca (2), we obtain an arrival flow with
1 1
ra (merged) = ra (1) + ra (2) = + .
ta (1) ta (2)
The coefficient of variation of the time between arrivals of the merged flow is hard to
estimate. A simple (though rough) approximation for the squared coefficient of variation is
ra (1) ra (2)
c2a (merged) = c2 (1) + c2 (2).
ra (1) + ra (2) a ra (1) + ra (2) a
It should be noted, however, that the inter-arrival times of the merged flow are, in general,
no longer independent.
• Random splitting. Randomly splitting (or thinning) an arrival flow with mean inter-arrival
time ta and coefficient of variation ca , which means that with probability p an arrival is
colored red and otherwise ignored (or colored blue), yields a new red arrival flow with
p t
ra (red) = pra = , ta (red) = a , c2a (red) = pc2a + 1 − p.
ta p
We now look at the departure flow. The same measures can be used to describe departures,
namely the departure rate rd and the coefficient of variation cd of the time between departures,
defined as
1 σ
rd = , c d = d ,
td td
where td is the mean time between departures or mean inter-departure time, and σd is the
standard deviation of the time between departures, see Figure 4.22.
Clearly, rd = ra by conservation of flow. Variability in departures from a workstation depends
on both variability in arrivals and process times at that workstation. The relative contribution of
these two sources of variability depends on the utilization of the workstation, defined as
rt
u = a e,
m
where m is the number of machines in the workstation. A simple approximation for c2d in a
single-machine workstation (m = 1) is
c2d = (1 − u2 )c2a + u2 c2e ,
and for multi-machine stations (m ≥ 1),
u2
c2d = 1 + (1 − u2 )(c2a − 1) + √ (c2e − 1). (4.8)
m
4.8 Variability interactions - Queueing 73
This approximation for cd makes sense, since, if m = 1 and u is close to 1, then the machine
nearly always busy, so
cd ≈ ce .
On the other hand, if u is close to 0, then te is very small compared to ta , so
cd ≈ ca .
In serial production lines, all departures from workstation i are arrivals to the next workstation
i + 1, so
ta (i + 1) = td (i), ca (i + 1) = cd (i),
where ta (j ), ca (j ), td (j ), cd (j ) are the descriptors of arrivals and departures at workstation j, see
Figure 4.26.
Workstation i Workstation i + 1
Remark 4.2.
• Successive times between departures are typically not independent. For example, an inter-
departure time following a very long one will most likely correspond to a process time
(since it is likely that during the long inter-departure time at least one job arrived, which
can immediately enter service upon the departure instant). However, the assumption of
independence is usually a reasonable approximation.
• If both inter-arrival times and process times are independent and Exponential (which implies
that ca = ce = 1), then it can be shown that inter-departure times are also independent
and Exponential (so cd = 1). The above simple approximation agrees with this property,
i.e., if ca = ce = 1, then the approximation indeed yields cd = 1.
• Arrival process: Arrivals can consist of single jobs or batches, and there may a single flow
or multiple flows of different jobs.
• Service (or production) process: The workstation can have a single machine, or multiple
(non-) identical ones. Processing discipline can be first-come first-served (FCFS), last-come
first-served (LCFS), shortest process time first (SPTF), random, and so on.
74 Manufacturing Models
max b jobs
1 te , ce
m
buffer re = te
t a , ca 2
1
ra = ta
• Queue (or buffer): There may be ample (unlimited) queue space for jobs, or limited (or
even no) queue space.
• δ, throughput of station.
where u is the machine utilization. In case queueing space is unlimited (b = ∞), we have δ = ra
(since outflow is then equal to inflow) and the utilization u is given by
ra ra te δte
u= = = .
re m m
In the next section we start with the zero-buffer model.
G M
ta te
The first machine is G (i.e., the generator of arrivals), and the second one M. Machine G is
never starved (i.e., there is always raw material available). If machine G completes a job, and
M is still busy, then G blocks and has to wait till M finishes the job, before it can move the job
to M and start processing the next one. Machine M never blocks (i.e., it can always get rid
of the job), but may have to wait for input from G. The mean process time of G is ta , and
the mean process time of M is te . What is the throughput δ? The throughput can be easily
estimated the following Pych model (the complete code is in Appendix B).
def model():
ta = 1.0
te = 1.0
n = 1000
env = Environment()
a = Channel(env)
b = Channel(env)
env.run()
In this model we specified constant process times with mean 1. Process E is only counting
the number of completed jobs (and once n = 1000 jobs are completed, the program stops
76 Manufacturing Models
env = Environment()
a = Channel(env),
b = Channel(env)
c = Channel(env)
env.run()
For a buffer of size N = 10, the estimated throughput is already around 0.93. In the next
section we study the finite buffer model with Exponential process times, and derive an exact
formula for the throughput as a function of the buffer size.
Remark 4.3. For zero buffer and Exponential process times the throughput can be exactly
determined. Suppose A and B are Exponential with rate λ1 and λ2 . Then
P(max{A, B} ≤ t) = P((A ≤ t)P(B ≤ t) = (1 − e−λ1 t )(1 − e−λ2 t ) = 1 − e−λ1 t − e−λ2 t + e−(λ1 +λ2 )t .
Hence, the density of max{A, B} is
d
f (t) = F(t) = λ1 e−λ1 t + λ2 e−λ2 t − (λ1 + λ2 )e−(λ1 +λ2 )t ,
dt
so Z ∞
λ2 + λ λ + λ2
E [max{A, B}] = tf (t)dt = 1 1 2 2 ,
t=0 λ1 λ2 (λ1 + λ2 )
and thus we get for the throughput
1 λ λ (λ + λ2 )
δ= = 21 2 1 .
E [max{A, B}] λ1 + λ1 λ2 + λ22
4.10 Finite-buffer model 77
0 1 2 ··· ··· b
µ µ µ µ
So the lesson is that the only way to reduce WIP without sacrificing too much throughput is:
variability reduction!
Example 4.8. Consider the two machine line with 1/λ = 21 minutes and 1/μ = 20 minutes
21 = 0.9524. For b = ∞, we get
(see Section 8.7.1. in [2]). Then u = 20
w = 20 jobs, δ = 0.0476 jobs per minute, ϕ = 420.14 minutes
and for b = 4 (so two buffer places in between G and M),
w = 1.894 jobs, δ = 0.039 jobs per minute, ϕ = 48.57 minutes
This shows that limiting the buffer space in between the two machines greatly reduces the WIP
and flow time, but at the price of also reducing the throughput, the cost of which may not be
covered by the savings in inventory cost.
λ λ λ λ λ
µ µ µ µ µ
which is assumed to be less than 1 (since otherwise the machine can not handle the work).
Let pn be the probability (or long-run fraction of time) of finding n jobs in the system, and in
the same way as in the previous section, these probabilities can be determined through balance
equations stating that, in equilibrium, the number of transitions per unit time from state n to
n − 1 is equal to the number from n − 1 to n. So (see Figure 4.30)
pn μ = pn−1 λ, n = 1, 2, . . .
whence
pn = pn−1 u = · · · = p0 un = (1 − u)un , n = 0, 1, 2, . . .
since p0 = 1 − u. Hence, the number of jobs in system is Geometric with parameter u. For the
average WIP we get (see also (4.10))
∞
X ∞
X u
w= npn = n(1 − u)un = ,
n=0 n=0
1−u
10
8
Mean waiting time
6
4
2
0
Utilization
Figure 4.31: Mean waiting time ϕB as function of utilization u for ca = 1 and ce = 0 (black),
1 (blue), 2 (red)
env = Environment()
a = Channel(env)
b = Channel(env)
c = Channel(env)
env.run(until=E)
4.11 Single machine station 81
In this model we take Uniform inter-arrival times on (0, 2) and Uniform process times on (0, 1).
Based on 105 departures, simulation produces an average waiting time of approximately 0.15.
Since ta = 1, c2a = 31 , te = 12 , c2e = 31 , γ = 13 and u = 21 , approximation (4.12) yields ϕB = 16 .
Another example is Constant inter-arrival times of 53 , and Exponential process times with mean
2 . Then we get ta = 5 , ca = 0, te = 2 , ce = 1, γ = 2 and u = 6 , so ϕB = 4 where, based on
1 3 2 1 2 1 5 5
105 departures, the simulation estimate is 1.11. Hence, both examples show that approximation
(4.12) is reasonably accurate.
By applying Little’s law we can also obtain an approximation for the mean WIP in the buffer,
u2
wB (G/G/1) = γ ,
1−u
and, by adding the mean number of jobs in process,
u2
w(G/G/1) = γ + u.
1−u
The above expression for w provides an estimate for the long-run average WIP in the station,
but it does not tell anything about how the WIP behaves over time. In Figure 4.32 we show
realizations of the WIP over time for an Exponential single machine system.
6
100
5
15
80
4
60
Buffer level
Buffer level
Buffer level
10
3
40
2
20
1
0
0 20 40 60 80 100 0 200 400 600 800 1000 0 500 1000 1500 2000
Figure 4.32: Behavior of WIP over time of Exponential single machine system, ta = 1.0,
te = 0.5 (left), 0.9 (middle), 0.95 (right)
The realization for u = 0.95 shows that, although the long-run average WIP is 19 and there is
surplus capacity of 5%, the WIP is very large for very long periods! In practise, such “disasters”
will not be observed, since in situations with extremely high WIP levels, one will typically try to
get additional processing capacity to clear the high WIP.
Exercise 61. A miniload is a compact storage system for totes, in which an automated cranes
that can quickly move horizonally and vertically to retrieve totes from the rack for an order
picker at the entrance of the aisle (see Figure 4.33). The horizontal speed of the crane is
2,5 meter per second and the vertical speed is 0,4 meter per second. The crane can move
simulateously in horizontal and vertical direction. The rack with totes is schematically displayed
in Figure 4.34. The length of the rack is 25 meter and its height is 4 meter. The horizontal
location of the retrieved tote is Uniform between 0 and 25 meter, an independent of its vertical
location. The vertical loction is Uniform between 0 and 4 meter. The crane always starts at
the left down corner, at the entrance of the aisle. The random variable X is the travel time of
the crane to the tote in horizontal direction, and Y is de travel time in vertical direction.
bak
startpunt kraan 25 m
2. Since the crane can move simultaneously in horizontal and vertical direction, the travel
time to the tote is Z = max(X, Y). Show that the distribution function F(t) = P(Z ≤ t) of
the travel time is given by
1, t > 10
F(t) = P(Z ≤ t) = t2 , 0 < t ≤ 10
100
0, t≤0
Item
Picker
Poisson stream with a rate of 100 orders per hour. The picker processes the orders in order of
arrival. The locations of the two requested items are independent and Uniform on the circle.
The pick time of one item is 5 sec. Let the random variable R be the retrieval time of a
pick order, that is, the rotation time of the carousel to the two items plus the pick time of the
two items. Hence R can be written as R = max(U1 , U2 ) + 10 sec, where U1 and U2 are the
Uniform locations of item 1 and 2 on (0, 30).
1. Calculate the probability that during 1 minute at least two pick orders arrive.
2. Show that
x2
P(max(U1 , U2 ) ≤ x) = , 0 ≤ x ≤ 30.
900
3. Argue that the density of R is given by
t−10
f (t ) = 450 10 < t < 40
0 elsewhere
4. Calculate the probability that the retrieval time R is greater than 30 seconds.
5. Determine the mean te , variance σ2e and squared coefficient of variation c2e of the retrieval
time R.
6. What is the utilization of the picker?
7. Determine the mean flow time of an order (waiting time plus retrieval time).
Exercise 63. Milling tools are stored in a rack of 12 meters (see Figure 4.37), where requests
for tool sets arrive according to a Poisson stream with a rate of ra requests per hour. Tool
84 Manufacturing Models
sets consists of two tools. These two tools can be located at any position in the rack (and
independently from each other). Requests for tool sets are handled in order of arrival by an
operator. The operator uses a cart to go along the rack to retrieve the tools, starting from the
left hand side of the rack. When both tools have been collected, he returns to the beginning
of the rack (at the left hand side) to deliver the tools. His walking speed is v = 0.5 meter per
second. The retrieval time of a tool set is denoted by the random variable R which consists of
the walking time of the operator plus the pick time of the two tools. So R can be expressed
as R = W + P1 + P2 where W is the walking time and Pi is the pick time of tool i, i = 1, 2.
The pick times are independent. The mean pick time (per tool) is 10 seconds and the standard
deviation is 4 seconds. The locations of the two tools are indicated by Ui , i = 1, 2. The
locations Ui are independent and Uniform on (0, 12) meters.
12 meter
ra
U1
U2
1. Argue that the walking time W can be expressed as W = 2 max{U1 , U2 }/v seconds.
Exercise 64. Products are stored in a long aisle of 20 m and are collected by an automated
shuttle (see Figure 4.38). Orders arrive according to a Poisson process with a rate of λ = 100
orders per hour. Each order requests exactly one product. To collect an order, the shuttle starts
at the entrance of the aisle and drives with a speed of v = 0.8 m/s to the location of the
product. The location is random. Since the aisle is long and the number of products large, the
location can be well approximated by a continuous Uniform distribution on [0, 20] m. The
time to pick a product is (exactly) P = 5 s. After picking, the shuttle drives back to the entrance
and delivers the product (and will start to pick the next order if there is one).
1. Let D be the distance from the entrance to the location of the product. Calculate the
mean and standard deviation of D.
4.12 Multi machine station 85
20 m
Shuttle
Orders
2. The time to collect an order is the sum of the driving time (back and forth) and the pick
time of the product. Argue that the time to collect an order can be expressed as P + 2D/v
and calculate its mean and standard deviation.
3. Calculate the capacity of the shuttle, that is, the maximum arrival rate of orders λmax that
can be handled by the shuttle.
4. Calculate the utilization of the shuttle.
5. Calculate the mean order throughput time. This is the mean time that elapses from the
arrival of an order till it has been collected by the shuttle.
To reduce order throughput times and increase the capacity of the shuttle, more frequently
requested products are relocated near the entrance of the aisle. After relocating the products,
the new density f (x) of the location of a requested product is given by
x
f (x) = ce− 10 , 0 ≤ x ≤ 20,
where c is a normalizing constant.
1
1 te = µ
unlimited buffer re = mµ
1
ta = λ 2
ra = λ
So we consider a workstation with unlimited buffer space and m parallel identical machines. Jobs
arrive one at a time. The inter-arrival times are Exponential with rate ra = λ. Processing is in
order of arrival and jobs are processed one at a time. The process times are Exponential with
rate μ. Hence the capacity of the station is re = mμ and the utilization of each machine is
ra λ
u= = ,
re μ
which is assumed to be less than 1.
λ λ λ λ λ
µ 2µ 3µ mµ mµ
Let pn be the probability (or long-run fraction of time) of finding n jobs in the system. As we
have seen before, these probabilities can be determined through balance equations stating that,
in equilibrium, the number of transitions per unit time from state n to n − 1 is equal to the
number from n − 1 to n, yielding (see Figure 4.30)
pn nμ = pn−1 λ n ≤ m,
and
pn mμ = pn−1 λ n > m.
Hence, for n ≤ m we obtain
2 n
λ 1 λ 1 λ
pn = pn−1 = pn−2 = · · · = p0
nμ n(n − 1) μ n! μ
or with u = λ
mμ ,
(mu)n
pn = p0 , n ≤ m. (4.13)
n!
For n > m we get
2 n−m
λ λ λ
pn = pn−1 = pn−2 = · · · = pm
mμ mμ mμ
or
(mu)m n−m
pn = pm un−m = p0 u , n > m. (4.14)
m!
Finally, the probability p0 follows from the requirement that the probabilities pn have to add up
to 1,
∞
X
pn = 1
n=0
so
1 mX−1
(mu)n (mu)m 1
= + .
p0 n=0 n! m! 1 − u
From (4.13) we obtain for n = m that
(mu)m
(mu)m
pm = p0 = Pm−1 (mu)n m! (mu)m . (4.15)
m! n=0 n! + m! 1−u
1
4.12 Multi machine station 87
An important quantity is the probability Q that all machines are busy. In case of a single
machine (m = 1) we immediately have Q = u, but for multiple machines this is more involved.
From (4.14) and (4.15) we get
∞ ∞ (mu) m
X X pm
Q= pn = pm un−m = = m! .
1 − u (1 − u) Pmn=−01 (mu )n (mu)m
(4.16)
n=m n=m n! + m!
For the calculation of Q there is a simple and reasonable approximation available, namely
√
Q=u 2(m+1)−1 . (4.17)
Based on the expressions for the probabilities pn we can now determine the average WIP in
the buffer,
X∞
u Qu
wB = (n − m)pn = pm =
n=m ( 1 − u) 2 1−u
and by Little’s law,
wB Q te
= ϕB = ,
λ 1−u m
from which the mean flow time and average total WIP directly follow,
Qte
ϕ = ϕB + te = +t ,
1−u m e
Qu
w = λϕ = + mu.
1−u
1 te , ce
m
unlimited buffer re = te
t a , ca 2
1
ra = ta
The above expression (4.12) separates ϕB into three terms V × U × T: dimensionless Variability
term γ, dimensionless Utilization term 1Q−u , and the process Time term mte . In (4.18) we can
use for Q either (4.16) (which is exact for the Exponential model) or the simple approximation
(4.17).
Example 4.9. We consider a workstation with m = 2 machines. Inter-arrival times are Expo-
nential with rate ra = 9 (and c2a = 1) and the process times are Uniform√
on (0, 25 ), so te = 15
and c2e = 31 . Then u = ramte = 10
9 .. Hence γ = 1 (1 + 1 ) = 2 , Q ≈ 0.9 6−1 = 0.86 (the exact
2 3 3
value of Q = 9581 = 0.85), so the mean waiting time can be approximated by
Q te 2
ϕB = γ = · 0.86 = 0.58.
1−u 2 3
The quality of this approximation can be evaluated by the following Pych model (see Appendix
E).
def model():
m = 2
n = 1000000
env = Environment()
a = Channel(env)
b = Channel(env)
c = Channel(env)
G = Generator(env, a, lambda: random.exponential(0.111))
B = Buffer(env, a, b)
Ms = [ Machine(env, b, c, lambda: random.uniform(0.0, 0.4)) for j in range(m) ]
E = Exit(env, c, n)
env.run(until=E)
Based on 106 departures, the simulation estimate is 0.58. Hence, in this example, approximation
(4.12) is quite accurate.
From the approximation (4.18)) for the mean waiting time in the buffer we get, by Little’s law,
the following approximation for the average WIP in the buffer,
Qu
wB (G/G/m) = δϕB = γ ,
1−u
where δ = ra is the throughput of the workstation. For the mean flow time and the average
total WIP we obtain the approximations
Q te
ϕ = ϕB + te = γ +t ,
1−u m e
Qu
w = δϕ = γ + mu.
1−u
Note that for m = 1 these approximations coincide with the approximations for the single
machine model, presented in the previous section.
Picker
Carousels are mostly used for storage and retrieval of small and medium sized goods. The
picker has a fixed position in front of the carousel, which rotates the required items to the
picker. The carousel loop can be schematically represented by a circle of length 10 meter and
it rotates counter-clockwise with speed 0.5 m/sec, see Figure 4.43. Pick orders for one item
arrive according to a Poisson stream with a rate of 4 orders per min. The picker processes the
orders in order of arrival. The location of a requested item is Uniform on the circle and the
pick time of an item is 4 sec.
1. Let R be the retrieval time of an item, that is, rotation time of the carousel plus pick
U + 4 sec, where U is the Uniform location of the
time. Hence R can be written as R = 0.5
requested item on (0, 10). Argue that the density of R is given by
1
f (t ) = 20 4 < t < 24
0 elsewhere
2. Determine the mean te and squared coefficient of variation c2e of the retrieval time R.
3. What is the utilization of the picker?
4. Determine the mean flow time of an order (waiting time plus retrieval time).
Two options are considered to reduce the mean flow time: (i) Installing a second identical
carousel parallel to the existing one, and (ii) Replacing the existing carousel by a bi-directional
carousel (rotating at the same speed as the existing one). The bi-directional carousel always
rotates along the shortest route to the requested item.
5. Estimate the mean flow time for option (i), thus two parallel identical (uni-directional)
carousels both processing the orders in order of arrival.
6. Determine the mean flow time for option (ii).
90 Manufacturing Models
ce (i) ce (i + 1)
For stability we assume that each workstation can handle all work that is offered to that stations,
so
t (i)
u(i) = e < 1, i = 1, . . . , n.
m(i)ta (i)
Conservation of flow implies that the flow out of station i is equal to the flow into station i,
td (i) = ta (i)
and also that the flow into station i + 1 is equal to the flow out of station i,
ta (i + 1) = td (i).
The above relations link the flow rates in the serial production lines. The following relations link
the variability of the flows through the production line and, in particular, the relations describe
how variability propagates through the production line. For the variability of the output of
station i we have as approximation (see (4.8))
u2 (i) 2
c2d (i) = 1 + (1 − u2 (i))(c2a (i) − 1) + p (c (i) − 1)
m(i) e
and since departures from workstation i are arrivals to workstation i + 1,
ca (i + 1) = cd (i).
The mean waiting time in each workstation i can be estimated by (see (4.18))
Q(i) te (i)
ϕB (i) = γ (i) .
1 − u(i) m(i)
Example 4.10. We consider a production line with two machines, see Figure 4.45.
Machine 1 has Poisson inflow with rate ra (1) = 2 (and ca (1) = 1). The process times on
Machine 1 are Constant with te (1) = 13 and on Machine 2 the processs times are Uniform with
te (2) = 52 . So c2e (1) = 0 and c2e (2) = 31 . What is the mean total flow time of jobs? For
Machine 1 we have u(1) = ra (1)te (1) = 23 , γ (1) = 12 (1 + 0) = 12 , and thus
u(1) 2
ϕ(1) = γ (1) te (1) + te (1) = .
1 − u(1) 3
4.13 Serial production lines 91
Machine 1 Machine 2
ra (1)
te (1) te (2)
Figure 4.45: Production line with two machines
env = Environment()
a = [ Channel(env) for i in range(3)]
b = [ Channel(env) for i in range(2)]
G = Generator(env, a[0], lambda: random.exponential(0.5))
B1 = Buffer(env, a[0], b[0])
M1 = Machine(env, b[0], a[1], lambda: 0.33)
B1 = Buffer(env, a[1], b[1])
M1 = Machine(env, b[1], a[2], lambda: random.uniform(0.0, 0.8))
E = Exit(env, a[2], n)
env.run(until=E)
return E.value
The Pych model estimates ϕ = 1.86, so we can conclude that the analytical approximation is
reasonably accurate.
Example 4.11. Let us consider the previous example and reverse the two machines: So first
Machine 2 and then Machine 1, see Figure 4.46. Does mean total flow time increase or
decrease by reversing the two machines?
The mean total flow time for the reversed system is equal to ϕ = 1.99, which is greater than
for original configuration. Why? The reason is that variability propagates through the line! The
process variability c2e (2) on Machine 2 is higher than c2e (1) on Machine 1, so it is better to
locate Machine 2 toward the end of the production line.
92 Manufacturing Models
Machine 2 Machine 1
ra (1)
te (2) te (1)
Figure 4.46: Production line with Machine 2 first
Exercise 66. At the genetics department of an academic hosiptal DNA material is analysed.
The analysis is sequencing which tries to identify whether there are certain mutations or not.
The analysis is performed by a series of robots and (essentially) consists of two steps, each of
whcih is done in batches. The first step is preprocessing of DNA material and is performed by
the machine Pre (see Figure 4.47. The second machine MPS carries out the DNA analysis
by using the method Massive Parallel Sequencing. Samples of DNA material arrive according
to a Poisson process with a rate of 4 samples per hour. The preprocessing step on machine
Pre is done in batches of 100 samples. The mean time to preprocess a batch is te (1) = 20
hours en and the standard deviation is σe (1) = 10 hours. The mean time to anlsys a batch
of 100 samples on machine MPS takes on average te (2) = 23 hours with a standard deviation
of σe (2) = 10 hours. The total flow time of a sample is the time elapsing from arrival till the
moment the sample has been analysed. This flow time is a key performance indicator.
Exercise 67. Parts arrive according to Poisson stream with a rate of 10 parts/hour and they
need to be processed on machine M, see Figure 4.48. Before processing, parts are tested (in
order of arrival) on test facility T. It takes an Exponential time to test a single part. The
mean test time is 5 min. With probability 10 1 a part is scrapped due to quality issues. Parts
9 ) proceed to machine M for processing. The
that successfully pass the test (with probability 10
processing time of a part on machine M takes on average t0 = 6 minutes with a standard
deviation of 3 minutes. However, a critical component of machine M is subject to random
4.13 Serial production lines 93
M2
M1
Parts
M3
failures: when this component fails, the machine is down. The mean time to failure is mf = 10
hours, and the repair time is exactly mr = 30 minutes.
1. Calculate the utilization u(T) of the test facility.
2. Calculate the mean flow time ϕ(T) of a part at test facility T (waiting time plus test
time).
3. Calculate the arrival rate ra (M) and squared coefficient of variation c2a (M) of the inflow to
machine M, and argue that the inflow to machine M is Poisson.
4. Calculate the availability A = mf / (mf + mr ) of machine M.
5. Calculate the mean effective processing time te and squared coefficient of variation c2e of
machine M.
6. Calculate the utilization u(M) of machine M and estimate the mean flow time ϕ(M) of a
part at machine M (waiting time plus processing time).
To improve the availability of machine M and to reduce flow times, it is decided to keep some
spare components for the critical component of machine M. So, when the critical component of
machine M fails, it can be immediately replaced by a spare one in negligible time, and thus it
can be assumed that the effect of spare components is 100% availability of machine M.
7. Calculate te and c2e of machine M.
8. Estimate the mean flow time ϕ(M) of a part at machine M.
B¥ B¥ B¥
ta = 4.0 min td, cd2
Exercise 68. (Exercise 3.7.1 [1]) We have a manufacturing line with three workstations,
see Figure 4.49. From measurements done, we have the following data available for each
workstation:
94 Manufacturing Models
We assume that lots arrive at workstation 1 with an inter-arrival time that is distributed according
to an exponential distribution with mean 4.0 minutes.
ca2 = 2.0
W1 W2 W3
Exercise 69. (Exercise 3.7.2 [1]) Consider a manufacturing line consisting of three
workstations. Each workstation consists of an infinite buffer and a single machine. In addition
we have the following data:
• Lots arrive according to a distribution with mean ta and squared coefficient of variation
c2a = 2.0.
• Workstations 1,2, and 3 have a mean process time of t0,1 = 1.0 hr, t0,2 = 4.0 hr, and
t0,3 = 1.0 hr respectively and a squared coefficient of variation c20,1 = 4.0, c20,2 = 1.0, and
c20,3 = 1.0.
Alternative 1 Alternative 2
W1 W2 W3 W1 W2 W3
Exercise 70. Part of an assembly line for car roof systems is shown in Figure 4.52. The
car roof systems produced on this line consist of two glass panels, a front and rear glass panel.
Frames arrive at machine M1 for assembly of the front glass panel with a rate of ra = 25
frames per hour. The squared coefficient of variation of the inter-arrival times is c2a = 10
1 . The
assembly time of the front glass panel takes exactly t0 (1) = 2 minutes. However, machine M1
is subject to random breakdowns. The mean time to failure is mf (1) = 2 hours, and the repair
time is exactly mr (1) = 10 minutes. After assembly at machine M1 the frame is transferred to
the buffer in front of machine M2 by a robot. This transfer time of the robot is exactly tr = 1
minute. Machine M2 attaches the rear glass panel to the frame. The assembly time of the
rear glass panel takes exactly t0 (2) = 2 minutes. For this machine the mean time to failure is
mf (2) = 8 hours, and the repair time is exactly mr (2) = 40 minutes.
1. Calculate the availability A(i) = mf (i)/ (mf (i) + mr (i)) of machine Mi for i = 1, 2.
96 Manufacturing Models
robot
M1 M2
frames
ra , ca te (1), ce (1) te (2), ce (2)
2. Calculate the mean effective assembly time te (i) and squared coefficient of variation c2e (i)
of machine Mi for i = 1, 2.
3. Calculate the utilization u(1) and u(2) of both machines.
4. Estimate the mean flow time ϕ(1) of a frame at machine M1 (waiting time plus assembly
time).
5. Determine the rate rd (1) at which frames depart from M1 and estimate the squared
coefficient of variation c2d (1) of inter-departure times from M1 .
6. Estimate the mean flow time ϕ(2) of a frame at machine M2 (waiting time plus assembly
time).
7. Estimate the mean total flow time (which is the time elapsing from arrival of a frame at
M1 till departure from M2 ).
4.14 Batching
In manufacturing systems we can distinguish between two kinds of batching.
• Process batch: Parts are processed together on a workstation. There are two types of
process batches:
– Simultaneous:
Parts are produced simultaneously in a (true) batch workstation (e.g., in a furnace).
– Sequential:
Parts from a common part family are produced sequentially before the workstation is
changed-over to another part family.
• Transfer batch: Parts are moved together from one station to another. Note that:
– The smaller the transfer batch, the less time parts have to wait to form the batch.
– The smaller the transfer batch, the more material handling is needed.
Below we investigate through examples the effect of batching on the manufacturing system
performance.
Example 4.12. (Simultaneous batching interactions)
Parts arrive one by one at a machine with rate ra , see Figure 4.53. The coefficient of variation
of the inter-arrival times is ca . Parts are processed by the machine in batches of size k. The
4.14 Batching 97
parts batches
ra , ca ra (batch), ca (batch) te , ce
Figure 4.53: Simulataneous batching machine
mean batch process time on the machine is te , and ce is the coefficient of variation. What is
mean flow time of a part? The part flow time can be decomposed in the wait-to-batch time
plus the batch flow time,
ϕ(part) = ϕ(w2b) + ϕ(batch).
(k − 1)ta
The wait-to-batch time is the waiting time of a part for the process batch to form. So the first
part has to wait for the remaining k − 1 parts to arrive, the second one for the remaining k − 2
parts, and so on (see Figure 4.54). The wait-to-batch time for the last, kth part, is equal to
0. Hence, the average wait-to-batch time is equal to
1
ϕ(w2b) = ((k − 1)ta + (k − 2)ta + · · · + ta + 0)
k
ta k − 1 + 0
= k
k 2
(k − 1)ta
=
2
k−1
= .
2ra
The batch arrival rate is ra (batch) = ra and
k
σ2e
c2e =
te2
σ2s + kσ2p
=
(ts + ktp )2
u = ra (batch)te
ra
= t
k e
ts
= ra + tp .
k
Note that the requirement u < 1 implies that there is minimal feasible batch size: k > 1−rartas tp .
Furthermore, from the expression for u, we can conclude that the batch size k affects the
machine utilization: A larger batch size k leads to a lower utilization. The mean part flow time
is the average wait-to-batch time plus the mean batch flow time, so
Apparently, there is a batch size trade-off: A larger batch size k leads to a lower machine
utilization, but also to a higher wait-to-batch and wait-in-batch time. This trade-off is illustrated
in Figure 4.55 for the parameter setting ra = 0.4, ca = 1, ts = 5, σ2s = 6 14 , tp = 1 and σ2p = 14 .
In this example the optimal batch size is k = 5.
45
40
mean total flow time
35
30
25
20
5 10 15
batch size k
Figure 4.55: Mean part flow time as a function of the batch size k for the parameter setting
ra = 0.4, ca = 1, ts = 5, σ2s = 6 41 , tp = 1 and σ2p = 41
individually after processing. Then the mean effective process time of a part is
1
te (part) = ts + tp + 2tp + · · · + ktp
k
tp
= ts + (1 + 2 + · · · + k)
k
tp k(1 + k)
= ts +
k 2
(k + 1)tp
= ts + .
2
Hence, the mean flow time of part with job splitting is
!
k − 1 1 c2a σ2s + kσ2p ra ( tks + tp ) (k + 1)tp
ϕ(part) = + + t (ts + ktp ) + ts + .
2ra 2 k (ts + ktp ) 1 − ra ( k + tp )
2 s 2
wait-to-batch
parts batches
ra , ca te (1), ce (1) batch size k ra (batch), ca (batch) te (2), ce (2)
We consider a production line with two machines. Single parts arrive with rate ra at Machine
1. After processing on Machine 1, parts are transferred in batches of size k to Machine 2, the
transfer time is 0 (negligible). Machine i (i = 1, 2) processes parts one by one, with mean te (i)
and coefficient of variation ce (i). How does mean flow time of part depend on batch size k?
100 Manufacturing Models
batchsize k
ta,l = 4.0 hr
2
ca,l = 1.0
t0,b = k·t0
2
c0,b = c02 /k
Exercise 71. (Exercise 3.7.3 [1]) We have a batch machine that processes batches of
fixed size k, see Figure 4.57. Lots arrive with mean inter-arrival time ta,l = 4.0 hours and
squared coefficient of variation c2a,l = 1.0. The process time for a batch is proportional with the
batch size: t0,b = k · t0 , the squared coefficient of variation is inversely proportional with the batch
size: c20,b = c20 /k. For this machine we have t0 = 3.0 hours and c20 = 0.50.
4.14 Batching 101
1. Express the mean total flow time for this batch machine in batch size k.
2. For what value of k is the flow time minimal? (Do not forget to assure a stable system.)
ta,l = 0.50 hr
2
ca,l = 1.0
k=4
t0,b = 5.0 hr
2
c0,b = 0.50
Exercise 72. (Exercise 3.7.4 [1]) We have three identical batch machines in parallel, see
Figure 4.58. The batch machines have a single buffer in which batches are formed. The
machines process batches of a fixed size k = 4 lots. Lots arrive at the buffer with ta,l = 0.50
hours and c2a,l = 1.0. The batch process time is characterized by t0,b = 5.0 hours and c20,b = 0.5.
Exercise 73. Consider the two-stage production line in Figure 3. The first processing stage
is done by machine M, the second stage by furnace F. Parts arrive one by one according to
a Poisson stream with a rate of ra = 20 parts per hour. The processing time on machine M is
exactly t0 = 2 minutes. Machine M is subject to random breakdowns. The mean time to failure
is mf = 2 hours, and the repair time is exactly mr = 10 minutes. After processing at machine
M the part is transferred to the buffer in front of the furnace. The furnace is processing parts
in batches of size 10. The mean processing time of a batch at the furnace is exactly tf = 20
minutes.
M F
Parts
Figure 4.59: Two-stage production line consisting of machine M and batch furnace F
Exercise 74. Consider the two-stage semi-conductor assembly line in Figure 4.60. The first
processing stage is die-bonding done by machine D, the second stage is wire-bonding done by
machine W. Production batches arrive one by one according to a Poisson stream with a rate of
ra = 2 batches per hour. The mean processing time of a production batch on the die-bonding
machine D is td = 20 minutes with a standard deviation of σd = 20 minutes. Each batch
requires a setup time: this is the time required to install the right wafer on the die-bonding
machine. The setup time is ts = 5 minutes with a standard deviation of σs = 15 minutes.
After die-bonding, the batch is transferred to the buffer in front of the wire-bonding machine
W. The mean processing time of a batch on the wire-bonding machine is tw = 24 minutes and
the standard deviation is σw = 36 minutes. The wire-bonding machine W is subject to random
breakdowns. The mean time to failure is mf = 2 hours, and the repair time is exactly mr = 10
minutes.
D W
Batches
Figure 4.60: Two-stage semi-conductor assembly line consisting of die-bonding machine D and
wire-bonding machine W
1. Calculate the mean te (D) and squared coefficient of variation c2e (D) of the effective pro-
cessing time (setup plus die bonding) at machine D.
2. Calculate the utilization u(D) of machine D.
3. Calculate the mean flow time ϕ(D) of a batch at machine D (waiting time plus processing).
4. Determine the departure rate rd (D) of batches from D and estimate the squared coefficient
of variation c2d (D) of the inter-departure times from D.
5. Calculate the availability A = mf / (mf + mr ) of machine W.
6. Calculate the mean te (W) and squared coefficient of variation c2e (W) of the effective
processing time (wire bonding plus breakdowns) at machine W.
7. Calculate the utilization u(W) of the wire-bonding machine W.
8. Estimate the mean flow time ϕ(W) of a batch at the wire-bonding machine W (that is,
waiting time plus effective processing time of a batch).
9. Estimate the mean total flow time of a batch (from arrival at D till departure from W)
and the mean total WIP in the two-stage assembly line.
Bibliography
[1] I.J.B.F. Adan, A.T. Hofkamp, J.E. Rooda and J. Vervoort, Analysis of Manufacturing
Systems, 2012.
[2] W.J. Hopp, M.L. Spearman, 2012. Factory Physics, 3rd ed., McGraw-Hill, 2008.
[3] D. Morin, Probability - For the Enthusiastic Beginner, 2016.
[4] T. Lamballais, D. Roy and M.B.M. De Koster, Estimating performance in a Robotic Mobile
Fulfillment System. European Journal of Operational Research, 256, 976–990, 2017.
[5] R. Suri, Quick Response Manufacturing, Taylor&Francis Inc, 1998.
[6] H.C. Tijms, Understanding Probability, 3rd ed., Cambridge, 2012.
104 BIBLIOGRAPHY
A
KIVA model
# =================================
# Pod definition
# =================================
@dataclass
class Pod:
id: int = 0
# =================================
# Generator definition
# =================================
@process
def Generator(env, c_out, N):
for i in range(N):
x = Pod()
send = c_out.send(x)
yield env.execute(send)
# =================================
# Storage definition
# =================================
@process
def Storage(env, c_in, c_out, la):
while True:
receive = c_in.receive()
x = yield env.execute(receive)
send = c_out.send(x)
yield env.execute(send)
# =================================
# Buffer definition
# =================================
@process
def Buffer(env, c_in, c_out):
106 KIVA model
xs = [] # list of pods
while True:
receiving = c_in.receive()
sending = c_out.send(xs[0]) if len(xs)>0 else None
yield env.select(sending,receiving)
if selected(sending):
xs = xs[1:]
if selected(receiving):
x = receiving.entity
xs = xs + [x]
# =================================
# Pick definition
# =================================
@process
def Pick(env, c_in, c_out, mu, n):
for i in range(n):
receive = c_in.receive()
x = yield env.execute(receive)
send = c_out.send(x)
yield env.execute(send)
print(f"TH = {n / env.now}")
# =================================
# KIVA Model
# =================================
def model():
N = 1 # Number of robots
la = 4.0 # average rate / hour for storing and retrieving racks
mu = 20.0 # average rate / hour for picking of racks
env = Environment()
a = Channel(env)
b = Channel(env)
c = Channel(env)
G = Generator(env, a, N)
Ss = [Storage(env, a, b, la) for j in range(N)]
B = Buffer(env, b, c)
P = Pick(env, c, a, mu, 10000)
env.run()
# =================================
# Main
# =================================
model()
B
Zero-buffer model
# =================================
# Job definition
# =================================
@dataclass
class Job:
id: int = 0
# =================================
# Generator definition
# =================================
@process
def Generator(env, c_out, u):
i = 0
while True:
x = Job(id = i)
send = c_out.send(x)
yield env.execute(send)
delay = u()
yield env.timeout(delay)
i = i+1
# =================================
# Machine definition
# =================================
@process
def Machine(env, c_in, c_out, u):
while True:
receive = c_in.receive()
x = yield env.execute(receive)
delay = u()
yield env.timeout(delay)
send = c_out.send(x)
yield env.execute(send)
108 Zero-buffer model
# =================================
# Exit definition
# =================================
@process
def Exit(env, c_in, n):
x = Job()
while x.id < n:
receive = c_in.receive()
x = yield env.execute(receive)
print(f"TH = {x.id / env.now}")
# =================================
# GME Model
# =================================
def model():
ta = 1.0
te = 1.0
n = 1000
env = Environment()
a = Channel(env)
b = Channel(env)
G = Generator(env, a, lambda: ta)
M = Machine(env, a, b, lambda: te)
E = Exit(env, b, n)
env.run()
# =================================
# Main
# =================================
model()
C
Finite-buffer model
# =================================
# Job definition
# =================================
@dataclass
class Job:
id: int = 0
# =================================
# Generator definition
# =================================
@process
def Generator(env, c_out, u):
i = 0
while True:
x = Job(id = i)
send = c_out.send(x)
yield env.execute(send)
delay = u()
yield env.timeout(delay)
i = i+1
# =================================
# Buffer definition
# =================================
@process
def Buffer(env, c_in, c_out, N):
xs = []
while True:
receiving = c_in.receive() if len(xs)<N else None
sending = c_out.send(xs[0]) if len(xs)>0 else None
yield env.select(sending,receiving)
if selected(sending):
xs = xs[1:]
if selected(receiving):
x = receiving.entity
xs = xs + [x]
110 Finite-buffer model
# =================================
# Machine definition
# =================================
@process
def Machine(env, c_in, c_out, u):
while True:
receive = c_in.receive()
x = yield env.execute(receive)
delay = u()
yield env.timeout(delay)
send = c_out.send(x)
yield env.execute(send)
# =================================
# Exit definition
# =================================
@process
def Exit(env, c_in, n):
x = Job()
while x.id < n:
receive = c_in.receive()
x = yield env.execute(receive)
print(f"TH = {x.id / env.now}")
# =================================
# GBME Model
# =================================
def model():
ta = 1.0
te = 1.0
n = 1000
N = 10
env = Environment()
a = Channel(env)
b = Channel(env)
c = Channel(env)
G = Generator(env, a, lambda: random.exponential(ta))
B = Buffer(env, a, b, N)
M = Machine(env, b, c, lambda: random.exponential(te))
E = Exit(env, c, n)
env.run()
# =================================
# Main
# =================================
model()
D
Single machine model
# =================================
# Job definition
# =================================
@dataclass
class Job:
entrytime: float = 0.0
# =================================
# Generator definition
# =================================
@process
def Generator(env, c_out, u):
while True:
x = Job(entrytime = env.now)
send = c_out.send(x)
yield env.execute(send)
delay = u()
yield env.timeout(delay)
# =================================
# Buffer definition
# =================================
@process
def Buffer(env, c_in, c_out):
xs = []
while True:
receiving = c_in.receive()
sending = c_out.send(xs[0]) if len(xs)>0 else None
yield env.select(sending,receiving)
if selected(sending):
xs = xs[1:]
if selected(receiving):
x = receiving.entity
xs = xs + [x]
# =================================
112 Single machine model
# Machine definition
# =================================
@process
def Machine(env, c_in, c_out, u):
while True:
receive = c_in.receive()
x = yield env.execute(receive)
send = c_out.send(x)
yield env.execute(send)
delay = u()
yield env.timeout(delay)
# =================================
# Exit definition
# =================================
@process
def Exit(env, c_in, n):
sumw = 0.0
for i in range(n):
receive = c_in.receive()
x = yield env.execute(receive)
sumw = sumw + (env.now - x.entrytime)
print(f"Mean waiting time spent in buffer = {(sumw / n):g}")
# =================================
# GBME Model
# =================================
def model():
n = 100000
env = Environment()
a = Channel(env)
b = Channel(env)
c = Channel(env)
G = Generator(env, a, lambda: random.uniform(0.0, 2.0))
B = Buffer(env, a, b)
M = Machine(env, b, c, lambda: random.uniform(0.0, 1.0))
E = Exit(env, c, n)
env.run(until=E)
# =================================
# Main
# =================================
model()
E
Multi machine model
# =================================
# Job definition
# =================================
@dataclass
class Job:
entrytime: float = 0.0
# =================================
# Generator definition
# =================================
@process
def Generator(env, c_out, u):
while True:
x = Job(entrytime = env.now)
send = c_out.send(x)
yield env.execute(send)
delay = u()
yield env.timeout(delay)
# =================================
# Buffer definition
# =================================
@process
def Buffer(env, c_in, c_out):
xs = []
while True:
receiving = c_in.receive()
sending = c_out.send(xs[0]) if len(xs)>0 else None
yield env.select(sending,receiving)
if selected(sending):
xs = xs[1:]
if selected(receiving):
x = receiving.entity
xs = xs + [x]
# =================================
114 Multi machine model
# Machine definition
# =================================
@process
def Machine(env, c_in, c_out, u):
while True:
receive = c_in.receive()
x = yield env.execute(receive)
send = c_out.send(x)
yield env.execute(send)
delay = u()
yield env.timeout(delay)
# =================================
# Exit definition
# =================================
@process
def Exit(env, c_in, n):
sumw = 0.0
for i in range(n):
receive = c_in.receive()
x = yield env.execute(receive)
sumw = sumw + (env.now - x.entrytime)
print(f"Mean waiting time spent in buffer = {(sumw / n):g}")
# =================================
# GBME Model
# =================================
def model():
m = 2
n = 100000
env = Environment()
a = Channel(env)
b = Channel(env)
c = Channel(env)
G = Generator(env, a, lambda: random.exponential(0.111))
B = Buffer(env, a, b)
Ms = [ Machine(env, b, c, lambda: random.uniform(0.0, 0.4)) for j in range(m) ]
E = Exit(env, c, n)
env.run(until=E)
# =================================
# Main
# =================================
model()
F
Serial production line
# =================================
# Job definition
# =================================
@dataclass
class Job:
entrytime: float = 0.0
# =================================
# Generator definition
# =================================
@process
def Generator(env, c_out, u):
while True:
x = Job(entrytime = env.now)
send = c_out.send(x)
yield env.execute(send)
delay = u()
yield env.timeout(delay)
# =================================
# Buffer definition
# =================================
@process
def Buffer(env, c_in, c_out):
xs = []
while True:
receiving = c_in.receive()
sending = c_out.send(xs[0]) if len(xs)>0 else None
yield env.select(sending, receiving)
if selected(sending):
xs = xs[1:]
if selected(receiving):
x = receiving.entity
xs = xs + [x]
# =================================
116 Serial production line
# Machine definition
# =================================
@process
def Machine(env, c_in, c_out, u):
while True:
receive = c_in.receive()
x = yield env.execute(c_in.receive())
send = c_out.send(x)
yield env.execute(c_out.send(x))
delay = u()
yield env.timeout(delay)
# =================================
# Exit definition
# =================================
@process
def Exit(env, c_in, n):
sumw = 0.0
for i in range(n):
receive = c_in.receive()
x = yield env.execute(receive)
sumw = sumw + (env.now - x.entrytime)
return sumw / n
# =================================
# Mline Model
# =================================
def model():
n = 10000
env = Environment()
a = [ Channel(env) for i in range(3)]
b = [ Channel(env) for i in range(2)]
G = Generator(env, a[0], lambda: random.exponential(0.5))
B1 = Buffer(env, a[0], b[0])
M1 = Machine(env, b[0], a[1], lambda: 0.33)
B1 = Buffer(env, a[1], b[1])
M1 = Machine(env, b[1], a[2], lambda: random.uniform(0.0, 0.8))
E = Exit(env, a[2], n)
env.run(until=E)
return E.value
# =================================
# Experiment
# =================================
def Experiment():
m = 10
117
phi = 0.0
sum1 = 0.0
sum2 = 0.0
smean = 0.0
svar = 0.0
for i in range(m):
phi = model()
sum1 = sum1 + phi
sum2 = sum2 + phi*phi
print(f"Mean flow time in run {i+1:d} is {phi:g}")
smean = sum1 / m
svar = sum2 / m - smean * smean
print(f"Mean flow time estimate is {smean:g} +- {1.96 * math.sqrt(svar/m):g}")
# =================================
# Main
# =================================
Experiment()
118 Serial production line
G
Solutions to selected exercises
Exercise 1. Label the nine socks as 1, 2, . . . , 9. If the order in which the socks are chosen is
considered as being as being important, then we take an ordered sample space being the set
of all pairs (s1 , s2 ), where s1 is the label of the first sock chosen and s2 is the label of the
second sock chosen. The sample space has 9 × 8 = 72 equally likely outcomes. There are
4 × 5 = 20 outcomes for which the first sock chosen is black and the second is white. Also,
there 5 × 4 = 20 outcomes for which the first sock chosen is white and the second is black.
Hence the desired probability is (20 + 20)/72 = 95 . Another probability model for this problem
is one in which the order of selection of the socks is not considered as being relevant. Then
we take an unordered sample space in
which outcome is a set of two different socks from the
9
nine socks. This sample space has 2 = 36 equally likely outcomes. The number of outcomes
for which the two socks in the set have different colors is 51 × 41 = 20. This leads to the
same probability 36 20 = 5 .
9
Exercise 2. The sample space is Ω = {(i, j ) |
i, j = 1, . . . , 6}, where i is the score of player A
and j is the score of player B. All outcomes are equally likely and thus each outcome gets
assigned a probability of 361 . Let E be the event that the absolute difference of the scores of
players A and B is 0, 1, or 2, then E = {(i, j ) | |i − j | ≤ 2}. The number of outcomes in the set
E is 24. Thus, the desired probability is P(E) = 36 24 .
Exercise 4. Let A be the event that the truck is used on a given day and let B be the event
that the van is used on a given day. Using the basic relation P(A ∪ B) = P(A) + P(B) − P(AB),
it follows that the desired probability is given by P(B) = 0.9 − 0.75 + 0.3 = 0.45.
Exercise 11.
1. Let Ai (ki ) be the availability of component i if it has ki copies, so
A1 (k1 ) = 1 − 0.4k1 , A2 (k2 ) = 1 − 0.5k2 , A3 (k3 ) = 1 − 0.3k3
The availability of the system is
A = A1 (k1 )A2 (k2 )A3 (k3 )
So for the system in Figure 3.9 with k1 = 1, k2 = 2, k3 = 1,
A = 0.6 · 0.75 · 0.7 = 0.315
Exercise 12. The random variable X can take on the values 0, 1, and 2. By the law of
3 × 0 + 3 × 4 = 6,
conditional probability, P(X = 0) = 1 2 1 1 P(X = 1) = 1 × 0 + 2 × 1 × 2 = 1, and
3 3 4 3
P(X = 2) = 31 × 1 + 23 × 14 = 21 .
Exercise 13. Take as sample space the set 0 ∪ {(x, y) : x2 + y2 ≤ 25}, where the outcome
0 means that the dart has missed the target and the outcome (x, y) that the dart has hit the
target at the point (x, y). Let the random variable X be the score of a single throw of the dart.
The random variable X can take on the values 0, 5, 8, and 15 with re- spective probabilities
0.25, 0.75 × 2525−9 = 0.48, 0.75 × 925 −1
= 0.24, and 0.75 × 25
1 = 0.03. The expected value of
the random variable X is E [X] = 0.48 × 5 + 0.24 × 8 + 0.03 × 15 = 4.77.
Exercise 22.
1.
47 46 45 44
P(none loaded) = = 0.77
50 49 48 47
3 47
or
P(none loaded) = 0 504
4
2.
3 47 46 45
P(one tool loaded) = 4 = 0.21
50 49 48 47
3 47
or
P(one tool loaded) = 1 503
4
3.
3 12 6
E [number of tools loaded] = 4= = = 0.24
50 50 25
Exercise 23.
1.
10 9 !
9 1 9
P(at least 2 PCBs fail) = 1−P(at most 1 PCB fails) = 1− + 10 = 0.264
10 10 10
2.
P(at least 2 PCBs fail) = P(at least 2 PCBs fail|N = 10)P(N = 10)
+P(at least 2 PCBs fail|N = 11)P(N = 11)
+P(at least 2 PCBs fail|N = 12)P(N = 12)
0.264 + 0.303 + 0.341
= = 0.303
3
Exercise 24. The density function should integrate to 1 over (0, ∞). This gives
Z ∞ Z ∞
c c
dx = dz = 1
0 (1 + x ) 3 1 z3
and so −c 12 z−2 |∞
1 = 1. Hence, c = 2. Further,
Z 0.5
2
P(X ≤ 0.25) = dx = 0.5556
0 (1 + x)3
121
and Z 1.5
2
P(0.5 < X ≤ 1.5) = dx = 0.2844.
0.5 (1 + x)3
P(AB)
Further, P(0.5 < X ≤ 1.5|X > 0.5) = 1−0.5556 =
0.2844 0.64, using the relation P(A|B) = P(B) .
Exercise 25. Let the random variable X denote the size of any particular claim. Then
10 R
P (X > 5) (1/50)(10 − x)dx
P(X > 5|X > 2) = = R 10
5
P(X > 2)
2 (1/50)(10 − x)dx
0.25
= = 0.3906.
0.64
Exercise 26. Let the random variable X be the length of any particular phone call made by
the travel agent. Then,
Z ∞
P(X > 7) = 0.25e−0.25x dx = −e−0.25x |∞
7 =e
−1.75
= 0.1738.
7
Exercise 33. Let the random variable X be the lifetime of the light bulb. The density
function f (x) of X is given by f (x) = 10
1 for 2 < x < 12 and f (x) = 0 otherwise. Define the
random variable Y as the age of the bulb upon replacement. The random variable Y = g(X),
where the function g(x) is defined by
x x ≤ 10
g(x) =
10 x > 10.
By the substitution rule,
Z 10 Z 12
1 1
E [Y] = x dx + 10 dx = 6.8
2 10 10 10
and Z 10 Z 12
h i 1 1
E Y2 = x2 dx + 102 dx = 53.0667.
2 10 10 10
This gives Var [Y] = 53.0667 − 6.82 =
6.8266 and so σ[Y] = 2.613. Hence the expected
value and the standard deviation of the age of the bulb upon replacement are 6.8 and 2.613.
Exercise 38.
1. Z ∞ Z ∞ 1
1= f (x)dx = ce− 10 x+4 dx = 10c,
x=40 x=40
so c = 10
1. Hence X can be written as X = 40 + Y where Y is Exponential with mean 10.
2. The primitives of the functions xeax and x2 eax are given by
Z Z
1 1 1 2 2 2
xeax dx = x − 2 eax , x2 eax dx = x − 2 x + 3 eax .
a a a a a
h i
The above integrals can be directly used to calculate E [X] and E X2 , but it is easier to
use that X = 40 + Y with Y is Exponential with mean 10, yielding
E [X] = E [40 + Y] = 40+E [Y] = 40+10 = 50, Var [X] = Var [40 + Y] = Var [Y] = 100, σ[X] = 10.
122 Solutions to selected exercises
3. Z ∞ 1
P(X > 50) = f (x)dx = e− 10 50+4 = e−1 = 0.368
x=50
4. Let Q (> 40) denote the number on stock. Then
1 1
P(X > Q) = e− 10 Q+4 =
20
so
Q = 40 + 10 ln(20) = 70.
Exercise 39
1. The primitives of the functions xeax and x2 eax are given by
Z Z
1 1 1 2 2 2
xeax dx = x − 2 eax , x2 eax dx = x − 2 x + 3 eax .
a a a a a
Hence Z ∞ 1
cxe− 2 x dx = 4c = 1,
0
so c = 14
2. Z ∞
1 − 21 x
P(X > 4) = xe dx = 3e−2 = 0.406
4 4
3. Z ∞
1 2 − 12 x
E [X] = x e dx = 4 hour.
0 4
Exercise 40.
1. The time till A or B fails is min(TA , TB ) where TA and TB are the lifetime of A and
B, respectivily. The minimum of Exponential random variables is again Exponential with
parameter the sum of the parameters. So min(TA , TB ) is Exponential with parameter 2
and the mean time till A or B fails is 0.5 uur. If A or B fails, the residual lifetime
of the remaining component is Exponential with parameter 1 (since Exponential randowm
variables are memoryless). Hence the mean time till both components fail is 0.5 + 1 = 1.5
hours.
2. This is the probability that all components still work after 1 hour, so e−(1+1+0.5+0.5) =
e−3 = 0.05.
3. This is the probability that components A, C and D still work after 1 hour, so e−(1+0.5+0.5) =
e−2 = 0.14.
Exercise 53.
1.
1 1
δ1 = λ + δ3 + δ4
5 5
δ2 = δ1
2
δ3 = λ + δ2
3
4
δ4 = δ3
5
123
2.
34
δ1 = λ = 1.79λ
19
34
δ2 = λ
19
125
δ3 = λ = 2.19λ
57
100
δ4 = λ = 1.75λ
57
3. Percentage scrapped is
3 δ2 × 100% =
1
34
× 100% = 29.8%
2λ 114
4.
δ1 t01 102 306
u1 = = λ = λ
m1 19 57
136 408
u2 = λ = λ
19 57
375
u3 = λ
57
400
u4 = λ
57
5. Workstation W2 is the bottleneck (i.e. highest utilization).
6. What is the maximal inflow λmax for the manufacturing system to remain stable?
Set ub = u2 = 1 yielding
408
λ = 1,
57 max
so λmax = 408
57 = 0.14 jobs per hour.
Exercise 54.
1.
δ1 =0.2λ + 0.5δ2 ,
δ2 =δ1 ,
δ3 =0.8λ + 0.2δ4 ,
δ4 =δ3 ,
δ5 =0.4δ2 + 0.74δ4 .
2.
δ1 =0.4λ,
δ2 =0.4λ,
δ3 =λ,
δ4 =λ,
δ5 =0.9λ.
124 Solutions to selected exercises
u1 =0.2λ,
u2 =0.6λ,
u3 =0.5λ,
u4 =λ,
u5 =(0.24 + 0.37)λ = 0.61λ.
5. Workstation W4 .
Exercise 55.
1.
2
δ1 (1) = λ
5
1 4
δ2 (1) = δ1 (1) + δ3 (1)
2 5
1
δ3 (1) = δ1 (1)
2
2.
2
δ1 (1) = λ
5
9
δ2 (1) = λ
25
1
δ3 (1) = λ
5
3.
1
δ1 (2) = δ (2)
3 3
4 2
δ2 (2) = δ1 (2) + δ3 (2)
5 3
3
δ3 (2) = λ
5
4.
1
δ1 (2) = λ
5
14
δ2 (2) = λ
25
3
δ3 (2) = λ
5
25 λ,
5. Total scrap is 2 so fraction 2 is scrapped, which is 8%.
25
125
Exercise 61
1. The location in horizontal direction is Uniform on (0, 25) meter and the speed is 2,5
meter per second, so the travel time is Uniform on (0, 10) seconds. Same for the vertical
travel time.
2. For the travel time Z we have
F(t) = P(Z ≤ t) = P(max(X, Y) ≤ t) = P(X ≤ t, Y ≤ t)
t t t2
= P(X ≤ t)P(Y ≤ t) = = , 0 < t ≤ 10.
10 10 100
3.
d t
f (t) = F(t) = , 0 < t ≤ 10,
dt 50
and f (t) = 0 for t < 0 en t > 10.
4.
Z 10 2
t 2
E [Z] = dt = 6 (sec),
0 50 3
h i Z 10 3
t
E Z2 = dt = 50 (sec2 ),
0 50
h i 50
Var [Z] = E Z2 − (E [Z])2 = = 5.55 (sec2 )
9
5. The mean retrieval time is 5+2E [Z]+5 = 23 13 seconds. Hence the capcity is 3600/23 13 =
154 totes per hour.
6. The mean time to retrieve a tote is te = 23 13 seconds, the variance is Var [2Z] = 4Var [Z],
so the squared coefficient of variation is c2e = 4Var [Z] /te2 = 2/49. The order arrival rate
1 orders per second (and c = 1, since orders arrive according to a Poisson process).
is 60 a
te = 0.39. Hence, the mean flow time ϕ is equal to
The utilisation u is equal to 60
1 2 2 u 1 2 0.39 1 1
ϕ= (ca + ce ) te + te = (1 + ) 23 + 23 = 31.1 sec.
2 1−u 2 49 1 − 0.39 3 3
Exercise 62.
126 Solutions to selected exercises
1. Arrival rate is 53 pick orders per min. Number of arrivals during 1 minute is Poisson
distributed with mean 53 . So
− 53 − 53 5 8 5
P(at least 2 arrivals) = 1 − P(at most 1 arrival) = 1 − e +e = 1 − e− 3 = 0.496
3 3
2. Since U1 and U2 are independent and P(U1 ≤ x) = P(U2 ≤ x) = x/30,
x2
P(max(U1 , U2 ) ≤ x) = P(U1 ≤ x, U2 ≤ x) = P(U1 ≤ x)P(U2 ≤ x) =
900
3. Since R = max(U1 , U2 ) + 10,
(t − 10)2
P(R ≤ t) = P(max(U1 , U2 ) ≤ t − 10) = , 10 < t < 40
900
Differentiating yields
d t − 10
f (t ) = P(R ≤ t) = , 10 < t < 40
dt 450
4.
5
P(R > 30) = 1 − P(R ≤ 30) =
9
5. Z 40 Z 40
1
te = tf (t)dt = 30 sec, σ2e = (t − 30)2 f (t)dt = 50 sec2 , c2e =
t=10 t=10 18
6. Arrival rate ra = 5
3 pick orders per min, so
5
u = ra te =
6
7. Since c2a = 1 (Poisson), c2e = 1
18 , u= 5
6 and te = 1
2 min,
1 2 2 u 131
ϕ= (ca + ce ) te + te = = 1.8 min
2 1−u 72
Exercise 63.
1. Walking distance is 2 max{U1 , U2 } (forth and back to farthest location). Walking time is
obtained by dividing distance by speed, so it is 2 max{U1 , U2 }/v.
2. We have
2
P(W ≤ t) = P( max{U1 , U2 } ≤ t)
v
vt
= P(max{U1 , U2 } ≤ )
2
vt vt
= P(U1 ≤ , U2 ≤ )
2 2
vt vt
= P(U1 ≤ )P(U2 ≤ )
2 2
t 2
=
48
Hence d t
fW (t) = P(W ≤ t) =
dt 1152
127
4. We have
tr = E [R] = E [W + P1 + P2 ] = E [W] + E [P1 ] + E [P2 ] = 32 + 10 + 10 = 52 [sec]
and
σ2r = Var [R] = Var [W] + Var [P1 ] + Var [P2 ] = 128 + 16 + 16 = 160 [sec2 ]
5. ra
t <1
3600 r
so maximum arrival rate is 3600
tr = 69.2 requests per hour.
σ2r ra
6. We have c2a = 1 (Poisson arrivals), c2r = tr2 = 0.06, u = 3600 tr = 0.867. hence
1 2 2 u 1 0.867
ϕ= (c + c ) t + t = (1 + 0.06) 52 + 52 = 231.7 [sec]
2 a r 1−u r r 2 1 − 0.867
7. On one hand, the variability of the retrieval time increases, and on the other hand, the
mean walking distance (which is 2( 14 6 + 12 8 + 14 9) = 15 12 < 16) decreases, and thus the
mean retrieval time decreases. Hence it will depend on the utilization of the operator
whether the mean flow time decreases or increases!
Exercise 64.
1. Z 20 Z
x h i 20 x2 400
E [D] = dx = 10 m, E D2 = dx = m2
0 20 0 20 3
h i 400 100 2 p 10
Var [D] = E D2 − (E [D])2 = = m , σ[D] = Var [D] = √ = 5.8 m
12 3 3
2. Time to collect an order is pick time P plus driving time, which is driving distance divided
by speed. Hence, mean time to collect an order is
te = E [P + 2D/v] = P + 2E [D] /v = 5 + 25 = 30 sec
and the variance is given by
625 1
σ2e = Var [P + 2D/v] = 22 Var [D] /v2 = = 208 sec2
3 3
so q
25
σe = σ[P + 2D/v] = Var [P + 2D/v] = √ = 14.4 sec
3
3. λmax = 120 orders per hour.
128 Solutions to selected exercises
1.23
ϕ= · 5 · 30 + 30 = 122.25 sec = 2.03 min
2
6. We have Z 20 Z 20
x
1= f (x)dx = c e− 10 dx = c10(1 − e−2 )
0 0
so
1
c= = 0.116
10(1 − e−2 )
7. Since f (x) is decreasing, we get
f (x1 ) f (0)
max = = e2 = 7.39
0≤x1 ,x2 ≤20 f (x2 ) f (20)
8. We have
Z 20
x 1 − 3e−2
E [D] = c xe− 10 dx = c100(1 − 3e−2 ) = 10 = 6.87 m
0 1 − e−2
so mean time to collect an order is
te = E [P + 2D/v] = P + 2E [D] /v = 5 + 17.17 = 22.17 sec
22.17 =
9. The capacity increases from 120 orders per hour to 3600 162.4 orders per hour.
Exercise 65.
U is Uniform on (0, 20), so the retrieval time R is Uniform on (4, 24), the
1. Note that 0.5
density of which is constant on (4, 24).
2. We have
Z 24 h i Z 24 2
t t 1
te = E [R] = dt = 14 sec, E R2 = dt = 229 sec2 ,
t =4 20 t =4 20 3
h i 1 σ2 25
σ2e = Var [R] = E R2 − (E [R])2 = 33 sec2 , c2e = 2e = = 0.17.
3 te 147
Or, since U is Uniform on (0, 10), we have E [U] = 5 and Var [U] = 102 = 8 13 , so
12
U E [U] Var [U] 1
t e = E [ R] = E +4 = + 4 = 14 sec, σ2e = Var [R] = = 33 sec2 .
0.5 0.5 0.52 3
129
1 2 u(1) 1 0.8
ϕ1 = (ca (batch)+c2e (1)) te (1)+te (1) = (0.01+0.25) 20+20 = 30.4 hours
2 1 − u(1) 2 1 − 0.8
and the mean flow time of a sample is 12.375 + 30.4 = 42.775 hours.
4. For departures of batches from machine Pre we have (as approximation)
16 1 9 1
c2d (1) = u2 (1)c2e (1) + (1 − u2 (1))c2a (batch) = + = 0.1636.
25 4 25 100
So c2a (2) = c2d (1) = 0.1636 and c2e (2) = (10/23)2 = 0.189. Then, for the mean flow
time we get
1 u(2) 1 23/25
ϕ2 = (c2a (2)+c2e (2)) t (2)+te (2) = (0.1636+0.189) 23+23 = 69.6 hours.
2 1 − u(2) e 2 2/25
5. 12.375 + ϕ1 + ϕ2 = 112.375 hours.
6. The mean waiting time to form a batch of 2 batches of 100 samples is (1+0)25/2 = 12.5
hours. Since ta (2) = 50 hours and c2a (2) = 12 c2d (1) = 0.0818, we get u(2) = 23/50 =
0.46. It then follows for that the mean flow time of a batch at MPS is equal to
1 u(2) 1 23/50
ϕ2 = (c2a (2)+c2e (2)) te (2)+te (2) = (0.0818+0.189) 23+23 = 25.7 hours.
2 1 − u(2) 2 27/50
The mean total flow time of a sample then becomes 12.375 + ϕ1 + 12.5 + ϕ2 = 80.975
hours. A reduction of almost 30%!
130 Solutions to selected exercises
Exercise 67.
1. Arrival rate is ra = 1
6 parts/min, te (T) = 5 min, so
5
u(T) = ra te (T) =
6
2.
u(T)
ϕ(T) = t (T) + te (T) = 30 min
1 − u(T) e
3.
c2d (T) = u2 (T) · 1 + (1 − u2 (T) · 1 = 1
and
9 2 1
c2a (M) =
cd (T) + =1
10 10
Output of an Exponential machine with Poisson input is again Poisson (so c2d (T) = 1), and
randomly splitting a Poisson stream is also Poisson.
4.
10 20
A= =
10 + 2 21
1
5.
t0 63 m 1 20 30 841
te = = min, c2e = c20 + A(1 − A) r = + = = 0.48
A 10 t0 4 441 6 1764
6. ra (M) = 10 ra = 20
9 3 parts/hour, so
189
u(M) = ra (M)te = = 0.945
200
and since c2a (M) = 1 (Poisson),
1 u(M) 1 0.945
ϕ(M) = (c2a (M) + c2e ) t + t = (1 + 0.48) 6.3 + 6.3 = 86.4 min
2 1 − u( M ) e e 2 0.055
7.
1
te = t0 = 6 min, c2e = c20 =
4
8. Now
u(M) = ra (M)te = 0.9
so
1 0.9 159 3
ϕ(M) = (1 + 0.25) 6+6= = 39 = 39.75 min
2 0.1 4 4
which is a reduction of more than 50%.
Exercise 70.
1. We have mf (1) = 2 hours = 120 mins, mr (1) = 10 mins, so
mf (1) 12
A(1) = =
mf (1) + mr (1) 13
and since mf (2) = 480 mins and mr (2) = 40 mins,
mf (2) 12
A(2) = = .
mf (2) + mr 2) 13
131
4.
65
1 2 u(1) 1 1 60 13 13
ϕ(1) = (ca (1) + c2e (1)) te (1) + te (1) = ( + ) 7265 + = 6.74 min
2 1 − u(1) 2 10 169 1 − 72 6 6
5. Since flow out from M1 is equal to flow into M1 we have rd (1) = ra . The squared
coefficient of variation c2d (1) is (approximately) given by
2 ! 2
65 1 65 60
c2d (1) = (1 − u2 (1))c2a + u2 (1)c2e (1) = 1− + = 0.308.
72 10 72 169
6. Estimate the mean flow time ϕ(2) of a frame at machine M2 (waiting time plus assembly
time).
Note that ra (2) = rd (1) and c2a (2) = c2d (1). Hence
65
1 2 u(2) 1 13 13
ϕ(2) = (cd (1)+c2e (2)) te (2)+te (2) = (0.308+1.42) 7265 + = 19.55 min
2 1 − u(2) 2 1 − 72 6 6
7.
ϕ = ϕ(1) + tr + ϕ(2) = 6.74 + 1 + 19.55 = 27.29min.
By Little’s law we get for the total WIP w (number of frames) in this part of the assembly
line,
5
w = ra ϕ = 27.29 = 11.37 frames.
12
Exercise 73.
1.
120 12
A= =
120 + 10 13
2.
t0 13
te = = = 2.167 [min]
A 6
and
mr mr 12 10 60
c2e = c20 + A(1 − A) + c2r A(1 − A) =0+ +0= = 0.36
t0 t0 169 2 169
132 Solutions to selected exercises
3.
ra
u(M) = t = 0.72
60 e
4. We have c2a = 1 (Poisson arrivals), so
1 2 2 u(M) 1 0.72
ϕ(M) = (c + c ) t + t = (1 + 0.36) 2.167 + 2.167 = 5.96 [min]
2 a e 1 − u( M ) e e 2 1 − 0.72
6.
1 ra 2
u(F) = tf =
10 60 3
7. The mean inter-departure time td (M) from M is td (M) = rd (1M) = 0.05 [hours] = 3 [min].
So
9 27 1
ϕ(w2b) = td (M) = = 13 [min]
2 2 2
8. We have
c2d (M)
c2a (batch) = = 0.067
10
Hence
1 2 u(F) 1 2/3
ϕ(batch) = (c (batch) + c2f ) t + t = (0.067 + 0) 20 + 20 = 21.34 [min]
2 a 1 − u(F) f f 2 1/3
9.
ϕ = ϕ(M) + ϕ(w2b) + ϕ(batch) = 5.96 + 13.5 + 21.34 = 40.8 [min]
The mean total WIP follows from Little,
ra 1
WIP = ϕ = 40.8 = 13.6
60 3
Exercise 74.
1.
te (D) = ts + td = 25 min
σ2e (D) = σ2s + σ2d = 152 + 202 = 625 min2
σ2 (D)
c2e (D) = 2e =1
te (D)
2.
25 5
u(D) = ra te (D) = 2 · =
60 6
3. Note that c2a = c2e (D) = 1, so
u(D)
ϕ(D) = t (D) + te (D) = 150 min
1 − u(D) e
133
5.
2 12
A= = = 0.92
2 + 16 13
6. tw
t e ( W) = = 26 min
A
σ2w
Note that c2w = tw2 = 4 ,
9 so
mr 9 12 1 10
c2e (W) = c2w + A(1 − A) = + = 2.28
tw 4 13 13 24
7.
26 13
u(W) = rd (D)te (W) = 2 · = = 0.87
60 15
8.
c (D) + ce (W)
2 2 u(W) 1 + 2.28 0.87
ϕ(W) = D te (W) + te (W) = 26 + 26 = 311.4 min
2 1 − u(W) 2 0.13