Chapter-3
Chapter-3
Note: This module is prepared from the text/reference books of Probability and Statistics course
(MATH F113) just to help the students. The study material is expected to be useful but not
exhaustive. For detailed study, the students are advised to attend the lecture/tutorial classes
regularly, and consult the text/reference books.
1
Contents
2
3.1 Definitions
Discrete Random Variable
Suppose a random experiment results into finite or countably infinite outcomes with sample space
S. Then a variable X taking real values x corresponding to each outcome of the random experiment
(or each element of S) is called a discrete random variable. In other words, the discrete random
variable X is a function from the sample space S to the set of real numbers. So, in principle, the
discrete random variable X being a function could have any given definition.
S = {HH, HT, T H, T T }.
Let X denotes the number of heads. Then X is a discrete random variable or the function from
the sample space S onto the set {0, 1, 2}, that is,
since X(HH) = 2, X(HT ) = 1, X(T H) = 1 and X(T T ) = 0. In tabular form, it can be displayed
as
Outcome HH HT TH TT
X=x 2 1 1 0
3
We find that P (X = 0) = 14 , P (X = 1) = 1
2
and P (X = 2) = 14 . It is easy to see that the
function f given by
X=x 0 1 2
1 1 1
f (x) = P (X = x) 4 2 4
X=x 0 1 2
1 3
F (x) = P (X ≤ x) 4 4
1
Remark: Note that X is a function with domain as the sample space S. So, in the above example,
X could also be defined as the number of tails, and accordingly we could write its pdf and cdf.
X=x 1 2 3 ...
1 1 2 1 3
f (x) = P (X = x) 2 2 2
...
4
The cumulative distribution function F of X is given by
1
x x
1 − 21
X
2 1
F (x) = f (x) = 1 =1− , where x = 1, 2, 3, .........
X≤x
1 − 2
2
Note. Determining cdf could be very useful. For instance, in the above example, suppose it
is required to calculate P (10 ≤ X ≤ 30). Here, one option is to sum all the probabilities from
P (X = 10) to P (X = 30). Instead, we use the
cdf to obtain
P (10 ≤ X ≤ 30) = F (30) − F (9) = 1 − 2130 − 1 − 219 = 219 − 2130 .
Ex. Find the probability distribution of the number of heads in a toss of four coins. Also, plot
the probability mass function and probability histogram.
Ex. If a car agency sells 50% of its inventory of a certain foreign car equipped with side airbags,
find a formula for the probability distribution of the number of cars with side airbags among the
next 4 cars sold by the agency.
4
Sol. f (x) = x
/16, x = 0, 1, 2, 3, 4.
5
3.1.1 Expectation
Let X be a random variable with pmf p. Then, the expectation of X, denoted by E(X), is defined
as
X
E(X) = xf (x).
X=x
Ex. Let X denotes the number of heads in a toss of two fair coins. Then X assumes the values
0, 1 and 2 with probabilities 14 , 12 and 14 respectively. So E(X) = 0 × 14 + 1 × 12 + 2 × 14 = 1.
Note: (1) The expectation E(X) of the random variable X is the theoretical average or mean
value of X. In a statistical setting, the average value, mean value1 and expected value are syn-
onyms. The mean value is demoted by µ. So E(X) = µ.
(3) The expected or the mean value of the random variable X is a measure of the location of the
center of values of X.
Sol. Let X representthe number of good components in the sample. Then probability distribu-
4 3
x 3−x
tion of X is f (x) = 7
, x = 0, 1, 2, 3.
3
1
From your high school mathematics, you know that if we have n distinct values x1 , x2 , ...., xn with frequencies
n
X
f1 , f2 , ...., fn respectively and fi = N , then the mean value is
i=1
n n n
X fi xi X fi X
µ= = xi = f (xi )xi .
i=1
N i=1
N i=1
fi
where f (xi ) = N is the probability of occurrence of xi in the given data set. Obviously, the final expression for µ
is the expectation of a random variable X assuming the values xi with probabilities f (xi ).
6
Simple calculations yield f (0) = 1/35, f (1) = 12/35, f (2) = 18/35, and f (3) = 4/35. Therefore,
3
X 12
µ = E(X) = xf (x) = = 1.7.
x=0
7
Thus, if a sample of size 3 is selected at random over and over again from a lot of 4 good
components and 3 defective components, it will contain, on average, 1.7 good components.
Ex. A salesperson for a medical device company has two appointments on a given day. At the
first appointment, he believes that he has a 70% chance to make the deal, from which he can earn
$ 1000 commission if successful. On the other hand, he thinks he only has a 40% chance to make
the deal at the second appointment, from which, if successful, he can make $1500. What is his
expected commission based on his own probability belief? Assume that the appointment results
are independent of each other.
Sol. First, we know that the salesperson, for the two appointments, can have 4 possible commis-
sion totals: $0, $1000, $1500, and $2500. We then need to calculate their associated probabilities.
By independence, we obtain
f (0) = (1 − 0.7)(1 − 0.4) = 0.18,
f (2500) = (0.7)(0.4) = 0.28,
f (1000) = (0.7)(1 − 0.4) = 0.42,
f (1500) = (1 − 0.7)(0.4) = 0.12.
Ex. Suppose that the number of cars X that pass through a car wash between 4:00 P.M. and 5:00
P.M. on any sunny Friday has the following probability distribution:
X=x 4 5 6 7 8 9
1 1 1 1 1 1
f (x) = P (X = x) 12 12 4 4 6 6
Let g(X) = 2X − 1 represent the amount of money, in dollars, paid to the attendant by the
manager. Find the attendant’s expected earnings for this particular time period.
Sol. We find
9
X
E(g(X)) = E(2X − 1) = (2x − 1)f (x) = $12.67.
x=4
3.1.2 Variance
Let X and Y be two random variables assuming the values X = 1, 9 and Y = 4, 6. We observe
that both the variables have the same mean values given by µX = µY = 5. However, we see that
7
the values of X are far away from the mean or the central value 5 in comparasion to the values of
Y . Thus, the mean value of a random variable does not account for its variability. In this regard,
we define a new parameter known as variance. It is defined as follows.
If X is a random variable with mean µ, then its variance, denoted by V (X) is defined as the
expectation of (X − µ)2 . So, we have
V (X) = E((X −µ)2 ) = E(X 2 )+µ2 −2µE(X) = E(X 2 )+E(X)2 −2E(X)E(X) = E(X 2 )−E(X)2 .
Ex. Let X denotes the number of heads in a toss of two fair coins. Then X assumes the values
0, 1 and 2 with probabilities 41 , 12 and 14 respectively. So
E(X) = 0 × 14 + 1 × 12 + 2 × 14 = 1,
E(X 2 ) = (0)2 × 41 + (1)2 × 21 + (2)2 × 14 = 32 .
∴ V (X) = 23 − 1 = 12 .
Note: (i) The variance V (X) of the random variable X is also denoted by σ 2 . So V (X) = σ 2 .
(ii) If X is a random variable and c is a constant, then it is easy to verify that V (c) = 0 and
V (cX) = c2 V (X).
x 1 2 3
f (x) 0.3 0.4 0.3
x 0 1 2 3 4
f (x) 0.2 0.1 0.3 0.3 0.1
Show that the variance of the probability distribution for company B is greater than that for
company A.
Ex. Calculate the variance of g(X) = 2X + 3, where X is a random variable with probability
distribution
x 0 1 2 3
1 1 1 1
f (x) 4 8 2 8
2
Sol. µ2X+3 = 6, σ2X+3 = 4.
8
Ex. Find the mean and variance of a random variable X with the pmf given by
2|X|
where c is a constant. If g(X) = (−1)|X|−1 , then show that E(g(X)) exists but E(|g(X)|)
2|X| − 1
does not exist.
∞
X
Sol. Using the condition f (x) = 1, we find c = 1/2.
x=±1
∞ ∞
X X 1
Now E(g(X)) = g(x)f (x) = (−1)|x|−1
, which is an alternating and con-
x=±1 x=±1
2(2|x| − 1)
∞
X 1
vergent series. So E(g(X)) exists. But E(|g(X)|) = is a divergent series, so
x=±1
2(2|x| − 1)
E(|g(X)|) does not exist.
9
3.1.4 Moments and moment generating function
Let X be a random variable and k be any positive integer. Then E(X k ) defines the kth ordinary
moment of X.
Obviously, E(X) = µ is the first ordinary moment, E(X 2 ) is the second ordinary moment and
so on. Further, the ordinary moments can be obtained from the function E(etX ). For, the ordinary
k
moments E(X k ) are coefficients of tk! in the expansion
t2
E(e ) = 1 + tE(X) + E(X 2 ) + ............
tX
2!
Also, we observe that
dk
E(X k ) = E(etX ) t=0 .
dtk
Thus, the function E(etX ) generates all the ordinary moments. That is why, it is known as the
moment generating function and is denoted by mX (t). Thus, mX (t) = E(etX ).
In general, the kth moment of a random variable X about any point a is defined as E((X −a)k ).
Obviously, a = 0 for the ordinary moments. Further, E(X − µX ) = 0 and E((X − µX )2 ) = σX 2
.
So the first moment about mean is 0 while the second moment about mean yields the variance.
10
Mean, variance and mgf of geometric random variable
For the geometric random variable X, we have
∞
X
(i) µX = E(X) = xg(x; p)
x=1
∞
X
= xq x−1 p
x=1
X∞
=p xq x−1
x=1
∞
!
d X
=p qx
dq x=1
∞
X
(∵ Term by term differentiation is permissible for the convergent power series q x within its
x=1
intervalof convergence
|q| < 1.)
d q
=p
dq 1 − q
1
=p
(1 − q)2
1
=
p
2
(ii) σX = E(X 2 ) − E(X)2 = E(X(X − 1)) + E(X) − E(X)2
∞
X 1 1
=p x(x − 1)q x−1 + − 2
x=1
p p
∞
X q
= pq x(x − 1)q x−2 − 2
x=1
p
d2
q q
=p 2 − 2
dq 1 − q p
d2 q q
= pq 2 − 2
dq 1 − q p
d 1 q
= pq 2
− 2
dq (1 − q) p
2 q
= pq −
(1 − q)3 p2
2 q
= pq 3 − 2
p p
2q q
= 2 − 2
p p
q
= 2.
p
∞
X
(iii) mX (t) = E(etX ) = etx g(x; p)
x=1
11
∞
X
=p etx q x−1
x=1
∞
pX t x
= (qe )
q x=1
p qet
= (t < − ln q)
q 1 − qet
pet
=
1 − qet
Remark: Note that we can easily obtain E(X) and E(X 2 ) from the moment generating function
mX (t) by using
dk
E(X k ) = [mX (t)]t=0 ,
dtk
for k = 1 and k = 2 respectively. In other words the first and second t-derivatives of mX (t) at
t = 0 provide us E(X) and E(X 2 ), respectively. Hence we easily get mean and variance from the
moment generating function. Verify!
Ex. For a certain manufacturing process, it is known that, on the average, 1 in every 100 items is
defective. What is the probability that the fifth item inspected is the first defective item found?
Ex. At a busy time, a telephone exchange is very near capacity, so callers have difficulty placing
their calls. It may be of interest to know the number of attempts necessary in order to make a
connection. Suppose that we let p = 0.05 be the probability of a connection during a busy time.
Find the probability of a successful call in the fifth attempt.
12
If we make a change of variable via y = x − k, then
k+y−1 k y k+y−1 k y
nb(y; k, p) = p q = p q , y = 0, 1, 2, . . . .
k−1 y
n n
Here we have used the well known result: = .
x n−x
∞
−k
X k+y−1 y
Note that (1 − q) = q is a negative binomial series.
y=0
y
X∞
Now, let us show that nb(x; k, p) = 1. For,
x=k
∞ ∞
X X x − 1 k x−k
nb(x; k, p) = p q
x=k x=k
k−1
∞
X y+k−1 k y
= p q , where y = x − k
y=0
k−1
∞
k
X y+k−1 y
=p q
y=0
y
= pk (1 − q)−k
= pk p−k
= 1.
13
∞
X
(ii) E((X + 1)X) = (x + 1)x nb(x; k, p)
∞ x=k
X x − 1 k x−k
= (x + 1)x p q
x=k
k−1
∞
X x + 1 k x−k
= (k + 1)k p q
x=k
k + 1
∞
k(k + 1) X x + 1 k+2 x−k
= p q
p2 x=k
k + 1
∞
k(k + 1) X y−1
= pk+2 q y−(k+2) , where x = y − 2,
p2 y=k+2
(k + 2) − 1
∞
k(k + 1) X
= nb(y; k + 2, p)
p2 y=k+2
k(k + 1)
= .1
p2
k(k + 1)
=
p2
k(k + 1) k k 2 kq
So V (X) = E((X + 1)X) − E(X) + E(X)2 = 2
− − 2 = 2
p p p p
∞ ∞
X X x − 1 k x−k
(iii) mX (t) = E(etX ) = etx nb(x; k, p) = etx p q
x=k x=k
k−1
∞
X x − 1 k x−k tx
= p q e
x=k
k−1
∞
X y + k − 1 k y t(y+k)
= p q e , where y = x − k
y=0
k−1
∞
k tk
X y+k−1
=p e (qet )y
y=0
y
= (pet )k (1 − qet )−k
k
pet
=
1 − qet
Ex. In an NBA (National Basketball Association) championship series, the team that wins four
games out of seven is the winner. Suppose that teams A and B face each other in the championship
games and that team A has probability 0.55 of winning a game over team B.
(a) What is the probability that team A will win the series in 6 games?
(b) What is the probability that team A will win the series?
(b) The team A can win the championship series in 4th or 5th or 6th or the 7th game. So
required probability is
14
nb(4; 4, 0.55) + nb(5; 4, 0.55) + nb(6; 4, 0.55) + nb(7; 4, 0.55) = 0.6083.
Outcome S0 S1 ..... S2
X=x 0 1 ..... n
n n n n−1 n n
P (X = x) 0
q 1
q p ..... n
p
The random variable X withthis pmf is called binomial random variable. Here the name ‘binomial’
because the probabilities n0 q n , n1 q n−1 p,....., nn pn in succession are the terms in the binomial
expansion of (q + p)n . Once the values of the parameters n and p are given/determined, the pmf
uniquely describes the binomial distribution of X.
15
n
X n − 1 n−x x−1
= np q p
x=1
x−1
n−1
X n − 1 n−1−y y
= np q p (where y = x − 1)
y=0
y
= np(p + q)n−1 = np.
n
X
(ii) E(X(X − 1)) = x(x − 1)b(x; n, p)
x=0
n
X n n−x x
= x(x − 1) q p
x=0
x
n
2
X n − 2 n−x x−2
= n(n − 1)p q p
x=2
x − 2
n−2
2
X n − 2 n−2−y y
= n(n − 1)p q p (where y = x − 2)
y=0
y
= n(n − 1)p2 (p + q)n−2 = n(n − 1)p2 .
2
So σX = E(X 2 ) − E(X)2 = E(X(X − 1)) + E(X) − E(X)2 = n(n − 1)p2 + np − n2 p2 = npq
n
X
tX
(iii) mX (t) = E(e ) = xb(x; n, p)
x=0
n
tx n
X
= e q n−x px
x=0
x
n
X n n−x t x
= q (pe )
x=0
x
= (q + pet )n
Note: In the particular case n = 1, the binomial distribution is called Bernoulli distribution:
b(x; 1, p) = q 1−x px , x = 0, 1.
Ex. The probability that a certain kind of component will survive a shock test is 3/4. Find the
probability that exactly 2 of the next 4 components tested survive.
16
Ex. The probability that a patient recovers from a rare blood disease is 0.4. If 15 people are
known to have contracted this disease, what is the probability that (a) at least 10 survive, (b)
from 3 to 8 survive, and (c) exactly 5 survive?
Ex. A large chain retailer purchases a certain kind of electronic device from a manufacturer. The
manufacturer indicates that the defective rate of the device is 3%.
(a) The inspector randomly picks 20 items from a shipment. What is the probability that there
will be at least one defective item among these 20?
(b) Suppose that the retailer receives 10 shipments in a month and the inspector randomly tests
20 devices per shipment. What is the probability that there will be exactly 3 shipments each con-
taining at least one defective device among the 20 that are selected and tested from the shipment?
Sol. (a) Denote by X the number of defective devices among the 20. Then X follows a binomial
distribution with n = 20 and p = 0.03. Hence, P (X ≥ 1) = 1 − P (X = 0) = 0.4562.
(b) In this case, each shipment can either contain at least one defective item or not. Hence,
testing of each shipment can be viewed as a Bernoulli trial with p = 0.4562 from part (a). As-
suming independence from shipment to shipment and denoting by Y the number of shipments
containing at least one defective item, Y follows another binomial distribution with n = 10 and
p = 0.4562. Therefore,
P (Y = 3) = 0.1602.
Ex. In a bombing attack, there is a 50% chance that any bomb will strike the target. At least
two direct hits are required to destroy the target. How many minimum number of bombs must be
dropped so that the probability of hitting the target at least twice is more than 0.99?
Sol. Let n bombs must be dropped so that there is at least 99% chance to hit the target at least
twice. Let X be random variable representing the number of bombs striking the target. Then
X = 0, 1, 2, ...., n follows a a binomial distribution with, p = 1/2, and therefore
P (X ≥ 2) ≥ 0.99 or 1 − P (X = 0) − P (X = 1) ≥ 0.99
It can be simplified to get 2n ≥ 100 + 100n. This inequality is satisfied if n ≥ 11. So at least 11
bombs must be dropped so that there is at least 99% chance to hit the target at least twice.
17
3.3.1 Multinomial Distribution
The binomial experiment becomes a multinomial experiment if we let each trial have more than
two possible outcomes. For example, the drawing of a card from a deck with replacement is a
multinomial experiment if the 4 suits are the outcomes of interest.
In general, if a given trial can result in any one of k possible outcomes o1 , o2 ,. . . , ok with
probabilities p1 , p2 ,. . . , pk , then the multinomial distribution gives the probability that o1 occurs
x1 times, o2 occurs x2 times, . . . , and ok occurs xk times in n independent trials, as follows:
n
f (x1 , x2 , . . . , xk ) = px1 1 px2 2 . . . pxnk ,
x1 , x2 , ..., xk
where
n n!
= ,
x1 , x2 , ..., xk x1 !x2 ! . . . xk !
x1 + x2 + · · · + xk = n, p1 + p2 + · · · + pk = 1.
Clearly, when k = 2, the multinomial distribution reduces to the binomial distribution.
Ex. The probabilities that a person goes to office by car, bus and train are 1/2, 1/4 and 1/4,
respectively. Find the probability that the person will go to office 2 days by car, 3 days by bus
and 1 day by train in the 6 days.
6! 1 2 1 3 1
Sol. 2!3!1! 2 4 4
.
Ex. The complexity of arrivals and departures of planes at an airport is such that computer
simulation is often used to model the “ideal” conditions. For a certain airport with three runways,
it is known that in the ideal setting the following are the probabilities that the individual runways
are accessed by a randomly arriving commercial jet:
Runway 1: p1 = 2/9,
Runway 2: p2 = 1/6,
Runway 3: p3 = 11/18.
What is the probability that 6 randomly arriving airplanes are distributed in the following fashion?
Runway 1: 2 airplanes,
Runway 2: 1 airplane,
Runway 3: 3 airplanes
6! 2 2 1 11 3
Sol. 2!1!3! 9 6 18
.
18
3.4 Hypergeometric Distribution
The hypergeometric distribution arises under the following conditions:
(i) The random experiment consists of choosing n objects without replacement from a lot of N objects
given that r objects possess a trait or property of our interest in the lot of N objects.
(ii) X denotes the number of objects possessing the trait or property in the selected sample of size n.
See the following venn diagram for an illustration.
N
N -r r
n- x x
It is easy to see that the x objects with the trait (by definition of X) are to be chosen from the
r objects in xr ways while the remaining n − x objects are to be chosen from the N − r objects
−r
in Nn−x
ways. So the n objects carrying x items with the trait can be chosen from the N objects
r N −r
in x n−x ways while Nn is the total numbers of ways in which n objects can be chosen from N
The random variable X with this pmf is called hypergeometric random variable. The hypergeo-
metric distribution is characterized by the three parameters N , r and n. Note that X lies in the
range max(0, n + r − N ) ≤ x ≤ min(n, r). So minimum value of x could be n + r − N instead
of 0. To understand this, let N = 30, r = 20 and n = 15. Then the minimum value of x is
n + r − N = 15 + 20 − 30 = 5. For, there are only N − r = 10 objects without the trait in the
30 items. So a sample of 15 items certainly contains at least 5 objects with the trait. So in this
case, the random variable X takes the values 5, 6, ..., 15. Notice that the maximum value of x
is min(n, r) = min(20, 15) = 15. Similarly, if we choose n = 25, the random variable X takes the
values 15, 16, 17, 18, 19 and 20. In case, we choose n = 8, the random variable X takes the values
0, 1, 2 ..., 8.
Next, let us check whether h(x; N, r, n) is a valid pmf. Note that x ∈ [max(0, n + r −
N ), min(n, r)]. But we can take x ∈ [0, n] because in situations where this range is not [0, n],
we have h(x; N, r, n) = 0. Also, know the Vandermonde’s identity:
n n a
b
X a b a+b X x n−x
= or a+b
= 1.
x=0
x n−x n x=0 n
19
This identity is understandable in view of the following example.
Suppose a team of n persons is chosen from a group of a men and b women. The number of
ways of choosing the team of n persons from the group of a + b persons is a+bn
, the right hand
side of the Vandermonde’s identity. We can count these number of ways by considering that in
the team of n persons, x persons are men and remaining n − x persons are women. Then we end
up with getting the left hand side of the Vandermonde’s identity.
Now from the Vandermonde’s identity, it follows that
n n r N −r
X X x n−x
h(x; N, r, n) = N
= 1. Thus, h(x; N, r, n) is a valid pmf.
x=0 x=0 n
ab z a(a − 1)b(b − 1) z 2
2 F1 (a, b; c; z) =1+ + + .......,
c 1! c(c − 1) 2!
where a, b, c are constants, and z is variable of the hypergeometric function.
20
d2 a(a − 1)b(b − 1)
2
[2 F1 (a, b; c; z)] = 2 F1 (a + 2, b + 2; c + 2; z).
dz c(c − 1)
Following this, it can be shown that
d r
µX = E(X) = mX (t) =n .
dt t=0 N
2
Similarly, by calculating second derivative of mX (t) at t = 0, the variance can be found as σX =
r N − r N − n
E(X 2 ) − E(X)2 = n .
N N N −1
Sol. Here N = 52, r = 26, n = 5, x = 2, and therefore P (X = 2) = h(2; 52, 26, 5) = 0.3251.
Ex. Lots of 40 components each are deemed unacceptable if they contain 3 or more defectives.
The procedure for sampling a lot is to select 5 components at random and to reject the lot if a
defective is found. What is the probability that exactly 1 defective is found in the sample if there
are 3 defectives in the entire lot?
r N −r
x n−x r! (N − r)! n! · (N − n)!
h(x; N, r, n) = N
= ·
x! · (r − x)! (n − x)! · (N − n − (r − x))!
n
N!
n r!/(r − x)! (N − r)! · (N − n)!
= · ·
x N !/(N − x)! (N − x)! · (N − r − (n − x))!
n r!/(r − x)! (N − r)!/(N − r − (n − x))!
= · ·
x N !/(N − x)! (N − n + (n − x))!/(N − n)!
Y x n−x
n (r − x + k) Y (N − r − (n − x) + m)
= · ·
x k=1 (N − x + k) m=1 (N − n + m)
Now taking the large N limit for fixed r/N , n and x we get the binomial pmf,
n x n−x
b(x; n, p) = p q
x
21
since
(r − x + k) r
lim = lim =p
N →∞ (N − x + k) N →∞ N
and
(N − r − (n − x) + m) N −r
lim = lim = 1 − p = q.
N →∞ (N − n + m) N →∞ N
In practice, this means that we can approximate the hypergeometric probabilities with bino-
mial probabilities, provided N n. As a rule of thumb, if the population size is more than 20
times the sample size (N > 20n) or N/n > 20, then we may use binomial probabilities in place of
hypergeometric probabilities.
Ex. A manufacturer of automobile tires reports that among a shipment of 5000 sent to a local
distributor, 1000 are slightly blemished. If one purchases 10 of these tires at random from the
distributor, what is the probability that exactly 3 are blemished?
Ex. Ten cards are randomly chosen without replacement from a deck of 52 playing cards. Find
the probability of getting 2 spades, 3 clubs, 4 diamonds and 1 heart?
22
3.5 Poisson Distribution
Consider the pmf of the binomial random variable X:
n x
b(x; n, p) = p (1 − p)n−x , x = 0, 1, 2, · · · , n.
x
Let us calculate limiting form of the Binomial distribution as n → ∞, p → 0, and np = k is a
constant. We have
n x
b(x; n, p) = p (1 − p)n−x
x
n!
= px (1 − p)n−x
x!(n − x)!
n(n − 1)...(n − x + 1) x
= p (1 − p)n−x
x!
(np)(np − p)...(np − xp + p)
= (1 − p)n−x
x!
(np)(np − p)...(np − xp + p)
= (1 − p)−x (1 − p)n
x!
(k)(k − p)...(k − xp + p)
= (1 − p)−x (1 − p)k/p (Using np = k)
x!
Thus, in the limit p → 0, we get
e−k k x
p(x; k) = , x = 0, 1, 2, ....
x!
The Poisson distribution is characterized by the single parameter k.
23
Mean, variance and mgf of Poisson random variable
∞
X
(i) µX = E(X) = xp(x; k)
x=0
∞
X e−k k x
= x
x=1
x!
∞
X k x−1
= ke−k
x=1
(x − 1)!
−k k
= ke e
=k
2
(ii) σX = E(X 2 ) − E[X]2 = E(X(X − 1)) + E(X) − E(X)2
∞
X e−k k x
= x(x − 1) + k − k2
x=2
x!
∞
X k x−2
=k e2 −k
+ k − k2
x=2
(x − 2)!
= k 2 e−k ek + k − k 2
=k
We notice that
2
µX = k = σX .
∞
X
(iii) mX (t) = E(etX ) = etx p(x; k)
x=0
∞
X e−k k x
= etx
x=0
x!
∞
X (ket )x
= e−k
x=0
x!
t
= e−k eke
t
= ek(e −1)
e−6 63
Sol. Here λ = 6000, s = 0.001, k = λs = 6 and x = 3, and therefore P (X = 3) = p(3; 6) = 3!
.
Ex. In the last 5 years, 10 students of BITS-Pilani are placed with a package of more than one
crore. Find the probability that exactly 7 students will be placed with a package of more than one
crore in the next 3 years.
24
e−6 67
Sol. Here λ = 10/5 = 2, s = 3, k = λs = 6 and x = 7, and therefore P (X = 7) = p(7; 6) = 7!
.
Ex. During a laboratory experiment, the average number of radioactive particles passing through
a counter in 1 millisecond is 4. What is the probability that 6 particles enter the counter in a given
millisecond?
Ex. Ten is the average number of oil tankers arriving each day at a certain port. The facilities
at the port can handle at most 15 tankers per day. What is the probability that on a given day
tankers have to be turned away?
Note: We proved that the Binomial distribution tends to the Poisson distribution as n → ∞,
p → 0 and np = k remains constant. Thus, we may use Poisson distribution to approximate
binomial probabilities when n is large and p is small. As a rule of thumb this approximation can
safely be applied if n > 50 and np < 5.
Sol. Let X be a binomial random variable with n = 400 and p = 0.005. Thus, np = 2. Using the
Poisson approximation,
(a) P (X = 1) = e−2 21 = 0.271 and
X3
(b) P (X ≤ 3) = e−2 2x /x! = 0.857.
x=0
Ex. In a manufacturing process where glass products are made, defects or bubbles occur, occa-
sionally rendering the piece undesirable for marketing. It is known that, on average, 1 in every
1000 of these items produced has one or more bubbles. What is the probability that a random
sample of 8000 will yield fewer than 7 items possessing bubbles?
Sol. This is essentially a binomial experiment with n = 8000 and p = 0.001. Since p is very close to
0 and n is quite large, we shall approximate with the Poisson distribution using k = (8000)(0.001) =
8. Hence, if X represents the number of bubbles, the require probability is P (X < 7) = 0.3134.
25
3.6 Uniform Distribution
A random variable X is said to follow uniform distribution if it assumes finite number of values
all with same chance of occurrence or equal probabilities. For instance, if the random variable X
assumes n values x1 , x2 , .... , xn with equal probabilities P (X = xi ) = 1/n, then it is uniform
random variable with pmf given by
1
u(x) = , x = x1 , x2 , ...., xn .
n
The moment generating function, mean and variance of the uniform random variable respec-
tively read as
n n
1 X txi 1X
mX (t) = e , µ= xi ,
n i=1 n i=1
n n
!2
1X 2 1X
σ2 = x − xi .
n i=1 i n i=1
Ex. Suppose a fair die is thrown once. Let X denotes the number appearing on the die. Then X
is a discrete random variable assuming the values 1, 2, 3, 4, 5, 6. Also, P (X = 1) = P (X = 2) =
P (X = 3) = P (X = 4) = P (X = 5) = P (X = 6) = 1/6. Thus, X is a uniform random variable.
26