0% found this document useful (0 votes)
5 views

MA1201-Probability-Notes

The document provides an overview of random variables, including their types (discrete and continuous), probability distributions, and key concepts such as probability mass functions (PMF), cumulative distribution functions (CDF), mean, and variance. It includes examples to illustrate the concepts, such as rolling dice and analyzing semiconductor wafers. Additionally, it covers specific distributions like discrete uniform, Bernoulli, and binomial distributions.

Uploaded by

Kumuda hasini
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

MA1201-Probability-Notes

The document provides an overview of random variables, including their types (discrete and continuous), probability distributions, and key concepts such as probability mass functions (PMF), cumulative distribution functions (CDF), mean, and variance. It includes examples to illustrate the concepts, such as rolling dice and analyzing semiconductor wafers. Additionally, it covers specific distributions like discrete uniform, Bernoulli, and binomial distributions.

Uploaded by

Kumuda hasini
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

MA1201 - Probability

Spring 2025
Dr. Sushmitha P

Random Variable (R.V.)


- A random variable is a function that assigns a real number to each outcome in the
sample space of a random experiment.
- A random variable is usually denoted by X, and the values of the random variable
are denoted by x.

Types of Random Variables


1. Discrete R.V.: A random variable with a finite (or countably infinite) range.
2. Continuous R.V.: A random variable with an interval of real numbers as its range.
Examples:
- Discrete: Number of scratches on a surface, proportion of defective parts among 1000
tested, etc.
- Continuous: Electrical current, length, pressure, temperature, etc.
In some cases, X may be discrete, but because the range is so large, it might be
convenient to analyze X as a continuous random variable.
Example: Current measurements from a digital instrument that displays current
to the nearest 1/100th of a milliampere.

Note:
1
1. If X(x) ̸= 0 for x ∈ S (X : S → R), then X(x) , |X(x)| are also random variables.
2. If X and Y are random variables on S, then X ± Y , XY , and aX + bY (where a, b
are constants) are all random variables on S.

Probability Distribution for a Discrete Random Vari-


able
The probability distribution of a random variable X is a description of the proba-
bilities associated with the possible values of X.
For a discrete random variable:

X = xi x1 x2 · · · xi · · ·
P (X = xi ) p1 p2 · · · pi · · ·
Here, xi , i = 1, 2, . . . are the values of the random
X variable X, and pi = P (X = xi )
are the corresponding probabilities that satisfy: P (X = xi ) = 1.
i
It can also be represented as a frequency polygon or histogram.

1
Probability Mass Function (PMF)
For a discrete random variable X, with possible values x1 , x2 , . . . , xn , a PMF is a func-
tion f such that:

1. f (xi ) ≥ 0,
X
2. f (xi ) = 1,
i

3. f (xi ) = P (X = xi ).

Example: Two dice are rolled. Let X represent the sum of the values on the dice.

X 1 2 3 4 5 6 7 8 9 10 11 12
P (X = xi ) 0 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36

Example: Let the random variable X denote the number of semiconductor wafers
that need to be analyzed to detect a large particle of contamination. Assume that the
probability that a wafer contains a large particle is p = 0.01, and that the wafers are
independent. Determine the probability distribution of X.
Let a = absent, p = present.
WLOG, Sample space: S = {p, ap, aap, aaap, . . .}.
We have:
P (X = 1) = P (p) = 0.01,
P (X = i) = P (aa . . . ap) = (0.99)i−1 (0.01), ∀i = 1, 2, . . .
(Verify, this is a PMF.)

Cumulative Distribution Functions


The probability P (X ≤ x) is the probability of the event X ≤ x. It is a function of x
and is denoted by F (x).
The cumulative distribution function of a discrete random variable X is:
X
F (x) = P (X ≤ x) = P (X = xi ), −∞ < x < ∞
xi ≤x

Properties:
• F (−∞) = 0 and F (∞) = 1

• 0 ≤ F (x) ≤ 1

• If x ≤ y, F (x) ≤ F (y) (monotonically increasing function)

• If x1 ≤ x ≤ x2 , F (x1 ) − F (x2 )

2
Example 1: Two dice rolled
X = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12


 0 x<1

1
 36 x = 1



3
F (x) = 36 x=2

...





1 x > 12

(A piecewise constant function)

Example 2: Determine PMF of X from the following cumulative


distribution function


 0 x<1

1
1≤x<2


4

1
F (x) = 2
2≤x<3
 7
3≤x<4




 8
1 x≥4
The size of the step at xi denotes the probability at X = xi .
PMF:
P (X = xi ) = F (xi ) − F (xi−1 )
For example:
P (X = 2) = F (2) − F (1)

Note:
For any interval probabilities:

P (a < X ≤ b) = F (b) − F (a)


P (a ≤ X ≤ b) = F (b) − F (a) + P (X = a)
P (a < X < b) = F (b) − F (a) − P (X = b)
P (a ≤ X < b) = F (b) − F (a) + P (X = a) − P (X = b)

Mean and Variance of a Discrete Random Variable


Two numbers that summarize a probability distribution:

• Mean – center or middle value.

• Variance – measure of dispersion.

They don’t uniquely identify a distribution.

3
Mean or Expected Value
The mean or expected value of a discrete random variable X, denoted by µ or E(X),
is given by: X X
E(X) = xi P (X = xi ) = xi p i

Variance
The variance of X, denoted by σ 2 or V (X), is:
X
V (X) = E((X − µ)2 ) = (xi − µ)2 P (X = xi )

Expanding, X X X
V (X) = x2i pi − 2µ xi p i + µ 2 pi
P
Since pi = 1, this simplifies to:
X
V (X) = x2i pi − µ2

The standard deviation is: √


σ= σ2

Key Points
• Mean: Weighted average.
• f (x): PMF of a loading on a long thin beam.
• E(X): Point at which the beam balances (center of the distribution).
• Variance: How scattered the data is from the mean.

Example
Let X have the PMF:
x = xi −2 −1 0 1 2 3
P (X = xi ) 0.1 k 0.2 2k 0.3 k

Find: 1. k 2. Mean 3. Variance


P
Step 1: Solve for k using P (X = xi ) = 1:

0.1k + k + 0.2 + 2k + k = 1

Simplifying:
0.6 + 4k = 1 =⇒ k = 0.1
Step 2: Compute Mean:

E(X) = (−2)(0.1) + (−1)(0.1) + (0)(0.2) + (1)(0.2) + (2)(0.3) + (3)(0.1)

4
Simplifying:
E(X) = −0.2 − 0.1 + 0 + 0.2 + 0.6 + 0.3 = 0.8
Step 3: Compute Variance:

V (X) = (−2)2 (0.1) + (−1)2 (0.1) + (0)2 (0.2) + (1)2 (0.2) + (2)2 (0.3) + (3)2 (0.1) − (0.8)2

Simplifying:

V (X) = (4)(0.1) + (1)(0.1) + (0)(0.2) + (1)(0.2) + (4)(0.3) + (9)(0.1) − 0.64

V (X) = 2.8 − 0.64 = 2.16


NOTE: Expected value of a function of a random variable: If h(X) is a function
of the random variable X, then:
X
E(h(X)) = h(xi )pi

Example: PMF

X 0 1 2
X ∼ pmf:
P (X) 12 14 14
1 1 1 1 5
E(X 2 ) = 02 · + 12 · + 2 2 · = 0 + + 1 =
2 4 4 4 4
Note E(X)2 ̸= E(X 2 ):
1 1 1 1 1 3
E(X) = 0 · +1· +2· =0+ + =
2 4 4 4 2 4
 2
2 3 9
E(X) = =
4 16
But for the polynomial aX + b, it is true:

E(aX + b) = aE(X) + b

Variance:

V (aX + b) = a2 V (X) (variance doesn’t change on translation)

5
Discrete Uniform Distribution
A random variable X has a discrete uniform distribution if each of the n values in its
range x1 , x2 , . . . , xn has equal probability.
Then f (x) = n1 .
Mean:
n n Pn
X X 1 xi
µ = E(X) = xi p i = p i xi , pi = , E(X) = i=1
i=1 i=1
n n

Variance:
σ 2 = V (X) = E(X 2 ) − (E(X))2
n
X n
X
= pi x2i − p2i ( xi )2 = (substitute pi )
i=1 i=1

Suppose xi are consecutive integers from a to b(a ≤ b), then:

a + b 2 (b − a + 1)2 − 1
µ= ;σ =
2 12
Suppose xi are consecutive integers from 1 to n, then:

n + 1 2 n2 − 1
µ= ;σ =
2 12

Rolling a die, tossing a coin are examples of uniform distribution.


Example: Voice Communication System
A voice communication system for a business has 48 external lines. At a particular
time, the system is observed, and some of the lines are being used. Let the random
variable X denote the number of lines in use.

Range of X : {0, 1, 2, . . . , 48}


Assume X is a discrete uniform random variable.
48 + 0
E(X) = = 24
2
r r
(48 − 0 + 1)2 − 1 492 − 1
σ= = = 14.14
12 12
On average, 24 lines are in use. But since σ is large, at many times far more or
fewer than 24 lines are used.

This formula is more useful when we talk about a random variable taking values
5, 10, 15, . . . , 30. Then Y = 5X, where X takes values 1, 2, . . . , 6.

6
1+6
E(Y ) = 5E(X) = 5 · = 17.5
2
(6 − 1 + 1)2 − 1 36 − 1
V (Y ) = (5)2 V (X) = (5)2 · = (5)2 · = 72.92
12 12
If Y denotes the proportion of the 48 voice lines used at a particular time, then
Y = X/48.
1 24
E(Y ) = E(X) = = 0.5
48 48
V (X) 0.087
V (Y ) = 2
=
48 482

Bernoulli’s Distribution
A random experiment with only two possible outcomes — success or failure — is called
a Bernoulli trial. A random variable X, which takes either 0 or 1 as its value, is
called a Bernoulli’s variable. The corresponding distribution is called the Bernoulli’s
distribution.
Let X ∼ Bernoulli(p), where: - Probability of success p - Probability of failure
(1 − p)
(
px (1 − p)1−x x = 0, 1
P (X = x) =
0 otherwise
X 0 1
P (X = xi ) 1 − p p
Mean µ = E(X) = 0(1 − p) + 1(p) = p
Variance σ 2 = E(X 2 ) − E(X)2 = (02 (1 − p) + 12 (p)) − p2 = p − p2 = p(1 − p) = pq

Standard deviation σ = pq

Binomial Distribution B(n, p)


Consider the following experiments and random variables:
1. Flip a coin 10 times. Let X be the number of heads.
2. In the next 20 births at a hospital, let X be the number of female births.
In all cases:
- The result of each experiment (performed repeatedly) is either a success or a failure.
- The experiment consists of n identical trials, where n is finite.
- Each trial is a Bernoulli trial (outcome is failure or success).
- All trials are independent.
- The probability of success p in each trial remains constant.

7
The random variable X, which equals the number of trials that result in success, is
a binomial random variable with parameters n = 1, 2, ... and 0 < p < 1.
The PMF is given by:
 
n x
P (X = x) = p (1 − p)n−x , x = 0, 1, 2, . . . , n
x
The term “Bi-nomial” (two outcomes) comes from the binomial expansion:
n  
n
X n k n−k
(p + q) = p q , q =1−p
k=0
k

where the sum of all probabilities equals 1.



Define a new random variable for each independent trial: For i = 1, 2, ..., n, let
(
1, if the ith trial is a success,
Xi =
0, otherwise,

Then,
n
X
X= Xi
i=1
Expected value:
n
X
E(X) = E(Xi ) = np
i=1
Variance: n
X
V (X) = V (Xi ) = npq.
i=1
Note:
If p = 0.5, the binomial distribution is symmetric.
If p < 0.5, then the binomial distribution is skewed to the right.
If p > 0.5, then the binomial distribution is skewed to the left.

Example 1:
A die is thrown 4 times. Getting a number greater than 2 is a success. Find the
probability of getting (a) exactly one success, (b) less than 3 successes.
We have n = 4 and p = 64 = 23 . Thus q = 1 − p = 13 .
(a) Probability of exactly one success:
   1  3
4 2 1 2 1 8
P (X = 1) = =4· · =
1 3 3 3 27 81
(b) Probability of less than 3 successes:
P (X < 3) = P (X = 0) + P (X = 1) + P (X = 2)

8
   0  4
4 2 1 1 1
P (X = 0) = =1·1· =
0 3 3 81 81
   2  2
4 2 1 4 1 24
P (X = 2) = =6· · =
2 3 3 9 9 81
Thus,
1 8 24 33
P (X < 3) = P (X = 0) + P (X = 1) + P (X = 2) = + + = .
81 81 81 81

Example 2:
The average percentage of failures in a certain experiment is 40%. What is the proba-
bility that out of a group of 6 candidates, at least 4 pass in the examination?
Here, n = 6, p = 0.6, and q = 0.4. We need to find P (X ≥ 4):

P (X ≥ 4) = P (X = 4) + P (X = 5) + P (X = 6)
 
6
P (X = 4) = (0.6)4 (0.4)2 = 15(0.1296)(0.16) = 15(0.020736) = 0.31104
4
 
6
P (X = 5) = (0.6)5 (0.4)1 = 6(0.07776)(0.4) = 6(0.031104) = 0.186624
5
 
6
P (X = 6) = (0.6)6 (0.4)0 = 1(0.046656)(1) = 0.046656
6
Adding these probabilities:

P (X ≥ 4) = P (X = 4)+P (X = 5)+P (X = 6) = 0.31104+0.186624+0.046656 = 0.54432.

Example 3:
X follows a binomial
 distribution
 such that 4P2 (X =2 4) = P (X =2 2). If n = 6, find p.
6 4 2 6 2 4
We know: 4 4 p q = 2 p q . Thus 4p = q = (1 − p) . This simplifies to
3p2 + 2p − 1 = 0. Solving, we get p = −1, 31 . Since p = −1 is not possible, p = 31 .

Example 4:
Find the maximum n such that the probability of getting no head in tossing a coin n
times is greater than 0.1.

p = 0.5, q = 0.5

n
P (X = 0) > 0.1 =⇒ C0 p0 q n > 0.1 =⇒ (0.5)n > 0.1
 n
1
> 0.1
2

9
Calculating for different values of n:

n = 1 =⇒ 0.5 > 0.1

n = 2 =⇒ 0.25 > 0.1


n = 3 =⇒ 0.125 > 0.1
n = 4 =⇒ 0.0625 < 0.1
Thus, the maximum n is:
max n = 3

Example 5:
If the sum of the mean and variance of a binomial distribution of n trials is 95 , find the
binomial distribution.
For a binomial distribution:
9
np + npq =
5
Let n = 5, then:
9
5p + 5p(1 − p) =
5
Simplify:
9
5p + 5p − 5p2 =
5
9
10p − 5p2 =
5
9
p2 − 2p + =0
25
Solve for p: Using the quadratic formula, we find:
9 1
p= ,
5 5
Since p represents probability, p = 15 . The binomial distribution is given by:
 x  5−x
5 1 4
P (X = x) = Cx , x = 0, 1, 2, . . . , 5.
5 5

Geometric and Negative Binomial Distribution


Assume a series of Bernoulli trials. Trials are conducted until a success is obtained.
Let X denote the number of trials until the first success.
For Example: The probability that a bit transmitted through a digital transmission
channel is received in error is 0.1. Assume that transmissions are independent events

10
and let the random variable X denote the number of bits transmitted until the first
error.
The probability P (X = 5) denotes that the first four bits are transmitted correctly
and the fifth bit is in error.
S–error free, E–error

P (X = 5) = P (SSSSE) = (0.9)4 (0.1)

If the first trial is a success, then X = 1. The range of X is:

X ∈ {1, 2, . . . }.

Geometric Distribution
In a series of Bernoulli trials (independent trials with constant probability p of a suc-
cess), the random variable X that equals the number of trials until the first success is
a geometric random variable with parameter p.

f (x) = (1 − p)x−1 p, x = 1, 2, . . .
- Probabilities decrease in a geometric progression.

Mean: E(X)

X
E(X) = kp(1 − p)k−1
k=1
P∞ k q
Let k=1 q = 1−q
, where q = 1 − p. Then:

∂ X k
E(X) = p q
∂q k=1

Differentiating:  
∂ q
E(X) = p
∂q 1−q
Using derivative rules:
 
(1 − q) − q 1
E(X) = p =p
(1 − q)2 (1 − q)2

Substituting q = 1 − p:
1
E(X) =
p

11
Variance: V (X)
The variance is given by:
V (X) = E(X 2 ) − (E(X))2
First, compute E(X 2 ):

X
E(X 2 ) = k 2 p(1 − p)k−1
k=1
P∞
Let k=1 k 2 q k−1 be handled similarly:

X
2
E(X ) = p [k(k − 1) + k]q k−1
k=1
∞ ∞
!
X X
k−1 k−1
=p k(k − 1)q + kq
k=1 k=1
2
 
∂ q 1
= pq 2 +
∂q 1−q p
 
∂ 1 1
= pq 2
+
∂q (1 − q) p
2 1
= pq 3
+
(1 − q) p

Substituting q = 1 − p:
2pq 1 2q + p
E(X 2 ) = 3
+ =
p p p2

Finally, compute V (X):

V (X) = E(X 2 ) − (E(X))2

Substituting values:  2
2q + p 1 q
V (X) = − = 2
p2 p p

Memoryless Property of Geometric Distribution


P (X > t + s|X > s) = P (X > t)
Equivalently:
P (X > t + s) = P (X > t)P (X > s)
For example, if 100 bits are transmitted, the probability that the first error after
bit 100 occurs on bit 106 is the probability that the next six outcomes are SSSSSE
(identical to error occurring at the 6th bit).

12
Example:
A typist types 3 letters erroneously for every 100 letters. What is the probability that
the fourth letter typed is the first erroneous letter?
Solution: Let X denote the number of letters typed until the first erroneous letter.

P (X = 10) = p(1 − p)9 = (0.03)(0.97)9

Negative Binomial Distribution


Let X denote the number of trials required to obtain r successes.
First r − 1 successes are obtained in the first r − 1 trials, and the rth trial is a
success. Thus, if p represents success probability,
 
x−1 r
P (X = x) = p (1 − p)x−r
r−1
Since trials are independent:
 
x−1 r
f (x) = p (1 − p)x−r , x = r, r + 1, . . .
r−1

At least r trials are required to get r successes. So, range: x = r, r + 1, . . .


When r = 1, the negative binomial distribution becomes a geometric random vari-
able.

X = X 1 + X2 + · · · + Xr
where Xi denotes the number of trials to get the i-th success after getting (i − 1)-th
success. Each Xi is a geometric random variable.
r
E(X) = (Memoryless property)
p
r(1 − p)
V (X) =
p2

Poisson Distribution
The Poisson distribution is a type of probability distribution useful in describing the
number of events that will occur in a specific period of time or in a specific area/volume.
Examples:

• Number of car accidents in a year, on a road.

• Number of printing mistakes on each page of a book.

• Number of calls to a telephone exchange.

13
In general:
• The intervals are considered to be of small length ∆t, and assume that ∆t → 0.
• X is discrete.
• Number of trials is indefinitely very large (n → ∞).
• Probability of success is very small (p → 0).
• np = λ, which is a constant.
The probability mass function for the Poisson distribution may be derived from the
pmf for the Binomial distribution:

P (X = x) = nCx px q n−x ,
where q = 1 − p.
Now, np = λ, so that p = nλ .
Substituting and simplifying:
 x  n−x
n! λ λ
P (X = x) = 1− .
x!(n − x)! n n
As
λx e−λ
n → ∞, P (X = x) = , x = 0, 1, 2, . . .
x!
Here, λ is the parameter.
We may observe that the sum of all probabilities is 1. The mean and variance of
the Poisson distribution is as follows:

X λx
= eλ e−λ = 1
x=0
x!
∞ ∞
X λx −λ X λx −λ
E(X) = x e = x e
x=0
x! x=1
x!

X λx−1
= λe−λ = λe−λ eλ = λ.
x=1
(x − 1)!

For E(X 2 ):
∞ ∞
X λx −λ X λx
E(X 2 ) = x2 e = [x(x − 1) + x] e−λ
x=0
x! x=0
x!
∞ ∞
2 −λ
X λx−2 −λ
X λx−1
=λ e x(x − 1) + λe
x=2
(x)! x=1
(x − 1)!

2 −λ
X λx−2
=λ e +λ
x=2
(x − 2)!
= λ2 e−λ eλ + λ = λ2 + λ

14
Variance:

V (X) = E(X 2 ) − [E(X)]2 = (λ2 + λ) − (λ)2 = λ.

The Poisson distribution is an approximation to the Binomial distribution.

Example 1:
There are 50 telephone lines in an exchange. The probability of any line being busy is
p = 0.1. What is the probability that all lines are busy?
Solution : Let X be the number of lines busy. Given n = 50, p = 0.1. Thus
np = λ = 5.
50
P (X = 50) = λ50! e−λ .

Example 2:
If X follows Poisson distribution with P (X = 2) = 32 P (X = 1), find P (X = 0).
Solution :
2
P (X = 2) = P (X = 1)
3
λ2 −λ 2 λ −λ
e = e
2! 3 1!
4
λ=
3
λ0 4
P (X = 0) = e−λ = e− 3 .
0!

Example 3:
If X follows Poisson distribution with P (X = 2) = 9P (X = 4) + 90P (X = 6), find the
mean.
Solution :

P (X = 2) = 9P (X = 4) + 90P (X = 6)
λ2 −λ λ4 λ6
e = 9 e−λ + 90 e−λ
2! 4! 6!
λ4 + 3λ2 − 1 = 0

Thus the mean, λ = 1.

Example 4:
Show that three successive values of a Poisson variate cannot have equal probability of
success.

15
Continuous Random Variable
Probability Density Function
For a continuous random variable X, a probability density function (pdf) is a function
such that:

(i) f (x) ≥ 0, x ∈ (−∞, ∞),


´∞
(ii) −∞ f (x)dx = 1,
´b
(iii) P (a ≤ X ≤ b) = a
f (x)dx.

Histogram approximates PDF (Riemann Integration)


For any arbitrary continuous random variable X:

P (X = x) = 0.

Here:
P (a < X < b) = P (a ≤ X < b) = P (a < X ≤ b) = P (a ≤ X ≤ b).

Cumulative Distribution Function


The cumulative distribution function (CDF) is defined as:
ˆ x
F (x) = P (X ≤ x) = f (t)dt, −∞ < x < ∞.
−∞

The relationship between the pdf and CDF is:

f (x) = F ′ (x).

Hence:
P (a ≤ X ≤ b) = F (b) − F (a).
In the discrete case, F (x) is not continuous. But, here F (x) is a continuous function.

Mean and Variance


Suppose X is a continuous random variable with pdf f (x):
ˆ ∞
µ = E(X) = xf (x)dx.
−∞

The variance is given by:


ˆ ∞
2 2
σ = V (X) = E((X − µ) ) = (x − µ)2 f (x)dx.
−∞

16
Expanding the formula: ˆ ∞
2
σ = x2 f (x)dx − µ2 .
−∞

For expected value of a function of X:


ˆ ∞
E(h(X)) = h(x)f (x)dx.
−∞

For linear transformations:

E(aX + b) = aE(X) + b.

Example 1:
Let the continuous random variable X denote the current measurement in a thin copper
wire in mA. Assume the range of X is [4.9, 5.1]. Assume f (x) = 5, x ∈ [4.9, 5.1]. What
is the probability that a current measurement is less than 5 mA?
ˆ 5
P (X < 5) = 5 dx = 5(5 − 4.9) = 0.5
−∞

What is the expected value of power when the resistance is 100 Ω?

P = I 2 R, I = 10−3 X (current in mA), R = resistance in Ω


ˆ 5.1
−6 2 −6
E(10 X R) = 10 R x2 f (x)dx
4.9

Substituting R = 100 Ω, f (x) = 5:


ˆ 5.1
2 −6
E(X R) = 10 (100)(5) x2 dx
4.9

3 5.1
 
x
= 10−4
3 4.9

5.13 4.93
 
−4
= 10 −
3 3
Simplifying:

E(P ) = 0.0005W

17
Example 2:
(
x
6
0≤x≤3
+k
If f (x) = , find k. Also find P (1 ≤ X ≤ 2).
0 otherwise
To find k : f (x) must satisfy:
ˆ ∞
f (x)dx = 1
−∞

Substituting:
ˆ 3
x
( + k)dx = 1
0 6
Expanding:
ˆ 3 ˆ 3
1
xdx + k dx = 1
6 0 0
Solving:
 3
1 x2
+ k[x]30 = 1
6 2 0
3
+ 3k = 1
4
Thus:
1
k=
12
´2 ´2
To find P (1 ≤ X ≤ 2) : P (1 ≤ X ≤ 2) = 1
f (x)dx = 1
( x6 + 1
12
)dx = 1/3.

Example 3:

0, x < 0

Given F (x) = x2 , 0 ≤ x ≤ 1 . Find f (x), P (0.5 < X < 0.75) and the mean.

1, x > 1


0, x<0
 (
2x, 0 ≤ x ≤ 1
f (x) = F ′ (x) = 2x, 0 ≤ x ≤ 1 =
 0, otherwise
0, x>1

´ 0.75 ´ 0.75
P (0.5 < X < 0.75) = 0.5 f (x)dx = 0.5 2xdx = [x2 ]0.5
0.75
= 0.3125
´∞ ´1 3
µ= −∞
xf (x)dx = 0
2x2 dx = 2[ x3 ]10 = 2
3

18
Continuous Uniform Distribution (Rectangular Dis-
tribution)
(
k a≤x≤b
f (x) =
0 otherwise
To find k: ˆ b
1
k dx = 1 =⇒ k =
a b−a

Expected Value and Variance


Expected value:
ˆ ∞ ˆ b ˆ b
1 1
E(X) = xf (x) dx = x dx = x dx
−∞ a b−a b−a a

 2 b
1 x 1 b 2 − a2 a+b
= = · =
b−a 2 a b−a 2 2
Variance:
σ 2 = E(X 2 ) − [E(X)]2
First, calculate E(X 2 ):
ˆ ∞ ˆ b ˆ b
2 2 2 1 1
E(X ) = x f (x) dx = x dx = x2 dx
−∞ a b−a b−a a

 3 b
1 x 1 b 3 − a3
= = ·
b−a 3 a b−a 3
Now, calculate σ 2 :
 2
1 a+b
σ = E(X ) − [E(X)] = (a2 + b2 + ab) −
2 2 2
3 2
Simplify:
2 (b − a)2
σ =
12

Cumulative Distribution Function (CDF)


The CDF is given by: 
0
 x<a
x−a
F (x) = a≤x<b
 b−a
1 x>b

Note: F (x) is not differentiable at the endpoints.

19
Example 1:
If X is uniformly distributed over (−a, a), find a so that P (X > 1) = 1/3.
Given:
P (X > 1) = 1 − F (1) = 1/3
From the CDF: ˆ 1
F (1) = P (X < 1) = f (x)dx
−a

Substitute f (x) = 1/(2a):


ˆ 1
1 1+a
F (1) = dx = = 2/3
−a 2a 2a

Solving for a, we get a = 3.

Example 2:
Subway trains run on a certain line every half an hour between midnight and 6 AM.
What is the probability that a man entering the station at a random time during this
period will have to wait for at least 20 minutes?
Let X be the man’s waiting time (in minutes). Then:
(
1
, 0 < x < 30
f (x) = 30
0, otherwise

Find P (X > 20):


ˆ 30 ˆ 30
1
P (X > 20) = f (x)dx = dx
20 20 30
Simplify:
x 30
P (X > 20) = | = 1/3
30 20

Normal Distribution (Gaussian Distribution)


• Most widely used.

• Limiting case of binomial distribution.

• Whenever a random experiment is replicated, the random variable that equals


the average result over the replicates tends to have a normal distribution.

• De Moivre discovered it but it is credited to Gauss.

20
A continuous random variable X with probability density function (PDF):
1 (x−µ)2
f (x) = √ e− 2σ2 , −∞ < x < ∞
2πσ

E(X) = µ, V (X) = σ 2 , X ∼ N (µ, σ 2 )


Symmetry of f (x) gives:

P (X < µ) = P (X > µ) = 0.5

Example:
P (|X − µ| ≤ 3σ) ≈ 0.9973

Normalized/Standard Normal Random Variable


A normal variable with µ = 0 and σ 2 = 1:
X −µ
Z=
σ
PDF of Z:
1 z2
f (z) = √ e− 2 , −∞ < z < ∞

Cumulative distribution function (CDF):
ˆ z
Φ(z) = P (Z ≤ z) = f (t)dt
−∞

Properties of Φ(z):

• Φ(−z) = 1 − Φ(z)

• P (a ≤ X ≤ b) = Φ b−µ a−µ
 
σ
−Φ σ

• P (Z ≤ a) = P (Z > −a)

Exercise: Derive the mean and variance of normal distribution.

The curve of a normal distribution is unimodal and bell-shaped with the highest
point over the mean µ. It is symmetrical about a vertical line through µ.

21
Example 1
A normal distribution has a mean 20 and a standard deviation of 4. Find out the
probability that a value of X lies between 20 and 24.
Solution:
µ = 20, σ = 4
X −µ X − 20
Z= =
σ 4
 
24 − 20
P (20 < X < 24) = P 0 < Z < = P (0 < Z < 1)
4
From standard normal tables:

P (0 < Z < 1) = 0.3413

Example 2
The mean and standard deviation of a normal variable X are 50 and 4 respectively.
Find the value of the corresponding random variable when x = 42, 54, 84.
Solution: Using the formula:
X −µ X − 50
Z= or Z =
σ 4
For X = 42:
42 − 50
Z= = −2
4
For X = 54:
54 − 50
Z= =1
4
For X = 84:
84 − 50
Z= = 8.5
4

Example 3
The ABC company uses a machine to fill boxes with soap powder. Assume that the
net weight of the boxes is normally distributed with mean 15 and standard deviation
0.8 ounces.
(i) What proportion of boxes will have net weight of more than 14 ounces?
(ii) 25% of the boxes will be heavier than a certain net weight w, and 75% of the
boxes will be lighter than this net weight. Find w.

Solution: Given:
µ = 15, σ = 0.8
(i) For P (X > 14): Using:
X −µ
Z=
σ

22
We find:
14 − 15
P (X > 14) = P (Z > ) = P (Z > −1.25)
0.8
From standard normal tables:

P (Z > −1.25) = 0.8944

Thus, 89.44% of the boxes will have net weight greater than 14 ounces.
(ii) To find w, we use the property that P (X > w) = 0.25, which corresponds to
P (Z > z) = 0.25. From standard normal tables, this gives z ≈ −0.674. Using:

w = zσ + µ

We find:
w = (0.674)(0.8) + 15
Simplify:
w ≈ 15.54
Thus, the value of w is approximately 14.46 ounces.

Example 3:
In a normal distribution exactly 7% of the items are under 35 and 89% of the items are
under 63. What are the values of the mean and s.d. of the distribution?
35−µ 35−µ

Solution: P Z < σ
= 0.07 =⇒ σ
= −1.48
63−µ 63−µ

P Z< σ
= 0.89 =⇒ σ
= 1.23

Solving both the equations, we get µ = 50.31, σ = 1.33

Example 4:
The income of a group of 10000 persons was found to be normally distributed with
mean equal to Rs 750 and s.d. equal to Rs 50. What was the least income among the
richest 250?

Solution: µ = 750, σ = 50
Let x′ be the lowest income among the richest 250 people.

P (X > x′ ) = 250
10000
= 0.025

x′ −750 x′ −750
= 1.96 =⇒ x′ = 848

=⇒ P Z > 50
= 0.025 =⇒ 50

23
Exponential Distribution
The random variable X that equals distance between successive events from a Poisson
process with mean number of events λ > 0 per unit interval is an exponential random
variable with parameter λ.

The pdf of X is f (x) = λe−λx , x ≥ 0.


´∞ ´∞
Mean: 0
xf (x)dx = 0
x(λe−λx )dx = 1
λ

1
Variance: Var(X) = E[X 2 ] − (E[X])2 = λ2

Example: Number of flaws in a copper wire - Poisson process. Distance between


flaws - exponential variable. Starting point doesn’t matter; length of the interval mat-
ters.

Example:
In a large corporate computer network, user log-ons to the system can be modeled as a
Poisson process with a mean of 25 log-ons per hour. What is the probability that there
are no log-ons in an interval of 6 minutes?
Solution: Let X denote the time (in hours) from the start of the interval until the
first log-on. Then X has an exponential distribution with λ = 25 log-ons per hour.
6 minutes = 0.1 hours.
ˆ ∞
∞
25e−25x dx = −e−25x 0.1 = e−2.5 .

P (X > 0.1) =
0.1
The cumulative distribution function (CDF) is given by:
ˆ x ˆ x
−λt
F (x) = P (X ≤ x) = λe dt = λe−λx dx.
0 0

F (x) = 1 − e−λx .
Determine the interval of time such that the probability that no log-on occurs in
the interval is 0.9:

P (X > x) = 0.9.

e−25x = 0.9.
Taking the natural logarithm on both sides:
−25x = ln(0.9).

ln(0.9)
x=− .
25

24
Using ln(0.9) ≈ −0.105:
0.105
x= ≈ 0.0042 hours ≈ 0.25 minutes.
25
The mean time until the next log-on is:
1 1
E[X] = = = 0.04 hours.
λ 25
The standard deviation is:
1
σ= = 0.04 hours.
λ
- The probability that there are no log-ons in a six-minute interval is e−2.5 irrespec-
tive of the starting time of the interval. - A Poisson process assumes that events occur
uniformly over the interval of observation (no clustering).

Gamma Distribution
An exponential random variable describes the length until the first count is obtained in
a Poisson process. A generalization of the exponential distribution is the length until r
events occur in a Poisson process.

Gamma Function
ˆ ∞
Γ(r) = xr−1 e−x dx, r>0
0

The random variable X with probability density function (PDF):

λr xr−1 e−λx
f (x) = , x>0
Γ(r)

is a gamma random variable with parameters λ > 0, r > 0.


If r is an integer, X has an Erlang distribution: In this case,

Γ(r) = (r − 1)!

Here, λ is the scale parameter and r is the shape parameter.


r r
E(X) = , Var(X) = (from exponential distribution)
λ λ2

25
Example
The time to prepare a micro-array slide for high-throughput genomics is a Poisson
process with a mean of two hours per slide. What is the probability that 10 slides
require more than 25 hours to prepare?
Let X denote the time to prepare 10 slides. Then X has a gamma distribution with
λ = 21 and r = 10.
We need to calculate:

P (X > 25) = 1 − P (X < 25)


ˆ 25 1 10 10−1 − x
2
x e 2
P (X > 25) = 1 − dx
0 Γ(10)
Substituting:
ˆ 25  10
1 1 x
P (X > 25) = 1 − x9 e− 2 dx = 0.2014
9! 0 2
Expected value:
r 10
E(X) = = = 20
λ 1/2
Variance:
r 10 p
V (X) = = = 40, σ= V (X) = 6.32 hours.
λ2 (1/2)2

Lognormal Distribution
If X = exp(W ) is a random variable with W following a normal distribution, then X
follows a lognormal distribution. (ln(X) follows normal distribution)

Range: (0, ∞)
Let W ∼ N (θ, w2 ), then:
   
ln(x) − θ ln(x) − θ
F (x) = P (X ≤ x) = P (exp(W ) ≤ x) = P (W ≤ ln(x)) = P Z ≤ =Φ
w w
The PDF is given by:
f (x) = F ′ (x)
Let W have a normal distribution with mean θ and variance w2 . Then X = exp(W )
is a lognormal random variable with PDF:
" #
1 (ln(x) − θ)2
f (x) = √ exp − , x>0
xw 2π 2w2

Mean:
2 /2
E(X) = eθ+w

26
Variance:
2 2
V (X) = e2θ+w (ew − 1)
Note: θ, w2 are mean and variance of the random variable W .

The lifetime of a part that degrades over time is often modeled by a lognormal
random variable.

Example:
The lifetime (in hours) of a semiconductor laser has a lognormal distribution with
θ = 10 and w = 1.5. What is the probability that the lifetime exceeds 10000 hours?

Solution:
We need to find:
P (X > 10000)
This can be expressed as:

P (X > 10000) = 1 − P (exp(W ) ≤ 10000)


= 1 − P (W ≤ ln(10000))
 
ln(10000) − 10
=1−P Z ≤
1.5
= 1 − P (Z ≤ −0.52)
= 1 − 0.3 = 0.7

Thus, the probability that the lifetime exceeds 10,000 hours is 0.7.

What lifetime is exceeded by 99% of lasers?


Find x such that
P (X > x) = 0.99
 
ln(x) − 10
P (X > x) = P (exp(W ) > x) = P (W > ln(x)) = P Z>
1.5
Here, Z follows the standard normal distribution. From the standard normal table:

P (Z > z) = 0.99 =⇒ z = −2.33

Substituting back:
ln(x) − 10
= −2.33 =⇒ ln(x) = 10 − 2.33 × 1.5 = 6.505
1.5
x = e6.505 ≈ 668.45 hrs
The mean lifetime can be calculated as:
1.52
E(X) = e10+ 2 = e10+1.125 ≈ 67, 846.3

27
The variance is given by:
 2 
V (X) = e20.3 e1.5 − 1 ≈ 3907059986.6

Standard deviation: p
σ= V (X) ≈ 197661.5 hrs

Beta Distribution
A continuous distribution that is flexible but bounded over a finite range.
For example, the proportion of solar radiation absorbed by a material or the pro-
portion required to complete a task in a project.

The random variable X with probability density function (p.d.f.) given by:

Γ(α + β) α−1
f (x) = x (1 − x)β−1 , x ∈ [0, 1]
Γ(α)Γ(β)

is a beta random variable with parameters α > 0, β > 0.


Shape parameters: α, β.

Example:
Consider the completion time of a large commercial development project. The propor-
tion of the maximum allowed time to complete a task is modelled as a beta random
variable with parameters α = 2.5, β = 1. What is the probability that the proportion
of the maximum time exceeds 0.7?
Let X denote the random variable that represents the proportion of max time re-
quired to complete the task. We need to calculate:
ˆ 1
P (X > 0.7) = f (x)dx
0.7

Substituting f (x):
ˆ 1 ˆ 1
2.5−1 1−1 Γ(3.5)
P (X > 0.7) = x (1 − x) dx = x1.5 dx
0.7 Γ(2.5)Γ(1) 0.7

Using properties of gamma functions and simplifying:



(2.5)(1.5)(0.5) π x2.5 1
P (X > 0.7) = √ [ ] = 1 − (0.7)2.5 = 0.59
(1.5)(0.5) π 2.5 0.7

Thus, the probability that the proportion exceeds 0.7 is approximately 0.59.

28
Mean and Variance
If X ∼ Beta(α, β), then:
α αβ
E(X) = , V (X) =
α+β (α + β)2 (α + β + 1)
Here, a beta variate X has range [0, 1]. If W is a random variable that follows beta
distribution with range [a, b], then
W = a + (b − a)X

Some Inequalities
Markov’s Inequality
If X is a random variable that takes only non-negative values, then for any value a > 0:
E(X)
P (X ≥ a) ≤
a
Proof: For a > 0, let: (
1 if X ≥ a
I=
0 otherwise
Since X ≥ 0, we have:
 
X X
I≤ =⇒ E(I) ≤ E
a a
But E(I) = P (X ≥ a), so:
E(X)
P (X ≥ a) ≤
a

Chebyshev’s Inequality
If X is a random variable with finite mean µ and variance σ 2 , then for any k > 0:
σ2
P (|X − µ| ≥ k) ≤
k2
Proof: Since (X − µ)2 ≥ 0, we can use Markov’s inequality with a = k 2 :
E((X − µ)2 ) σ2
P ((X − µ)2 ≥ k 2 ) ≤ =
k2 k2
Thus:
σ2
P (|X − µ| ≥ k) = P ((X − µ)2 ≥ k 2 ) ≤
k2
These inequalities provide bounds on probabilities when the mean or both mean
and variance are known.

29
Example 1:
Suppose that it is known that the number of items produced in a factory during a week
is a random variable with mean 50.

(a) What can be said about the probability that this week’s production will exceed
75?

(b) If the variance of a week’s production is known to equal 25, then what can be said
about the probability that this week’s production will be between 40 and 60?

Solution:

(a)
E(X) 50 2
P (X > 75) ≤ = =
75 75 3
(b)
σ2 25 1
P (|X − 50| > 10) ≤ 2 = =
10 100 4
1 3
P (40 ≤ X ≤ 60) = 1 − P (|X − 50| > 10) ≥ 1 − =
4 4

Example 2:
If X is uniformly distributed over the interval [0, 10] with
25
E(X) = 5, and V (X) =
3
. Then by the Chebychev’s inequality,
25
V (X)
P (|X − 5| > 4) ≤ = 3 = 0.52
42 16
But actually:

P (|X − 5| > 4) = 1 − P (5 − 4 < X < 5 + 4) = 1 − P (1 < X < 9) = 0.2

The bound may be too much away but has a lot of theoretical applications.

30

You might also like