0% found this document useful (0 votes)
45 views

Lecture 3

The document covers topics in probability including random variables, probability mass functions, cumulative distribution functions, expectation, variance, and several common probability distributions. Examples are provided to illustrate key concepts. Questions and exercises are also included throughout the document.

Uploaded by

21142467
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views

Lecture 3

The document covers topics in probability including random variables, probability mass functions, cumulative distribution functions, expectation, variance, and several common probability distributions. Examples are provided to illustrate key concepts. Questions and exercises are also included throughout the document.

Uploaded by

21142467
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Nguyễn Ngọc Tứ

Lecture 3

Nguyễn Ngọc Tứ Lecture 3 2023-2024 1 / 42


Lecture outline

Random variables
Probability mass function (PMF)
Cummulative distribution function (CDF)
Expectation and Variance
Binomial distribution
Poisson distribution
Geometric distribution
Negative binomial distribution
Hypergeometric distribution

Nguyễn Ngọc Tứ Lecture 3 2023-2024 2 / 42


Random variables

Definition – Random variable


- A random variable X : Ω → R is a function
- A discrete random variable is a real-valued function of the outcome
of the experiment that can take a finite or countably infinite number
of values.
- We will denote random variables by capital letters X , Y , . . . and
their values by lowercase x, y , . . .

Example

Tossing a coin has Ω = {H, T }. Then, we define a random variable


X by
X (H) = 1, X (T ) = 0.

Nguyễn Ngọc Tứ Lecture 3 2023-2024 3 / 42


Probability mass function (PMF)
Let us consider an experiment whose outcomes X are integers. The
probability distribution of X is the function p : R → R defined by

pX (x) = P(X = x) = P({w ∈ Ω s.t. X (w ) = x}) for all x ∈ Z


P
pX (x) ≥ 0, x pX (x) = 1.

Example

Consider rolling a fair die. The possible outcomes are , , . . . , ,


which we convert to a numerical outcome X ∈ {1, 2, . . . , 6} in the
obvious way. Then
(
1
, x ∈ {1, 2, 3, 4, 5, 6}
p(x) = 6
0, otherwise.

Nguyễn Ngọc Tứ Lecture 3 2023-2024 4 / 42


Probability mass function (PMF)
13. A mail-order computer business has six telephone lines. Let X denote
the number of lines in use at a specified time. Suppose the pmf of X is as
given in the accompanying table.
x 0 1 2 3 4 5 6
p(x) 0.10 0.15 0.20 0.20 0.25 0.04 0.06
Calculate the probability of each of the following events.
a. at most three lines are in use
b. fewer than three lines are in use
c. at least three lines are in use
d. between three and six lines, inclusive, are in use
e. between three and five lines, inclusive, are not in use
f. at least three lines are not in use

Nguyễn Ngọc Tứ Lecture 3 2023-2024 5 / 42


Probability mass function (PMF)

Nguyễn Ngọc Tứ Lecture 3 2023-2024 6 / 42


Cummulative distribution function (CDF)

Definition

The function F : R → R defined by


X
F (x) = p(t) = P(X ≤ x)
t≤x

is called the cummulative distribution function of X .

Example

A store carries flash drives with either 1 GB, 2 GB, 4 GB, 8 GB, or 16 GB
of memory. The accompanying table gives the distribution of of memory in
a purchased drive:

x 1 2 4 8 16
p(x) 0.05 0.10 0.35 0.40 0.10

Nguyễn Ngọc Tứ Lecture 3 2023-2024 7 / 42


Distribution function (CDF)
F (x) is defined as follows:

F (1) = P(X ≤ 1) = P(X = 1) = p(1) = 0.05


F (2) = P(X ≤ 2) = P(X = 1, 2) = p(1) + p(2) = 0.15
F (4) = P(X ≤ 4) = P(X = 1, 2, 4) = p(1) + p(2) + p(4) = 0.5
F (8) = P(X ≤ 8) = P(X = 1, 2, 4, 8) = p(1) + p(2) + p(4) + p(8) = 0.9
F (16) = P(X ≤ 16) = 1.



0 x <1




0.05 1 ≤ x < 2

0.15 2 ≤ x < 4
The probability distribution is F (x) =


0.5 4≤x <8

0.9


 8 ≤ x < 16

1 16 ≤ x
Nguyễn Ngọc Tứ Lecture 3 2023-2024 8 / 42
CDF

Proposition

For any two numbers a and b with a ≤ b

P(a ≤ X ≤ b) = F (b) − F (a−)

where a− represents the largest possible X that is strictly less than


a. In particular, if the only possible values X are integers and if a and
b are integers, then

P(a ≤ X ≤ b) = F (b) − F (a − 1)

If a = b then P(X = a) = F (a) − F (a − 1).

Nguyễn Ngọc Tứ Lecture 3 2023-2024 9 / 42


CDF

Example

Let X = the number of days of sick leave taken by a randomly selected


employee of a large company during a particular year. If the maximum
number of allowable sick days per year is 14, possible values of X are
0, 1, . . . , 14. With F (0) = 0.58, F (1) = 0.72, F (2) = 0.76, F (3) =
0.81, F (4) = 0.88, and F (5) = 0.94. Then, we have

P(2 ≤ X ≤ 5) = F (5) − F (1) = 0.22

and
P(X = 3) = F (3) − F (2) = 0.05.

Nguyễn Ngọc Tứ Lecture 3 2023-2024 10 / 42


CDF
23. A consumer organization that evaluates new automobiles customarily
reports the number of major defects in each car examined. Let X denote
the number of major defects in a randomly selected car of a certain type.
The cdf of X is as follows:



0 x <0

0.06 0 ≤ x < 1







0.18 1 ≤ x < 2

0.39 2 ≤ x < 3
F (x) =


0.65 3 ≤ x < 4




0.92 4 ≤ x < 5




0.97 5 ≤ x < 6

1 6≤x
Calculate the following probabilities directly from the cdf:
a. P(X = 2) b. P(X > 3) c. P(2 ≤ X ≤ 5) d. P(2 < X < 5)
Nguyễn Ngọc Tứ Lecture 3 2023-2024 11 / 42
CDF

Nguyễn Ngọc Tứ Lecture 3 2023-2024 12 / 42


CDF
24. An insurance company offers its policyholders a number of different premium
payment options. For a randomly selected policyholder, let X = the number of
months between successive payments. The cdf of X is as follows:

0


x <1
.25 1≤x <3





.40 3 ≤ x < 4
F (x) =


 .55 4 ≤ x < 6
.80 6 ≤ x < 12





1 12 ≤ x

a. What is the pmf of X ?


b. Using just the cdf, compute P(3 ≤ X ≤ 6) and P(4 ≤ X ).

Nguyễn Ngọc Tứ Lecture 3 2023-2024 13 / 42


Expectation

Definition
Let X be a discrete random variable with PMF pX (x). The expecta-
tion value of X is defined by
X
E (X ) = xpX (x).
x

Properties of expectations

If α, β are constants, then


1. E (α) = α
2. E (αX ) = αE (X )
3. E (αX + β) = αE (X ) + β

Nguyễn Ngọc Tứ Lecture 3 2023-2024 14 / 42


Variance
Second moment: E (X 2 ) = x 2 pX (x)
P
x
Variance:
Var (X ) = E [(X − E (X ))2 ]
X
= (x − E (X ))2 pX (x)
x
= E (X 2 ) − E 2 (X )

p
Standard deviation: σ(X ) = Var (X )

Properties

1. Var (X ) ≥ 0
2. Var (αX + β) = α2 Var (X )

Nguyễn Ngọc Tứ Lecture 3 2023-2024 15 / 42


Expectation and Variance - Example

A store carries flash drives with either 1 GB, 2 GB, 4 GB, 8 GB, or 16 GB
of memory. The accompanying table gives the distribution of of memory in
a purchased drive:
x 1 2 4 8 16
p(x) 0.05 0.10 0.35 0.40 0.10

Then, we have

µX = EX = 1 · 0.05 + 2 · 0.1 + 4 · 0.35 + 8 · 0.4 + 16 · 0.1 = 6.45


E (X 2 ) = 12 · 0.05 + 22 · 0.1 + 42 · 0.35 + 82 · 0.4 + 162 · 0.1 = 57.25
σX2 = V (X ) = E (X 2 ) − (EX )2 = 57.25 − 6.452 = 15.6475

σX = 15.6475 = 3.95569

Nguyễn Ngọc Tứ Lecture 3 2023-2024 16 / 42


Expectation and Variance - Exercises
32. An appliance dealer sells three different models of upright freezers having
14.5, 17.9, and 19.1 cubic feet of storage space, respectively. Let X = the
amount of storage space purchased by the next customer to buy a freezer.
Suppose that X has pmf
x 13.5 16.9 19.1
p(x) 0.2 0.5 0.3
a. Compute E (X ), E (X 2 ), and V (X ).
b. If the price of a freezer having capacity X cubic feet is 25X − 8.5, what
is the expected price paid by the next customer to buy a freezer?
c. What is the variance of the price 25X − 8.5 paid by the next customer?
d. Suppose that although the rated capacity of a freezer is X, the actual
capacity is h(X ) = X − 0.01X 2 . What is the expected actual capacity of
the freezer purchased by the next customer?

Nguyễn Ngọc Tứ Lecture 3 2023-2024 17 / 42


Expectation and Variance - Exercises

Nguyễn Ngọc Tứ Lecture 3 2023-2024 18 / 42


The Binomial distribution
A biased coin is tossed n times. Let P(H) = p.
X : numbers of heads in n independent coin tosses.
n = 4, X = 2, P(X = 2) =?

Nguyễn Ngọc Tứ Lecture 3 2023-2024 19 / 42


The Binomial distribution
A biased coin is tossed n times. Let P(H) = p.
X : numbers of heads in n independent coin tosses.
n = 4, X = 2, P(X = 2) =?

pX (2) = P(HHTT ) + P(HTHT ) + P(HTTH)


+ P(THHT ) + P(THTH) + P(TTHH)
 
2 2 4 2
= 6p (1 − p) = p (1 − p)2
2

In general, X is a binomial random variable with parameters n and p.


Then the PMF of X is
 
n k
pX (k) = P(X = k) = p (1 − p)n−k , k = 0, 1, . . . , n
k

Notation: X ∼ B(n, p) and EX = np, Var (X ) = np(1 − p)


Nguyễn Ngọc Tứ Lecture 3 2023-2024 19 / 42
The Binomial distribution - Exercises
50. A particular telephone number is used to receive both voice calls and
fax messages. Suppose that 35% of the incoming calls involve fax messages,
and consider a sample of 30 incoming calls. What is the probability that
a. Exactly 6 of the calls involve a fax message?
b. At least 6 of the calls involve a fax message?
c. More than 6 of the calls involve a fax message?
d. What is the expected and the standard deviation of the number among
the 30 calls that involve a fax message?
e. What is the probability that the number of calls among the 30 that involve
a fax transmission exceeds the expected number by more than 2 standard
deviations?
Solution.
X = the number of calls involve a fax message. Then
X ∼ B(n, p) where n = . . . , p = . . .

Nguyễn Ngọc Tứ Lecture 3 2023-2024 20 / 42


The Binomial distribution - Exercises

Nguyễn Ngọc Tứ Lecture 3 2023-2024 21 / 42


The Binomial distribution - Exercises
52. Suppose that 40% of all students who have to buy a text for a particular
course want a new copy (the successes!), whereas the other 60% want a used copy.
Consider randomly selecting 30 purchasers.
a. What are the mean value and standard deviation of the number who want a new
copy of the book?
b. What is the probability that the number who want new copies is more than two
standard deviations away from the mean value?
c. The bookstore has 18 new copies and 18 used copies in stock. If 30 people come
in one by one to purchase this text, what is the probability that all 30 will get the
type of book they want from current stock?
[Hint: Let X = the number who want a new copy. For what values of X will all 30
get what they want?]
d. Suppose that new copies cost $100 and used copies cost $70. Assume the
bookstore currently has 50 new copies and 50 used copies. What is the expected
value of total revenue from the sale of the next 30 copies purchased?
[Hint: Let h(X ) = the revenue of the 30 purchasers want new copies. Express this
as a linear function.]
Nguyễn Ngọc Tứ Lecture 3 2023-2024 22 / 42
The Binomial distribution - Exercises

Nguyễn Ngọc Tứ Lecture 3 2023-2024 23 / 42


The Binomial distribution - Homework
54. A particular type of tennis racket comes in a midsize version and an oversize
version. Sixty percent of all customers at a certain store want the oversize version.
a. Among ten randomly selected customers who want this type of racket, what is
the probability that at least five want the oversize version?
b. Among ten randomly selected customers, what is the probability that the number
who want the oversize version is within 1 standard deviation of the mean value?
c. The store currently has seven rackets of each version. What is the probability
that all of the next ten customers who want this racket can get the version they
want from current stock?

Nguyễn Ngọc Tứ Lecture 3 2023-2024 24 / 42


The Poisson distribution
Let λ be a positive real number. The Poisson distribution with
parameter λ is defined by
λk
pX (k) = e −λ
, k = 0, 1, . . .
k!
Notation: X ∼ P(λ) and EX = λ = Var (X )

85. Suppose small aircraft arrive at a certain airport according to a Poisson


process with rate α = 7 per hour, so that the number of arrivals during a
time period of t hours is a Poisson rv with parameter µ = 7t.
a. What is the probability that exactly 6 small aircraft arrive during a 1-hour
period? At least 7?
b. What are the expected value and standard deviation of the number of
small aircraft that arrive during a 90-min period?
c. What is the probability that at least 20 small aircraft arrive during a
2.5-hour period? That at most 10 arrive during this period?
Nguyễn Ngọc Tứ Lecture 3 2023-2024 25 / 42
The Poisson distribution

Nguyễn Ngọc Tứ Lecture 3 2023-2024 26 / 42


The Poisson distribution
The Poisson PMF with parameter λ is a good approximation for a
binomial PMF with parameter n and p, provided λ = np, n is very
large, and p is very small, i.e.,
λk n!
e −λ ≈ p k (1 − p)n−k , k = 0, 1, . . . , n
k! k!(n − k)!
The number of typos in a book with a total of n words and the
probability p that any one word is misspelled is very small.
- n = 100 and p = 0.01. Then the probability of k = 5 successes in
n = 100 trails
1. Using the binomial PMF
100!
0.015 (1 − 0.01)95 = 0.00290.
95!5!
2. Using the Poisson PMF with λ = np = 100 · 0.01 = 1
1
e −1 = 0.00306.
5!
Nguyễn Ngọc Tứ Lecture 3 2023-2024 27 / 42
The Poisson distribution
101. Of the people passing through an airport metal detector, 0.5% activate
it; let X among a randomly selected group of 800 who activate the detector.
a. What is the (approximate) pmf of X ?
b. Compute P(X = 4).
c. Compute P(X ≥ 4).

Nguyễn Ngọc Tứ Lecture 3 2023-2024 28 / 42


The Poisson distribution - Exercises
88. In proof testing of circuit boards, the probability that any particular
diode will fail is 0.015. Suppose a circuit board contains 600 diodes.
a. How many diodes would you expect to fail, and what is the standard
deviation of the number that are expected to fail?
b. What is the (approximate) probability that at least four diodes will fail
on a randomly selected board?
c. If five boards are shipped to a particular customer, how likely is it that
at least four of them will work properly? (A board works properly only if all
its diodes work.)

Nguyễn Ngọc Tứ Lecture 3 2023-2024 29 / 42


The Poisson distribution - Exercises

Nguyễn Ngọc Tứ Lecture 3 2023-2024 30 / 42


The geometric distribution
Tossing a coin until it comes up H. How long must we wait for the
game to end?
X = number of coin tosses until first head.
Assume independent tosses, 0 < P(H) = p < 1.
The PMF of X is

Nguyễn Ngọc Tứ Lecture 3 2023-2024 31 / 42


The geometric distribution
Tossing a coin until it comes up H. How long must we wait for the
game to end?
X = number of coin tosses until first head.
Assume independent tosses, 0 < P(H) = p < 1.
The PMF of X is

pX (k) = P(X = k) = P(TT . . . TH) = (1 − p)k−1 p, k = 1, 2 . . .

and
∞ ∞ ∞
X X
k−1
X 1
pX (k) = (1 − p) p=p (1 − p)k−1 = p =1
1 − (1 − p)
k=1 k=1 k=1

1 1−p
Notation: X ∼ Geo(p) and EX = , Var (X ) =
p p2
Nguyễn Ngọc Tứ Lecture 3 2023-2024 31 / 42
The geometric distribution - Exercise
1. When Anh plays chess against his favorite computer program, he wins
with probability 0.60. Assume independence. Find the probability that
a. Anh’s first win happens until he plays his third game.
b. Anh’s fifth win happens until he plays his eighth game.

Nguyễn Ngọc Tứ Lecture 3 2023-2024 32 / 42


The negative binomial distribution

Tossing a coin until it comes up k heads. How long must we wait for
the game to end?
X = number of coin tosses until the kth head is observed.
Assume independent tosses, 0 < P(H) = p < 1.
The trial is repeated until we attain k successes (heads).
n trails:
* k − 1 successes in the first n − 1 trails
* the nth trial was a success.
By independence,
k−1 k−1 k−1 k
P(X = n) = Cn−1 p (1 − p)n−k p = Cn−1 p (1 − p)n−k , n ≥ k.

Notation: X ∼ NB(n; k, p)

Nguyễn Ngọc Tứ Lecture 3 2023-2024 33 / 42


The negative binomial distribution - Example
2. A medical researcher is recruiting 10 subjects for a study on an exper-
imental drug for COVID-19. Each person that she interviews has a 65%
chance of being eligible to participate in the study.
a. What is the probability that she will have to interview 30 people?
b. What is the probability that she will have to interview more than 30
people?
Solution.
X = the number of people must be interviewed before selecting 10 subjects.

Nguyễn Ngọc Tứ Lecture 3 2023-2024 34 / 42


The negative binomial distribution - Example
Y = the number of failures that occur before the kth success is observed.
Then Y = X − k and
k−1 k
P(Y = y ) = Cn−1 p (1 − p)n−k = Cyk−1 k y
+k−1 p (1 − p) , y = 0, 1, . . .

k(1 − p) k(1 − p)
Notation: Y ∼ NB(y ; k, p) and EY = , Var (Y ) =
p p2
75. Suppose that P(male birth) = 0.49. A couple wishes to have exactly two female
children in their family. They will have children until this condition is fulfilled.
a. What is the probability that the family has y male children?
b. What is the probability that the family has four children?
c. What is the probability that the family has at most four children?
d. How many male children would you expect this family to have? How many
children would you expect this family to have?
Solution.
Y = the number of male children before a couple wishes to have exactly two female
children in their family
Nguyễn Ngọc Tứ Lecture 3 2023-2024 35 / 42
The negative binomial distribution - Example

Nguyễn Ngọc Tứ Lecture 3 2023-2024 36 / 42


The hypergeometric distribution

The assumptions leading to the hypergeometric distribution are as follows:


1. The set to be sampled consists of N elements (a finite population).
2. Each individual can be characterized as a success (S) or a failure (F),
and there are M successes in the population.
3. A sample of n individuals is selected without replacement in such a
way that each subset of size n is equally likely to be chosen.
The random variable of interest is X = the number of S’s in the
sample. The probability distribution of X depends on the parameters
n, M, and N, so we wish to obtain

P(X = x) = h(x; n, M, N).

Nguyễn Ngọc Tứ Lecture 3 2023-2024 37 / 42


The hypergeometric distribution

Proposition

If X is the number of S’s in a completely random sample of size n


drawn from a population consisting of M S’s and (N-M) F’s, then
the probability distribution of X, called the hypergeometric distri-
bution, is given by
x C n−x
CM N−M
P(X = x) = h(x; n, M, N) =
CNn

for x, an integer, satisfying max(0, n − N + M) ≤ x ≤ min(n, M).


 
M N −n M M
E (X ) = n · , V (X ) = ·n· · 1−
N N −1 N N

Nguyễn Ngọc Tứ Lecture 3 2023-2024 38 / 42


The hypergeometric distribution - Example
Five individuals from an animal population thought to be near extinction in a certain
region have been caught, tagged, and released to mix into the population. After
they have had an opportunity to mix, a random sample of 10 of these animals is
selected. Let X = the number of tagged animals in the second sample. If there are
actually 25 animals of this type in the region. Compute
a) P(X = 2) b) P(X ≤ 2) c) EX , V (X ).

Nguyễn Ngọc Tứ Lecture 3 2023-2024 39 / 42


The hypergeometric distribution - Example
Five individuals from an animal population thought to be near extinction in a certain
region have been caught, tagged, and released to mix into the population. After
they have had an opportunity to mix, a random sample of 10 of these animals is
selected. Let X = the number of tagged animals in the second sample. If there are
actually 25 animals of this type in the region. Compute
a) P(X = 2) b) P(X ≤ 2) c) EX , V (X ).
Solution. We have n = 10, M = 5, N = 25 and
10−x
C5x C20
P(X = x) = h(x; 10, 5, 25) = 10
, x = 0, 1, 2, 3, 4, 5
C25

C52 C20
8
a) P(X = 2) = h(2; 10, 5, 25) = = 0.385
C 10
P25 2
b) P(X ≤ 2) = P(X = 0, 1, 2) = x=0 h(x; 10,  5, 25) 
= 0.699
5 15 5 5
c) EX = 10 · = 2, V (X ) = · 10 · 1− =1
25 24 25 25

Nguyễn Ngọc Tứ Lecture 3 2023-2024 39 / 42


The hypergeometric distribution - Exercise
68. An electronics store has received a shipment of 22 table radios that have
connections for an iPod or iPhone. Twelve of these have two slots (so they can
accommodate both devices), and the other eight have a single slot. Suppose that
six of the 22 radios are randomly selected to be stored under a shelf where the
radios are displayed, and the remaining ones are placed in a storeroom. Let X =
the number among the radios stored under the display shelf that have two slots.
a. What kind of a distribution does X have (name and values of all parameters)?
b. P(X = 3), P(X ≤ 3), and P(X ≥ 3).
c. Calculate the mean value and standard deviation of X .

Nguyễn Ngọc Tứ Lecture 3 2023-2024 40 / 42


The hypergeometric distribution - Homework
69. Each of 17 refrigerators of a certain type has been returned to a distributor
because of an audible, high-pitched, oscillating noise when the refrigerators are
running. Suppose that 8 of these refrigerators have a defective compressor and the
other 6 have less serious problems. If the refrigerators are examined in random order,
let X be the number among the first 7 examined that have a defective compressor.
Compute the following:
a. P(X = 5), P(X ≤ 5).
b. The probability that X exceeds its mean value by more than 1 standard deviation.

Nguyễn Ngọc Tứ Lecture 3 2023-2024 41 / 42


The hypergeometric distribution - Homework
71. A geologist has collected 10 specimens of basaltic rock and 10 specimens of
granite. The geologist instructs a laboratory assistant to randomly select 15 of the
specimens for analysis.
a. What is the pmf of the number of granite specimens selected for analysis?
b. What is the probability that all specimens of one of the two types of rock are
selected for analysis?
c. What is the probability that the number of granite specimens selected for analysis
is within 1 standard deviation of its mean value?

Nguyễn Ngọc Tứ Lecture 3 2023-2024 42 / 42

You might also like