Stats 2 GA
Stats 2 GA
1. A customer will purchase a shirt with probability 0.5. The customer will purchase a
pant with probability 0.4 and will purchase both a shirt and a pant with probability 0.2.
What is the probability that the customer will purchase neither a shirt nor a pant?
Solution:
Let A be the event that the customer will purchase a shirt and B be the event that the
customer will purchase a pant.
Given that, P (A) = 0.5 and P (B) = 0.4.
Also given that the customer will purchase both a shirt and a pant with probability 0.2.
i.e. P (A ∩ B) = 0.2.
We have to find the probability that the customer will purchase neither a shirt nor a
pant i.e. P (AC ∩ B C ).
We know that P (AC ∩ B C ) = P ((A ∪ B)C ) = 1 − P (A ∪ B)
And, P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = 0.5 + 0.4 − 0.2 = 0.7
⇒ P (AC ∩ B C ) = 1 − P (A ∪ B) = 1 − 0.7 = 0.3
2. Suppose that we roll a pair of fair dice, so each of the 36 possible outcomes is equally
likely. Let A denote the event that the first die shows 5, B be the event such that the
sum of the outcomes of rolling the pair of dice is 10, and C be the event such that the
sum of the outcomes of rolling the pair of dice is 7. Then
Solution:
We are rolling a pair of fair dice and all the 36 outcomes is equally likely that means
probability of occurring each outcome is same i.e. 1/36.
A is the event that the first die shows 5.
⇒ A = {(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6)}
B is the event that the sum of the outcomes of rolling the pair of dice is 10.
⇒ B = {(4, 6), (5, 5), (6, 4)}
C is the event that the sum of the outcomes of rolling the pair of dice is 7.
⇒ C = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
Also, A ∩ B = {(5, 5)} and A ∩ C = {(5, 2)}
Since each outcome is equally likely, so
1
6 3 6 1 1
P (A) = 36
, P (B) = 36
, P (C) = 36
, P (A ∩ B) = 36
and P (A ∩ C) = 36
1
Since P (A ∩ B) = 36
6= P (A)P (B) ⇒ event A and B are not independent.
1
Also, P (A ∩ C) = 36 = 61 × 16 = P (A)P (C) ⇒ event A and C are independent.
Hence, option (b) and (c) are correct.
3. Let A and B be two independent events of a random experiment. Then, which of the
following is/are always true?
Solution:
Given that A and B are two independent events ⇒ P (A ∩ B) = P (A)P (B).
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
= P (A) + P (B) − P (A)P (B)
= P (A)[1 − P (B)] + P (B)
= P (A)P (B C ) + P (B)
4. The probability that a student registered for IITM online degree program will pass the
qualifier exam is 0.6 independent of all other students. Find the probability that out of
10,000 registered students, 7,000 students will pass the qualifier exam.
2
a) (0.6)3000 (0.4)7000
b) (0.6)7,000 (0.4)3,000
10,000
c) C7,000 (0.6)3,000 (0.4)7,000
10,000
d) C7,000 (0.6)7,000 (0.4)3,000
Solution:
Probability(p) that the student registered for IITM online degree program will pass the
qualifier exam is 0.6.
We have to find the probability that out of 10,000 registered students, 7,000 students will
pass the qualifier exam and passing qualifier exam for any student will be independent
of the other.
So here we can use binomial distribution with X will be number of students who will
pass the exam along with p = 0.6, n = 10, 000, and k = 7, 000.
And we know that for binomial distribution P (X = k) = n Ck pk (1 − p)(n−k)
Hence, probability that out of 10,000 registered students, 7,000 students will pass the
qualifier exam is 10,000 C7,000 (0.6)7,000 (0.4)3,000 .
a) (0.05)6 × 0.95
b) (0.95)6 × 0.05
c) (0.95)5 × 0.05
d) (0.05)5 × 0.95
Solution:
We have to find the probability that the first defect is observed when the sixth compo-
nent is tested.
The probability of a defective computer component is 0.05.
Here we can assume that getting a defective component is success. That means we have
to find the probability of first success at 6th trials with p given as 0.05.
So here we can use geometric distribution with X representing the number of compo-
nents tested along with p = 0.05
And we know that for geometric distribution P (X = k) = (1 − p)k−1 p.
3
⇒ P (X = 6) = (1 − 0.05)6−1 × 0.05
⇒ P (X = 6) = (0.95)5 × 0.05
Hence the probability that the first defect is observed when the sixth component is tested
is (0.95)5 × 0.05.
6. If Aarushi and Ansh play a game of chess, Aarushi wins with probability 0.5 and Ansh
wins with probability 0.4 and the game ends in a draw with probability 0.1, independent
of all other games. They agree to play a match consisting of 5 games. Find the proba-
bility that Aarushi wins 4-1 (win gives 1 pt to winner and draw gives 0.5 pts to both).
Enter your answer correct to 3 decimals accuracy.
Solution:
Let Ai be the event that Aarushi will win the ith game and Bj be the event that Ansh
will win the jth game.
From given information we have P (Ai ) = 0.5, P (Bj ) = 0.4
There are two disjoint ways that Aarushi wins 4-1.
i) Aarushi wins 4 games and Ansh wins one game.
Probability of happening this will be 5 C4 (0.5)4 × 0.4 = 0.125
ii) Aarushi wins 3 games and 2 games are drawn.
Probability of happening this will be 5 C3 (0.5)3 × (0.1)2 = 0.0125
So, the probability that Aarushi wins 4-1 is 0.125 + 0.0125 = 0.1375
7. The probability of someone catching flu in a particular winter when they have been
given the flu vaccine is 0.2. Without the vaccine, the probability of catching flu is 0.5. If
40% of the population has been given the vaccine, what is the probability that a person
chosen at random from the population will catch flu over that winter? Enter the answer
correct to 2 decimals accuracy.
Solution:
Let A be the event that the person will catch flu and V be the event that the person
has been given the vaccine.
Given that P (A | V ) = 0.2, P (A | V C ) = 0.5 and P (V ) = 0.4
We have to find the probability that a person chosen at random from the population
will catch flu over that winter i.e. P (A).
And we can write P (A) = P (A | V )P (V ) + P (A | V C )P (V C )
⇒ P (A) = 0.2 × 0.4 + 0.5 × (1 − 0.4)
⇒ P (A) = 0.38
8. Suppose you are playing a game of cards with your friend. Your friend is supposed
to give you 13 cards one by one. With a well-shuffled pack of 52 cards, what is the
probability that you are dealt a perfect hand(13 of one suit)?
13!
a)
52!
4
12! × 39!
b)
51!
13! × 39!
c)
51!
13! × 39!
d)
52!
Solution:
Your friend is supposed to give you 13 cards one by one. Need to find the probability
that you are dealt a perfect hand i.e. you have gotten 13 cards of one suit.
For the first card, it can be any card from the 52 cards so probability will be 1.
Once the first card is given to you, the probability for the second card to be of same suit
will be 12
51
because once the first card is given to you it will belong to one particular suit
and second card will be conditional on that.
11
Similarly for the third card, probability will be 50 .
Continue like this, we get that the probability that you are dealt a perfect hand is
12 11 10 9 8 7 6 5 4 3 2 1
=1× × × × × × × × × × × ×
51 50 49 48 47 46 45 44 43 42 41 40
12! × 39!
=
51!
9. A person has bought a bed from an online furniture store. The seller delivers the
disassembled bed parts along with some screws to assemble it. The probability of a
screw being defective is 0.1 independent of all other screws. To compensate for the
manufacturing error, the seller sends two extra screws in the package where the bed
needs exactly 8 screws to assemble. What is the probability that the buyer will be able
to assemble the bed? (Enter the answer correct to 4 decimal accuracy)
Solution:
Let X represents the number of screws that seller sends with the bed.
We need exactly 8 screws to assemble the bed and the seller sends two extra i.e. seller
sends ten screws.
The buyer will be able to assemble the bed if 8 screws are non - defective or 9 screws
are non - defective or 10 screws are non - defective out of the ten screws.
We can relate this with binomial distribution as X ∼ Binomial(10, p) where p is the
probability of a screw being non - defective and value of p will be 1 - 0.1 = 0.9
The buyer will be able to assemble the bed if at least 8 screws are non - defective.
So, the probability that the buyer will be able to assemble the bed is P (X ≥ 8).
5
And
P (X ≥ 8) = P (X = 8) + P (X = 9) + P (X = 10)
= 10 C8 (0.9)8 (0.1)2 + 10 C9 (0.9)9 (0.1)1 + 10 C10 (0.9)10 (0.1)0
= (0.9)8 [(0.1)2 × 45 + 10 × 0.9 × 0.1 + 0.81]
= (0.9)8 × 2.16
= 0.9298
10. In a pizza shop 40% of the customers order medium size pizza, 50% order small size
pizza, and 10% order large size pizza. Of those ordering medium size pizza 32 also ask to
add extra toppings. Of those ordering small size pizza 15 also ask to add extra toppings,
and of those ordering large size pizza 45 also ask to add extra toppings. Given that a
customer asked to add extra toppings, find the conditional probability that the customer
ordered a medium pizza.
15
a) 67
40
b) 67
12
c) 67
52
d) 67
Solution:
Let S, M and L denote the event that customer will order small, medium and large size
pizza, respectively.
Given that P (S) = 0.50, P (M ) = 0.40 and P (L) = 0.10.
Also, let T be the event that customer will ask to add extra toppings.
This implies that P (T | S) = 15 , P (T | M ) = 23 and P (T | L) = 54 .
We need to find P (M | T ).
And
P (M ∩ T )
P (M | T ) =
P (T )
P (T | M )P (M )
=
P (T | S)P (S) + P (T | M )P (M ) + P (T | L)P (L)
2
3
× 0.40
= 1
5
× 0.50 + 3 × 0.40 + 45 × 0.10
2
0.80 15
= ×
3 6.7
40
=
67
6
Statistics for Data Science - 2
Practice Assignment 0.2.2 Solution
Events and Probabilities
1. The probability that an electrical machine will work more than 5 years but less than
8 years is 0.6 and the probability that it will work at least 8 years is 0.1. What is the
probability that the machine will work for more than 5 years? [1 mark]
Solution:
Define events A and B as follows:
A = Event that electrical machine will work more than 5 years.
B = Event that electrical machine will work more than 8 years.
From the given information,
P (A \ B) = 0.6
P (B) = 0.1
Now,
A = (A \ B) ∪ (A ∩ B)
Note that A ∩ B = B
⇒A = (A \ B) ∪ B
⇒P (A) = P ((A \ B) ∪ B)
⇒P (A) = P (A \ B) + P (B) (Since, A \ B and B are disjoints events.)
⇒P (A) = 0.6 + 0.1 = 0.7
2. Five cards are drawn from a well-shuffled pack of playing cards with replacement. Find
the probability that there will be at least two aces. [1 mark]
5
1
(a)
13
5
12
(b)
13
5 4
12 12
(c) 1 − −5
13 135
5
1 12
(d) 1 − −5
13 135
Solution:
Since, cards are drawn with replacement, probability of drawing ace in every draw will
4 1
be same and equal to =
52 13
P (There will be at least two aces) = 1 − P (There will be no ace) − P (There will be one ace)
0 5 ! 1 4 !
1 12 1 12
= 1 − 5 C0 − 5 C1
13 13 13 13
5 4
12 12
=1− −5
13 135
3. Choose the correct statements for any two non empty events A and B. [2 mark]
Solution:
We know that A ∩ B ⊆ A,
Then by using subset property, we have
P (A) = P (A ∩ B) + P (A \ (A ∩ B))
⇒P (A) = P (A ∩ B) + P (A \ B) (Since,A \ (A ∩ B) = A \ B)
⇒P (A \ B) = P (A) − P (A ∩ B) ....(1)
Therefore, option (a) is not necessarily true while option (b) is correct.
P (A \ B) = P (A) − P (A ∩ B)
= P (A) − [P (A) + P (B) − P (A ∪ B)] (By addition rule)
= P (A ∪ B) − P (B)
B ⊂ A ⇒ A ∩ B = B. ...(2)
From equation (1) and (2), we have
If B ⊂ A, then P (A \ B) = P (A) − P (B)
Therefore, option (d) is correct.
Page 2
If A and B are disjoint events, then P (A ∩ B) = 0 .. (3)
From equation (1) and (3), we have
If A and B are disjoint events, then P (A \ B) = P (A)
Therefore, option (e) is correct.
A∪B∪C =S ...(1)
And
1
P (A ∪ B) = ...(2)
2
Now, we know that (Proved in the previous question): P (A \ B) = P (A ∪ B) − P (B)
for any two events A and B. Using this, we have
P (C \ (A ∪ B)) = P (A ∪ B ∪ C) − P (A ∪ B)
= P (S) − P (A ∪ B)
1 1
=1− =
2 2
5. Two friends Ravi and Sonali are playing a game in which they are hitting a target in
rounds. In each round, both hit the target independent of each other with a probability
of 0.5. The first one who hits the target three times wins the game. What is the
probability that in the fifth round Sonali wins the game? [2 marks]
1. 6 × (0.5)5
2. 30 × (0.5)10
3. 96 × (0.5)5
4. 96 × (0.5)10
Solution:
Define Events A and B as follows:
A = Ravi hits the target.
B = Sonali hits the target.
Given that
P (A) = P (B) = 0.5 ...(1)
Page 3
Sonali will win in the fifth round if Sonali hits her target third time in the fifth round
and Ravi hits target 0 or 1 or 2 times out of five rounds.
Probability that Sonali will hit the target third time in her fifth round = 4 C2 (0.5)2 (0.5)3
= 6 × (0.5)5
= 16 × (0.5)5
Therefore, Probability that Sonali wins in the fifth round = 6 × (0.5)5 × 16 × (0.5)5
= 96 × (0.5)10
6. A family has three children each of which is equally likely to be a boy or a girl indepen-
dently to each other. Let A be the event that at most one child is a boy. B be the event
that the family has at least one girl and one boy. C be the event that all three children
are of same-sex. Choose the correct options. [2 mark]
Solution:
Since, a family has three children each of which is equally likely to be a boy or a girl
independently to each other, sample space of gender of all three children (in the order
of elder to younger) will be
Page 4
B = Family has at least one girl and one boy
B = {bbg, bgb, gbb, bgg, gbg, ggb}
A ∩ C = Event that at most one child is a boy and all three children are of same sex.
⇒ A ∩ C = Event that all three children are girls
A ∩ C = {ggg}
Page 5
B ∩ C = Family has at least one boy and one girl and all three children are of same sex.
⇒ B ∩ C = Empty event
P (B ∩ C) = 0 ...(6)
⇒B and C are disjoint events.
7. In a town, 60% of the residents are eligible for voting in an election but only 80 % of the
eligible residents voted in the election. A person is randomly selected from the town.
What is the conditional probability that the person is eligible for the voting given that
he or she did not vote? [2 mark]
2
1.
13
3
2.
13
4
3.
13
6
4.
13
Define events A and B as follows:
A = randomly selected person is eligible for voting.
Page 6
B = randomly selected person has voted.
Given that
P (A) = 0.6
P (B|A) = 0.8
⇒ P (B C |A) = 0.2
Note that
P (B C |AC ) = 1
To find: P (A|B C )
P (B C |A).P (A)
p(A|B C ) =
P (B C |A).P (A) + P (B C |AC ).P (AC )
(0.2)(0.6)
=
(0.2)(0.6) + (1)(0.4)
0.12 3
= =
0.52 13
8. Urn A contains 3 red and 2 blue marbles while urn B contains 2 red and 8 blue marbles.
A fair coin is tossed. If the coin turns up head, a marble is chosen from urn A. If it
turns up tail, a marble is chosen from urn B. Suppose Shreya who tosses the coin gets
a red color marble. What is the conditional probability that the marble is drawn from
the urn A? (Answer the question correctly up to two decimal points.) [2 marks]
Solution:
Define the events as follows:
H = Coin turns up head.
T = Coin turns up tail.
R = Red marble is drawn.
B = Blue marble is drawn.
3
P (R|H) =
5
2
P (B|H) =
5
2 1
P (R|T ) = =
10 5
8 4
P (B|T ) = =
10 5
Page 7
Marble is drawn from urn A if the coin turns up head.
P (R|H).P (H)
P (H|R) =
P (R|H).P (H) + P (R|T ).P (T )
3 1
.
= 3 15 21 1
. + 5.2
5 2
3
=
4
9. Three different tasks were assigned to three persons A, B, and C. Previous records show
that A, B, and C will complete their tasks independent of each other with probabilities
of 12 , 23 , and 34 , respectively. If it is known that exactly two of them have completed their
tasks, then what is the conditional probability that A has not completed his task? [3
marks]
3
(a) 4
3
(b) 11
6
(c) 11
9
(d) 11
D = (A ∩ B ∩ C C ) ∪ (A ∩ B C ∩ C) ∪ (AC ∩ B ∩ C)
P (D) = P ((A ∩ B ∩ C C ) ∪ (A ∩ B C ∩ C) ∪ (AC ∩ B ∩ C))
Since, A ∩ B ∩ C C , A ∩ B C ∩ C, and AC ∩ B ∩ C are disjoint events, we have
P (D) = P (A ∩ B ∩ C C ) + P (A ∩ B C ∩ C) + P (AC ∩ B ∩ C)
Since, A, B, and C are independent events, we have
P (D) = P (A)P (B)P (C C ) + P (A)P (B C )P (C) + P (AC )P (B)P (C)
Page 8
Now,
P (AC ∩ D)
P (AC |D) =
P (D)
P (AC ∩ B ∩ C)
=
P (A)P (B)P (C C ) + P (A)P (B C )P (C) + P (AC )P (B)P (C)
1 2 3
. .
2 3 4
= 1 2 1 1 1 3
. .
2 3 4
+ . .
2 3 4
+ 12 . 23 . 43
6
=
11
10. There are twenty boxes out of which exactly fifteen contains gifts and five are empty.
Five boxes are removed randomly. Now, a person selects one box from the remaining
boxes, then what is the probability that the person selects the empty box? [3 marks]
(Hint: Consider all the cases of removing empty boxes and apply the law of total prob-
ability)
1
(a) 4
2
(b) 3
3
(c) 4
1
(d) 3
Solution:
Define the events A, B, C, D, E,and F as follows:
A = Removed boxes contain no empty box.
B = Removed boxes contain one empty box.
C = Removed boxes contain two empty boxes.
D = Removed boxes contain three empty boxes.
E = Removed boxes contain four empty boxes.
F = Removed boxes contain five empty boxes.
Let X be the event that person selects the empty box.
Page 9
P (X) = P (A).P (X|A) + P (B).P (X|B) + P (C).P (X|C) + P (D).P (X|D) + P (E).P (X|E)
+ P (F ).P (X|F )
15
C5 5 C0 5 15
C4 5 C1 4 15
C3 5 C2 3 15
C2 5 C3 2 15
C1 5 C4 1 15
C 0 5 C5 0
= 20 + 20 + 20 + 20 + 20 + 20
C5 15 C5 15 C5 15 C5 15 C5 15 C5 15
1
= (15015 + 27300 + 13650 + 2100 + 75)
15504 × 15
1
=
4
Page 10
Statistics for Data Science - 2
Practice Assignment 0.3.1 Solution
Discrete random variable
1. A random variable X is defined as the length of the hypotenuse of the right-angled tri-
angle whose other two sides are determined by the roll of two 6-sided dice. How many
values does X take? [1 mark]
Solution:
When two dice are rolled then there are a total of 36 outcomes.
The outcomes are:
{(1, 1) , (1, 2), ... , (1, 6),
(2, 1), (2, 2), ... , (2, 6),
...
...
(6, 1), (6, 2), ... , (6, 6)}
But the outcomes like (1, 2) (2, 1) will give the same length of the hypotenuse, hence a
total of 21 values are possible for the random variable X.
2. Two cards are drawn from a well shuffled pack of 52 cards one after other without
replacement. A random variable is defined as:
(
0 if both cards are of same color
X=
1 if both cards are of different color
x 0 1
(a) 1 12
fX (x) 13 13
x 0 1
(b) 1 1
fX (x) 2 2
x 0 1
(c) 25 26
fX (x) 51 51
x 0 1
(d) 12 13
fX (x) 25 25
Solution:
P (X = 0) = P (Both the cards are of same colors)
= P (First card is any one of 52 cards).P (2nd card is of same color as of 1st card)
25
= 1.
51
3. In a group of fifteen people, 8 people have blood group type O, 4 people have blood group
type A, and 3 people have blood group type B. If five people are selected randomly from
these fifteen people, then what is the probability that out of these five people 2 people
have blood group type O, 2 have blood group type A and one has blood group type B?
(Answer the question correct up to two decimal places.) [2 mark]
Solution:
Number of ways of selecting five people out of 15 = 15 C5
Number of ways of selecting 2 people of blood group of type O out of 8 people of blood
group of type O= 8 C2
Number of ways of selecting 2 people of blood group of type A out of 4 people of blood
group of type A= 4 C2
Number of ways of selecting 1 people of blood group of type B out of 3 people of blood
group of type B= 3 C1
8
C2 4 C2 3 C1
Therefore, required probability = 15 C
5
28 × 6 × 3
. = = 0.167
3003
x -2 -1 0 1 2
fX (x) a 0.2 b 0.1 0.2
Table: PMF of X
3
If P (X ≤ 1|X ≥ −1) = , then find the value of P (X = −2). [2 marks]
4
Solution:
Page 2
We know that
X
fX (x) = 1
x∈TX
3
P (X ≤ 1|X ≥ −1) =
4
P (X ≤ 1, X ≥ −1) 3
⇒ =
P (X ≥ −1) 4
P ({−1, 0, 1}) 3
⇒ =
P ({−1, 0, 1, 2}) 4
b + 0.3 3
⇒ =
b + 0.5 4
⇒4b + 1.2 = 3b + 1.5
⇒b = 0.3 ...(2)
a = 0.2
b = 0.3
P (X = −2) = a
⇒P (X = −2) = 0.2
5. Siberian seagulls migrate to Ganga river to escape harsh winter weather in the months
of October to March. It is seen that the number of Siberian seagulls reaching Ganga
river on one day in January is Poisson distributed with an average of 1000. What is the
probability that 650 seagulls will arrive on a given day of January? [2 marks]
e−650 (650)1000
(a)
650!
−650
e (650)1000
(b)
1000!
−1000
e (650)1000
(c)
650!
Page 3
e−1000 (1000)650
(d)
650!
Solution:
Let X be the number of Siberian seagulls migrating everyday near to Ganga river.
By given condition, we have
X ∼ Poisson(1000)
e−λ λx
P (X = x) =
x!
e−1000 (1000)650
⇒P (X = 650) =
650!
x -1 0 1 2 3
fX (x) 0.1 0.3 0.2 0.1 0.3
Table: PMF of X
If another random variable Y is defined as Y = X(X − 1), then find the smallest value
1 1
of y in the range of Y such that P (Y ≤ y) > and P (Y ≥ y) ≤ . [2 marks]
2 2
Solution:
Y is defined as Y = X(X − 1)
At X = −1, Y = −1(−2) = 2
At X = 0, Y = 0(−1) = 0
At X = 1, Y = 1(0) = 0
At X = 2, Y = 2(1) = 2
At X = 3, Y = 3(2) = 6
Therefore, TY = {0, 2, 6}
Page 4
First required condition is not satisfied at Y = 0.
7. Three friends toss three fair coins to decide who is going to pay for the dinner. The per-
son getting an outcome different from the other two outcomes will pay for the dinner. If
all three coins result in the same outcome, they will toss the coins again. If X denotes
the number of trials needed to decide who is going to pay, then what is the probability
that X is at most 3? (Answer the question correct up to two decimal places.) [2 marks]
Solution:
Let X be the number of trials to decide who is going to pay.
Sample space on tossing three coins are:
{HHH, HHT, HTH, THH, HTT, THT, TTH, TTT }
P (They will decide who is going to pay) = P ({ HHT, HTH, THH, HTT, THT, TTH }) =
6
8
= 43
P (They will not decide who is going to pay) = P ({ HHH, TTT }) = 28 = 14
X will take values as 1, 2, 3, 4, ...
and X ∼ Geometric( 34 )
P (X ≤ 3) = P (X = 1) + P (X = 2) + P (X = 3)
2
3 1 3 1 3
= + . + .
4 4 4 4 4
= 0.98
6
8. Let X ∼ Uniform({1, 2, 3, ... n}). If the probability that X is an odd number is ,
11
then what can be the value of n? [2 marks]
(a) 11 only
(b) 12 only
(c) Any multiple of 11.
(d) Any odd multiple of 11.
Solution:
Since, X ∼ Uniform({1, 2, 3, ... n})
Let A be the event that X takes odd numbers.
Therefore,
number of outcomes in A
P (A) = ...(1)
number of outcomes in S
where S = {1, 2, 3, ...n}
Page 5
It is given that
6
P (A) = ...(2)
11
By equation (1) and (2), we have
n should be multiple of 11 and number of odd numbers less than or equal to n should
be multiple of 6.
This is possible only for n = 11.
9. The number of customers arriving per day at a certain automobile service facility is
assumed to follow a Poisson distribution with an average of 50 customers arriving each
day. Assume that number of customers on different days are independent. What is the
probability that exactly 40 customers will come for at least 5 days over a 30 days period?
[3 marks]
4
X x 30−x
30 e−50 (50)40 e−50 (50)40
(a) 1 − Cx 40!
1− 40!
x=0
4
X x 30−x
30 e−50 (50)40 e−50 (50)40
(b) Cx 40!
1− 40!
x=0
5 −50 (50)40 25
e−50 (50)40
(c) 30
C5 . 1 − e 40!
40!
−50 (50)40 5
−50 40 25
(d) 30 C5 1 − e 40! . e 40!(50)
Solution:
Let X be the number of customers arriving per day at a certain automobile service
facility.
X ∼ P oisson(50)
e−50 5040
P (X = 40) =
40!
Let Y be the number of days in the next 30 days on which 40 customers have arrived
on that particular shop.
e−50 5040
Then, Y ∼ Binomial 30,
40!
Now,
P (Y ≥ 5) = 1 − P (Y < 5)
4 x 30−x !
e−50 (50)40 e−50 (50)40
X
30
1− Cx 1−
40! 40!
x=0
Page 6
10. A biased coin with the probability of 0.4 of showing head is tossed until it shows either
two consecutive heads or two consecutive tails. If X denotes the number of tosses
required, what is the value of P (X = 5)? [3 marks]
(a) 0.03456
(b) 0.02304
(c) 0.01675
(d) 0.0576
Solution:
It is clear that
P (X = 5) = P (HTHTT) + P (THTHH)
= (0.4)2 (0.6)3 + (0.4)3 (0.6)2
= 0.0576
Page 7
Statistics for Data Science - 2
1. Toss a coin 50 times. Let the random variable X be defined as the number of tails
observed. Find the average of the values in the range of the random variable.
Solution:
Random variable X is defined as the number of tails observed while tossing the coin 50
times.
So the possible values taken by X is 0, 1, 2, 3 ....48, 49, 50.
⇒ Range of X = {0, 1, 2, 3....., 48, 49, 50}
Average of range values = sum of all values of range/ total number of values
0+1+2+3+.....+48+49+50 1275
⇒ Average of range values = 51
= 51
= 25
2. Suppose that 5 fruits are randomly chosen from a basket containing 20 fruits, of which
16 are good and 4 are rotten. Let Y denote the number of rotten fruits chosen. Find
the possible values taken by Y .
a) {1, 2, 3, 4, 5}
b) {0, 1, 2, 3, 4, 5}
c) {1, 2, 3, 4}
d) {0, 1, 2, 3, 4}
Solution:
Random variable Y is defined as the number of rotten fruits chosen from the basket
while drawing 5 fruits. Since there are only 4 rotten fruits, so Y cannot take values more
than 4. Also there are 16 good fruits, so while drawing fruits there can be 0 rotten fruit
or 1 rotten fruit or 2 rotten fruits or 3 rotten fruits or 4 rotten fruits.
Hence, the possible values taken by Y i.e Range = {0, 1, 2, 3, 4}.
3. Let X be the number of candies present in a box. We have the following information:
There are at most four candies in the box.
The probability of having 2 candies in the box is the same as the probability of having
one candy.
The probability of having no candy in the box is the same as the probability of having
3 candies.
The probability of having four candies is twice of the probability of having three candies
and four times of having two candies.
What will be the PMF of X?
1
X 0 1 2 3 4
a) 1 1 2 2 4
P (X = x) 10 10 10 10 10
X 0 1 2 3 4
b) 2 1 1 2 4
P (X = x) 10 10 10 10 10
X 0 1 2 3 4
c) 1 2 1 2 4
P (X = x) 10 10 10 10 10
X 0 1 2 3 4
d) 4 2 1 1 2
P (X = x) 10 10 10 10 10
Solution:
Given that there are at most four candies in the box, so X cannot take values more than
4.
Also given that
P (X = 2) = P (X = 1), P (X = 0) = P (X = 3), P (X = 4) = 2P (X = 3) and
P (X = 4) = 4P (X = 2).
Let P (X = 2) = p and P (X = 0) = q
⇒ 2q = 4p
⇒ q = 2p
And we know that P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3) + P (X = 4) = 1
⇒ q + p + p + q + 2q = 1
⇒ 4q + 2p = 1
Using the above relation, we will get 4 × 2p + 2p = 1
⇒ p = 1/10 and hence q = 2/10.
So, P (X = 0) = 2/10, P (X = 1) = 1/10, P (X = 2) = 1/10, P (X = 3) = 2/10, and
P (X = 4) = 4/10.
Therefore, option b is the correct answer.
X 0 1 2 3 4 5 6
P (X = x) 0 k 4k 6k 4k 10k 2 6k 2
Table: PMF of X
Find the value of P (X ≤ 4). Enter your answer correct up to 4 decimals accuracy.
Solution:
P6
We know that P (X = x) = 1
x=0
⇒ P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3) + P (X = 4) + P (X = 5) + P (X = 6)
=1
⇒ 0 + k + 4k + 6k + 4k + 10k 2 + 6k 2 = 1
2
⇒ 16k 2 + 15k − 1 = 0
⇒ (16k − 1)(k + 1) = 0
⇒ k = −1 or k = 1/16
Since k cannot take negative values, so k must be 1/16.
Now,
P (X ≤ 4) = P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3) + P (X = 4)
= 0 + k + 4k + 6k + 4k
= 15k
1
= 15 ×
16
= 0.9375
5. I roll two fair six sided dice and observe the two outcomes. Let the random variables
Y and Z denote the outcomes observed on the two dice and let X = Y + Z. Find
P (Y = 3|X = 6).
Solution:
Y and Z denotes the outcomes observed on the two dice.
Given X = Y + Z, so the favourable outcomes for X = 6 will be {(1,5),(2,4),(3,3),(4,2),
(5,1)}.
From the reduced sample space the favourable outcomes for (Y = 3|X = 6) will be
{(3,3)}.
Hence, P (Y = 3|X = 6) = 51 = 0.2
3
At X = 3
Y = (3 − 1)(3 + 1)(3 + 3) = 48
This implies that Y is taking values -3, 0, 15, and 48.
So,
7. A shopkeeper sells mobile phones. The demand for mobile phone follows a Poisson dis-
tribution with mean 4.6 per week. The shopkeeper has 5 mobile phones in his shop at
the beginning of a week. Find the probability that this will not be enough to satisfy the
demand for mobile phones in that week. Enter your answer correct up to two decimals
accuracy.
Solution:
The shopkeeper has 5 mobile phones in his shop at the beginning of a week. The shop-
keeper will not be able to satisfy the demand for mobile phones in that week only if
the demand of mobile phone is more than 5 phones. So, we need to find the value of
P (X > 5).
Also given that demand for mobile phone follows a Poisson distribution with mean 4.6
per week. i.e. λ = 4.6
P (X > 5) = 1 − P (X ≤ 5)
= 1 − [P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3) + P (X = 4) + P (X = 5)]
−4.6
e (4.6)0 e−4.6 (4.6)1 e−4.6 (4.6)2 e−4.6 (4.6)3 e−4.6 (4.6)4 e−4.6 (4.6)5
=1− + + + + +
0! 1! 2! 3! 4! 5!
−4.6
= 1 − e [1 + 4.6 + 10.58 + 16.22 + 18.66 + 17.16]
= 1 − 0.68
= 0.32
8. Suppose that in the end semester paper of Statistics there are 18 multiple-choice ques-
tions (only one option is correct for each question). Each question has 4 possible options.
You know the answer to 8 questions, but you have no idea about the other 10 questions
and choose answers randomly and independently. Your score X of the exam is the total
number of correct answers. Find the value of P (X ≥ 12). Enter your answer correct up
to 2 decimals accuracy.
Solution:
Since your score is the total number of correct answers and you know the answer to 8
questions.
4
So, instead of finding the value of P (X ≥ 12), define a new random variable Y and
find the value of P (Y ≥ 4) from the set of 10 questions for which you do not know the
answer.
Also there are four options to each question and only one is correct. That means prob-
ability of getting an answer correct is 1/4 and each question is independent of other.
So we can use binomial distribution with n = 10 and p = 0.25
Now,
P (Y ≥ 4) = 1 − P (Y < 4)
= 1 − [P (Y = 0) + P (Y = 1) + P (Y = 2) + P (Y = 3)]
h 1 0 3 10 1 1 3 9 1 2 3 8 1 3 3 7 i
= 1 − 10 C0 + 10 C1 + 10 C2 + 10 C3
4 4 4 4 4 4 4 4
3 7 h 3 3 1 1 3 2 1 2 3 1 1 3 i
=1− + 10 + 45 + 120
4 4 4 4 4 4 4
3 7 h 372 i
=1−
4 64
= 1 − 0.78
= 0.22
9. A fruit owner sells fruit in a lot that contains 50 fruits. A customer selects 5 fruits at
random from a lot and rejects the lot (will not purchase) if one of the 5 selected fruits
is rotten. What is the probability that the customer will purchase the lot if there are 4
rotten fruits in the lot? Enter your answer correct up to 2 decimals accuracy.
Solution:
Given that there are 4 rotten fruits in the lot that contains 50 fruits.
Customer will purchase the lot if out of 5 selected fruits there is no rotten fruit.
Probability that there will not be any rotten fruit in 5 selected fruits will be
4
C0 46 C5 1370754
50 C
= = 0.6469
5 2118760
Accepted range: 0.61 - 0.67
10. Suppose the probability that any given person will independently believe a tale about
the existence of a parallel universe is 0.6. What is the probability that the eighth person
to hear this tale about existence of a parallel universe is the fifth one to believe it?
a) 8 C5 (0.6)5 (0.4)3
7
b) C4 (0.6)5 (0.4)3
c) 8 C5 (0.6)3 (0.4)5
d) 7 C4 (0.6)3 (0.4)5
5
Solution:
Given that the probability that any given person will believe a tale about the existence
of parallel universe is 0.6.
We need to find the probability that the eighth person to hear this tale about existence
of parallel universe is the fifth one to believe it.
We can put this into other words as out of 7 trials we need 4 successes and 8th trial also
a success.(Here success is considered as the probability that the person will believe the
tale about the existence of parallel universe)
Probability of getting 4 successes out of 7 will be 7 C4 (0.6)4 (0.4)3
Combining that 8th trial also, success will be 7 C4 (0.6)4 (0.4)3 × 0.6.
This implies that the probability that the eighth person to hear this tale about existence
of parallel universe is the fifth one to believe it is 7 C4 (0.6)5 (0.4)3 .
11. Suppose the number of visitors arriving at a zoo can be modeled to be Poisson dis-
tributed. On an average 20 visitors arrive per hour. Let X be the number of visitors
arriving from 2pm to 4pm. Then the probability that at least 35 visitors will arrive in
the given duration is
k=∞
P e−20 (20)k
a)
k=35 k!
k=34
P e−20 (20)k
b) 1 −
k=0 k!
k=∞
P e−40 (40)k
c)
k=35 k!
k=34
P e−40 (40)k
d) 1 −
k=0 k!
Solution:
Given that on an average 20 visitors arrive per hour and X is the number of visitors
arriving from 2pm to 4pm. So, here λ = 20 × 2 = 40
Now we have to find the probability that at least 35 visitors will arrive in the given
duration, that is from 2pm to 4pm.
6
Also we can write
7
Statistics for Data Science - 2
Week 1 Graded Assignment
Multiple random variables
X
0 1
Y
1 1
1
4 8
1
2 k
4
1
3 0
8
1 1 1 1
⇒ + + +k+0+ =1
4 8 4 8
3 1
⇒k = 1 − =
4 4
Now,
fXY (1, 2)
fY |X=1 (2) =
fX (1)
fXY (1, 2)
=
fXY (1, 1) + fXY (1, 2) + fXY (1, 3)
1
4
= 1 1 1
8
+ + 4 8
1
4 1
= 1 =
2
2
2. Customers at a fast-food restaurant buy both sandwiches and drinks. The following
joint distribution summarizes the numbers of sandwiches (X) and drinks (Y ) purchased
by customers.
X
1 2
Y
1 0.4 0.2
2 0.1 0.25
3 0 0.05
Find the probability that a customer will buy two sandwiches given that he has bought
three drinks.
Solution:
X denotes the number of sandwiches purchased by a customer and Y denotes the num-
ber of drinks purchased by a customer.
To find: fX|Y =3 (2)
Now,
fXY (2, 3)
fX|Y =3 (2) = =
fY (3)
fXY (2, 3)
=
fXY (1, 3) + fXY (2, 3)
0.05
=
0 + 0.05
=1
3. A fair coin is tossed 4 times. Let X be the total number of heads and Y be the number
of heads before the first tail (If there is no tail in all the four tosses, then Y = 4). What
is the value of fY |X=2 (0)? [2 marks]
5
(a)
16
1
(b)
8
9
(c)
16
Page 2
1
(d)
2
Solution:
A fair coin is tossed four times. X denotes the number of heads and Y denotes the
number of heads before first tail (If there is no tail in all the four tosses, then Y = 4).
Clearly, X ∼ Binomial(4, 21 ).
Now,
fXY (2, 0)
fY |X=2 (0) =
fX (2)
fX|Y =0 (2).fY (0)
= ..(1)
fX (2)
Now, event Y = 0 shows that there is no head before first tail that is first outcome is
tail.
It implies that fY (0) = 21
And 4
fX (2) = 4 C2 12
Putting the values in the equation (1), we get
1 3 1
3
C2 2
.2
fY |X=2 (0) = 4
1
4C
2 2
3 1
= =
6 2
Page 3
Solution:
fXY Z (x, y, z)
We know that fX|(Y =y,Z=z) (x) =
fY Z (y, z)
⇒ fXY Z (x, y, z) = fX|(Y =y,Z=z) (x).fY Z (y, z)
Hence, option (a) is correct and option (b) is incorrect.
fXY (x, y) = fX (x).fY (y) is true only when X and Y are independent. Therefore, option
(d) need not to be always true.
5. Two random variables X and Y are jointly distributed with joint pmf
⇒fXY (0, 0) + fXY (0, 1) + fXY (0, 2) + fXY (0, 3) + fXY (1, 0) + fXY (1, 1) + fXY (1, 2)
+ fXY (1, 3) + fXY (2, 0) + fXY (2, 1) + fXY (2, 2) + fXY (2, 3) = 1
Page 4
Now, using the given condition,
4
P (X ≥ 1, Y ≤ 2) =
7
⇒P (X = 1, Y = 0) + P (X = 1, Y = 1) + P (X = 1, Y = 2) + P (X = 2, Y = 0)+
4
P (X = 2, Y = 1) + P (X = 2, Y = 2) =
7
4
⇒ab + ab + a + ab + 2a + 2ab + 2ab + a + 2ab + 2a =
7
4
⇒6a + 9ab = ....(2)
7
6. Akshat draws a card randomly from a well-shuffled pack of 52 cards. If the drawn card
is a face card, then he draws two balls randomly from bag A which contains 5 Red, 6
Black and 4 Green balls. If the drawn card is not a face card, then he draws three balls
randomly from bag B which contains 7 Red, 8 Black and 5 Green balls. Let two random
variables X and Y are defined as:
(
0 if the drawn card is a face card
X=
1 if the drawn card is not a face card
and Y be the number of Red balls drawn. Find the value of fY (1). Write your answer
correct up to two decimal places.
Solution:
Akshat draws a card randomly from a well-shuffled pack of 52 cards. Random variable
X is defined as (
0 if the drawn card is a face card
X=
1 if the drawn card is not a face card
If the drawn card is a face card, then he draws two balls randomly from bag A which
contains 5 Red, 6 Black and 4 Green balls. If the drawn card is not a face card, then
he draws three balls randomly from bag B which contains 7 Red, 8 Black and 5 Green
Page 5
balls. Random variable Y is the number of Red balls drawn.
To find: fY (1)
We know that
7. Three fair coins are tossed. If the first head occurs on the first toss, you score 1 point.
If the first head occurs on toss 2 or on toss 3, you score 2 or 3 points, respectively. If
no heads appear, you lose 1 point (that is score −1 point). Let X denote the number
of heads and Y denote the points scored. What is the probability that fewer than three
heads will occur and you will score 1 or less? Write your answer correct to two decimal
places.
Solution:
Given that X denotes the number of heads and Y denotes the point scored.
Clearly, TX = {0, 1, 2, 3} and TY = {−1, 1, 2, 3}.
To find: P (X < 3, Y ≤ 1).
Outcome X Y
HHH 3 1
HHT 2 1
HTH 2 1
THH 2 2
HTT 1 1
THT 1 2
TTH 1 3
TTT 0 −1
The outcomes HHT, HTH, HTT, TTT correspond to the event (X < 3, Y ≤ 1).
Page 6
Therefore,
8. Contracts for two construction jobs are each assigned uniformly at random to one or
more of three firms, A, B, and C. Let X denote the number of contracts assigned to firm
A and Y the number of contracts assigned to firm B. Find the value of fX|Y =0 (2). Write
your answer correct to two decimal places.
Solution:
Given that X denotes the number of contracts assigned to firm A and Y denotes the
number of contracts assigned to firm B.
Since each job is randomly assigned to one or more of the three firms, probability of
1
assigning one job to any of the three firms is . (Notice that one firm can be assigned
3
either 0 or 1 or 2 jobs).
Clearly, TX = TY = {0, 1, 2} Therefore,
Similarly,
and
Therefore,
P (Y = 0) = P (X = 0, Y = 0) + P (X = 1, Y = 0) + P (X = 2, Y = 0)
1 2 1
= + +
9 9 9
4
=
9
Page 7
Now,
P (X = 2, Y = 0)
fX|Y =0 (2) =
P (Y = 0)
1/9
=4
/9
1
=
4
9. A fair coin is tossed five times, and the number of heads, N , is counted. The coin is
then tossed N more times. Find the probability that heads will appear for a total of
four times in this process. Write your answer correct to two decimal places.
Solution:
Given that N denotes the number of heads in five tosses of a coin.
Clearly, N ∼ Binomial(5, 1/2).
It implies that
10. From a group of three members of party A, two members of party B, and one member
of party C, a committee of two people is to be selected uniformly at random. Let X
denote the number of party A members and Y denote the number of party B members
on the committee. Find the value of fXY (1, 1).
Solution:
Page 8
Given that X denotes the number of party A members in selected two member’s commit-
tee and Y denotes the number of party B members in selected two member’s committee.
fXY (1, 1) = P (X = 1, Y = 1)
3
C 1 .2 C 1
= 6
C2
3×2
=
15
= 0.4
Page 9
Statistics for Data Science - 2
1. Let X and Y be two random variables with joint distribution given in Table 3.1.P, where
a and b are two unknown values.
X
0 1 2
Y
1 3
0 a
12 12
2 1
1 b
12 12
3 1 1
2
12 12 12
i) Find P (Y = 1).
4
a)
12
3
b)
12
5
c)
12
1
d)
12
Solution: P
We know that, fXY (x, y) = 1
x∈TX , y∈TY
1 3 2 1 3 11
⇒ 12 + 12 +a + 12 + b + 12 + 12 + 12
+ 12 =1
⇒a+b=0
Since a and b cannot take negative values ⇒ a = b = 0.
1
Now,
X
P (Y = 1) = fXY (x, 1)
x∈TX
2 1
= +b+
12 12
3
= +0
12
3
=
12
P (Y = 1, X = 2)
P (Y = 1 | X = 2) =
P (X = 2)
1
= 12
1 1
a+ +
12 12
1
=
2
2
Solution:
P (X = 0, Y ≥ 1) = P (X = 0, Y = 1) + P (X = 0, Y = 2)
2 3
= +
12 12
5
=
12
2. Let X ∼ Uniform({1, 2, 3, 4, 5, 6}) and let Y be the number of times 2 occurs in X throws
of a fair die. Choose the incorrect option(s) among the following.
1
a) P (Y = 2 | X = 2) =
6
52
b) P (Y = 2 | X = 4) = 3
6
5
c) P (Y = 5 | X = 6) = 5
6
5
d) P (Y = 6 | X = 5) = 6
6
Solution:
P (Y = 2 | X = 4) ∼ Bin(4, 1/6)
1 5
= 4 C 2 ( )2 ( )2
6 6
52
= 3
6
P (Y = 5 | X = 6) ∼ Bin(6, 1/6)
1 5
= 6 C 5 ( )5 ( )1
6 6
5
= 5
6
P (Y = 6 | X = 5) ∼ Bin(5, 1/6)
=0
3
3. Let the random variables X and Y each have range {1, 2, 3}. The following formula
gives the joint PMF
i + 2j
P (X = i, Y = j) = ,
c
where c is an unknown value. Find P (1 ≤ X ≤ 3, 1 < Y ≤ 3).
5
a) 9
7
b) 9
2
c) 9
4
d) 9
Solution: P
We know that, P (X = x, Y = y) = 1
x∈TX , y∈TY
⇒ P (X = 1, Y = 1) + P (X = 1, Y = 2) + P (X = 1, Y = 3) + P (X = 2, Y = 1) + P (X =
2, Y = 2) + P (X = 2, Y = 3) + P (X = 3, Y = 1) + P (X = 3, Y = 2) + P (X = 3, Y =
3) = 1
⇒ 3c + 5c + 7c + 4c + 6c + 8c + 5c + 7c + 9
c
=1
⇒ c = 54
Now,
P (1 ≤ X ≤ 3, 1 < Y ≤ 3) = P (X = 1, Y = 2) + P (X = 1, Y = 3) + P (X = 2, Y = 2)
+ P (X = 2, Y = 3) + P (X = 3, Y = 2) + P (X = 3, Y = 3)
1
⇒ P (1 ≤ X ≤ 3, 1 < Y ≤ 3) = [5 + 7 + 6 + 8 + 7 + 9]
c
42
=
54
7
=
9
4. The joint PMF of the random variables X and Y is given in Table 1.2.P.
X
1 2 3
Y
1 k k 2k
2 2k 0 4k
3 3k k 6k
4
Consider the random variable Z = X 2 Y.
i) Find the range of Z | Y = 2.
a) {1, 4, 9}
b) {4, 8, 18}
c) {1, 9}
d) {2, 18}
e) {2, 8, 18}
Solution: P
We know that, P (X = x, Y = y) = 1
x∈TX , y∈TY
⇒ k + k + 2k + 2k + 0 + 4k + 3k + k + 6k = 1
1
⇒ k = 20
When Y = 2, P (X = 2, Y = 2) = 0. So for the range we will not consider the pair (2,
2).
Since Z = X 2 Y , the range of Z | Y = 2 will be {12 × 2, 32 × 2} which is equal to {2, 18}.
ii) Find the value of P (Z = 18 | Y = 2).
1
a) 3
2
b) 3
3
c) 4
1
d) 4
Solution:
P (Z = 18, Y = 2)
P (Z = 18 | Y = 2) =
P (Y = 2)
P (X = 3, Y = 2)
=
P (X = 1, Y = 2) + P (X = 3, Y = 2)
4k
=
2k + 4k
2
=
3
5. From a sack of fruits containing 3 mangoes, 2 kiwis, and 3 guavas, a random sample of
4 pieces of fruit is selected. If X is the number of mangoes and Y is the number of kiwis
in the sample, then find the joint probability distribution of X and Y .
5
X
0 1 2 3
Y
3 9 3
0 0
70 70 70
2 18 2 18
1
70 70 70 70
3 9 3
2 0
70 70 70
a)
X
0 1 2 3
Y
3 9 3
0 0
70 70 70
2 18 18 2
1
70 70 70 70
3 9 3
2 0
70 70 70
b)
X
0 1 2 3
Y
3 9 3
0 0
70 70 70
2 18 18 2
1
70 70 70 70
9 3 3
2 0
70 70 70
c)
X
0 1 2 3
Y
3 3 9
0 0
70 70 70
2 18 18 2
1
70 70 70 70
3 9 3
2 0
70 70 70
6
d)
Solution:
X is the number of mangoes and Y is the number of kiwis in the sample. The number
of mangoes and kiwis in the sack is 3 and 2,respectively.
So X will take values in {0, 1, 2, 3} and Y will take values in {0, 1, 2} when the random
sample of 4 pieces is selected.
P (X = 0, Y = 0) = P (no mango and no kiwi) = 0 (not possible since the number of
guava is 3)
2
C1 3 C3 2
P (X = 0, Y = 1) = P (no mango and one kiwi) = 8 =
C4 70
2
C2 3 C2 3
P (X = 0, Y = 2) = P (no mango and two kiwis) = 8C
=
4 70
3
C1 3 C3 3
P (X = 1, Y = 0) = P (one mango and no kiwi) = 8C
=
4 70
3
C1 2 C1 3 C2 18
P (X = 1, Y = 1) = P (one mango and one kiwi) = 8C
=
4 70
3
C1 2 C2 3 C1 9
P (X = 1, Y = 2) = P (one mango and two kiwis) = 8C
=
4 70
3
C2 3 C2 9
P (X = 2, Y = 0) = P (two mangoes and no kiwi) = 8C
=
4 70
3
C2 2 C1 3 C1 18
P (X = 2, Y = 1) = P (two mangoes and one kiwi) = 8C
=
4 70
3
C2 2 C2 3
P (X = 2, Y = 2) = P (two mangoes and two kiwis) = 8C
=
4 70
Similarly you can check for other values also.
Answer: b
6. Suppose you flip a fair coin. If the coin lands heads, you roll a fair six-sided die 50 times.
If the coin lands tails, you roll the die 51 times. Let X be 1 if the coin lands heads and
0 if the coin lands tails. Let Y be the total number of times you get the number 5 while
throwing the dice. Find P (X = 1|Y = 10).
85
a)
157
82
b)
167
72
c)
157
7
85
d)
167
Solution:
246
⇒ P (X = 1|Y = 10) = × P (X = 0|Y = 10)
255
255
⇒ P (X = 1|Y = 10) + P (X = 1|Y = 10) = 1
246
246 82
⇒ P (X = 1|Y = 10) = =
501 167
7. Three balls are selected at random from a box containing five red, four blue, three yellow
and six green coloured balls. If X, Y and Z are the number of red balls, blue balls and
green balls respectively, choose the correct option(s) among the following.
25
a) P (X = 1, Y = 0, Z = 2) =
272
5
b) P (X = 1, Y = 1, Z = 1) =
34
1
c) P (X = 1, Y = 0 | Z = 2) =
4
5
d) P (X = 0, Y = 0, Z = 3) =
204
8
Solution:
5
C1 6 C2 25
P (X = 1, Y = 0, Z = 2) = P (one red ball and 2 green balls) = 18 C
=
3 272
5
C1 4 C1 6 C1
P (X = 1, Y = 1, Z = 1) = P (one red ball, one blue ball and 1 green ball) = 18 C
3
5
=
34
6
C3 5
P (X = 0, Y = 0, Z = 3) = P (3 green balls) = 18 C
=
3 204
And
5
C1
P (X = 1, Y = 0 | Z = 2) = P (one red ball given that two balls are green) = 16 C
1
5
=
16
8. A computer system receives messages over three communications lines. Let Xi be the
number of messages received on line i in one hour. Suppose that the joint pmf of X1 , X2 ,
and X3 is given by
fX1 X2 X3 (x1 , x2 , x3 ) = (1 − p1 )(1 − p2 )(1 − p3 )px1 1 px2 2 px3 3 for x1 ≥ 0, x2 ≥ 0, x3 ≥ 0 and
0 < pi < 1.
i) Find fX1 X2 (x1 , x2 ).
a) (1 − p1 )(1 − p2 )
b) (1 − p1 )(1 − p2 )(1 − p3 )px1 1 px2 2
c) (1 − p1 )(1 − p2 )px1 1 px2 2
d) px1 1 px2 2
Solution:
∞
X
fX1 X2 (x1 , x2 ) = fX1 X2 X3 (x1 , x2 , x3 )
x3 =0
X∞
= (1 − p1 )(1 − p2 )(1 − p3 )px1 1 px2 2 px3 3
x3 =0
9
a) (1 − p1 )
b) (1 − p1 )(1 − p2 )px2 2
c) (1 − p1 )px1 1
d) (1 − p2 )px2 2
Solution:
∞
X
fX2 (x2 ) = fX1 X2 (x1 , x2 )
x1 =0
X∞
= (1 − p1 )(1 − p2 )px1 1 px2 2
x1 =0
P (Y = 2) = P (Y = 2 | X = 0)P (X = 0) + P (Y = 2 | X = 1)P (X = 1)
= 0 + 0.4 × 0.4
= 0.16
10
ii) Find P (X = 1). Enter your answer correct to two decimals accuracy.
P (X = 1) = 0.40
iii) Find P (X = 1, Y = 1). Enter your answer correct to two decimals accuracy.
P (X = 1, Y = 1) = P (Y = 1 | X = 1)P (X = 1)
= 0.6 × 0.4
= 0.24
10. Let X1 , X2 , X3 ∼ fX1 X2 X3 where Xi ∈ {−1, 1} for each i. If fX1 X2 X3 (−1, −1, 1) =
1
8
, fX2 (1) = 16 , fX3 |X2 =−1 (1) = 51 , find fX1 |X2 =−1,X3 =1 (−1). Enter your answer correct
to two decimals accuracy.
Solution:
11. Suppose that the number of people who visit a yoga academy each day is a Poisson
random variable with mean 30. Suppose further that each person who visits is, indepen-
dently, a girl with probability 0.5 or a boy with probability 0.5. Find the joint probability
that exactly 10 boys and 15 girls visit the yoga academy on any given day.
e−30 3025
a)
15!10!
−15
e 3025
b)
15!10!
−8
e 1525
c)
15!10!
e−30 1525
d)
15!10!
11
Solution:
Let X denote the number of boys who visits on a particular day.
Let Y denote the number of girls who visits on a particular day.
Let Z = X + Y be the total number of people who visits.
Observe that if x + y 6= z ⇒ P (X = x, Y = y | Z = z) = 0
Therefore
12
Statistics for Data Science - 2
Week 2 Graded Assignment
Multiple random variables
1. Let X and Y be two independent discrete random variables with CDFs FX and FY ,
respectively. Define another random variable Z = min(X, Y ), then the CDF of Z is
a) min(FX , FY )
b) FX FY
c) FX + FY + FX FY
d) FX + FY − FX FY
Solution:
FZ (z) = P (Z ≤ z) = P (min(X, Y ) ≤ z)
= 1 − P (min(X, Y ) > z)
= 1 − P (X > z, Y > z)
fZ (3) = P (Z = 3) = P (X − Y = 3)
= P (X = 4, Y = 1) + P (X = 5, Y = 2) + P (X = 6, Y = 3)
fZ (3) = P (X = 4, Y = 1) + P (X = 5, Y = 2) + P (X = 6, Y = 3)
= P (X = 4)P (Y = 1) + P (X = 5)P (Y = 2) + P (X = 6)P (Y = 3)
1 1 1 1 1 1
= × + × + ×
6 6 6 6 6 6
3
=
36
1
=
12
a) p > 0.02
b) p < 0.04
c) p > 0.15
d) p < 0.30
e) p = 0.05
Solution:
If X ∼ Geometric(p) and Y ∼ Geometric(p) are two independent random variables and
Z = X + Y , then
P (Z = n) = (n − 1)p2 (1 − p)n−2
(Try derivation yourself)
We have to find the value of p for which P (Z = 26) > P (Z = 25).
P (Z = 26) = (26 − 1)p2 (1 − p)26−2
and P (Z = 25) = (25 − 1)p2 (1 − p)25−2
Page 2
⇒ 25(1 − p) > 24
⇒ 1 − p > 24
25
⇒ p < 0.04
4. The following options gives the joint PMF of the random variables X and Y . If the
random variables X and Y are independent, then which of the following option(s) can
be the joint PMF of X and Y ?
Y
0 1 2
X
0 0.01 0 0
1 0.09 0.09 0
2 0 0 0.81
a)
Y
0 1 2
X
b)
Y
0 1 2
X
1 1 1
0
12 24 24
1 1 1
1
6 12 8
1 1 1
2
4 8 12
c)
Page 3
Y
0 1 2
X
1 1 1
0
10 5 5
1 1 3
1
10 10 10
d)
Y
0 1
X
0 0.10 0.15
1 0.20 0.30
2 0.10 0.15
e)
Answer: e
Solution:
In option a)
P (X = 0, Y = 1) = 0 but P (X = 0) = 0.01+0+0 = 0.01 and P (Y = 1) = 0+0.09+0 =
0.09
⇒ P (X = 0, Y = 1) 6= P (X = 0)P (Y = 1)
Therefore, option (a) cannot be the joint PMF of X and Y.
In option b)
P (X = 0, Y = 0) = 0.06 but P (X = 0) = 0.06 + 0.18 + 0.12 = 0.36 and P (Y = 0) =
0.06 + 0.04 = 0.10
⇒ P (X = 0, Y = 0) = 0.06 6= 0.036 = P (X = 0)P (Y = 0)
Therefore, option (b) cannot be the joint PMF of X and Y.
In option c)
P (X = 1, Y = 0) = 1/6 but P (X = 1) = 1/6 + 1/12 + 1/8 = 3/8 and P (Y = 0) =
1/12 + 1/6 + 1/4 = 1/2
⇒ P (X = 1, Y = 0) = 1/6 6= 3/16 = P (X = 1)P (Y = 0)
Therefore, option (c) cannot be the joint PMF of X and Y.
Page 4
In option d)
P (X = 0, Y = 1) = 1/5 but P (X = 0) = 1/10 + 1/5 + 1/5 = 1/2 and P (Y = 1) =
1/5 + 1/10 = 3/10
⇒ P (X = 0, Y = 1) = 1/5 6= 3/20 = P (X = 0)P (Y = 1)
Therefore, option (d) cannot be the joint PMF of X and Y.
In option e)
For every (x, y), P (X = x, Y = y) = P (X = x)P (Y = y) (check yourself)
Hence option (e) is the joint PMF of X and Y.
5. Let X and Y be two independent random variables such that X ∼ Bernoulli(0.2) and
Y ∼ Bernoulli(0.4). Let another random variable Z be defined as Z = X + Y . Find
the value of fX|Z=1 (1). Enter the answer correct to two decimal places.
Answer: [0.25, 0.29]
Solution:
X ∼ Bernoulli(0.2)
Y ∼ Bernoulli(0.4)
Z =X +Y
P (X = 1, Z = 1)
fX|Z=1 (1) =
P (Z = 1)
P (X = 1, Y = 0)
=
P (X = 1, Y = 0) + P (X = 0, Y = 1)
P (X = 1)P (Y = 0)
=
P (X = 1)P (Y = 0) + P (X = 0)P (Y = 1)
0.2 × 0.6
=
(0.2 × 0.6) + (0.8 × 0.4)
=0.28
Page 5
Solution:
First we will find the probability such that X1 = 0, X2 = 1 and X3 do not take values 0
and 1.
Since all the four random variable are independent, we have
P (X1 = 0, X2 = 1, X3 6= {0, 1}) = P (X1 = 0)P (X2 = 1)P (X3 6= {0, 1})
Now, P (X1 = 0) = e−3 , P (X1 = 1) = 3e−3
P (X3 6= {0, 1}) = 1 − P (X3 = {0, 1})
= 1 − [P (X3 = 0) + P (X3 = 1)]
= 1 − 4e−3
Now, P (X1 = 0, X2 = 1, X3 6= {0, 1}) = e−3 3e−3 (1 − 4e−3 ) = 3e−6 (1 − 4e−3 )
We can choose such pairs of Xi in 3! ways.
Therefore, probability that exactlyXi equals 0 and exactly one Xi equals 1 is given by
6 × 3e−6 (1 − 4e−3 ) = 18e−6 (1 − 4e−3 ) ways.
7. Let X ∼ Bernoulli(0.3) and Y ∼ Bernoulli(0.5) be independent. Define Z = X+Y −XY ,
find the distribution of Z.
(a) Z ∼ Bernoulli(0.35)
(b) Z ∼ Bernoulli(0.65)
(c) Z ∼ Bernoulli(0.15)
(d) Z ∼ Bernoulli(0.85)
Solution:
Given X ∼ Bernoulli(0.3) and Y ∼ Bernoulli(0.5) are independent.
Z = X + Y − XY
Since X ∼ Bernoulli(0.3), it will take the values in {0, 1}.
Similarly Y will take values in {0, 1}.
Also X and Y are given to be independent, therefore fXY (x, y) = fX (x)fY (y) for all
(x, y).
Consider the following joint distribution table of X and Y .
X Y X + Y − XY fXY (x, y)
1 1 1 (0.3)(0.5) = 0.15
1 0 1 (0.3)(0.5) = 0.15
0 1 1 (0.7)(0.5) = 0.35
0 0 0 (0.7)(0.5) = 0.35
Page 6
8. Let X and Y be independent and identically distributed Geometric random variables
with parameter 0.6.
(a) Find P (X = 2 | X + Y = 11). Enter the answer correct to one decimal place.
Answer: 0.1
Solution:
Given X, Y ∼ i.i.d. Geometric(0.6).
P (X = 2, X + Y = 11)
P (X = 2 | X + Y = 11) =
P (X + Y = 11)
P (X = 2)P (Y = 9)
= 10
P
P (X = i, Y = 11 − i)
i=1
(0.4)(0.6) × (0.4)8 (0.6)
= 10
P
P (X = i)P (Y = 11 − i)
i=1
(0.4)9 (0.6)2
= = 0.1
10(0.4)9 (0.6)2
(b) Find P (X = 7 | X + Y = 11). Enter the answer correct to one decimal place.
Answer: 0.1
Solution:
This can also be solved using the above steps.
1
Remark: For any i, P (X = i | X + Y = n) = , for i : 1 → n − 1.
n−1
9. The joint distribution of X and Y is given by
9
fXY (x, y) = ,
16 × 4x+y
where TX , TY ∈ {0, 1, 2, . . .}.
Page 7
Solution:
X, Y is in the range {0, 1, 2, . . .}.
The range of X + Y is {0, 1, 2, . . .}.
X +Y fX+Y (k)
9
0
16
9
1 2
16 × 4
9
2 3
16 × 42
.. ..
. .
9
k (k + 1)
16 · 4k
9
Therefore, P (X + Y = k) = (k + 1)
16 · 4k
(ii) Find the probability mass function of Z = max{X, Y }.
3(4k − 1)
(a) fZ (k) = for k = 1, 2, . . .
2 · 42k
3(4k − 1)
(b) fZ (k) = for k = 0, 1, . . .
2 · 42k
9 3(4k − 1)
(c) fZ (k) = + for k = 0, 1, . . .
16 · 42k 2 · 42k
9 3(4k − 1)
(d) fZ (k) = + for k = 1, 2, . . .
16 · 42k 2 · 42k
Solution:
The joint distribution of X and Y is given by
9
fXY (x, y) =
16 × 4x+y
Let Z = max{X, Y }.
Clearly, range of Z will be TZ = {0, 1, 2, . . .}.
Page 8
Z fZ (k)
9
0
16
9 9
1 2 +
16 · 4 16 · 42
9 9 9
2 2 + 2 +
16 · 42 16 · 43 16 · 44
.. ..
. .
9 1 1 1 9
k 2 k
1 + + 2 + . . . + (k−1) +
16 · 4 4 4 4 16 · 42k
Now,
9 1 1 1 9
fZ (k) =2 1 + + 2 + . . . + (k−1) +
16 · 4k 4 4 4 16 · 42k
k
1
1−
9 4 9
=2 +
k
16 · 4
1 16 · 42k
1−
4
9 3(4k − 1)
= +
16 · 42k 2 · 42k
9 3(4k − 1)
Therefore, fZ (k) = + for k = 0, 1, . . .
16 · 42k 2 · 42k
10. Let the random variables X and Y , which represent the number of calls received by call
centers A and B, respectively, in a one-hour interval follow the Poisson distribution. The
average number of calls received in call centers A and B is 2 per hour and 3 per hour,
respectively. Assume that X and Y are independent. If Z denotes the total number of
calls received in call centers A and B, find the conditional probability fY |Z=5 (3). Enter
the answer correct to two decimal places.
Answer: [0.33, 0.36]
Solution:
Given, X and Y denote the number of calls received by call center A and B in a one-hour
interval.
Also, X ∼ Poisson(2) and Y ∼ Poisson(3) are independent.
Page 9
Statistics for Data Science - 2
Week 2 Practice Assignment
Multiple random variables
1. Consider an experiment of tossing a fair coin twice. Let X be the number of heads that
occurs in the two tosses and Y be the number of tails that occurs in the two tosses.
Based on the given information, choose the correct statements.
X
0 1 2
Y
1
0 0 0 4
1
1 0 2
0
1
2 4
0 0
It is clear that
fXY (0, 0) 6= fX (0).fY (0)
It implies that X and Y are dependent random variables.
So, option (a) is incorrect and option (b) is correct.
fXY (0, 1)
fY |X=0 (1) = = 0 (Since, fXY (0, 1) = 0)
fX (0)
So, option (d) is incorrect.
2. Two fair dice are thrown simultaneously. Let X be the outcome on the first die and Y
be the sum of the outcomes on both the dice. Find the value of P (Y − X ≥ 6).
1
(a)
6
1
(b)
12
5
(c)
12
1
(d)
24
Solution:
X denotes the outcome on the first die and Y denotes the sum of the outcomes on both
the dice.
Notice that Y − X will denote the outcome on the second die.
Let Z = Y − X, then Z ∼ Uniform({1, 2, 3, 4, 5, 6})
P (Y − X ≥ 6) = P (Z ≥ 6)
P (Y − X ≥ 6) = P (Z = 6)
1
P (Y − X ≥ 6) =
6
3. Let X and Y denote the number of cars and number of bikes reaching a street corner
during a certain 15-minute time period, respectively. Joint distribution of X and Y is
given as
9
fXY (x, y) =
16(4x+y )
Choose the correct option(s).
3
(a) Marginal pmf of X is fX (x) = .
4x+1
Page 2
3
(b) Marginal pmf of X is fX (x) = .
4x
(c) X and Y are independent random variables.
(d) X and Y are dependent random variables.
Solution:
X and Y denote the number of cars and number of bikes reaching a street corner during
a certain 15-minute time period, respectively.
Range of X and Y will be TX , TY = {0, 1, 2, ..., ∞}
Now,
∞
X
fX (x) = fXY (x, y)
y=0
∞
X 9
=
y=0
16(4x+y )
∞
9 X 1
=
16.4x y=0 4y
9 1 1
= 1 + + 2 + ...
16.4x 4 4
9 1
=
16.4x 1 − 14
9 4
=
42 .4x 3
3
= x+1
4
Now, Choose two arbitrary points x and y in the range of X and Y , respectively, then
Page 3
3 3
fX (x).fY (y) = .
4x+1 4y+1
9
⇒ fX (x).fY (y) =
16(4x+y )
⇒ fX (x).fY (y) = fXY (x, y)
Solution:
First we will find the probability such that X1 = 0, X2 = 1 and other two random
variables do not take value 0 and 1.
Now,
e−4 40
P (X1 = 0) = = e−4
0!
e−4 41
P (X2 = 1) = = 4e−4
1!
Page 4
. = 1 − [e−4 + 4e−4 ] = 1 − 5e−4
We can choose such pairs of Xi for which exactly one Xi equals 0 and exactly one Xi
equals 1 in 4P2 ways.
Therefore,
probability that exactly one of the Xi equals 0 and exactly one of the Xi equals 1 is
given by
4
P2 e−4 (4e−4 )(1 − 5e−4 )2
= 48e−8 (1 − 5e−4 )2
5. A person tosses a fair coin until it shows a head and rolls a fair die until it shows the
number six. Assume that tossing the coin is independent of rolling the die. What is the
probability that number of tosses to get first head is equal to number of rolls to get the
first six? Write your answer correct to two decimal places.
Solution
Let X denote the number of tosses of the coin to get the first head and let Y denote the
number of rolls of the die to get the first six.
1 1
Clearly, X ∼ Geometric( ) and Y ∼ Geometric( ).
2 6
Number of tosses to get first head will be equal to number of rolls to get the first six if
Page 5
(X = 1, Y = 1) or (X = 2, Y = 2) or (X = 3, Y = 3) or so on. Therefore,
P (X = Y ) = P (X = 1, Y = 1) + P (X = 2, Y = 2) + P (X = 3, Y = 3) + . . .
X∞
= P (X = i, Y = i)
i=1
∞
X
= P (X = i)P (Y = i)
i=1
∞ i i−1
X 1 5 1
=
i=1
2 6 6
∞ i−1 i−1
1 X 1 5
=
12 i=1 2 6
∞ i−1
1 X 5
=
12 i=1 12
" 2 #
1 5 5
= 1+ + + ...
12 12 12
1 1
=
12 1 − 5/12
1
=
7
6. Let X and Y be i.i.d. Geometric(p), where p ∈ [0, 1] is a constant. Find the value of
P (X = 6|X + Y = 10). Write your answer correct to two decimal places.
Solution:
Given that X and Y are i.i.d. Geometric(p).
To find: P (X = 6|X + Y = 10).
P (X = 6, X + Y = 10)
P (X = 6|X + Y = 10) =
P (X + Y = 10)
P (X = 6, Y = 4)
=
P (X + Y = 10)
P (X = 6)P (Y = 4)
= ...(1)
P (X + Y = 10)
Page 6
Now,
9
X
P (X + Y = 10) = P (X = i, Y = 10 − i)
i=1
X9
= P (X = i)P (Y = 10 − i)
i=1
9
X
= (1 − p)i−1 p.(1 − p)9−i p
i=1
9
X
= p2 (1 − p)8
i=1
= 9p2 (1 − p)8 ...(2)
7. Two dice are rolled simultaneously. Let X denote the greatest outcome (if both the
outcomes are same, then X will be that outcome) and Y denote the lowest outcome (if
both the outcomes are same, then Y will be that outcome). Choose the correct options.
(a) X is uniformly distributed.
(b) Y is uniformly distributed.
(c) X and Y are independent random variables.
(d) X and Y are not independent.
(e) Joint distribution of X and Y is Uniform{1, 2, 3, 4, 5, 6}.
Solution:
Given that X denotes the greater outcome and Y denotes the lower outcome.
Clearly, TX = TY = {1, 2, . . . , 6}.
Page 7
If x > y, fXY (x, y) = P (any one of the two outcomes is x and another one is y)
1 1
=2× =
36 18
If x = y, fXY (x, y) = P (both the outcomes are same (x = y) )
1
=
36
X
1 2 3 4 5 6
Y
1 1 1 1 1 1
1
36 18 18 18 18 18
1 1 1 1 1
2 0
36 18 18 18 18
1 1 1 1
3 0 0
36 18 18 18
1 1 1
4 0 0 0
36 18 18
1 1
5 0 0 0 0
36 18
1
6 0 0 0 0 0
36
Option (a):
1 3
From the table, we know that fX (1) = , fX (2) = and so on.
36 36
It implies that X is not uniformly distributed.
Hence, option (a) is incorrect.
Option (a):
11 9
From the table, we know that fY (1) = , fY (2) = and so on.
36 36
It implies that Y is not uniformly distributed.
Hence, option (b) is incorrect.
Page 8
(d) is correct.
Option(e):
It is clear from the table that joint distribution of X and Y is not uniformly distributed.
P (X = 0, |X + Y | = 1)
P (X = 0 | Z = 1) =
P (|X + Y | = 1)
P (X = 0, |Y | = 1)
=
P (|X + Y | = 1)
P (X = 0, Y = ±1)
=
P (|X + Y | = 1)
P (X = 0, Y = 1) + P (X = 0, Y = −1)
=
P (X = 0, Y = 1) + P (X = 0, Y = −1) + P (X = 1, Y = 0) + P (X = −1, Y = 0)
1
9
+ 19
= 1
9
+ 19 + 19 + 19
1
=
2
Page 9
Solution:
The joint distribution of X and Y is given by
1
fXY (x, y) = ,
2x+y
Let Z = max{X, Y }.
Clearly, range of Z will be TZ = {1, 2, 3, . . .}.
fZ (k) = P (max{X, Y } = k)
= P (X = k, Y = k) + P (X = k, Y < k) + P (X < k, Y = k)
k−1
X k−1
X
= fXY (k, k) + fXY (k, i) + fXY (i, k)
i=1 i=1
k−1 k−1
1 X 1 X 1
= 2k + +
2 i=1
2k+i i=1 2k+i
k−1
1 X 1
= + 2
22k i=1
2k+i
k−1
1 2 X 1
= +
2k 2k i=1 2i
1 1 1 1 1
= + + + . . . + k−1
2k 2k−1 2 22 2
1 1 1 1
= + 1 + + . . . + k−2
2k 2k 2 2
1 1 1 − /2 1 k−1
= 2k + k
2 2 1 − 1/2
k−1
1 2 2 −1
= 2k + k
2 2 2k−1
1 2k−1 − 1
= 2k + 2k−2
2 2
10. Let X ∼ Binomial(5, 1/2) and Y ∼ Binomial(4, 1/4) be two independent random vari-
ables. Find the value of P (max{X, Y } = 1). Write your answer correct to two decimal
points.
Page 10
Solution:
Given that X ∼ Binomial(5, 1/2) and Y ∼ Binomial(4, 1/4) are two independent ran-
dom variables.
To find: P (max{X, Y } = 1)
Page 11
Statistics for Data Science - 2
Week 3 Graded assignment solutions
1. Suppose 1 in 100 products that are coming out of a production line is defective. Sup-
pose we randomly pick and keep aside products from the production line till the first
defective item is obtained. Let the random variable X represent the number of prod-
ucts that are kept aside (Assume that the first defective item is also kept aside). Find
Var(X).
1
(a)
100
99
(b)
100
(c) 100
(d) 9900
Solution:
The random variable X represent the IITMnumber of products that are kept aside (including
Logo (1).png
the first defective item) before the first defective is obtained.
It is given that 1 out of 100products
are defective.
1
Therefore, X ∼ Geometric
100
Now,
1−p
Var(X) =
p2
1− 1
= 1 100 = 9900
( 100 )2
2. Two coins are tossed. The probabilities of occurrence of tail on the first and the second
coin are 0.6 and 0.4, respectively. If the random variable X represents the number of
heads obtained, find the expected value of X. (Enter the answer correct to 2 decimal
points).
Answer: 1
Solution:
1
Given,
Random variable X denote the number of heads obtained after the tossing of two coins.
Therefore, X will take the values in {0, 1, 2}.
Now,
X
E(X) = xP (X = x)
x∈X
=0.P (X = 0) + 1.P (X = 1) + 2.P (X = 2)
P (X = 1) =P (H on first coin and T on second coin) + P (T on first coin and H on second coin)
=(0.4 × 0.4) + (0.6 × 0.6)
=0.52
3. Let the two random variables X and Y be independent with means equal to 10 and
20, and variances equal to 2 and 4, respectively. Find the value of Var(XY ).
Hint: If X and Y are independent, X 2 and Y 2 are also independent.
Answer: 1208
Solution:
Mean and variance of X is 10 and 2, respectively.
Mean and variance of Y is 20 and 4, respectively.
2
4. Let X and Y be two independent discrete random variables. Define random variables
U and V as
X − E(X) Y − E(Y )
U= , V =
SD(X) SD(Y )
Find Cov(U, V ).
Answer: 0
Solution:
Cov(U, V ) = E(U V ) − E(U )E(V ).
Since U and V are the standardized form of random variables X and Y , respectively,
Cov(U, V ) = E(U V )
X − E(X) Y − E(Y )
=E
SD(X) SD(Y )
1
= E[(X − E(X))(Y − E(Y ))]
SD(X)SD(Y )
= E[XY − XE(Y ) − Y E(X) + E(X)E(Y )]
IITM Logo (1).png
= E[XY ] − E[X]E(Y ) − E[Y ]E(X) + E(X)E(Y )
5. Using Markov’s inequality, find a bound on the probability that on a particular day,
the number of reservations will exceed 30.
1
(a) P (X > 30) ≤
4
1
(b) P (X > 30) ≥
3
10
(c) P (X > 30) ≤
31
10
(d) P (X > 30) >
31
Solution:
3
Random variable X represents the number of people who make reservation in a restau-
rant. It is given that
E(X) = 10 (3)
Using Markov’s inequality, we know that
µ
P (X ≥ c) ≤
c
10
Therefore, P (X > 30) = P (X ≥ 31) ≤ .
31
Therefore, the correct option is (c).
6. Find a bound on the probability that on a particular day, number of reservations made
will lie in between 6 and 14 using Chebyshev’s inequality.
7
(a) P (6 < X < 14) ≤
8
7
(b) P (6 < X < 14) ≥
8
7
(c) P (6 < X < 14) >
8
1
(d) P (6 < X < 14) ≤
8
Solution: IITM Logo (1).png
4 16
⇒k= ⇒ k2 = =8
σ 2
1 7
Therefore, P (6 < X < 14) ≥ 1 − =
8 8
Hence, the correct option is (b).
4
7. The joint probability mass function of three discrete random variables X, Y and Z is
given as
1
p(0, 1, 2) = p(0, 2, 3) = p(1, 0, −2) =
3
Calculate Var(XY + 2Z).
52
(a)
9
32
(b)
9
80
(c)
3
56
(d)
3
Solution:
t1 t2 t3 t1 t2 + 2t3 fXY Z (t1 , t2 , t3 )
0 1 2 4 1/3
0 2 3 6 1/3
1 0 −2 -4 1/3
5
8. An urn contains 5 white balls and 5 red balls. 2 balls are selected at random. Let
X denote the number of red balls drawn and let Y denote the number of white balls
drawn. Find the correlation coefficient between X and Y .
(a) ρ(X, Y ) = 1
(b) ρ(X, Y ) = −1
(c) ρ(X, Y ) = 0
(d) ρ(X, Y ) = −0.5
Solution:
Two balls are selected at random from the urn containing 5 white and 5 red balls.
Random variable X represent the number of red balls drawn.
Therefore, X will take values in {0, 1, 2}.
Random variable Y represent the number of white balls drawn.
Therefore, Y will take values in {0, 1, 2}.
Joint probability distribution of X and Y is given by
X
0 1 2
Y
10
0 0 0
IITM Logo (1).png
45
25
1 0 0
45
10
2 0 0
45
Now,
10 25 10
E(X) = 0 × + 1× + 2×
45 45 45
=1
Similarly, E(Y ) = 1.
2 10 25 2 10
E(X ) = 0 × + 1× + 2 ×
45 45 45
65
=
45
6
65
Similarly, E(Y 2 ) =.
45
65 20
Now, Var(X) = Var(Y ) = − (1)2 =
45 45
10 25
E(XY ) = 0 × + 1× + (2 × 0)
45 45
25
=
45
Cov(X, Y )
ρ(X, Y ) =
SD(X)SD(Y )
E(XY ) − E(X)E(Y )
= p
Var(X)Var(Y )
( 25 − 1)
= q 45
( 20
45
) × ( 20
45
)
=−1
9. Five students each from class 8, 9 and 10 have been nominated for the formation of
the school committee. The number of boys and girls who are selected from each of the
classes is given in Table 4.1.A.
If the committee comprises of two students from each class, find the expected number
of girls in the committee. (Enter the answer correct to 1 decimal point)
Answer: 2.8
Solution:
Let X1 represent the number of girls from class eight in the school committee.
Let X2 represent the number of girls from class nine in the school committee.
Let X3 represent the number of girls from class ten in the school committee.
We need to find E(X1 + X2 + X3 ).
7
We know that E(X1 + X2 + X3 ) = E(X1 ) + E(X2 ) + E(X3 ).
Since total number of girls selected from class eight is 2, therefore, the committee can
comprise of either 0 girl or 1 girl or 2 girls from class eight.
i.e. X1 will take values in {0, 1, 2}.
Now
3
C2 3
P (X1 = 0) = 5 =
C2 10
3 2
C1 × C1 6
P (X1 = 1) = 5 =
C2 10
2
C2 1
P (X1 = 2) = 5 =
C2 10
3 6 1 8
Therefore, E(X1 ) = 0× + 1× + 2× =
10 10 10 10
Similarly, total number of girls selected from class nine is 2, therefore, the committee
can comprise of either 0 girl or 1 girl or 2 girls from class nine.
8
i.e. X2 will take values in {0, 1, 2}, hence E(X2 ) = .
IITM Logo (1).png
10
Total number of girls selected from class ten is 3 and we have to select 2 students from
each class, therefore, the committee can comprise of either 0 girl or 1 girl or 2 girls
from class ten.
i.e. X3 will take values in {0, 1, 2}.
Now
2
C2 1
P (X3 = 0) = 5 =
C2 10
3 2
C1 × C1 6
P (X3 = 1) = 5 =
C2 10
3
C2 3
P (X3 = 2) = 5 =
C2 10
1 6 3 12
Therefore, E(X3 ) = 0× + 1× + 2× =
10 10 10 10
8
Now
10. A share of a company costs |1000 today. Suppose today’s share price increases by
50% with probability 0.6 and decreases by 50% with probability 0.4. Independent of
today, suppose that tomorrow’s share price increases by 20% with probability 0.2, and
decreases by 30% with probability 0.8. If you decide to buy 3 shares today, find the
expected profit (in |) at the end of 2 days.
(a) -120
(b) 360
(c) 120
(d) -360
Solution:
The cost price of a share of the company is |1000.
Let the random variable X representIITM
theLogoprice
(1).png of the share at the end of 2 days.
Price can either go up by 50% with probability 0.6 or can go down by 50% with prob-
ability 0.4 on the first day.
Independent of today, the share price can either go up by 20% with probability 0.2 or
can go down by 30% with probability 0.8.
i.e. If the share price increases by 50% on the first day, the price of the share will
become |1500.
Andthe price of the share at the end of two days if the share prices increases by 20%
20
is | 1500 × + 1500 =|1800 with probability (0.6 × 0.2) = 0.12.
100
Similarly,the price of the share
at the end of two days if the share prices decreases by
30
30% is | 1500 − 1500 × =|1050 with probability (0.6 × 0.8) = 0.48.
100
Again, if the share price decreases by 50% on the first day, the price of the share will
become |500.
Andthe price of the share
at the end of two days if the share prices increases by 20%
20
is | 500 × + 500 =|600 with probability (0.4 × 0.2) = 0.08.
100
Similarly, the price of the share at the end of two days if the share prices decreases by
9
30
30% is | 500 − 500 × =|350 with probability (0.4 × 0.8) = 0.32.
100
P (X = 1800) = 0.12
P (X = 1050) = 0.48
P (X = 600) = 0.08
P (X = 350) = 0.32
Now,
The expected gain at the end of two days if you buy one share is |(880-1000) = -|120.
Therefore, if you buy 3 shares of the company, expected gain will be -|360.
11. A lottery has 500 tickets out of which only 2 tickets contain prizes worth |500 and
|1,000; the rest are worth |0. If one has bought 2 tickets, what will be his/her ex-
pected gain (in |)? IITM Logo (1).png
Answer: 6
Solution:
In the lottery, only two tickets out of 500 contain prizes worth |500 and |1,000.
If one has bought two tickets, one can get the prizes worth |0, |500, |1,000 and |1,500.
Let the random variable X represent the worth of the prizes of two tickets.
Therefore, X will take values in {0, 500, 1000, 1500}.
498
C2
P (X = 0) = P (Both the tickets are worth |0) = 500
C2
498
C 1 1C 1
P (X = 500) = P (One of the ticket is worth |0 and the other is worth |500) = 500
C2
498
C 1 1C 1
P (X = 1000) = P (One of the ticket is worth |0 and the other is worth |1000) = 500
C2
P (X = 1500) = P (One of the ticket is worth |500 and the other is worth |1000) =
2
C2
500
C2
10
498 498
C 1 1C 1 498
C 1 1C 1 2
C2 C2
E(X) = 0 × 500 + 500 × 500 + 1000 × 500 + 1500 × 500
C2 C2 C2 C2
1
= 500 [500 × 498C 1 + 1000 × 498C 1 + 1 × 1500]
C2
1
= 500 [249000 + 498000 + 1500]
C2
748500
= =6
124750
Therefore, the expected gain is |6.
11
Statistics for Data Science - 2
Week 3 Practice Assignment
Expectation and variance
5 15
1. If the expected value and variance of the Binomial random variable X are and ,
2 8
respectively, then find the value of P (X = 10).
10
3
(a)
4
10
3
(b) 10
4
10
1
(c)
4
10
1
(d) 10
4
Solution: If X ∼ Binomial(n, p), then expected value and variance of X is given by np
and np(1 − p), respectively.
Given that
5
E[X] = np = ...(1)
2
And
15
Var(X) = np(1 − p) = ..(2)
8
Putting the value of np in the equation (2) from equation (1), we get
3 1
(1 − p) = ⇒ p = .
4 4
Putting the value of p in equation (1), we get
n = 10
It implies that X ∼ Binomial 10, 14
1 1
2. X and Y are two independent geometric random variables with parameters and ,
2 4
respectively. Find the value of Var(X + 2Y ).
Solution:
1−p
We know that if X ∼ Geometric(p), then Var(X) =
p2
1
1− 2
Therefore, Var(X) = 1 =2 ...(1)
4
1
1− 4
Var(Y ) = 1 = 12 ...(2)
16
3. The number of spam messages (X) sent to a server in a day has Poisson distribution
with parameter λ = 21. Each spam message independently has a probability of p = 13
of not being detected by the spam filter. Let Y denote the number of spam messages
detected by the filter in a day. Calculate the expected value of X + Y .
solution:
X denotes the number of spam messages sent to the server in a day and
X ∼ Poisson(21)
Therefore, Y ∼ Poisson(14)
4. Two random variables X and Y are jointly distributed with the joint pmf
1
fXY (x, y) = (x + y),
9
where x and y are integers in 0 ≤ x ≤ 2 and 0 ≤ y ≤ 1. Let Z = XY + Y 2 . Find the
expected value of Z.
1
(a)
3
Page 2
4
(b)
3
2
(c)
3
14
(d)
9
Solution:
E[Z] = E[XY + Y 2 ]
X
= (xy + y 2 )fXY (x, y)
0≤x≤2;0≤y≤1
1 X
= (xy + y 2 )(x + y)
9 0≤x≤2;0≤y≤1
1
= (1 + 4 + 9)
9
14
=
9
5. The distribution of a certain company’s employees’ monthly salary has mean |60000 and
standard deviation |20000. The probability that a randomly selected employee from that
company has a salary either greater than or equal to |100000 or less than or equal to
|20000 is:
1
(a) at least
4
1
(b) at most
4
1
(c) at least
2
1
(d) at most
2
Solution:
Let X denote the employees’ monthly salary.
Given that E[X] = µ = 60000 and SD= σ = 20000.
Page 3
Hence, probability that a randomly selected employee from that company has a salary
either greater than or equal to |100000 or less than or equal to |20000 is at most 14 .
6. Two random variables X and Y are jointly distributed with the joint pmf
1
fXY (x, y) = (xy + x + y + 1),
27
where x and y are integers in 0 ≤ x ≤ 1 and 1 ≤ y ≤ 3. Find the correlation coefficient
of X and Y .
Solution:
X
E[X] = xfXY (x, y)
x∈TX ,y∈YY
1 X
= x(xy + x + y + 1)
27 x∈T ,y∈Y
X Y
1
= (4 + 6 + 8)
27
18 2
= =
27 3
X
E[Y ] = yfXY (x, y)
x∈TX ,y∈YY
1 X
= y(xy + x + y + 1)
27 x∈T ,y∈Y
X Y
1
= (2 + 6 + 12 + 4 + 12 + 24)
27
60 20
= =
27 9
X
E[XY ] = xyfXY (x, y)
x∈TX ,y∈YY
1 X
= xy(xy + x + y + 1)
27 x∈T ,y∈Y
X Y
1
= (4 + 12 + 24)
27
40
=
27
Page 4
Cov(X, Y ) = E[XY ] − E[X]E[Y ]
40 2 20
= − .
27 3 9
=0
We know that
Cov(X, Y )
Correlation coefficient = p =0
Var(X)Var(Y )
7. Let X and Y be two independent random variables such that X ∼ Binomial(4, 12 ) and
Y ∼ Uniform({1, 2, 3}). Find the value of Cov(2X + Y , X + Y 2 X).
(a) 16.67
(b) 6.67
(c) 13.37
(d) 0
Solution:
Since X and Y are independent random variables, (X 2 , Y 2 ), (X, Y 2 ), (X, Y 3 ) are also
independent. It implies that
E[X 2 Y 2 ] = E[X 2 ]E[Y 2 ]
E[Y 2 X] = E[Y 2 ]E[X]
E[XY 3 ] = E[X]E[Y 3 ]
Therefore,
Page 5
Now, X ∼ Binomial(4, 12 )
Therefore, E[X] = np = 2
Var(X) = np(1 − p) = 1
E[X 2 ] = Var(X) + (E[X])2 = np(1 − p) + (np)2 = 1 + 4 = 5
Therefore,
70 56 56
Cov(2X + Y, X + Y 2 X) = 2(1) + 2( − ) + 24 −
3 3 3
28
= 26 −
3
= 16.67
X
0 1 2
Y
2 5
-1 0
17 17
1 2
0 0
17 17
3 4
1 0
17 17
Find the standard deviation of the product of the two random variables. (Write your
answer correct up to two decimal points.)
Solution:
To find: SD(XY )
Page 6
X
E[XY ] = xyfXY (x, y)
x∈TX ,y∈TY
2 5 4
= −1( ) − 2( ) + 2( )
17 17 17
−4
=
17
X
E[(XY )2 ] = = x2 y 2 fXY (x, y)
x∈TX ,y∈TY
2 5 4
= 1( ) + 4( ) + 4( )
17 17 17
38
=
17
9. An ice-cream seller sells ice creams at three prices: |30, |40, and |50. A random cus-
tomer will buy an ice cream of |30, |40 and |50 with probabilities of 0.5, 0.3, and
0.2, respectively. If the number of customers in a day follows Poisson distribution with
λ = 60, what is the expected sales (in |) of the seller in a day?
Solution:
Let X denote the number of customers coming to the ice-cream seller in a day, then
X ∼ Poisson(60)
Let Y denote the price at which the customer buys the ice-cream, then
E[Y ] = 30(0.5) + 40(0.3) + 50(0.2) = 37
Page 7
But since X ∼ Poisson(60), on an average 60 customers come to the ice-cream seller in
a day. It means that expected sale of the day will be
10. An urn contains 10 balls numbered from 1 to 10. We remove six balls randomly and add
up their numbers. Let X denote the sum of the numbers of the removed balls. Find the
expected value of X.
6
P
(Hint: Suppose Xi denotes the number of the ith removed ball, then X = Xi )
i=1
Solution:
Let Xi , i = 1, 2, ...6 denote the number on the ith ball, then
P6
X= (Xi )
i=1 6
P
⇒ E[X] = E (Xi )
6 i=1
P
⇒ E[X] = E(Xi )
i=1
⇒ E[X] = 6E(Xi ) ...(1)
1 11
Now, E[Xi ] = [1 + 2 + 3 + ...10] =
10 2
Putting the value in equation (1), we get
11
E[X] = 6 × = 33
2
Page 8
Statistics for Data Science - 2
P (X > 4) = 1 − P (X ≤ 4) = 1 − FX (4)
= 1 − (1 − e−3×4 )
= e−12
a) 1 − e−18
b) e−5 − e−18
c) e−18
d) e−9
Solution:
1
i) Find the value of k.
Solution:
We know that for PDF of the random variable
Z ∞
fX (x) = 1
−∞
Z ∞
⇒ ke−x dx = 1
0
∞
e−x
⇒k =1
−1
0
a) e−1
b) e−3 e−4
c) e−3 − e−4
d) e−4 − e−3
Rb
Hint: Use a e−x dx = e−a − e−b
Solution:
Z 4
P (3 < X < 4) = ke−x dx
3
4
e−x
=1×
−1
3
e−4 e−3
= −
−1 −1
−3 −4
=e −e
3 1 P (X ≤ 43 and X > 14 )
P (X ≤ |X> )=
4 4 P (X > 14 )
R 3/4 4
1/4
5x dx
= R1
1/4
5x4 dx
3/4
5x5
5
1/4
= 1
5x5
5
1/4
3/4
x5
3 1 1/4
⇒ P (X ≤ |X> )= 1
4 4
x5
1/4
3 5
(4) − ( 14 )5
=
1 − ( 14 )5
22
=
93
4. The lifespan (in hours) of an electronic component used in an electric car has the density
function ( x
1 − 500
500
e x≥0
fX (x) =
0 otherwise
Determine the probability that the component lasts more than 200 hours before it needs
to be replaced.
a) e−0.4
b) e200
3
c) 0.5
d) e−2.5
Solution:
Let X denote the lifespan (in hours) of the electronic component. We have to find the
probability that the component lasts more than 200 hours before it needs to be replaced
i.e.
P (X > 200) = 1 − P (X ≤ 200)
1
Also, we can relate the given density with the exponential distribution with λ = 500 .
5. A firm produces machines with a lifespan, whose distribution has a mean of 200 months
and standard deviation of 50 months. The firm wishes to introduce a warranty scheme
in which it would like to replace all the dysfunctional machines with new ones within
warranty period. But they do not wish to do so for more than 11.9% of the machines
they produce. If the lifespan of the machine is assumed to follow a normal distribution,
how long a guarantee period should be offered? (Answer is expected in months)
Hint: Use P (Z < −1.18) = 0.119, where Z represents the standard normal distribution.
Solution:
Let X denote the lifespan of the machines in months. Given that µ = 200 and σ = 50.
The firm did not wish to replace more than 11.9% of the machines they produce.
If m be the guarantee period (in months), then
P (X ≤ m) = 0.119
X − 200 m − 200
⇒P ≤ = 0.119
50 50
Comparing this equation with the given value of standard normal distribution we will get
m − 200
= −1.18
50
⇒ m = 141
4
a) (
1 0<y<1
fY (y) =
0 otherwise
b) (
(1 − y)3 0<y<1
fY (y) =
0 otherwise
c) (
y3 0<y<1
fY (y) =
0 otherwise
d) (
3y 2/3 0<y<1
fY (y) =
0 otherwise
Hint:
d
Apply the monotonic, differentiable function theorem and (1 − x)3 = −3(1 − x)2
dx
Solution:
We know that in the range (0, 1), (1 − x)3 is monotonic (decreasing function).
1
Therefore, we can use the formula, fY (y) = 0 −1 fX (g −1 (y))
|g (g (y))|
Given Y = (1 − X)3 = g(X)(let)
⇒ y 1/3 = 1 − x, ⇒ x = 1 − y 1/3 = g −1 (y)
Therefore g −1 (y) = 1 − y 1/3
d
g(x) = (1 − x)3 ⇒ g 0 (x) = −3(1 − x)2 , since (1 − x)3 = −3(1 − x)2
dx
And
g 0 (g −1 (y)) = g 0 (1 − y 1/3 ) = −3(1 − (1 − y 1/3 ))2 = −3y 2/3
|g 0 (g −1 (y))| = 3y 2/3 , since y 2/3 is positive in the range (0, 1).
fX (g −1 (y)) = fX (1 − y 1/3 ) = 3(1 − (1 − y 1/3 ))2 = 3y 2/3
3y 2/3
Therefore, fY (y) = 2/3
3y
⇒ fY (y) = 1
Therefore
(
1 0<y<1
fY (y) =
0 otherwise
5
Define Y = 13 (12 − X). Find the PDF of the random variable Y .
a) (
(12 − 3y)2 /27 −6 < y < 3
fY (y) =
0 otherwise
b) (
(12 − 3y)2 /27 3 < y < 6
fY (y) =
0 otherwise
c) (
(12 − 3y)/27 −6 < y < 3
fY (y) =
0 otherwise
d) (
(12 − 3y)/27 3 < y < 6
fY (y) =
0 otherwise
Solution:
We know that in the range (-6, 3), 31 (12 − x) is monotonic (decreasing function).
1
Therefore, we can use the formula, fY (y) = 0 −1 fX (g −1 (y))
|g (g (y))|
Given Y = 31 (12 − X) = g(X)(let)
⇒ 3y = 12 − x, ⇒ x = 12 − 3y = g −1 (y)
Therefore g −1 (y) = 12 − 3y
g(x) = 31 (12 − x) ⇒ g 0 (x) = − 13
And
g 0 (g −1 (y)) = g 0 (12 − 3y) = − 13
|g 0 (g −1 (y))| = 13
(12 − 3y)2
fX (g −1 (y)) = fX (12 − 3y) =
81
(12 − 3y)2
Therefore, fY (y) = 81
1
3
(12 − 3y)2
⇒ fY (y) =
27
When x = −6, y = 6 and x = 3, y = 3.
Therefore
(12 − 3y)2
3<y<6
fY (y) = 27
0 otherwise
6
8. Let X be a continuous random variable with the following PDF:
(
x3 (6x2 + 5x − 4) 0 < x ≤ 1
fX (x) =
0 otherwise
Z ∞
E[X] = xfX (x)dx
−∞
Z 1
= x × x3 (6x2 + 5x − 4)dx
Z0 1
= (6x6 + 5x5 − 4x4 )dx
0
1 1 1
6x7 5x6 4x5
= + −
7 6 5
0 0 0
6 5 4
= + −
7 6 5
187
=
210
7
Rb 1
Use a
xn dx = n+1
(bn+1 − an+1 )
Rb Rc Rb
Also, a xn dx = a xn dx + c xn dx where a < c < b.
Solution:
Var(Y ) = Var(6X + 5) = 36Var(X)
And Var(X) = E[X 2 ] − (E[X])2
Z ∞
E[X] = xfX (x)dx
−∞
Z 2
= xfX (x)dx
0
Z 1 Z 2
= xfX (x)dx + xfX (x)dx
0 1
Z 1 Z 2
= x.xdx + x(2 − x)dx
0 1
1 2 2
x3 2x2 x3
= + −
3 2 3
0 1 1
1 (23 − 13 )
= + (22 − 12 ) −
3 3
1 7
= +3−
3 3
=1
Z ∞
2
E[X ] = x2 fX (x)dx
−∞
Z 2
= x2 fX (x)dx
Z0 1 Z 2
2
= x fX (x)dx + x2 fX (x)dx
0 1
Z 1 Z 2
2
= x .xdx + x2 (2 − x)dx
0 1
1 2 2
4 3
x 2x x4
= + −
4 3 4
0 1 1
1 2 1
= + (23 − 13 ) − (24 − 14 )
4 3 4
1 14 15
= + −
4 3 4
7
=
6
8
Therefore,
Var(X) = 67 − 1 = 16
⇒ Var(Y ) = 36 × 16 = 6
9
Statistics for Data Science - 2
(a) e
(b) 0
(c) e−
(d) e−2
Answer: b
Solution: R0
We know that P (− < X < 0) = − fX (x)dx
But the value of fX (x) is zero in the range − to zero.
Therefore, P (− < X < 0) = 0.
Therefore, option b is the correct option.
1
2. Which of the following statements is/are true for a continuous random variable with
PDF fX (x)?
(a) If fX (2) = 2fX (1), then P (2 − < X < 2 + ) = 2P (1 − < X < 1 + ) for a small
.
(b) If fX (2) = 2fX (1), then P (2 − < X < 2 + ) ≈ 2P (1 − < X < 1 + ) for a small
.
(c) P (X = x0 ) = 0 for any value of x0 .
(d) CDF FX (x) is continuous in the domain [−∞, ∞].
Answer: b, c, and d
Solution:
Option a: We know that for small , P (x − < X < x + ) ∝ fX (x).
Therefore, P (1 − < X < 1 + ) ∝ fX (1) and P (2 − < X < 2 + ) ∝ fX (2)
But P (x − < X < x + ) is not exact linear function of fX (x).
Therefore when fX (2) = 2fX (1), then P (2 − < X < 2 + ) 6= 2P (1 − < X < 1 + )
but P (2 − < X < 2 + ) ≈ 2P (1 − < X < 1 + )
Hence option a is wrong but option b is correct.
Option c: The probability at an instant (PX (x)) for a continuous random variable is
zero as there is no sudden spike in the CDF function for any value of x. Hence option
c is correct.
Option d: For a continuous random variable CDF is always continuous.
3. If
1 (x2 − 8x + 16) 1 ≤ x ≤ 7
fX (x) = 18
0 otherwise
What is the value of P (X ≤ 4)? Enter the answer correct to one decimal accuracy.
R xa+1
( xa dx = )
a+1
Answer: 0.5
Solution: R
4
P (X ≤ 4) = −∞ fX (x)dx
R4
⇒ P (X ≤ 4) = 1 fX (x)dx, since fX (x) = 0 for x < 1.
R4 1 2
⇒ P (X ≤ 4) = 1 ( 18 (x − 8x + 16))dx
1 3
⇒ P (X ≤ 4) = (x /3 − 8x2 /2 + 16x/1)|41
18
1 1
⇒ P (X ≤ 4) = (43 /3 − 4 ∗ 42 + 16 ∗ 4) − (13 /3 − 4 ∗ 12 + 16 ∗ 1)
18 18
⇒ P (X ≤ 4) = 0.5
2
4. If X ∼ Normal(10, 25), what is the value of E[2X 2 ]?
Answer: 250
Solutions:
Given E[X]=10, Var(X)=25
We know that Var(X)= E[X 2 ] − E[X]2
⇒ E[X 2 ] = Var(X) + E[X]2
⇒ E[X 2 ] = 25 + 102 = 125
We know thatE[cX] = cE[X], where c is a constant.
⇒ E[2X 2 ] = 2E[X 2 ]
⇒ E[2X 2 ] = 2 × 125 = 250
5. If X ∼ Normal(10, 4), then what is the value of P (X ≥ 8|X ≤ 9)? Use the standard
normal distribution tables if necessary. Enter the answer up to two decimals accuracy.
Use the following CDF values of standard normal distribution.
FZ (−2) = 0.02275, FZ (−1.5) = 0.06681, FZ (−1) = 0.15866, FZ (−0.5) = 0.30854, FZ (0) =
0.5, FZ (0.5) = 0.69146, and FZ (1) = 0.84134
Answer: 0.485 accepted range 0.48 to 0.49
Solution:
Given µ = 10, σ 2 = 4 ⇒ σ = 2
We need to find P (X ≥ 8|X ≤ 9).
P (X ≥ 8 ∩ X ≤ 9)
P (X ≥ 8|X ≤ 9) =
P (X ≤ 9)
FX (9) − FX (8)
P (X ≥ 8|X ≤ 9) =
FX (9)
Converting present normal distribution to standard distribution to get values of FX (x).
x−µ 8 − 10
For x = 8, z = = = −1, ⇒ FX (8) = FZ (−1)
σ 2
x−µ 9 − 10
For x = 9, z = = = −0.5, ⇒ FX (9) = FZ (−0.5)
σ 2
FX (9) − FX (8)
P (X ≥ 8|X ≤ 9) =
FX (9)
0.30854 − 0.15866
⇒ P (X ≥ 8|X ≤ 9) = = 0.485
0.30854
3
2 log(y) 1≤y≤e
(a) fY (y) = y
0 otherwise
log(y)
1≤y≤e
(b) fY (y) = 2ey
0 otherwise
log(y) 1≤y≤e
(c) fY (y) = y
0 otherwise
log(y)
1≤y≤e
(d) fY (y) = ey
0 otherwise
log(y) 1≤y≤e
(e) fY (y) = 2y
0 otherwise
Answer: a
Solution:
Given Y = g(X) = eX
⇒ log y = x = g −1 (y)
Therefore g −1 (y) = log(y)
d(ex )
g(x) = ex , ⇒ g 0 (x) = ex Since = ex
dx x
We know that in the range 0 to 1, e is monotonic (increasing function).
1
Therefore, we can use the formula, fY (y) = 0 −1 fX (g −1 (y))
|g (g (y))|
g 0 (g −1 (y)) = g 0 (log y) = elog y = y
|g 0 (g −1 (y))| = y since y is positive in the range [1, e]
fX (g −1 (y)) = fX (log y) = 2 log y
1
Therefore, fY (y) = log y
y
2 log y
fY (y) =
y
Hence option a is correct.
4
The CDF of random variable X is given below:
0 x≤0
2x2 0 ≤ x ≤ 12
FX (x) = 12 1
2
≤x≤1
x
1≤x≤2
2
1 x≥2
d(xa )
= axa−1
dx
7. Which of the following statements is/are correct?
Answer: a, d
Solution:
d(FX (x))
We know that fX (x) =
dx
Given
0 x≤0
2x2 0 ≤ x ≤ 12
1 1
FX (x) = 2 2
≤x≤1
x
1≤x≤2
2
1 x≥2
5
d(0
=0 x≤0
dx
d(2x2 )
1
= 4x 0≤x≤
2
dx
1
d( 2 )
⇒ fX (x) = =0 1
<x≤1
2
dx
d( x2 )
= 12
1<x≤2
dx
d(1) = 0
x>2
dx
0 x≤0
1
4x 0 ≤ x ≤ 2
1
Therefore, fX (x) = 0 2
<x≤1
1
1<x≤2
2
0 x>2
Since, FX (x) is continuous in the given domain, hence X is a continuous random
variable.
8. What is the value of P (X ≥ 1|X ≤ 1.5)? Enter the answer correct to two decimals
accuracy.
Answer: 0.33, accepted range 0.31 to 0.35
Solution:
FX (1.5) − FX (1) 1.5/2 − 1/2
P (X ≥ 1|X ≤ 1.5) = = = 1/3
FX (1.5) 1.5/2
9. The time taken by Rohith to complete a race follows the exponential distribution with
expected time of completion of 10 minutes. What is the probability that Rohith takes
less than 20 minutes but more than 10 minutes to complete the race? Enter the answer
e−ax
correct to 2 decimals accuracy. ( e−ax dx =
R
)
−a
Answer: 0.2325, accepted range: 0.23 to 0.235
Solution:
Given E[X] = 10 minutes.
We know for a exponential distribution E[X] = λ1
⇒ λ1 = 10, λ = 0.1
For exponential distribution FX (x) = 1 − e−λx
The probability that athlete takes more than 10 minutes is,
FX (10) = 1 − e−0.1×10 = 1 − e−1
The probability that athlete takes more than 20 minutes is,
6
FX (20) = 1 − e−0.1×20 = 1 − e−2
The probability that athlete takes more than 10 minutes but less than 20 minutes to
complete race is FX (20) − FX (10) = e−1 − e−2 = 0.232 approximately.
10. The PDFs of random variables X1, X2, X3, X4, and X5 are shown in Figure 4.2.P.
Based on the information, choose the correct option(s) from below.
Answer: a, d, and e
Solution:
We know that in the PDF of normal distribution, the peak value occurs at mean.
E[X] = µ(mean)
Also, the value of PDF at mean is inversely proportional to standard deviation
1
Since, fX (µ) = √ .
2πσ
The peak value, which is mean or E[X], of PDF occurs approximately for X1, X2, X3, X4,
and X5 at -10, 0, 20, 10, and -10 respectively.
Therefore, E(X1) ≈ E(X5) < E(X2) < E(X4) < E(X3)
The peak value (fX (µ)) for variables X1, X2, X3, X4, and X5 are such that fX1 (µ) ≈
fX2 (µ) > fX3 (µ) > fX4 (µ) > fX5 (µ).
Therefore, Var(X1) ≈ Var(X2) < Var(X3) < Var(X4) < Var(X5)
Hence, options a, d, and e correct.
7
11. The PDF of a continuous random variable is given as
(
4x3 0 ≤ x ≤ 1
fX (x) =
0 otherwise
R xa+1
What is the value of Var(X)? ( xa dx = )
a+1
1
(a)
75
2
(b)
75
3
(c)
75
4
(d)
75
Answer: b
We knowR that Var(X) = E[X 2 ] − E[X]2
E[X] = xfX (x)dx
R1
E[X] = 0 x ∗ 4x3 dx
R1
⇒ E[X] = 0 4x4 dx
4x5 1
⇒ E[X] = |
5 0
4 4
⇒ E[X] = 5
−0= 5
R
E[X 2 ] = x2 fX (x)dx
R1
E[X] = 0 x ∗ 4x4 dx
6
⇒ E[X] = 4x6 |10
4 2
⇒ E[X] = 6
−0= 3
2
Var(X) = 3
− ( 45 )2
2 16
Var(X) = 3
− 25
2
Var(X) = 75
8
(a) If b2 − a2 = b1 − a1 , then Var(X) = Var(Y ).
(b) If b2 + a2 = b1 + a1 , then Var(X) = Var(Y ).
(c) If b2 − a2 = b1 − a1 , then E(X) = E(Y ).
(d) If b2 − b1 = a1 − a2 , then E(X) = E(Y ).
Answer: a and d
Solution:
We know that mean (E(X)) and Variance (Var(X)) of uniform random variable (X ∼
a+b (b − a)2
Uniform(a, b) is and respectively.
2 12
Given X ∼ Uniform(a1 , b1 ) and Y ∼ Uniform(a2 , b2 ),
a1 + b 1 a2 + b 2
E(X) = , E(Y ) = . So, for E(X) to be equal to E(Y ), a1 + b1 = a2 + b2
2 2
or b2 − b1 = a1 − a2 . Hence option d is correct and option c is incorrect.
(b1 − a1 )2 (b2 − a2 )2
Similarly for Var(X) to be equal to Var(Y ), = or b1 −a1 = b2 −a2 ,
12 12
hence option a is correct and option b is incorrect.
13. The CDF of a random variable X is given as:
0 x x<0
FX (x) = 0 ≤ x ≤ ln 2
ln 4
1 − e−x ln 2 ≤ x < ∞
9
0 x<0
1
(d) fX (x) = 0 ≤ x < ln 2
lnx 2
e ln 2 ≤ x < ∞
Answer: a
Solution:
d(FX (x))
We know that fX (x) =
dx
Given,
0 x<0
x
FX (x) = 0 ≤ x ≤ ln 2
ln 4
1 − e−x
ln 2 ≤ x < ∞
Therefore,
d(0)
=0 x<0
dx
x
d( )
fX (x) = ln 4 = 1 0 ≤ x ≤ ln 2
dx ln 4
−x
d(1 − e ) = e−x
ln 2 ≤ x < ∞
dx
Hence option a is correct.
10
Statistics for Data Science - 2
Week 5 graded Assignment
Solution
1. A person randomly chooses a battery from a store which has 40 batteries of type A and
60 batteries of type B. Battery life of type A and type B batteries are exponentially
distributed with average life of 4 years and 6 years, respectively. If the chosen battery
lasts for 5 years, what is the probability that the battery is of type A?
1
(a) 5
1 + e 12
1
(b) −5
1 + e 12
−4
e5
(c) −6
1+e 5
−6
e5
(d) −4
1+e 5
Solution:
Define a event X as follows:
(
1 If the chosen battery is of type A
X=
0 If the chosen battery is of type B
Y |X = 0 ∼ Exp( 16 )
It implies that
−y
fY |X=1 (y) = 14 e 4 ; y > 0 and
−y
fY |X=0 (y) = 16 e 6 ;y > 0
40 2
P (X = 1) = = and
100 5
60 3
P (X = 0) = =
100 5
fY |X=1 (5).P (X = 1)
fX|Y =5 (1) =
fY (5)
fY |X=1 (5).P (X = 1)
=
fY |X=1 (5).P (X = 1) + fY |X=0 (5).P (X = 0)
1 −5
4
e 4 . 52
= 1 −5 −5
4
e 4 . 52 + 16 e 6 . 35
1 −5
10
e4
= 1 −5
1 −5
10
e 4 + 10 e6
−5
e 4
= −5 −5
e 4+e 6
1
= 5
1 + e 12
3 exp( 18 )
(a)
3 exp( 18 ) + 6 + 2 exp( 29 )
3 exp( −1
8
)
(b)
3 exp( −1
8
) + 6 + 2 exp( −2
9
)
2 exp( −2
9
)
(c)
3 exp( 8 ) + 6 + 2 exp( −2
−1
9
)
6
(d)
3 exp( 32 ) + 6 + 2 exp( −1
−1
18
)
Solution:
Given that X ∼ Uniform{1, 2, 3} and Z ∼ Normal(1, 4) are independent.
Y = XZ + X
It implies that
Page 2
Y |X = 1 = Z + 1 ∼ Normal(2, 4)
Y |X = 2 = 2Z + 2 ∼ Normal(4, 16)
Y |X = 3 = 3Z + 3 ∼ Normal(6, 36)
Therefore,
−(y−2)2
fY |X=1 (y) = √1 exp
2 2π 8
−(y−4)2
fY |X=2 (y) = √1 exp
4 2π 32
−(y−6)2
fY |X=3 (y) = √1 exp
6 2π 72
−(2−4)2
√1 exp . 13
4 2π 32
=
−(2−4)2 −(2−2)2 −(2−6)2
√1 exp . 31 + √1 exp . 13 + √1 exp . 31
4 2π 32 2 2π 8 6 2π 72
exp −1 1
8 4
= 1
exp −1 −2
1
+ 2 exp(0) + 16 exp
4 8 9
3 exp( −1
8
)
=
3 exp( 8 ) + 6 + 2 exp( −2
−1
9
)
1. Yes
2. No
Solution:
First we will calculate the marginal densities of X and Y .
Page 3
For 0 ≤ x ≤ 1
Z 1
fX (x) = fXY (x, y)dy
0
Z 1
= 4xydy
0
1
2
= 2xy
0
= 2x
For 0 ≤ y ≤ 1
Z 1
fY (y) = fXY (x, y)dx
0
Z 1
= 4xydx
0
1
2
= 2x y
0
= 2y
Therefore,
fX (x).fY (y) = 4xy = fXY (x, y)
It implies that X and Y are independent random variables.
y=x
2
x
1 2
Page 4
The region X ≥ Y will be the lower half part of the circle.
Therefore,
Area of lower half circle
P (X ≥ Y ) =
Area of the circle
π(1)2/2
=
π(1)2
1
=
2
5. Let (X, Y ) ∼ Uniform(D), where D = {(x, y) : y ≤ 2x, 0 < x < 1, 0 < y < 2} ∪ [1, 2] ×
[0, 2]. Find the marginal density of X.
(a)
2x + 2 0≤x≤2
fX (x) = 3 3
0 otherwise
(b)
2x + 1 0≤x≤2
fX (x) = 3 3
0 otherwise
(c)
2x
3
0≤x≤1
2
fX (x) = 1≤x≤2
3
0 otherwise
(d)
2x
3
0≤x≤1
1
fX (x) = 1≤x≤2
3
0 otherwise
Page 5
y
2 y = 2x
x
1 2
Page 6
6. The joint pdf of two random variables X and Y is given by
(
24xy 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, x + y ≤ 1
fXY (x, y) =
0 otherwise
x+y =1
0.25
x
0.25 1
Z 1/4 Z 1/4−y
= 24xydxdy
y=0 x=0
1/4−y
Z 1/4
= 12x2 y dy
y=0
x=0
Page 7
Z 1/4 2
1
= 12y −y dy
y=0 4
Z 1/4
12
= y(1 − 4y)2 dy
y=0 16
Z 1/4
3
= y(1 + 16y 2 − 8y)dy
4 y=0
1/4
y2 8y 3
3
= + 4y 4 −
4 2 3
y=0
3 1 1 1
= + −
4 32 64 24
3 1 1
= . =
4 192 256
Option (b)
x + y = 0.5
0.5 x+y =1
x
0.5 1
Page 8
Orange region will denote X + Y ≤ 12 . Now,
Z 1/2 Z 1/2−y
1
P (X + Y ≤ ) = fXY (x, y)dxdy
2 y=0 x=0
Z 1/2 Z 1/2−y
= 24xydxdy
y=0 x=0
1/2−y
Z 1/2
2
= 12x y dy
y=0
x=0
Z 1/2 2
1
= 12y −y dy
y=0 2
Z 1/2
12
= y(1 − 2y)2 dy
y=0 4
Z 1/2
=3 y(1 + 4y 2 − 4y)dy
y=0
1/2
y2 4y 3
=3 + y4 −
2 3
y=0
1 1 1
=3 + −
8 16 6
2 1
=3× =
96 16
Page 9
y
0.5 x+y =1
x
0.5 1
Page 10
7. The joint pdf of two random variables X and Y is given by
(
3xy(1 − x) 0 ≤ x ≤ 1, 0 ≤ y ≤ 2
fXY (x, y) =
0 otherwise
1 fXY (X > 12 , Y = 1)
P (X > |Y = 1) =
2 fY (1)
1
= 2fXY (X > , Y = 1)
2
Z 1
= 2(3x(1 − x))dx
x= 12
Z 1
=6 (x − x2 )dx
1
2
1
x2 x3
=6 −
2 3 1
2
1 1 1 1 1 1
=6 − −6 − =1− =
2 3 8 24 2 2
8. The amount of milk (in litres) in a shop at the beginning of any day is a random amount
X from which a random amount Y (in litres) is sold during that day. Assume that the
Page 11
joint density function of X and Y is given by
(
1
0 ≤ x ≤ 10, 0 ≤ y ≤ x
fXY (x, y) = 50
0 otherwise
Find the probability that amount of milk left at the end of day is less than 5 litres. Write
your answer correct to two decimal points.
Solution:
y
y=x
10
5
x−y =5
x
5 10
X denotes the amount of milk at the beginning of any day and Y denotes the amount
of milk which is sold during that day.
Therefore, amount of milk left at the end of the day will be denoted by X − Y .
To find: P (X − Y < 5)
In the diagram above, brown region denotes X −Y < 5 and brown + blue region denotes
the support of X and Y .
1
Area of the support(X, Y ) = 2
× 10 × 10 = 50.
Therefore,
area of brown region
P (X − Y < 5) =
area of support
75/2
=
50
75
=
100
Page 12
9. The joint pdf of two continuous random variables X and Y is given by
(
ke−(x+y) x ≥ 0, y ≥ 0
fXY (x, y) =
0 otherwise
(a) e−10
(b) (e−5 − 1)e−5
(c) (1 − e−5 )e−5
(d) (e−5 + 1)e−5
Solution:
We know that Z Z
fXY dxdy = 1
Supp(X,Y )
Therefore,
Z ∞ Z ∞
(ke−(x+y) )dxdy = 1
y=0 x=0
Z ∞ Z ∞
⇒k e−y e−x dxdy = 1
y=0 x=0
∞
Z ∞
−y −x
⇒k e (−e ) dy = 1
y=0
0
Z ∞
−y
⇒k e (0 + 1)dy = 1
Zy=0
∞
⇒k e−y dy = 1
y=0
∞
−y
⇒k(−e ) =1
0
⇒k(0 + 1) = 1
⇒k = 1
To find: P (X ≥ 5, Y ≤ 5)
Page 13
Now,
Z 5 Z ∞
P (X ≥ 5, Y ≤ 5) = (e−(x+y) )dxdy
y=0 x=5
Z 5 Z ∞
= e−y e−x dxdy
y=0 x=5
∞
Z 5
= e−y (−e−x ) dy
y=0
5
Z 5
= e−y (0 + e−5 )dy
y=0
Z 5
−5
= (e ) e−y dy
y=0
5
= (e−5 )(−e−y )
0
= (e−5 )(−e−5 + 1)
= (e−5 )(1 − e−5 )
Therefore, fX ( 12 ) = 3
8
Page 14
Now,
1 1 fXY (X = 12 , 12 ≤ Y ≤ 1)
P ( ≤ Y ≤ 1|X = ) =
2 2 fX ( 12 )
Z 1
8 1 1
= + y dy
1/2 3 8 2
Z 1
1 1
= + y dy
1/2 3 2
1
y y2
= +
6 6 1/2
1 1 1 1
= + − +
6 6 12 24
1 1 5
= − = = 0.20
3 8 24
Page 15
Statistics for Data Science - 2
a) 0.6e−y + 0.4e−3y
b) 0.4e−y + 0.6e−3y
c) 0.6e−y + 1.2e−3y
d) 0.4e−y + 1.8e−3y
Solution:
Given that, X ∼ Bernoulli(0.6), therefore pX (1) = 0.6 and pX (0) = 0.4.
The marginal density of Y is given by
X
fY (y) = pX (x)fY |X=x (y)
x∈TX
1
The marginal density of Y is given by
X
fY (y) = pX (x)fY |X=x (y)
x∈TX
2e−12
a)
e−6 + 2e−12
e−6
b) −6
e + 2e−12
e−12
c) −6
e + e−12
e−6
d) −6
e + e−12
Solution:
Given that, X ∼ Uniform{1, 2}, therefore pX (1) = pX (2) = 21 .
The marginal density of Y is given by
X
fY (y) = pX (x)fY |X=x (y)
x∈TX
And
pX (2)fY |X=2 (3)
fX|Y =3 (2) =
fY (3)
1
2
× 4e−4×3
= −2×3
e + 2e−4×3
2e−12
= −6
e + 2e−12
2
4. The joint density function of two continuous random variables X and Y is given as
(
kxy 0 < x < 4, 0 < y < 1
fXY (x, y) =
0 otherwise
Find the value of k. Enter your answer correct to two decimals accuracy.
Solution: R∞ R∞
We know that for joint PDF, −∞ −∞ fXY (x, y)dxdy = 1
Since fXY (x, y) is nonzero in the region 0 < x < 4, 0 < y < 1.
Z 1Z 4
⇒ fXY (x, y)dxdy = 1
0 0
Z 1Z 4
⇒ kxy dxdy = 1
0 0
Z 1
y2 4
⇒ kx dx = 1
0 2 0
Z 1
⇒ 8kxdx = 1
0
x2 1
⇒ 8k =1
2 0
1
⇒k= = 0.25
4
5. Let (X, Y ) ∼ Uniform(D), where D = {(x, y) : x + y < 4, x > 0, y > 0}. Find the value
of P (2X + Y > 2).
1
a) 8
7
b) 8
3
c) 4
1
d) 4
Solution:
3
1
Area of the lower shaded region (A) will be 2
×1×2=1
Solution:
4
Z 1 Z 1−y
P (X + Y < 1) = (x + y)dxdy
0 0
Z 1 2
x 1−y
= + xy dy
0 2 0
Z 1
(1 − y)2
= + (1 − y)y dy
0 2
1
(1 − y)3 y 2 y 3
= − + −
6 2 3
0
1 1 1
= − − −
2 3 6
1
=
3
7. The joint PDF of two continuous random variables X and Y is given by
(
2
(5x + 2y) 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
fXY (x, y) = 7
0 otherwise
Find the marginal PDF of X.
a) (
2x 0 ≤ x ≤ 1
fX (x) =
0 otherwise
b) (
2
7
(5x + 1) 0 ≤ x ≤ 1
fX (x) =
0 otherwise
c) (
2
7
(3x + 2) 0 ≤ x ≤ 1
fX (x) =
0 otherwise
5
d) (
2
7
(5y + 1) 0 ≤ x ≤ 1
fX (x) =
0 otherwise
Solution:
For 0 ≤ x ≤ 1
Z 1
2
fX (x) = (5x + 2y)dy
0 7
1
2y 2
2
= 5xy +
7 2
0
2
= (5x + 1)
7
a) (
3
2
y(2 − y) 0 < y < 1
fY (y) =
0 otherwise
b) (
2y 0<y<1
fY (y) =
0 otherwise
c) (
3
2
(1 − y2) 0 < y < 1
fY (y) =
0 otherwise
d) (
2
3
(2 − y) 0 < y < 1
fY (y) =
0 otherwise
Solution: R∞ R∞
We know that for joint PDF, −∞ −∞ fXY (x, y)dxdy = 1
6
Since fXY (x, y) is nonzero in the region 0 < x < 4, 0 < y < 1.
Z 1Z 4
⇒ fXY (x, y)dxdy = 1
0 0
Z 1Z 4
⇒ k(2 − y)dxdy = 1
0 0
Z 1 4
⇒ k(2 − y)x dy = 1
0 0
Z 1
⇒ 4k(2 − y)dy = 1
0
1
y2
⇒ 4k 2y − =1
2
0
3
⇒ 4k × = 1
2
1
⇒k=
6
For 0 < y < 1
Z 4
1
fY (y) = (2 − y)dx
0 6
1 4
= (2 − y)x
6 0
2
= (2 − y)
3
9. Let X and Y be two independent continuous random variables with PDFs fX (x) and
fY (y) given as
(
1 0≤x<1
fX (x) =
0 otherwise
(
y/2 0 ≤ y < 2
fY (y) =
0 otherwise
Find the value of P (2X + Y > 1).
1
a) 24
11
b) 12
1
c) 12
23
d) 24
7
Solution:
Given that X and Y be two independent continuous random variables,
therefore fXY (x, y) = fX (x)fY (y).
(
y/2 0 ≤ x < 1, 0 ≤ y < 2
fXY (x, y) =
0 otherwise
We have to find the value of P (2X + Y > 1).
And
P (2X + Y > 1) = 1 − P (2X + Y ≤ 1)
1−y
Z 1 Z
2 y
P (2X + Y ≤ 1) = dxdy
2
Z0 1 0
1−y
y 2
= x dy
0 2 0
Z 1
1
= y(1 − y)dy
0 4
1
1 y2 y3
= −
4 2 3
0
1
=
24
1 23
⇒ P (2X + Y > 1) = 1 − 24
= 24
10. The joint density function of two random variables X and Y is given by
(
8xy 0 ≤ x ≤ 1, 0 ≤ y ≤ x
fXY (x, y) =
0 otherwise
8
a) Yes
b) No
Solution:
Z x
fX (x) = 8xy dy
0
x
y2
= 8x
2
0
3
= 4x
Z 1
fY (y) = 8xy dx
0
1
x2
= 8y
2
0
= 4y
11. Let (X, Y ) ∼ Uniform(D), where D = [3, 5] × [2, 4]. Are X and Y independent?
a) Yes
b) No
Solution:
(X, Y ) ∼ Uniform(D), therefore
9
(
1
4
3 ≤ x ≤ 5, 2 ≤ y ≤ 4
fXY (x, y) =
0 otherwise
Z 4
1
fX (x) = dy
2 4
4
1
= y
4
2
1
=
2
Z 5
1
fY (y) = dx
3 4
5
1
= x
4
3
1
=
2
fX (x)fY (y) = 21 × 12 = 41 = fXY (x, y).
Hence X and Y are independent.
a) (
2x 0 < x < 1
fX|Y =0.5 (x) =
0 otherwise
b) (
3x2 0<x<1
fX|Y =0.5 (x) =
0 otherwise
c) (
4x3 0<x<1
fX|Y =0.5 (x) =
0 otherwise
10
d) (
1 0<x<1
fX|Y =0.5 (x) =
0 otherwise
Solution:
For 0 < y < 1
Z 1
fY (y) = 4xy dx
0
1
x2
= 4y
2
0
= 2y
The distribution of X | Y = 0.5, (0 < x < 1) is given by
Solution:
For 0 < y < 1
Z 1 xy
fY (y) = x2 +
dx
0 3
3 1
x x2 y
= +
3 6
0
1 1
= + y
3 6
11
fXY (x, 1)
fX|Y =1 (x) =
fY (1)
x + x×1
2
= 1 1 3
3
+ ×1
6 x
= 2 x2 +
3
Z 1/2
1 1 x
P <X< |Y =1 = 2 x2 + dx
4 2 1/4 3
3 1/2
x x2
=2 +
3 6
1/4
1 1 1 1
=2 + − +
24 24 192 96
1 1
=2 −
12 64
13
=
96
12
Statistics for Data Science - 2
Week 7 Graded assignment
1. Let X1 , X2 , X3 are three independent and identically distributed random variables with
mean µ and variance σ 2 . Given below are 3 different formulations of sample mean.
(Observe that E[A] = E[B] = E[C]).
X1 + X2 + X3
A=
3
B =0.1X1 + 0.3X2 + 0.6X3
C =0.2X1 + 0.3X2 + 0.5X3
Solution:
Let X1 , X2 , X3 ∼ i.i.d.X, where E[X] = µ, Var(X) = σ 2
X1 + X 2 + X3
Var(A) =Var
3
1
= (Var[X1 ] + Var[X2 ] + Var[X3 ])
9
1 σ2
= (3σ 2 ) =
9 3
1
Var(C) =Var (0.2X1 + 0.3X2 + 0.5X3 )
=0.04Var[X1 ] + 0.09Var[X2 ] + 0.25Var[X3 ]
=0.38(3σ 2 )
=1.14σ 2
2. A random sample of size 25 is collected from a normal population with mean of 50 and
standard deviation of 5. Find the variance of the sample mean.
Solution:
We know that variance of the sample mean X is given by
σ2
Var[X] =
n
52
= =1
25
50
P
3. Let X1 , X2 , . . . , X50 ∼ i.i.d. Poisson(0.04) and let Y = Xi . Use Central Limit
i=1
theorem to find P (Y > 3). Enter the answer correct to 2 decimal places.
Solution:
Let X ∼ Poisson(0.04).
Consider the samples X1 , X2 , . . . , X50 from X.
E[X] = Var[X]
50 =0.04 50
P P
E[Y ] = E Xi = 50 × 0.04 = 2, Var[Y ] = Var Xi = 50 × 0.04 = 2
i=1 i=1
P (Y > 3) = P (Y − 2 > 1)
Y −2 3−2
=P √ > √
2 2
= P (Z > 0.707)
= 1 − FZ (0.707) = 1 − 0.76 = 0.24
2
4. Let the moment generating function of a random variable X be given by
1 −2λ 1 3 −λ 3 2λ 7
MX (λ) = e + + e + e + eλ
4 40 10 40 20
Find the distribution of X.
X −2 −1 0 1 2
1 3 3 1 7
P (X = x) 4 40 10 40 20
(a)
X −2 −1 0 1 2
1 1 3 3 7
P (X = x) 4 40 10 40 20
(b)
X −2 −1 0 1 2
1 3 1 7 3
P (X = x) 4 10 40 20 40
(c)
X −2 −1 0 1 2
1 3 1 7 3
P (X = x) 4 40 40 20 10
(d)
Solution:
The MGF of a discrete random variable X with the PMF fX (x) = P (X = x), x ∈ TX
is given by
MX (λ) = E[eλX ]
X
= P (X = x)eλx
x∈TX
X −2 −1 0 1 2
1 3 1 7 3
P (X = x) 4 10 40 20 40
3
5. A fair coin is tossed 1000 times. Use CLT to compute the probability that head appears
at most 520 times. Enter the answer correct to 3 decimal places.
Solution:
Define a random variable X such that
(
1 if head appears on tossing a fair coin
X=
0 otherwise
1
Therefore, E[X] = µ = and
2
1 1 1
Var(X) = σ 2 = . =
2 2 4
6. A fair die is rolled 100 times. Let X denote the number of times six is obtained. Find
X 1
a bound for the probability that differs from by less than 0.1 using weak law of
100 6
large numbers.
5
(a) at least
36
31
(b) at least
36
4
5
(c) at most
36
31
(d) at most
36
Solution:
X denotes the number of times six is obtained on rolling a fair die 100 times.
Let X1 , X2 , . . . , X100 be 100 i.i.d. samples such that
(
1 if six appears on rolling a fair die
Xi =
0 otherwise
1
E[Xi ] = µ = and
6
5
Var(Xi ) = σ 2 =
36
Notice that X = X1 + X2 + X3 + . . . + X100
!
X 1
To find: Bound on P − < 0.1 .
100 6
5
Let Y = Y1 + Y2 + . . . + Y500 , where Yi = Xi2 for all i : 1 → 500
0.5 0.5
E[Yi ] = = 1 and Var[Yi ] = = 2, for i : 1 → 500
0.5 0.25
250 250
E[Y ] = = 500 and Var[Y ] = = 1000
0.5 0.52
Let X be a random variable having the gamma distribution with the parameters α = 2n
and β = 1.
Hint:
α α
• If X ∼ Gamma(α, β), E[X] = and Var[X] = 2
β β
• Sum of n independent Gamma(α, β) is Gamma(nα, β)
8. Use the Weak Law of Large number to find the value of n such that
!
X
P − 1 > 0.01 < 0.01
2n
(a) 505000
(b) 470000
(c) 498000
(d) 482000
6
Solution:
Given X ∼ Gamma(2n, 1)
Let X = X1 + X2 + X3 + . . . + X2n , where Xi ∼ Gamma(1, 1).
σ2
P (|X − µ| > δ) ≤ 2
nδ
!
X 1
⇒P − 1 > 0.01 ≤
2n 2n × 0.012
1 1
Therefore, 2
< 0.01 =⇒ 2n > =⇒ n > 500000.
2n × 0.01 0.013
(a) 34570
(b) 33500
(c) 32500
(d) 30000
Solution:
E[X1 + . . . + X2n ] = 2n and Var[X1 + . . . + X2n ] = 2n
!
X
To find: The value of n such that P − 1 > 0.01 < 0.01.
2n
7
X − 2n
⇒ √ ∼ Normal(0, 1)
2n
Now,
!
X
P − 1 > 0.01 < 0.01
2n
!
X1 + . . . + Xn
=⇒ P − 1 > 0.01 < 0.01
2n
!
X1 + . . . + Xn − 2n √
=⇒ P √ > 0.01 2n < 0.01
2n
√
=⇒ P (| Z |> 0.01 2n) < 0.01
√
=⇒ 2P (Z > 0.01 2n) < 0.01
√ 0.01
=⇒ 1 − FZ (0.01 2n) <
√ 2
=⇒ FZ (0.01 2n) > 0.995
√
=⇒ FZ (0.01 2n) > FZ (2.58)
=⇒ n > 33282
10. Let the time taken (in hours) for failure of an electric bulb follow the exponential distri-
bution with the parameter 0.05. Suppose that 100 such light bulbs say L1 , L2 , . . . , L100
are used in the following manner: For every i, as soon as the light Li fails, Li+1 be-
comes operative, where i : 1 → 99 (i.e. If L1 fails, L2 becomes operative, if L2 fails, L3
becomes operative, and so on). Let the total time of operation of 100 bulbs be denoted
by T. Using CLT, compute the probability that T exceeds 2500 hours.
(a) FZ (1.5)
(b) 1 − FZ (1.5)
(c) FZ (2.5)
(d) 1 − FZ (2.5)
Solution:
Given, time to failure (in hours) of an electric bulb has the exponential distribution
with the parameter λ = 0.05.
Since, the bulbs are used in such a way, that as soon as light L1 fails, L2 becomes
operative, L2 fails, L3 becomes operative, and so on.
We know that if X ∼ Gamma(α, β) with parameter α = 1, then X ∼ Exp(β).
Also, sum of n i.i.d. Exp(λ) is Gamma(n, λ).
8
Since each of the Li ’s are exponentially distributed with parameter = 0.05, therefore
Let T = L1 + . . . + L100
1 1
E[Li ] = µ = = 20 and SD[Li ] = σ = = 20
0.05 0.05
To find: P (T ≥ 2500)
By CLT, we know that
T − 100µ
√ ∼ Normal(0, 1)
σ n
T − 2000
⇒ √ ∼ Normal(0, 1)
20 100
Now,
11. Suppose speeds of vehicles on a particular road are normally distributed with mean 36
mph and standard deviation 2 mph. Find the probability that the mean speed X of
20 randomly selected vehicles is between 35 and 38 mph.
√ √
(a) FZ ( 5) − FZ (− 5)
√ √
(b) FZ ( 20) − FZ (− 20)
√ √
(c) FZ ( 38) − FZ (− 35)
√ √
(d) FZ ( 20) − FZ (− 5)
Solution:
Let X denote the speed of a vehicle on a particular road.
Given that X ∼ Normal(36, 22 ).
Therefore, µ = 36 and σ = 2
Select X1 , X2 , . . . X20 samples such that X1 , X2 , . . . X20 ∼ iid X
X1 + X2 + . . . + X20
Let X = and S = X1 + X2 + . . . + X20
20
9
To find: P (35 < X < 38) From CLT, we know that
X1 + X2 . . . + Xn − nE[X]
√ ∼ Normal(0, 1)
nσ
S − nµ
⇒ √ ∼ Normal(0, 1)
nσ
(S − 36(20))
⇒ √ ∼ Normal(0, 1)
(2 20)
Now,
S
P (35 < X < 38) = P (35 < < 38)
20
S
= P (−1 < − 36 < 2)
20
S − 36(20)
= P (−1 < < 2)
√ 20
− 20 S − 36(20) √
= P( < √ < 20)
2 2 20
√ S − 36(20) √
= P (− 5 < √ < 20)
2 20
√ √
= FZ ( 20) − FZ (− 5)
10
Statistics for Data Science - 2
Week 7 practice Assignment
Statistics from samples and Limit theorems
X
1. If X, Y ∼ i.i.d. Normal(0, 4), what will be the variance of ?
Y
(a) 4
(b) 2
(c) 1
(d) Undefined
Solution:
X
We know that if X, Y ∼ i.i.d. Normal(0, σ 2 ), ∼ Cauchy(0, 1) and variance of Cauchy
Y
distribution is undefined.
Therefore, option(d) is correct.
2. A population has mean 60 and standard deviation 6. Random samples of size 100 from
this population are collected independently. Find the expected value of the sample mean.
Solution:
We know that expected value of the sample mean X is given by
E[X] = µ
= 60
1. FZ (0.3)
2. 1 − FZ (0.3)
3. FZ (−0.3)
4. 1 − FZ (−0.3)
Solution:
Now,
P (Y ≥ 10) = P (Y − 16 ≥ −6)
Y − 16 −6
= P( ≥ )
20 20
Y − 16
= P( ≥ −0.3)
20
= P (Z ≥ −0.3)
= 1 − P (Z < −0.3)
= 1 − FZ (−0.3)
4. Random samples of size 100 are collected from a population of unknown parameters. If
the variance of the sample mean is 36, what will be the standard deviation of the actual
population?
Solution:
σ2
We know that variance of the sample mean is given by where σ is the standard
n
deviation of the actual population and n is the sample size.
σ2
= 36
n
σ2
⇒ = 36
100
⇒σ 2 = 3600
⇒σ = 60
Page 2
Solution:
Given: standard deviation of the population, σ = 5
Sample size, n = 50
To find: upper bound on P (|X − µ| ≥ 10) where X and µ are sample mean and popu-
lation mean, respectively.
σ2
P (|X − µ| ≥ δ) ≤
nδ 2
25
⇒P (|X − µ| ≥ 10) ≤
100 × 50
⇒P (|X − µ| ≥ 10) ≤ 0.005
6. A study shows that the average daily sleeping hours of teenagers is ten hours with a
standard deviation of two hours. If a sample of 100 teenagers is collected, what will be
the probability that the mean of the sleeping hours of these 100 teenagers is at least 0.4
hours away from the population mean? Assume that each observation in the sample is
independent. Assume that FZ denotes the CDF of standard normal distribution.
Solution:
let X denote the average daily sleeping hours of teenagers.
Given: standard deviation of X, σ = 2
Sample size, n = 100
To find: P (|X − µ| ≥ 0.4) where X and µ are sample mean and population mean,
respectively.
Page 3
Now,
S
P (|X − µ| ≥ 0.4) = P ( − µ ≥ 0.4)
n
S − nµ
= P( ≥ 0.4)
n
√
S − nµ 0.4 n
= P( √ ≥ )
σ n σ
= P (|Z| ≥ 2)
= P (Z ≥ 2) + P (Z ≤ −2)
= 1 − P (Z ≤ 2) + P (Z ≤ −2)
= 1 − FZ (2) + FZ (−2)
2 σ2
Moment generating function of Normal(0, σ 2 ) is given by eλ /2
.
Let N ∼ Normal(0, 22 )
λ2 22/2
MN (λ) = e
λ2 22 λ4 24
=1+ + + ...
2 2!(4)
λ2 22 λ2
=1+ + 48 + . . .
2 4!
λ4
Therefore, 4th moment of Normal(0, 22 ) = coefficient of = 48
4!
Page 4
places.
Solution:
We know that if X ∼ Gamma(α, k) and Y ∼ Gamma(β, k) be two independent random
X
variables, then ∼ Beta(α, β).
X +Y
9. A study says that the delivery time of pizzas has a standard deviation of 10 minutes. A
pizza shop collected the data of some deliveries and their
√ delivery time. The probability
that the mean delivery time of this sample is at least 5 minutes away from the actual
mean delivery time is at most 51 as per the weak law of large numbers. What is the size
of the sample?
Solution:
Let X denote the delivery time of pizzas.
Given that σ = 10 √
To find: size of the sample such that P (|X − µ| ≥ 5) ≤ 15 ...(1).
By the weak law of large numbers, we have
σ2
P (|X − µ| ≥ δ) ≤ 2
nδ
√ 100
⇒P (|X − µ| ≥ 5) ≤ ...(1)
n×5
10. A company sells eggs whose weights are normally distributed with a mean of 70g and a
standard deviation of 2g. Suppose that these eggs are sold in packages that each contain
four eggs. Assume that the weight of each egg is independent. What is the probability
that the mean weight of the four eggs in a package is greater than 68.5g? Write your
answer correct to two decimal places.
(Hint: Use the fact that linear combination of normal distributions is again a normal
distribution. FZ (−1.5) = 0.066)
Page 5
Solution:
Let X denote the weight of an egg.
Given that E[X] = µ = 70
SD(X) = σ = 2
X ∼ Normal(70, 22 ) Let X1 , X2 , X3 and X4 denote the weights of four eggs in a package.
Suppose that
X1 + X2 + X3 + X4
X=
4
E[X] = µ = 70 and
σ2 4
Var(X) = = =1
n 4
Now,
11. Let X1 , X2 , X3 , . . . Xn be i.i.d. Poisson(4). What should be the value of n such that
P (3.8 ≤ X ≤ 4.2) ≥ 0.95? [2 marks]
(Hint: Use FZ (1.96) = 0.975)
1. at least 200
2. at least 385
3. at least 450
4. at least 585
Solution:
Given that X1 , X2 , X3 , . . . Xn ∼ i.i.d. Poisson(4)
Page 6
Mean of the distribution = µ = 4
Variance of the distribution = σ 2 = 4
Let S = X1 + X2 + . . . + Xn and
X1 + X2 + . . . + Xn
X=
n
X −4 −2 0 2 4
1 1 1 1 5
P (X = x) 8 6 6 8 12
1.
Page 7
X −4 −2 0 2 4
5 1 1 1 1
P (X = x) 12 8 6 6 8
2.
X −4 −2 0 2 4
1 1 5 1 1
P (X = x) 8 6 12 6 8
3.
X −4 −2 0 2 4
1 1 5 1 1
P (X = x) 8 6 12 8 6
4.
Solution:
The MGF of a discrete random variable X with the PMF fX (x) = P (X = x), x ∈ TX
is given by
MX (λ) = E[eλX ]
X
= P (X = x).eλx
x∈TX
X −4 −2 0 2 4
1 1 5 1 1
P (X = x) 8 6 12 6 8
13. A fair die is rolled 3600 times. Use CLT to compute the probability that six appears at
most 630 times. Enter the answer correct to two decimal places.
(Hint: Use FZ (1.341) = 0.91)
Solution:
Define a random variable X such that
(
1 if six appears on rolling a fair die
X=
0 otherwise
Page 8
1
Therefore, E[X] = µ = and
6
1 5 5
Var(X) = σ 2 = . =
6 6 36
Let S = X1 + X2 + . . . + X3600
To find: P (S ≤ 630)
14. A fair die is rolled 1000 times. Let X denote the number of times six is obtained. Find
X 1
a bound for the probability that differs from by more than 0.2 using weak law
1000 6
of large numbers.
5
1. at least
1440
1436
2. at least
1440
5
3. at most
1440
1436
4. at most
1440
Solution:
X denotes the number of times six is obtained on rolling the die 1000 times.
Let X1 , X2 , . . . , X1000 be 1000 i.i.d. samples such that
(
1 if six appears on rolling a fair die
Xi =
0 otherwise
Page 9
1
E[Xi ] = µ = and
6
5
Var(Xi ) = σ 2 =
36
Notice that X = X1 + X2 + X3 + . . . + X1000
!
X 1
To find: Bound on P − > 0.2 .
1000 6
15. Consider the following PDF curves and match them with the correct distribution. [1
mark]
Graph 1 Graph 2
Graph 3 Graph 4
Page 10
(a) Graph 1 → Gamma, Graph 2 → Normal, Graph 3 → Gamma, Graph 4 → Beta.
(b) Graph 1 → Beta, Graph 2 → Gamma, Graph 3 → Normal, Graph 4 → Gamma.
(c) Graph 1 → Beta, Graph 2 → Normal, Graph 3 → Normal, Graph 4 → Gamma.
(d) Graph 1 → Gamma, Graph 2 → Normal, Graph 3 → Normal, Graph 4 → Beta.
Solution:
Graph 1: Range of the distribution is [0, 1] and shape of the graph resembles to the Beta
distribution.
Graph 2: PDF curve is not symmetric about mean and shape of the graph resembles to
the Gamma distribution.
Graph 3: PDF curve is symmetric about mean and shape of the graph resembles to the
Normal distribution.
Graph 4: PDF curve is not symmetric about mean and shape of the graph resembles to
the Gamma distribution.
Therefore, Graph 1 → Beta, Graph 2 → Gamma, Graph 3 → Normal, Graph 4 →
Gamma.
16. Let X1 , X2 and X3 ∼ i.i.d. X where X has the following probability mass function:
x -1 2
2 1
fX (x) 3 3
Y -3 0 3 6
(a) 1 1 1 1
P (Y = y) 6 6 3 3
Y -3 0 3 6
(b) 8 4 2 1
P (Y = y) 27 9 9 27
Y -3 0 3 6
(c) 8 1 4 2
P (Y = y) 27 27 9 9
Y -3 0 3 6
(d) 2 8 1 4
P (Y = y) 9 27 27 9
Page 11
Solution:
The PMF of X is given by
x -1 2
2 1
fX (x) 3 3
MY (λ) = E[eλY ]
= E[eλ(X1 +X2 +X3 ) ]
= E[eλX1 eλX2 eλX3 ]
= E[eλX1 ]E[eλX2 ]E[eλX3 ] (Since, X1 , X2 and X3 are independent)
λX λX λX
= E[e ]E[e ]E[e ] (Since, X1 , X2 and X3 ∼ i.i.d. X)
= [MX (λ)]3 ...(1)
Now,
MX (λ) = E[eλX ]
= e−1λ .P (X = −1) + e2λ .P (X = 2)
2e−λ e2λ
= + ...(2)
3 3
From equation (1) and (2), we have
3
2e−λ e2λ
MY (λ) = +
3 3
1
= (2e−λ + e2λ )3
27
1
= (8e−3λ + e6λ + 12e−2λ e2λ + 6e−λ e4λ ) (since, (a + b)3 = a3 + b3 + 3a2 b + 3ab2 )
27
8 1 4 2
= e−3λ + e6λ + + e3λ
27 27 9 9
Therefore, distribution of Y is given by
Y -3 0 3 6
8 4 2 1
P (Y = y) 27 9 9 27
Page 12
Statistics for Data Science - 2
1. Let X1 , X2 , . . . , Xn be i.i.d.
samples from a distribution
X with mean µ and standard
X1 + X2 + . . . + Xn
deviation σ. Let µ̂ = 6 be an estimator of µ.
n
i) Is the estimator unbiased?
a) Yes
b) No
Solution:
X1 + X2 + . . . + Xn
E[µ̂] = E 6
n
6
= (nµ)
n
= 6µ
And
Bias(µ̂, µ) = E[µ̂] − µ = 6µ − µ = 5µ
Since, Bias(µ̂, µ) 6= 0, therefore the estimator is not unbiased.
ii) Find the risk of µ̂.
36σ 2
(a) + 25µ2
n
36σ 2
(b) + 5µ
n
6σ 2
(c) + 25µ2
n
6σ 2
(d) + 5µ
n
Solution:
X1 + X2 + . . . + Xn
Var(µ̂) = Var 6
n
36
= 2 (nσ 2 )
n
36σ 2
=
n
1
Risk(µ̂) = Bias(µ̂, µ)2 + Var(µ̂)
36σ 2
= (5µ)2 +
n
2
36σ
= 25µ2 +
n
2. Consider a sample of iid random variables X1 , X2 , . . . , Xn , where n > 20, E[Xi ] =
1 n
µ, Var(Xi ) = σ 2 and the estimator of µ, µ̂n =
P
Xi . Find the MSE of µ̂n .
n − 20 i=21
σ
a) n−20
σ2
b) n−20
σ2
c) n−21
σ
d) n
Solution:
" n
#
1 X
E[µ̂n ] = E Xi
n − 20 i=21
(n − 20)µ
=
n − 20
=µ
This implies that
Bias(µ̂n , µ) = E[µ̂n ] − µ = µ − µ = 0
" n
#
1 X
Var(µ̂n ) = Var Xi
n − 20 i=21
n
1 X
= Var(Xi )
(n − 20)2 i=21
1
= [(n − 20)σ 2 ]
(n − 20)2
σ2
=
(n − 20)
2
3. Let X1 , X2 , . . . , Xn ∼ iid X, where X is a random variable with density function
(
θ
θ+1 , x > 1,
fX (x) = x
0, otherwise.
θ
The mean of the random variable X is θ−1
. Find an estimator of θ using method of
moments.
X1 + X2 + . . . + Xn
(a)
X1 + X2 + . . . + Xn − 1
X1 + X2 + . . . + Xn
(b)
1 − X1 + X2 + . . . + Xn
X1 + X 2 + . . . + Xn
(c)
X1 + X2 + . . . + Xn − n
X1 + X 2 + . . . + Xn
(d)
n − X1 + X2 + . . . + Xn
θ
Solution: The mean of the random variable X is θ−1
.
So,
θ
M1 =
θ−1
⇒ M1 θ − M1 = θ
M1
⇒θ=
M1 − 1
X1 +X2 +...+Xn
n
⇒θ= X1 +X2 +...+Xn
n
1 −
X 1 + X2 + . . . + X n
⇒θ=
X1 + X2 + . . . + Xn − n
X1 + X 2 + . . . + Xn
Therefore the estimator of θ is .
X1 + X2 + . . . + Xn − n
4. Let X1 , X2 , X3 ∼ iid Binomial(4, θ). Given a random sample (1, 4, 2), find the maximum
likelihood estimate of θ.
2
a) 3
7
b) 12
1
c) 3
5
d) 12
Solution: Xi ∼ Binomial(4, θ)
⇒ fXi (x) = 4 Cx θx (1 − θ)4−x
3
Likelihood function is given by
3
Q
L(x1 , x2 , x3 ) = fXi (xi )
i=1
⇒ L(x1 , x2 , x3 ) = 4 Cx1 θx1 (1 − θ)4−x1 × 4 Cx2 θx2 (1 − θ)4−x2 × 4 Cx3 θx3 (1 − θ)4−x3
7 5 7
− =0⇒θ=
θ 1−θ 12
7
⇒ θ̂M L =
12
4
Solution: The mean of the random variable X is θ + 1.
So,
M1 = θ + 1
⇒ θ = M1 − 1
X1 + X2 + . . . + Xn
⇒θ= −1
n
X1 + X2 + . . . + Xn − n
⇒θ=
n
X 1 + X2 + . . . + Xn − n
Therefore the estimator of θ is .
n
ii) Is the method of moments estimator unbiased?
a) Yes
b) No
Solution:
Estimator of θ is
X1 + X 2 + . . . + Xn − n
θ̂ =
n
X1 + X 2 + . . . + Xn − n
E[θ̂] = E[ ]
n
1
= (E[X1 ] + E[X2 ] + . . . + E[Xn ] − n)
n
1
= (nθ + n − n)
n
=θ
And
Bias(θ̂, θ) = E[θ̂] − θ = θ − θ = 0
Since, Bias(θ̂, θ) = 0, therefore the estimator is unbiased.
6. Suppose it is known that a sample consisting of the values 10, 12, 15, 16.5, 18, 19, 20
and 21.5 comes from a population with the density function
( −x
1 θ
e , x > 0,
f (x) = θ
0, otherwise.
Find the maximum likelihood estimate of θ. Enter your answer correct to one decimal.
5
Solution:
n
Y
L(x1 , x2 , . . . , xn ) = fX (xi )
i=1
n
1 −xi
Y
= e θ
i=1
θ
1 −x1 −x2 −xn
= n e θ e θ ...e θ
θ
1 −(x1 +x2 +...+xn )
= n e θ
θ
(x1 + x2 + . . . + xn )
⇒ log(L(x1 , x2 , . . . , xn )) = −n log(θ) −
θ
Therefore, ML estimator for θ is given by
(x1 + x2 + . . . + xn )
θ̂ = arg maxθ [−n log(θ) − ]
θ
(x1 + x2 + . . . + xn )
Let Y = −n log(θ) −
θ
dY n (x1 + x2 + . . . + xn )
⇒ =− +
dθ θ θ2
Now we will equate this value to zero and find the value of θ.
n (x1 + x2 + . . . + xn )
⇒− + =0
θ θ2
x1 + x2 + . . . + xn
⇒θ=
n
x1 + x2 + . . . + xn
⇒ θ̂ =
n
Therefore, maximum likelihood estimate of θ for the given sample will be
10 + 12 + 15 + 16.5 + 18 + 19 + 20 + 21.5
θ̂ =
8
132
=
8
= 16.5
7. Let X be a discrete random variable with the following probability mass function
x 1 2 3 4
1−p p 1−p p
fX (x)
2 2 2 2
Table 8.1.G: PMF of X
6
Suppose a sample consisting of the values 2, 2, 4, 3, 1, 3, 1 and 2 is taken from the random
variable X. Find the estimate of p using method of moments. Enter your answer correct
to two decimals accuracy.
Solution:
1−p p 1−p p
E[X] = 1 × +2× +3× +4×
2 2 2 2
(1 − p) + 2p + 3(1 − p) + 4p
=
2
=p+2
Now
M1 = E[X] = p + 2
⇒ p = M1 − 2
Therefore, estimate of p will be
X1 + X2 + . . . + Xn
− 2.
n
So, the estimate of p for the given sample will be
2+2+4+3+1+3+1+2
p̂ = −2
8
18
= −2
8
= 0.25
Use the following values of CDF of standard normal distribution to answer the questions:
8. The weights (in grams) of mangoes grown in a certain area are normally distributed with
mean µ and standard deviation 40. The weights from a random sample of mangoes are
as follows:
220, 210, 240, 260, 235, 225, 270, 300, 200.
Find a 95% confidence interval for the mean weight of mangoes.
a) [203.87, 256.13]
b) [213.87, 266.13]
c) [230, 280]
d) [215.13, 235.87]
7
Solution:
n = 9, µ̂ = 240 and σ = 40.
β = 0.95, using CDF of Normal(0, 1),
α
√ = 1.96
σ/ n
40
α = 1.96 × √ = 26.13
9
P (|µ̂ − µ| < 26.13) = 0.95
So, 95% confidence interval is [240 - 26.13, 240 + 26.13] i.e. [213.87, 266.13]
9. From past experience it is known that the weights of seer fish grown at a commercial
hatchery are normal with a mean that varies from season to season but with a standard
deviation that remains fixed at 0.2 kilogram. If we want to be 90% certain that our
estimate of the present season’s mean weight of a seer fish is correct to within 0.01
kilograms, how large a sample is needed?
Solution:
Let X denote the weights of seer fish.
Given that σ = 0.2
To find the value of n such that P (|µ̂ − µ| ≤ 0.01) = 0.90
0.01
√ = 1.64
σ/ n
√ 1.64
⇒ n = 0.2 ×
0.01
⇒ n = 1075.84
a) [152.73, 164.15]
8
b) [156.67, 160.2]
c) [160.28, 167.72]
d) [150.34, 165.66]
Solution:
n = 9, µ̂ = 158.44 and S 2 = 55.52 ⇒ S = 7.45
X −µ
Using t−distribution, √ ∼ tn−1
S/ n
α
√ = 2.30
S/ n
7.45
α = 2.30 × √ = 5.71
9
P (|µ̂ − µ| < 5.71) = 0.95
So, 95% confidence interval is [158.44 - 5.71, 158.44 + 5.71] i.e. [152.73, 164.15].
9
Statistics for Data Science - 2
Week 8 Practice assignment
1. Let X1 , . . . , Xn be n i.i.d. samples from a random variable X with mean µ and variance
σ 2 . Let X̄ 2 be an estimator of µ2 where X̄(sample mean) is an unbiased estimator of
µ. Is the estimator X̄ 2 unbiased always?
(a) Yes
(b) No
Solution:
X1 + . . . + X n
X̄ =
n
Given X̄ is an unbiased estimator of µ and X̄ 2 is an estimator of µ2 .
=⇒ E[X̄] = µ
Now,
1
Solution:
Given θ̂ = 3X̄ an estimator of θ.
Expectation of X is given by
Z 1
E[X] = xfX (x)dx
−1
Z 1
1 + θx
= x dx
−1 2
1 1
Z
= (x + θx2 )dx
2 −1
1
x2 θx3 θ
= + =
4 6 3
−1
Bias(θ̂, θ) =E[θ̂ − θ]
X1 + . . . + Xn
=E 3 −θ
n
nθ
=3 − E[θ] = 0
3n
2
1 θ2
Therefore, Var[X] = −
3 9
X 1 + . . . + Xn
Var(θ̂) =Var 3
n
9
= 2 (nVar[X])
n
1 θ2
9
= 2 n −
n 3 9
2
3−θ
=
n
3 − θ2
MSE(θ̂) = Bias(θ̂)2 + Var[θ̂] = .
n
(a) 17.74
(b) 17.91
(c) 1.5
(d) 2.25
Solution:
Given the distribution of X has mean equal to µ and variance equal to σ 2 .
100
P 100
P 2
Also, Xi = 150 and Xi = 1999
i=1 i=1
1 P n
We know that S 2 = (Xi − X̄)2 is an unbiased estimator of Var[X].
n − 1 i=1
3
Therefore,
n
2 1 X
S = (Xi − X̄)2
n − 1 i=1
n
1 X 2
= (Xi + X̄ 2 − 2Xi X̄)
n − 1 i=1
n n
!
1 X X
= Xi2 + nX̄ 2 − 2X̄ Xi
n−1 i=1 i=1
n
!
1 X
= Xi2 + nX̄ 2 − 2nX̄ 2
n−1 i=1
n
!
1 X
= Xi2 − nX̄ 2
n−1 i=1
n 2 n
2
P P
1 X n
2
i=1 Xi 1
n
X 2 i=1
Xi
= X − n = X −
n − 1 i=1 i n n − 1 i=1 i n
1502
12
Therefore, S = 1999 − = 17.91
100 − 1 100
n
P
4. Let X1 , X2 , . . . , Xn ∼ i.i.d. X. Let a1 , . . . , an ≥ 0 such that ai = 1. Define the
i=1
n
ai xi . Define the estimator for the variance as S 2 =
P
estimator for mean as X̄ =
i=1
n
ai (Xi − X̄) with E[X] = µ and Var(X) = σ 2 . Choose the correct option(s) from
2
P
i=1
the following:
Solution:
4
Given X1 , X2 , . . . , Xn ∼ i.i.d. X, E[X] = µ, Var[X] = σ 2
n
P Pn
X̄ = ai xi is an estimator of µ, where ai = 1.
i=1 i=1
n
P n
P
(a) E[X̄] = E[a1 X1 + · · · + an Xn ] = ai E[X] = µ (since ai = 1)
i=1 i=1
Bias(X̄) = E[X̄] − E[X] = µ − µ = 0
Therefore, X̄ is an unbiased estimator of µ.
n n
a2i Var[X] = σ 2 a2i
P P
(b) Var[X̄] = Var[a1 X1 + · · · + an Xn ] =
i=1 i=1
E[X̄] =µ (1)
n
X
Var[X̄] =σ 2 a2i (2)
i=1
n
X
S2 = ai (Xi − X̄)2
i=1
n
X
= (ai Xi2 + ai X̄ 2 − 2ai Xi X̄)
i=1
Xn n
X n
X
= ai Xi2 + 2
ai X̄ − 2ai X̄Xi
i=1 i=1 i=1
Xn Xn
= ai Xi2 + X̄ 2 − 2X̄ 2 = ai Xi2 − X̄ 2
i=1 i=1
5
Now,
n
! n
X X
E[S 2 ] = E ai Xi2 − X̄ 2 = E[ai Xi2 ] − E[X̄ 2 ]
i=1 i=1
n
X
= ai E[Xi2 ] − E[X̄ 2 ]
i=1
n
X
= ai (σ 2 + µ2 ) − (Var[X̄] + µ2 )
i=1
n
X
=σ 2 + µ2 − σ 2 a2i − µ2 [From(2)]
i=1
n
X
=σ 2 − σ 2 a2i
i=1
n
!
X
= 1− a2i σ 2
i=1
6
In order to maximise the likelihood function, we need to minimize a.
Since −a < xi < a for all i and | xi |< a, therefore, a = max(| x1 |, . . . , | xn |).
Therefore, the ML estimator of a is max(| X1 |, . . . , | Xn |).
6. Let X1 , X2 , X3 ∼ iid Normal(µ, σ 2 ). Given a random sample (−1, 0, 1), find the maxi-
mum likelihood estimate of σ 2 .
2
a) 3
7
b) 12
1
c) 3
5
d) 12
Solution:
n
(Xi − µ̂M L )2
P
i=1
ML estimator of σ 2 is , where µ̂M L = X̄.
n
−1 + 0 + 1
Given the samplings −1, 0, 1, X̄ = =0
3
(−1)2 + 02 + 12 2
Therefore, ML estimator of σ 2 is = .
3 3
7. Let X1 , . . . , Xn be n i.i.d. samples of a random variable X. Let X have the PDF
f (x) = (α + 1)xα , where 0 < x < 1.
7
Likelihood function of a sampling X1 , X2 , . . . , Xn will be given by
n
Y
L(x1 , x2 , . . . , xn ) = fX (xi )
i=1
= (α + 1)n xα1 · · · xαn
⇒ log(L) = n log(α + 1) + α(log(x1 ) + · · · + log(xn ))
Now,
dY
=0
dα
n
⇒ = −[log(x1 ) + · · · + log(xn )]
α+1
n
⇒ α̂M L = −1 − Pn
log Xi
i=1
α+1
(b) The mean of the random variable X is α+2
. Find the estimator of α using method
of moments.
1 + 2M1
i. α̂M M E =
M1 − 1
1 − M1
ii. α̂M M E =
M1 − 1
1 + M1
iii. α̂M M E =
M1 − 1
1 − 2M1
iv. α̂M M E =
M1 − 1
Solution:
8
α+1
The expected value of X, E(X) is given as α+2 .
Using method of moments,
α+1
= m1
α+2
1 − 2m1
α=
m1 − 1
The estimator is
1 − 2M1
α̂M M E =
M1 − 1
8. Let X be a discrete random variable taking the values −1, 0, 1 with probabilities P (X =
p p
−1) = , P (X = 0) = , P (X = 1) = 1 − p. Let X1 , . . . , Xn ∼ i.i.d.{−1, 0, 1}. Find
2 2
the estimator of p using the method of moments.
2 − 2M1
(a)
3
2 + 2M1
(b)
3
1 + 2M1
(c)
3
2 + M1
(d)
3
Solution:
The expected value of X, E(X) is given by
X p p (2 − 3p)
E[X] = xpX (x) = −1 × + 0× + (1 × (1 − p)) =
x
2 2 2
(2 − 3p)
E[X] =
2
Using method of moments,
(2 − 3p)
= m1
2
The estimator is
2 − 2m1
p̂ =
3
2 − 2M1
p̂ =
3
9. Let X be a random variable with PDF
α
fX (x) = (λa)xα−1 e−λx , x > 0.
where α and a are constants. Find the maximum likelihood estimator of λ for n i.i.d.
samples of X.
9
n
Xiα
P
i=1
(a)
n
n
(b) P
n
Xiα
i=1
n
(c) n
Xiα
P
α
i=1
n
Xiα
P
i=1
(d)
nα
Solution:
Given,
α
fX (x) = (λa)xα−1 e−λx , x>0
Likelihood function of a sampling X1 , X2 , . . . , Xn will be given by
n
Y
L(x1 , x2 , . . . , xn ) = fX (xi )
i=1
α α
= (λa)n (x1 · · · xn )α−1 e−λ(x1 +···+xn )
Likelihood is a function of the parameter so, we can ignore the constant terms in the
likelihood function. Therefore,
α α
L = λn e−λ(x1 +···+xn )
⇒ log(L) = n log(λ) − λ(xα1 + · · · + xαn )
10
Now,
dY
=0
dλ
n
n X α
⇒ = xi
λ i=1
n
⇒λ = Pn
Xiα
i=1
10. A random sample of 1000 television screens taken from the household of a city shows
that the average running time of television is 7 hours per day with a standard deviation
of 2 hours. Assume the distribution of measurements to be approximately normal.
Calculate a 99% confidence interval for the daily average television running hours.
Hint: Use P (−2.58 < Z < 2.58) = 0.99.
Solution:
Given β = 0.99, n = 1000, X̄ = 7 and σ = 2.
To find: P (| X̄ − µ |≤ α) = 0.99
X̄ − µ α
P | √ |≤ √ =0.99
σ/ n σ/ n
α
=⇒ P | Z |≤ √ =0.99 where Z ∼ Normal(0, 1)
σ/ n
α α
=⇒ P − √ ≤ Z ≤ √ =0.99
σ/ n σ/ n
11
11. The distribution of the diameter of screws produced by a certain machine is normally
distributed with µ and σ unknown. We observe a random sample
9.8, 10.2, 10.4, 9.8, 10.0, 10.2 and 9.6 (in cm).
Find a 95% confidence interval for the mean diameter of screws.
Hint: Use P (−2.447 < t6 < 2.447) = 0.95 and S(sample standard deviation) = 0.283.
(a) [10.74, 11.26]
(b) [9.74, 10.26]
(c) [7.47, 8.26]
(d) [7.98, 8.75]
Solution:
Given that S = 0.283, n = 7, β = 0.95
12
Now,
15
√ = 1.96
σ/ n
√ 1.96
⇒ n = 40 ×
15
⇒ n = 27.31
13
Statistics for Data Science - 2
Week 9 graded Assignment
Bayesian estimation
1. Suppose that the number of buses reaching a particular stop in an one-hour time period
follows the Poisson distribution with an unknown parameter λ. Previous records suggest
that the prior probabilities of λ are P (λ = 0.25) = 0.3 and P (λ = 0.20) = 0.7. If in a
particular one-hour time period seven buses reach the bus stop, find the posterior mode
of λ. Write your answer correct to two decimal places.
Solution:
Prior probabilities of λ are P (λ = 0.25) = 0.3 and P (λ = 0.20) = 0.7.
Solution:
Let p denote the probability of getting an even number.
Prior distribution of p is fp ∼ Uniform[0, 1].
It implies that fp (p) = 1
Page 2
4. Call duration of daily stand up meetings of employees of a certain company follows the
exponential distribution with an unknown parameter λ. Duration (in minutes) of last
ten meetings are 20, 30, 35, 30, 25, 25, 20, 28, 34, 30. Find the Bayesian estimate
1
(posterior mean) of λ using the prior distribution of Exp( ) for λ. Write your answer
15
correct to two decimal places.
Solution:
Let Λ be the prior distribution of λ.
1
From the given information, fΛ (λ) ∼ Exp( ).
15
1 −λ/15
It implies that fΛ (λ) = e .
15
5. Marks of tenth class students of a school follow the normal distribution with an un-
known mean µ and variance 25. Marks of 10 students of the tenth class are 50, 45, 70,
60, 75, 90, 45, 60, 80, 75. Find the Bayesian estimate (posterior mean) of µ assuming
the Normal(50, 25) prior distribution. Write your answer correct to two decimal places.
Solution:
We know that normal distribution is conjugate to the normal distribution. That is if
prior distribution of µ is normal(µ0 , σ02 ) and sample is taken from Normal(µ, σ), then
nσ 2 µ0 σ 2
posterior distribution of the µ will be Normal with mean X 2 0 2 + 2
nσ0 + σ nσ0 + σ 2
Page 3
Therefore,
65 × 10 × 25 50 × 25
Posterior mean = +
10(25) + 25 10(25) + 25
16250 1250
= +
275 275
= 59.09 + 4.545 = 63.63
Solution:
Let p denote the probability of heads.
Given that prior of p is Beta(2, β) with an average of 0.4.
It implies that E[Beta(2, β)] = 0.4
2
⇒ = 0.4
2+β
⇒β=3
7. One out of the last ten candidates wins a treasure hunt game. Previous record shows
fraction of winners follows the Beta(20, b) distribution with an average of 20%. Estimate
the long-term fraction of winners of the treasure hunt game. Write your answer correct
to two decimal places.
Solution:
Let the long-term fraction of winners (probability of winning) be denoted by p.
Previous data shows that fraction of winners follows the Beta(20, b) distribution with an
average of 20%.
It implies that E[Beta(20, b)] = 0.2
Page 4
20
⇒ = 0.2
20 + b
⇒ b = 80
8. Rainfall in the monsoon season in Delhi follows normal distribution with mean µ and
variance 200 mm. Rainfall (in mm) registered in the 2021 monsoon are 600, 300, 450,
700, 850, 150, 200, 750. Prior information about the average rainfall is that it has mean
600 mm and variance 225 mm. Use the normal prior that matches your prior information
and find the posterior mean.
Solution:
Prior distribution of µ is given Normal with mean 600 and variance 225.
We know that normal distribution is conjugate to the normal distribution. That is if
prior distribution of µ is normal(µ0 , σ02 ) and sample is taken from Normal(µ, σ 2 ), then
nσ 2 µ0 σ 2
posterior distribution of the µ will be Normal with mean X 2 0 2 + 2
nσ0 + σ nσ0 + σ 2
Therefore,
500 × 8 × 225 600 × 200
Posterior mean = +
8(225) + 200 8(225) + 200
900000 120000
= +
2000 2000
= 450 + 60 = 510
9. Following frequency data shows the number of patients (n) arriving in an emergency
room between 12:00 AM and 6:00 AM.
Page 5
n frequency n frequency
0 1 6 14
1 4 7 4
2 17 8 4
3 17 9 1
4 17 10+ 0
5 21
(i) Fit the data into Poisson distribution (Find the parameter). Write your answer
correct to two decimal places.
Solution:
We know that λ̂ = X is an estimate of λ.
P
f i ni
i
Sample mean, X = P
fi
i
0 + 4 + 34 + 51 + 68 + 105 + 84 + 28 + 32 + 9
=
1 + 4 + 17 + 17 + 17 + 21 + 14 + 4 + 4 + 1
415
= = 4.15
100
Therefore, λ̂ = 4.15
(ii) Find an approximate 95% confidence interval using a normal approximation for the
error distribution.
(Use the following information:
sample variance S 2 = 3.40 and P (−0.36 < N(0, 0.034) < 0.36) = 0.95)
(a) [3.87, 5, 134]
(b) [3.79, 4.51]
(c) [3.12, 5.21]
(d) [4.01, 5.23]
Solution:
Error, e is given by
e = λ̂ − λ
Now, E[λ̂ − λ] = E[λ̂] − λ = λ − λ = 0
σ2 s2 3.4
Var(λ̂ − λ) = Var(λ) = ≈ = = 0.034
n n 100
Page 6
It is given that P (−0.36 < N(0, 0.034) < 0.36) = 0.95
Therefore δ = 0.36
10. Following frequency table shows the number of bankruptcies (n) filed by customers in a
time period of one month. The data consists of last 200 months.
n frequency n frequency
0 13 6 8
1 26 7 4
2 48 8 1
3 44 9 1
4 39 10+ 0
5 16
(i) Fit the data into Poisson distribution (Find the parameter). Write your answer
correct to two decimal places.
Solution:
We know that λ̂ = X is an estimate of λ.
P
fi ni
i
Sample mean, X = P
fi
i
0 + 26 + 96 + 132 + 156 + 80 + 48 + 28 + 8 + 9
=
13 + 26 + 48 + 44 + 39 + 16 + 8 + 4 + 1 + 1
583
= = 2.91
200
Therefore, λ̂ = 2.91
(ii) Find an approximate 95% confidence interval using a normal approximation for the
error distribution.
(Use the following information:
sample variance S 2 = 2.852 and P (−0.23 < N(0, 0.0142) < 0.23) = 0.95)
(a) [1.97, 4.14]
(b) [2.08, 3.34]
(c) [2.68, 3.14]
(d) [2.01, 4.232]
Solution:
Error, e is given by
e = λ̂ − λ
Page 7
Now, E[λ̂ − λ] = E[λ̂] − λ = λ − λ = 0
σ2 s2 2.852
Var(λ̂ − λ) = Var(λ) = ≈ = = 0.0142
n n 200
Page 8
Statistics for Data Science - 2
1. Let p be the proportion of students in IITM online degree programme who approve the
online proctored exams. The students’ committee is going to take a random sample of
n = 40 students from IITM online degree programme and ask if they approve the online
proctored exams. Suppose 10 out of the 40 students answered yes.
i) Calculate the posterior distribution if we use a continuous Uniform[0, 1] prior.
a) Beta(10, 30)
b) Beta(11, 31)
c) Beta(10, 40)
d) Beta(11, 40)
Solution:
Let fp (p) denote the prior distribution of p.
Then, by given information fp (p) = 1, since, p ∼ Uniform[0, 1].
If X1 , X2 , . . . , Xn ∼ iid Bernoulli(p)
⇒ posterior density = Beta(w + 1, n − w + 1) where w is the number of success.
Here n = 40, w = 10 ⇒ posterior density = Beta(11, 31)
ii) Find the Bayesian estimate (posterior mean) of p. Enter your answer correct to two
decimals accuracy.
Solution:
11
posterior mean = = 0.26
11 + 31
iii) Find the Bayesian estimate (posterior mean) with Beta(5, 5) prior. Enter your an-
swer correct to two decimals accuracy.
Solution:
Given the prior distribution is Beta(α, β)
X1 , X2 , . . . , Xn ∼ iid Bernoulli(p)
⇒ posterior density = Beta(w + α, n − w + β)
Here n = 40, w = 10, α = 5, β = 5
⇒ posterior density = Beta(15, 35)
15
posterior mean = = 0.30
15 + 35
iv) Find the Bayesian estimate (posterior mean) with Beta(10, 10) prior. Enter your
answer correct to two decimals accuracy.
Solution:
Here n = 40, w = 10, α = 10, β = 10
1
⇒ posterior density = Beta(20, 40)
20
posterior mean = = 0.33
20 + 40
2. The new method of screening for a disease fails to detect the presence of the disease in
20% of the patients from prior experience. A new random sample of n = 100 patients
who are known to have the disease is screened using the new method. Out of these 100
patients, the new method failed to detect the disease in 20 cases. Use a Beta(2, β) with
a suitable β to estimate the failure fraction. Enter your answer correct to two decimals
accuracy.
Solution:
The prior is given as Beta(2, β) with the information that the new method failed to
detect the disease in 20 cases.
2
⇒ = 0.20
2+β
⇒ β = 8.
If X1 , X2 , . . . , Xn ∼ iid Bernoulli(p)
⇒ posterior density = Beta(w + α, n − w + β)
Here n = 100, w = 20, α = 2, β = 8
⇒ posterior density = Beta(22, 88)
22
posterior mean = = 0.20
22 + 88
3. Suppose that the number of customers arriving in a restaurant in a one day time period
follows the Poisson distribution with unknown parameter λ. Previous records suggest
that the prior probabilities of λ are P (λ = 10) = 0.4 and P (λ = 8) = 0.6. If on a
particular day 15 people arrive at the restaurant, find the posterior mode of λ.
Solution:
2
And
Yn
a)
n
Yn
b)
n+1
nYn
c)
n+1
nYn
d)
n+2
Solution:
If X1 , . . . , Xn ∼ iid Normal(µ, σ 2 ), and the prior is Normal(µ0 , σ02 )
nσ 2 σ2
then posterior density = X 2 0 2 + µ0 2
nσ0 + σ nσ0 + σ 2
2
Here σ = 1, µ0 = 0, σ0 = 1
X1 + . . . + Xn n×1
⇒ posterior mean =
n n×1+1
Yn
⇒ posterior mean =
n+1
5. The marks distribution of IITM students in the end semester exam follows normal distri-
bution with unknown mean µ and variance 20. A random sample of marks of 8 students
are:
60, 60, 65, 65, 70, 70, 72, 75.
i) Assume that the prior distribution is Normal(50, 5). Find the posterior mean of µ .
Enter your answer correct to two decimals accuracy.
3
Solution:
60 + 60 + 65 + 65 + 70 + 70 + 72 + 75
X= = 67.125
8
Here X = 67.125, σ 2 = 20, n = 8, µ0 = 50, σ02 = 5
8×5 20
posterior mean = 67.125 + 50
8 × 5 + 20 8 × 5 + 20
40 20
= 67.125 + 50
60 60
= 61.416
ii) Assume that the prior distribution is Normal(50, 25). Find the posterior mean of µ .
Enter your answer correct to two decimals accuracy.
Solution:
60 + 60 + 65 + 65 + 70 + 70 + 72 + 75
X= = 67.125
8
Here X = 67.125, σ 2 = 20, n = 8, µ0 = 50, σ02 = 25
8 × 25 20
posterior mean = 67.125 + 50
8 × 25 + 20 8 × 25 + 20
200 20
= 67.125 + 50
220 220
= 65.56
6. Suppose X is a discrete random variable taking values {1, 2, 3} with respective proba-
bilities {p, 2(1 − p)/3, (1 − p)/3}, where 0 ≤ p ≤ 1 is a parameter. Consider the samples
1, 1, 3, 1, 3, 2, 1, 2, 3, 2 taken from X.
Use a Uniform[0, 1] prior on p to find the posterior mean. Enter your answer correct to
two decimals accuracy.
Solution:
Let fp (p) denote the prior distribution of p.
Then, by given information fp (p) = 1, since, p ∼ Uniform[0, 1].
4
7. The following ten samples are taken from the Geometric(p):
2, 4, 10, 8, 12, 6, 14, 6, 3, 5.
Find the posterior mean of p using Uniform[0, 1] prior. Enter your answer correct to two
decimals accuracy.
Solution:
If X1 , . . . , Xn ∼ iid Geometric(p), and the prior is Uniform[0, 1],
then posterior density = Beta(n + 1, x1 + x2 + . . . + xn − n + 1)
Here n = 10, x1 + x2 + . . . + xn = 2 + 4 + 10 + 8 + 12 + 6 + 14 + 6 + 3 + 5 = 70
⇒ posterior density = Beta(11, 61).
11
⇒ posterior mean = = 0.15
11 + 61
8. Consider the samples 7, 5, 0, 2, 10, 4, 9, 8, 3 taken from Poisson(λ), where λ is unknown.
Using a Gamma(4, 11) prior, find the posterior mean of λ. Enter your answer correct to
one decimal accuracy.
Solution:
If X1 , . . . , Xn ∼ iid Poisson(λ), and the prior is Gamma(α, β)
then posterior density = Gamma(x1 + x2 + . . . + xn + α, β + n)
Here n = 9, x1 + x2 + . . . + xn = 7 + 5 + 0 + 2 + 10 + 4 + 9 + 8 + 3 = 48, α = 4, β = 11
⇒ posterior density = Gamma(52, 20)
52
⇒ posterior mean = = 2.6
20
9. The number of defects per 10 meters of cloth produced by a weaving machine has the
Poisson distribution with mean λ. You examine 100 meters of cloth produced by the
machine and observe 61 defects. Your prior belief about λ is that it has mean 6 and
standard deviation 2. Use a Gamma(α, β) prior that matches your prior belief and find
the posterior distribution.
a) Gamma(70, 111.5)
b) Gamma(70, 11.5)
c) Gamma(61, 11.5)
d) Gamma(61, 13)
Solution:
Prior is Gamma(α, β) with mean 6 and standard deviation 2.
α α
⇒ = 6 and 2 = 4
β β
⇒ α = 6β and α = 4β 2
5
10. Assume that the time that elapses from one call to the next at a 911 call center has
the exponential distribution with parameter λ. The time elasped between ten calls (in
minutes) are: 3, 4, 6, 1, 7, 8, 2, 5, 1. Your prior belief about λ is that it has mean 3.5 and
standard deviation 1. Use a Gamma(α, β) prior that matches your prior belief and find
the posterior mean. Enter your answer correct to two decimals accuracy.
Solution:
Prior is Gamma(α, β) with mean 6 and standard deviation 2.
α α
⇒ = 3.5 and 2 = 1
β β
⇒ α = 3.5β and α = β 2
11. The frequency data on number of deaths per month due to a certain disease is given
below:
Table 9.1.P
6
(i) Fit a Poisson distribution to the given frequency table and find the parameter.
Write your answer correct to two decimal places.
Solution:
Let λ̂ = X̄ be an estimate of λ.
P
ni f i
Sample mean(X̄) = P
fi
Therefore, λ̂ = 0.47.
Therefore, the distribution is Poisson(0.47).
As we can observe from the table that the actual count is close to the expected
count, therefore, Poisson(0.47) is a reasonable fit for the given data.
(ii) Find an approximate 95% confidence interval using a normal approximation for the
sampling distribution.
(Use the following information:
sample variance S 2 = 0.498 and P (−0.07 < N(0, 0.0014) < 0.07) = 0.95)
(a) [0.40, 0.47]
(b) [0.40, 0.54]
(c) [0.44, 0.54]
(d) [0.44, 0.52]
Solution:
Error: λ̂ − λ
E[λ̂ − λ] = 0
σ2 S2
Var(λ̂ − λ) = Var(λ̂) = ≈
n n
7
2
s
Therefore, we will assume the sampling distribution to be Normal 0, .
n
Given that, sample variance (s2 ) = 0.498.
Therefore, the sampling distribution is Normal(0, 0.0014).
Now, 95% confidence interval for λ is [λ̂ − δ1 , λ̂ − δ2 ].
It is given that P (−0.07 < N(0, 0.0014) < 0.07) = 0.95, therefore,
12. The number of emails received by Neeti in intervals of one hour is given in Table 9.2.P.
Table 9.2.P: Emails received by Neeti in one-hour interval for the last 100 hours.
(i) Fit a Poisson distribution to the given frequency table and find the parameter.
Write your answer correct to two decimal places.
Solution:
Let λ̂ = X̄ be an estimate of λ.
P
ni f i
Sample mean(X̄) = P
fi
(1 × 15) + (2 × 22) + (3 × 22) + (4 × 17) + (5 × 10) + (6 × 5) + (7 × 3) + (8 × 1)
⇒ X̄ =
5 + 15 + 22 + 22 + 17 + 10 + 5 + 3 + 1
302
⇒ X̄ = = 3.02
100
Therefore, λ̂ = 3.02.
Therefore, the distribution is Poisson(3.02).
We can check the fit, the same way we did in the previous question.
8
(ii) Find an approximate 95% confidence interval using a normal approximation for the
sampling distribution.
(Use the following information:
sample variance S 2 = 3.05 and P (−0.34 < N(0, 0.0305) < 0.34) = 0.95)
(a) [1.89, 4.15]
(b) [2.08, 4.34]
(c) [2.68, 3.36]
(d) [1.89, 3.35]
Solution:
Error: λ̂ − λ
E[λ̂ − λ] = 0
σ2 S2
Var(λ̂ − λ) = Var(λ̂) = ≈
n n 2
s
Therefore, we will assume the sampling distribution to be Normal 0, .
n
Given that, sample variance (s2 ) = 3.05.
Therefore, the sampling distribution is Normal(0, 0.0305).
Now, 95% confidence interval for λ is [λ̂ − δ1 , λ̂ − δ2 ].
It is given that P (−0.34 < N(0, 0.0305) < 0.34) = 0.95, therefore,
9
Statistics for Data Science - 2
1. The average marks scored by students of a school in their board exams is reported to
be 400 with a standard deviation of 5. You suspect that the average may be lower,
possibly 390, and decide to sample students to find their marks.
(a) What sample size do you need for a test at the significance level 0.05 and power
0.95?
Answer: 3
Solution:
Let the random variable X represent the marks obtained by students in their
board exams with expected value µ and standard deviation σ.
Given µ = 400 and σ = 5.
Consider the Null and alternative hypothesis:
H0 : µ = 400
HA : µ < 400
Test Statistics: X
Test: Reject H0 , if X < c at α = 0.05
α = P (reject H0 |H0 is true)
X − 400 c − 400
=P √ < √
5/ n 5/ n
c − 400
=⇒ 0.05 = FZ √
5/ n
c − 400
=⇒ FZ−1 (0.05) = √
5/ n
c − 400
=⇒ −1.64 = √
5/ n
5
=⇒ c = −1.64 × √ + 400 · · · (1)
n
1
Again, when alternative hypothesis is true, we have
X − 390
√ ∼ Normal(0, 1)
5/ n
(b) Find the critical value c. Enter the answer correct to two decimal places.
394.73, [394.70, 395.30]
Solution: Substituting the value of c in (2), we get c = 394.73.
2. Suppose X ∼ Normal(µ, 9). For n = 100 iid samples of X, the observed sample mean
is 11.8. What conclusion would a z-test reach if the null hypothesis assumes µ = 10.5
(against an alternative hypothesis µ 6= 10.5)?
Solution:
Given, X ∼ Normal(µ, 9).
2
X1 , . . . , X100 ∼ iid X. For 100 iid samples of X, X̄ ∼ Normal(µ, 9/100)
Sample mean, X̄ = 11.8.
Consider the null and alternative hypotheses: Null hypothesis,
H0 : µ = 10.5
HA : µ 6= 10.5
Test for α = 0.05
Test: Reject H0 if |X̄ − µ| > c
3
3. Let X1 , . . . , X100 be a sample from a normal distribution having a variance of 25. We
wish to test the hypothesis H0 : µ = 0 versus HA : µ = 1.5. Consider a test that rejects
H0 for X̄ > c.
(a) Find the value of c at a significance level α = 0.05. Enter the answer correct to
two decimal places.
0.82
Solution:
Given, X ∼ Normal(µ, 25).
X1 , . . . , X100 ∼ iid X. For 100 iid samples of X, X̄ ∼ Normal(µ, 25/100)
The null and alternative hypothesis are
H0 : µ = 0
HA : µ > 0
Test for α = 0.05
Test: Reject H0 if X̄ > c
α = P (X̄ > c|µ = 0)
!
X̄ c
⇒α=P p >p
25/100 25/100
⇒ α = P (z > 2c)
⇒ 0.05 = 1 − Fz (2c)
1
⇒ c = Fz−1 (0.95) = 0.8224
2
(b) Find the power of the test. Enter the answer correct to two decimal places.
0.91309, [0.90, 0.93]
Solution:
Power = 1 − β = P (X̄ > c|µ = 1.5)
!
X − 1.5 c − 1.5
⇒1−β =P p >p
25/100 25/100
!
c − 1.5
⇒1−β =P z > p
25/100
⇒ 1 − β = 1 − Fz (2(c − 1.5))
Substituting the value of c from above problem,
1 − β = 1 − Fz (2(0.8224 − 1.5))
Therefore,
Power = 1 − β = 0.9123
4
4. A manufacturer supplies fuses, approximately 90% of which function properly. A new
process is initiated whose purpose is to increase the proportion of properly functioning
fuses. We obtain a random sample of 100 such fuses manufactured by the new process
and found out that 8 of them are not functioning properly. Let p denotes the proportion
of properly functioning fuses. (Use normal approximation to binomial)
Solution:
X1 , . . . , X100 ∼ iid Bernoulli(p).
The null and alternative hypothesis are:
H0 : p = 0.9
HA : p > 0.9
92
Sample mean (X̄) = = 0.92
100
Test for α = 0.05
Test statistic, T = X1 + . . . + X100 ∼ Binomial(n, p) which can be normally approxi-
mated as
T ≈ Normal(100p, 100p(1 − p))
p(1 − p)
X ≈ Normal p,
100
Test: Reject H0 if X̄ > c
α = P (X̄ > c | p = 0.9)
X̄ − p c−p
α=P r
p(1 − p) > r
p(1 − p)
n n
5
c − 0.9
α=P z > r
0.9 × 0.1
100
c − 0.9
⇒ 0.05 = 1 − Fz
0.03
⇒ c = 0.9 + 0.03 × Fz−1 (0.95)
c = 0.9493
Since X̄ < c, z-test at significance level (α) = 0.05, will accept H0 .
Test for α = 0.10
Test statistic, T = X1 + . . . + X100 ∼ Binomial(n, p) which can be normally approxi-
mated as
T ≈ Normal(100p, 100p(1 − p))
p(1 − p)
X̄ ≈ Normal p,
100
Test: Reject H0 if X̄ > c
α = P (X̄ > c | p = 0.9)
X̄ − p c−p
α=P r
p(1 − p) > r
p(1 − p)
n n
c − 0.9
α=P z > r
0.9 × 0.1
100
c − 0.9
⇒ 0.10 = 1 − Fz
0.03
⇒ c = 0.9 + 0.03 × Fz−1 (0.90)
⇒ c = 0.9384
Since X̄ < c, z-test at significance level (α) = 0.10, we will accept H0 .
Hence, options (i) and (iii) are correct.
5. A commonly prescribed drug for relieving nervous tension is believed to be only 25%
effective. To determine if a new drug is superior in providing relief, suppose that 100
people who were suffering with nervous tension are chosen at random and inoculated.
If more than 36 of them are found to be relieved, we reject the null hypothesis that
p = 1/4 and the new drug will be considered superior to the one presently in use. (Use
6
normal approximation to binomial)
Solution: Since, we will reject the null hypothesis if more than 36 out of 100
patients is found to be relieved, 36 is the critical value.
(b) Find P (Type I error). Enter the answer correct to four decimal places.
[0.0038, 0.0060]
Solution:
H0 : p = 0.25
HA : p > 0.25
Given, critical value (c) = 36
Test statistic, T = X1 + . . . + X100 ∼ Binomial(100, p) which can be normally
approximated as
T ≈ Normal(100p, 100p(1 − p))
Test: Reject H0 if T > c that is T > 36
α = P (T > c | p = 0.25)
α = P (T > 36 | p = 0.25)
!
36 − 100p
α=P z> p
100p(1 − p)
!
36 − 100(0.25)
α=P z>p
100 × 0.25(0.75)
11
α=P z> √
18.75
11
α = 1 − Fz √
18.75
α = 1 − Fz (2.54)
P (Type I error) = α = 0.0055
7
(c) Find P (Type II error) for p = 1/2. Enter the answer correct to four decimal
places.
[0.0024, 0.0035]
Solution:
P (Type II error) = β = P (T ≤ c ∼ p = 0.5)
β = P (T ≤ 36 ∼ p = 0.5)
!
36 − 100p
β=P z≤ p
100p(1 − p)
!
36 − 100(0.5)
β=P z≤ p
100 × 0.5(0.5)
−14
β=P z≤ √
25
−14
β = Fz
5
β = Fz (−2.8)
P (Type II error) = β = 0.0025
6. The proportion of adults living in a small town who are college graduates is estimated
to be p = 0.6. To test this hypothesis against the alternative p < 0.6, you decide to
take a sample of adults from the town.
(a) What sample size do you need for a test (against the alternative hypothesis that
p = 0.4) at a significance level of 0.10 and power of 0.90?
40
Solution:
Null hypothesis, H0 : p = 0.6
Alternate hypothesis, HA : p < 0.6
Given, α = 0.10 and power 1 − β = 0.90
Test statistic, T = Binomial(n, p) which can be normally approximated as
T ≈ Normal(np, np(1 − p))
p(1 − p)
X ≈ Normal p,
n
Test: Reject H0 if T < c
8
α = P (X̄ < c|p = 0.4)
X̄ − p c−p
α=P r
p(1 − p) < r
p(1 − p)
n n
c−p
α=P z < r
p(1 − p)
n
c − 0.6
α=P z < r
0.6 × 0.4
n
c − 0.6
⇒ 0.10 = Fz q
0.24
n
r
0.24 −1
⇒ c = 0.6 + F (0.10)
n z
0.6278
c = 0.6 − √ (3)
n
Now, power
1 − β = P (X̄ < c | p = 0.6)
X̄ − p c−p
1−β =P
r p(1 − p) < r p(1 − p)
n n
c−p
1−β =P z < r
p(1 − p)
n
c − 0.4
1−β =P z < r
0.4 × 0.6
n
c − 0.4
⇒ 0.90 = Fz q
0.24
n
9
r
0.24 −1
⇒ c = 0.4 + F (0.90)
n z
0.6278
c = 0.4 + √ (4)
n
Solving equations (3) and (4),
n ≈ 39.41
n = 40
(b) Find the critical value at a significance level of 0.10. Enter the answer correct to
two decimal places.
[0.48, 0.52]
Solution:
Substitute n = 40 in equation (3),
c = 0.5
7. A random sample of 36 packets of marshmallow weighs, on average, 145 grams with
a standard deviation of 5 grams. Test the hypothesis that µ = 150 grams against the
alternative hypothesis, µ < 150 grams, at the 0.05 level of significance.
(a) On average, it weighs less than 150 grams.
(b) On average, it weighs 150 grams.
Solution:
The null and the alternative hypothesis are:
H0 : µ = 150
HA : µ < 150
Test: Reject H0 , if X < c.
Given α = 0.05, we have
α = P (X < c | µ = 150)
X − 150 c − 150
=⇒ 0.05 = P √ < √
5/ 36 5/ 36
c − 150
=⇒ 0.05 = FZ √
5/ 36
c − 150
=⇒ −1.64 = √ =⇒ c = 148.63
5/ 36
10
8. A survey of 225 randomly selected students from a city revealed that 89.4% of them
have participated in extra curricular activities in their schools. Can we conclude at
1% level of significance that 90% of the students have participated in extra curricular
activities?
(a) Yes
(b) No
Solution:
Null hypothesis, H0 : p = 0.9
Alternate hypothesis, HA : p 6= 0.9
Given, α = 0.1 and n = 225
α = P (| X̄ − p |> c | µ = 0.9)
X̄ − 0.9 c
α=P| r |> r
0.9 × 0.1 0.9 × 0.1
225 225
c
α=P | z |> r
0.9 × 0.1
225
−15c
0.1 = 2Fz √
0.9 × 0.1
√
0.9 × 0.1
c=− × Fz−1 (0.05) = 0.03289
15
Since | X̄ − p |=| 0.894 − 0.9 |< c, z-test at significance level (α) = 0.05, will accept H0 .
9. A box of a certain brand of washing powder advertises that it weighs 2.5 kg, but the
actual weight is 2.4 kg with a standard deviation of 0.1 kg. The company wants to
test if the mean has changed. They take a random sample of 100 boxes and finds that
the average weight is 2.35 kg.
11
iv. H0 : µ = 2.5, HA : µ 6= 2.5
(b) What conclusion should be made using a significance level of α = 0.05?
i. Accept H0 .
ii. Reject H0 and accept HA .
Solution:
The company wants to check if the mean has changed. So, null and alternative
hypothesis are given by
6 2.4
H0 : µ = 2.4, µ =
Define a test statistic T as T = X.
X − 2.4 X − 2.4
By CLT, we can say that √ = ∼ Normal(0, 1).
0.1/ 100 1/100
Now,
10. It is claimed that the lifetimes of light bulbs are normally distributed with a mean of
800 hours and a standard deviation of 40 hours. We wish to test the hypothesis that
µ = 800 hours against the alternative that µ 6= 800 hours with a sample size of 30.
(a) If the acceptance region is defined as 780 ≤ X̄ ≤ 820, find the significance level.
Enter the answer correct to three decimal places.
0.006, [0.005, 0.008]
Solution:
Let the random variable X denote the lifetime of electric bulbs.
Given, X ∼ Normal(µ, 402 ).
X1 , . . . , X30 ∼ iid X.
For 30 iid samples of X, X̄ ∼ Normal(µ, 402 /30)
Null hypothesis, H0 : µ = 800
Alternate hypothesis, HA : µ 6= 800
12
α = P (Reject H0 | H0 is true)
α = P (X̄ > 820 or X̄ < 780 | µ = 800)
α = P (| X̄ − 800 |> 20)
!
X̄ − 800 20
α=P | p |> p
1600/30 1600/30
!
20
α = P | z |> p
1600/30
!
−20
α = 2Fz p
1600/30
α = 0.006
(b) Find the power of the test against the alternative that if the true mean life is 788
hours. Enter the answer correct to two decimal places.
[0.91, 0.94]
Solution:
13
Statistics for Data Science - 2
Week 10 practice Assignment
Hypothesis testing
1. Consider nine samples from Normal(100, 22 ). Let we wish to test H0 : µ = 100 against
HA : µ 6= 100.
(i) If the acceptance region is defined as 98.5 ≤ X ≤ 101.5, find the significance level.
Write your answer correct to two decimal places.
(Use P (−2.25 < Z < 2.25) = 0.975)
Solution:
Given that
H0 : µ = 100, HA : µ 6= 100
The acceptance region is defined as 98.5 ≤ X ≤ 101.5.
Now,
(ii) Find the power of the test against an alternative that the mean is 103. Write your
answer correct to two decimal places.
(Use P (−6.75 < Z < −2.25) = 0.012)
Solution:
1 − β = P (reject H0 |HA is true)
= P ((X > 101.5 or X < 98.5)|µ = 103)
= P (X < 98.5) + P (X > 101.5)
= P (X − 103 < −4.5) + P (X − 103 > −1.5)
X − 103 −4.5 X − 103 −1.5
=P 2/3
< 2 +P > 2
/3 2/3 /3
= P (Z < −6.75) + P (Z > −2.25)
= 1 − P (−6.75 < Z < −2.25)
= 1 − 0.012 = 0.98
2. Air crew escape systems are powered by a solid propellant. The mean burning rate of
this propellant must be 50 centimeters per second. We know that the standard deviation
of burning rate is σ = 2 centimeters per second. An engineer suspects that the mean
burning rate is greater than 50. The engineer decides to test at a significance level of
0.05 and selects a random sample of n = 25 and obtains a sample average burning rate
of 51.3 centimeters per second.
H0 : µ = 50, HA : µ > 50
(ii) What is the critical value (c) if the acceptance region is X ≤ c? Write your answer
correct to two decimal places.
(use: FZ (1.64) = 0.95)
Solution:
If the significance level of the test is 0.05, then
Page 2
P (reject H0 |H0 is true) = 0.05
⇒P (X > c|µ = 50) = 0.05
⇒P (X − 50 > c − 50) = 0.05
X − 50 c − 50
⇒P 2/5
> 2 = 0.05
/5
c − 50
⇒P Z > 2 = 0.05
/5
c − 50
⇒1 − FZ ( 2 ) = 0.05
/5
c − 50
⇒FZ ( 2 ) = 0.95
/5
c − 50
⇒ 2 = 1.64
/5
2
⇒c = 50 + (1.64)
5
⇒c = 50.65
3. Suppose a manufacturer of memory chips observes that the probability of chip failure
is p = 0.05. A new procedure is introduced to improve the design of chips and lower
the probability of chip failure. To test this new procedure, 200 chips are produced using
this new procedure and tested. We would accept the new procedure if the total number
of failed chips is less than 5 out of 200. Find the significance level of the test. Use the
normal approximation. Write your answer correct to three decimal places.
(Use P (Z < −1.62) = 0.052)
Solution:
A new procedure is introduced to improve the design of chips and lower the probability
of chip failure. Therefore, null and alternative hypothesis will be
Page 3
Define a test statistic T as T = number of failed chips out of 200.
Given that: We would accept the new procedure if the total number of failed chips is
less than 5 out of 200.
4. The mean lifetime of a sample of 100 light bulbs produced by a company is computed
to be 1570 hours with a standard deviation of 120 hours. µ is the mean lifetime of all
the bulbs produced by the company,
(i) Test the hypothesis µ = 1600 against the alternative hypothesis µ 6= 1600 at a level
of significance of 0.05.
(a) Reject the null hypothesis
(b) Accept the null hypothesis
Solution:
Given that
H0 : µ = 1600, HA : µ 6= 1600
Define a test statistic T as T = X.
Test: reject H0 if |X − 1600| > c Notice that when null hypothesis is true, we have
X − 1600
120/10
∼ Normal(0, 1)
Page 4
Now,
It implies that we will reject the null hypothesis if |X − 1600| > 23.52
(ii) Find the P -value. Write your answer correct to three decimal places.
Solution:
P -value is the minimum significance level at which null hypothesis is rejected for
the observed test statistic value.
Therefore, P -value is given by
5. The average IQ of the students of a school is reported to be 107 with a standard deviation
of 4. You suspect that the average may be higher, possibly 110, and decide to sample
students to find their IQs. What sample size do you need for a test at the significance
level 0.05 and power 0.95?
(Use: FZ (1.64) = 0.95 and FZ (−1.64) = 0.05)
Page 5
Solution:
According to the question, we have
X − 107
4/√n
∼ Normal(0, 1)
Now, the significance level of the test is given to be 0.05. It implies that
X − 110
4/√n
∼ Normal(0, 1)
Page 6
1 − β =P (reject H0 |HA is true) = 0.95
⇒P (X > c) = 0.95
X − 110 c − 110
⇒p 4/√n
> 4√ = 0.95
/ n
c − 110
⇒P Z > 4 √ = 0.95
/ n
c − 110
⇒1 − P Z ≤ 4 √ = 0.95
/ n
c − 110
⇒P Z ≤ 4 √ = 0.05
/ n
c − 110
⇒ 4√ = −1.64
/ n
4
⇒c = 110 − (1.64) √ ...(2)
n
6. An instructor gives a quiz involving 10 true-false questions. To test the hypothesis that
the student is guessing, the following decision rule is decided: (i) If 7 or more are correct,
the student is not guessing; (ii) if fewer than 7 are correct, the student is guessing. Find
the significance level of the test. Write your answer correct to two decimal places.
(Hint: If student is guessing then, probability of getting a question correct is p = 0.5)
Solution:
If a student is guessing the answer then, each question is equally likely to get corrected
that is p = 0.5 but if student is not guessing the answer then, probability of getting the
question correct is more than 0.5 that is p > 0.5.
It implies that
H0 : p = 0.5, HA : p > 0.5
Define a test statistic T as T = number of correct answers out of ten.
Page 7
As per the given information, we will reject the null hypothesis if T ≥ 7
Now,
= (120 + 45 + 10 + 1)(0.00097)
= 0.17
7. A cricket ball production line must produce of balls weights 163 g with a standard
deviation of 4 g in order to get top rating. To test the hypothesis of mean weights of the
balls to be 163, a sample of 16 balls are considered. If we want 0.01 level of significance,
what will be the acceptance region?
(Use FZ (2.57) = 0.995)
Solution:
Since, a cricket ball production line must produce of balls weights 163 g, null and alter-
native hypothesis are given by
H0 : µ = 163, HA : µ 6= 163
X − 163
Notice that when null hypothesis is true, 4/4
= X − 163 ∼ Normal(0, 1)
Now, the significance level of the test is given to be 0.01. It implies that
Page 8
P (reject H0 |H0 is true) = 0.01
⇒P (|X − 163| > c) = 0.01
⇒P (|Z| > c) = 0.01
⇒2P (Z < −c) = 0.01
⇒FZ (−c) = 0.005
⇒ − c = −2.57
⇒c = 2.57
Therefore, acceptance region will be [163 − 2.57, 163 + 2.57] = [160.43, 165.57].
8. A researcher has recently come into contact with a number of left-handed artists and
wonders whether artists are more likely to be left-handed than peoples in the general
population. She selects a random sample of 150 members of the Artists and asks each
whether they are left-handed or not. The sample proportion (who are left-handed) is
0.15. Suppose that 10% of people are left-handed in the general population.
(i) Does the data provide strong evidence that artists are more likely than the general
public to be left-handed if she decides a significance level of 0.05?
(a) Yes
(b) No
Solution:
10% of people are left-handed in the general population but a researcher wonders
whether artists are more likely to be left-handed. So, probability of an artist being
left-handed will be more than 0.1. Therefore, null and alternative hypothesis are
given by
H0 : p = 0.1, HA : p > 0.1
X1 + X2 + . . . + X150
Define a test statistic T as T = X = , where each Xi ∼
150
Bernoulli(0.1) (If null hypothesis is true).
p(p − 1) (0.1)(0.9) 0.09
Therefore, E[X] = p = 0.1 and Var(X) = = =
n 150 150
X − 0.1
Then, by CLT p ∼ Normal(0, 1).
0.09/150
Page 9
P (reject H0 |H0 is true) = 0.05
⇒P (X > c) = 0.05
!
X − 0.1 c − 0.1
⇒P p >p = 0.05
0.09/150 0.09/150
!
c − 0.1
⇒P Z > p = 0.05
0.09/150
!
c − 0.1
⇒1 − P Z ≤ p = 0.05
0.09/150
!
c − 0.1
⇒FZ p = 0.95
0.09/150
0.3
⇒c = 0.1 + (1.64) √
150
⇒c = 0.14
Since, X = 0.15 > 0.14, we will reject the null hypothesis. It implies that artists
are more likely than the general public to be left-handed if she decides a significance
level of 0.05.
(ii) Find the P -value. Write your answer correct to three decimal places.
Solution:
P -value is the minimum significance level at which null hypothesis is rejected for
the observed test statistic value.
Therefore, P -value is given by
α = P (X > 0.15
= P (X − 0.1 > 0.15 − 0.1)
!
X − 0.1 0.05
=P p >p
0.09/150 0.09/150
= P (Z > 2.04)
= P (Z < −2.04)
= 0.02
9. A cereal manufacturer tests its equipment weekly to be assured that the correct weight of
cereal is in each box. The company wants to test if the weight differs from the expected
weight. The weight of each box is expected to be 500g with a standard deviation of
100g. The manufacturer takes a random sample of 100 boxes and finds that the average
Page 10
weight is 520g. What is the sample’s P -value? Write your answer correct to two decimal
places.
(Use FZ (−2) = 0.022)
Solution:
The company wants to test if the weight differs from the expected weight and the weight
of each box is expected to be 500g. So, null and alternative hypothesis are given by
H0 : µ = 500, µ 6= 500
X − 500 X − 500
By CLT, we can say that 100/√100
= ∼ Normal(0, 1).
10
P -value is the minimum significance level at which null hypothesis is rejected for the
observed test statistic value.
Therefore, P -value is given by
10. A machine produces iron rods of mean weight 12kg with a standard deviation of 2kg. An
engineer suspects that average weight is less than 12kg, probably 10kg. So, he collects
the weights of n iron rods. He wants the significance level to be less than 10−4 and
probability of type two error to be less than 10−8 .
(use FZ (−3.74) = 10−4 and FZ (5.61) = 1 − 10−8 )
Solution:
According to the question, we have
H0 : µ = 12, HA : µ < 12
Page 11
Test: reject H0 if X < c.
Notice that when null hypothesis is true, we have
X − 12
2/√n
∼ Normal(0, 1)
Now, the significance level of the test is given to be less than 10−4 . It implies that
X − 10
2/√n
∼ Normal(0, 1)
Now, probability of type two error to be less than 10−8 . It implies that
Page 12
From equation (1) and (2), we have
2 2
12 − (3.74) √ = 10 + (5.61) √
n n
2
⇒(5.61 + 3.74) √ = 2
n
√
⇒ n = 9.35
⇒n = 87.42
⇒n = 88
(ii) Find the critical value (for the acceptance region to be defined as X ≥ c, where
X is the mean weight of the rods). Write your answer correct to two decimal places.
Solution:
Putting the value of n in the equation (1), we have
2
c ≤ 12 − (3.74) √
88
⇒c ≤ 11.20 ...(3)
Page 13
Statistics for Data Science - 2
1. The IQs (intelligence quotients) of 25 students from one batch of IITM students showed
a mean of 110 with a standard deviation of 8, while the IQs of 25 students from another
batch of IITM students showed a mean of 115 with a standard deviation of 7. Is there
a significant difference between the IQs of the two groups at a 0.05 level of significance?
a) Yes
b) No
2. A sociologist focusing on popular culture and media believes that the average number
of hours per week (hrs/week) spent on social media is different for men and women.
The researcher knows that the standard deviations of amount of time spent on social
media are 5 hrs/week and 6 hrs/week for men and women, respectively. Examining two
independent random samples of 64 individuals each, if the average number of hrs/week
1
spent on social media for the sample of men is 1.5 hours greater than that for the sample
of women, what conclusion can be made from a hypothesis test where, H0 : µM = µW
and HA : µM 6= µW ? Take α = 0.05.
a) Reject H0
b) Accept H0
Solution:
Let Xi and Yi represent the average number of hrs/week spent on social media by men
and women respectively.
X1 , X2 , . . . , X64 ∼ N(µ1 , 52 ) and Y1 , Y2 , . . . , Y64 ∼ N(µ2 , 62 )
|X − Y | = 1.5
Consider, H0 : µ1 = µ2 , HA : µ1 6= µ2
T = X − Y ∼ N(µ1 − µ2 , 25 64
+ 36
64
) i.e. N(µ1 − µ2 , 61 64
)
Test: Reject H0 if |T | > c.
!
T c
α = P (|T | > c | H0 ) = P |p |> p
61/64 61/64
! !
c −c
=P |Z| > p = 2FZ p
61/64 61/64
q
⇒ c = − 61 F −1 (α/2)
64 Z
q
⇒ c = − 61 FZ−1 (0.025)
q 64
⇒ c = − 61
64
× (−1.96) = 1.913
Since, |X − Y | = 1.5 < 1.913
Therefore, we will accept H0 .
3. An IITM instructor conducts two live sessions for two different classes, call it A and
B, in Statistics. Session A had 25 students attending while session B had 36 students.
The instructor conducted a test for the two sessions. Although there was no significant
difference in mean grades, session A had a standard deviation of 10 while session B had
a standard deviation of 14. Can we conclude at the 0.01 level of significance that the
variability in marks of class B is greater than that of A?
a) Yes
b) No
2
SB2
Test: Reject H0 if > 1 + cR
SA2
SB2
We know that, ∼ F (n2 − 1, n1 − 1)
SA2
n1 = 25, n2 = 36
S2
⇒ B2 ∼ F (35, 24)
SA
Therefore,
α = 1 − FF (35,24) (1 + cR )
⇒ 1 + cR = FF−1(35,24) (1 − α) = FF−1(35,24) (0.99)
⇒ 1 + cR = 2.529
SB2 2
Since, 2
= 14
102
= 1.96 < 2.529
SA
Therefore, we will accept H0 .
This implies that at the 0.01 level of significance the variability in marks of class B is
not greater than that of A.
4. The manufacturer of a new car claims that a typical car gets a mileage of 40 kilometres
per litre. We think that the mileage is less. To test our suspicion, we perform the
hypothesis test with H0 : µ = 40 and HA : µ < 40. Suppose we take a random sample of
900 new cars and find that their average mileage is 39.8 kilometres per litre and sample
standard deviation is 2, what does a t-test say about a null hypothesis with a significance
level of 0.05?
a) Reject H0
b) Accept H0
3
!
X − 40 c − 40
α=P p <p
4/900 4/900
!
c − 40
α = Ft899 p
4/900
!
c − 40
0.05 = Ft899 p
4/900
r
4 −1
c = 40 + F (0.05)
900 t899
c = 39.89
Since, X < c, reject H0 .
5. The standard deviation of weights of 70 gram bags of white cheddar popcorn is expected
to be 2.5 grams. A random sample of 20 packages showed a standard deviation of 3
grams. Is the apparent increase in variability significant at the 0.05 level.?
a) Yes
b) No
4
P (S 2 > c2 ) = 0.05
19S 2 19c2
⇒P > = 0.05
2.52 2.52
19c2
2
⇒P χ19 > = 0.05
2.52
19c2
2
⇒1 − P χ19 < = 0.05
2.52
19c2
⇒ 2 = 30.14
2.5
6.25 × 30.14
⇒c2 = = 9.91
19
Since S 2 = 9 < 9.91, we will not reject the null hypothesis.
Therefore, the apparent increase in variability is not significant at the 0.05 level.
6. Independent random samples of ceramic produced by two different processes were tested
for hardness. The results are:
Process 1 Process 2
8.5 9.0
9.5 9.5
8.0 10.5
9.0 9.5
10.0 10.0
9.5 9.0
10.5 9.0
10.0 9.5
Table 11.1.G
Can we conclude at 5% level of significance that the variances in hardness are equal?
a) Yes
b) No
5
2
SX
We know that, ∼ F (n1 − 1, n2 − 1)
SY2
n1 = 8, n2 = 8
S2
⇒ X2 ∼ F (7, 7)
SY
Therefore,
α/2 = FF (7,7) (1 − cL )
⇒ 1 − cL = FF−1(7,7) (α/2) = FF−1(7,7) (0.025)
⇒ 1 − cL = 0.2
S2
Since, X2 = 0.2857
0.6964
= 2.437 > 0.2
SY
Similarly we can check for other condition.
α/2 = 1 − FF (7,7) (1 + cR )
⇒ 1 + cR = FF−1(7,7) (1 − α/2) = FF−1(7,7) (0.975)
⇒ 1 + cR = 4.99
S2
Since, X2 = 0.2857
0.6964
= 2.437 < 4.99
SY
Therefore, we will accept H0 .
H0 : σ = 4
HA : σ = 5
6
Samples X1 , . . . , X10 are observed.
Now, likelihood ratio,
10
Normal(0, 52 )
Q
L = i=1
10
Q
Normal(0, 42 )
i=1
i=10
P 2
i=1 Xi
10
1
− 50
exp
5
= i=10
P 2
50 X
i=1 i
1
− 32
exp
4
10 i=10 !
4 X −1 1
= exp Xi2 +
5 i=1
50 32
10 i=10
!
4 9 X 2
= exp X
5 800 i=1 i
Table 11.2.G
a) Yes
b) No
7
Hint: Use Fχ−1
2 (0.95) = 16.9
9
Solution:
Value of the test statistic T is given by
(17 − 25)2 (31 − 25)2 (29 − 25)2 (18 − 25)2 (14 − 25)2
T = + + + +
25 25 25 25 25
2 2 2 2
(20 − 25) (35 − 25) (30 − 25) (20 − 25) (36 − 25)2
+ + + + +
25 25 25 25 25
582
=
25
= 23.28
9. On two major e-commerce websites A and B the sales on a particular day is given as a
contingency table [Table 11.3.G].
A B Total
Bought item 1000 1500 2500
Didn’t buy item 1200 2000 3200
Total 2200 3500 5700
Can we say that the sales is independent of websites at a significance level of 0.05?
a) Yes
b) No
0.01 = P (T > c)
⇒0.01 = 1 − P (T ≤ c)
⇒P (T ≤ c) = 0.95
⇒Fχ21 (c) = 0.95
⇒c = 3.84
9
Statistics for Data Science - 2
1. 46 people are divided into two groups - experimental and control. The experimental
group is inocculated against a disease while the control group is not. Both the groups
are then exposed to the disease and the data obtained is recorded in the contingency
table given below:
Use Chi-squared test to check if the inocculation and the contract of disease are inde-
pendent at a significance level of 5%.
(a) Yes
(b) No
1
2
2. Consider the following cross tabulation of status of completion of courses by learners
across three different websites:
(a) Yes
(b) No
3
3. Suppose that a sample of 150 electron tubes are tested and the following summary of
their life length (in hours), T is reported:
Life length 0 ≤ T < 100 100 ≤ T < 200 200 ≤ T < 300 T > 300
Number of electron tubes 47 40 35 28
The sample mean is recorded to be 200 hours. Test the hypothesis that T is exponen-
tially distributed at a significance level of 0.01 using Chi-squared test.
4
5
4. The number of accidental deaths in a country are tabulated each day for a specified
period of 400 days, along with expected frequencies according to a Poisson fit.
Number of deaths 0 1 2 3 4+
Observed frequency 1448 805 206 34 7
Expected frequency 1450 775 200 25 50
Use Chi-squared test to check if the above fit is acceptable at a significance level of
5%.
(a) Yes
(b) No
6
5. Let X1 , . . . , Xn ∼ Normal(0, σ 2 ), and consider testing H0 : σ = 1 versus HA : σ = 2.
Find the likelihood ratio of the observed samples 2, 3, 1, 4.4, 5, 5, 3.6, 6, 4, 6.
7
6. A random sample of 40 fibres manufactured using process A has a mean length of 16.7
cm, and standard deviation of 0.5 cm. A random sample of 60 fibres manufactured
using process B has mean length of 16.4 cm and standard deviation of 0.6 cm. Test
the hypothesis that the mean length of fibres manufactured using process A and B are
the same.
8
9
7. One wants to check if the average IQ of girls and boys are the same. It is known
that the IQ’s of both boys are girls have a standard deviation of 10. Mean IQ of 200
randomly selected boys is 99 and mean 1Q of 300 randomly selected girls is 97. Using
1% level of significance, comment on the IQ’s of girls and boys.
Hint: FZ−1 (−2.58) = 0.005
(a) The average IQ of both boys and girls are the same.
(b) The average IQ of boys are more as compared to the IQ’s of girls.
(c) The average IQ of boys are less as compared to the IQ’s of girls.
10
11
8. The amount of saturated fat present in 100 grams of cheese of two different brands are
measured. The data are as follows:
Brand A Brand B
21 20
19 39
20 24
23 33
22 30
28 28
32 30
19 22
13 33
18 24
Can we conclude at 5% level of significance that the two variances are equal?
Hint: sA = 5.3177, sB = 5.8699, FF (9,9) (0.248) = 0.025
(a) Yes
(b) No
12
13
9. A study is conducted to compare the duration of time a certain dose of pain reliever
works when administered to men and women. Standard deviation for a random sample
of 11 men is found to be 6.1 and, for a random sample of 14 women, it is found to be
5.3. Use a significance level of α = 0.05 to check the hypothesis that the variation in
time of relief is equal for both genders against the alternative that it is larger for men.
Hint: Use FF−1(10,13) (0.95) = 2.67.
(a) Reject H0 .
(b) Fail to reject H0 .
14
15
10. Past experience indicates that the time required for athletes to complete a 200 m race
is a normal random variable with a mean µ = 35 seconds. If a random sample of 20
athletes took an average of 33.1 seconds to complete the race with a standard deviation
of 4.3 seconds, test the hypothesis, at the 0.05 level of significance, that µ = 35 seconds
against the alternative that µ < 35 seconds.
16
11. A company manufactures mobiles chargers with an output voltage of 5V and variance
0.5V2 . The company wants to test the variance. They take a random sample of 12
chargers and the following voltages are obtained:
5.34, 5.65, 4.76, 5.00, 5.55, 5.54, 5.07, 5.35, 5.44, 5.25, 5.35, 4.61
Test the hypothesis that σ 2 = 0.5 at 0.05 level of significance.
(a) Accept H0
(b) Reject H0
17
12. The standard deviation of a component in a drug is expected to be 0.00002 kg. A
pharmacist suspecting the variability to be higher obtains a sample of 8 drugs and
found the sample standard deviation to be 0.00005 kg.
18
19